Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: CStr, the dereferenced complement to CString #592
Conversation
mzabaluev
added some commits
Jan 18, 2015
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
I think this could be added as a motivating usecase: // C library:
const char* f1(); // return value has static lifetime
void f2(const char*);/// Safe rust bindings:
fn f1() -> &'static CStr { ... }
fn f2(s: &CStr) { ... }
// Caller using only safe code:
f2(f1())If I also like the 'irrelevantly-sized type': my approach using a DST CStr would involve an unnecessary However, this proposal seems to be missing some details around the type:
|
This comment has been minimized.
This comment has been minimized.
|
@dgrunwald, thanks for your comments.
These address a special case: they are non-copying ways to get
In this proposal, I view unsafe fn c_str_from_raw_ptr<'a, T: ?Sized>(ptr: *const libc::c_char,
life_anchor: &'a T)
-> &'a CStr
Yes, and trivially. This, however, is not essential to the RFC and can be added later.
I'm hesitant about these, and I don't quite see the purpose of An alternative is to make |
mzabaluev
added some commits
Jan 19, 2015
This comment has been minimized.
This comment has been minimized.
|
How do you prevent instances of CStr being created on the stack without making it unsized? |
This comment has been minimized.
This comment has been minimized.
|
@Diggsey, you can't safely move a borrowed non- |
This comment has been minimized.
This comment has been minimized.
|
It still seems like there's potential for problems: for example, if DerefMut is ever implemented for CString (not unlikely, as it doesn't enforce UTF8 afaict) then you'll have to special-case "swap" to prevent it being used with CStr. You're allowing the programmer to bypass any "Sized" constraints on type parameters: for example, "sizeof" would give incorrect results when applied to CStr. What's really needed is an unsized type without fat references: maybe there should be an "Unsized" marker class, or a way to negatively implement the Sized trait for a type, which doesn't affect its in-memory representation. |
This comment has been minimized.
This comment has been minimized.
FWIW
Are there any examples of safe code demonstrating how this could be unintentionally misused? My understanding is that, as long as you can't safely initialize, mutate, copy, or move a value, its size should not be important. I also use this technique when given a public view into an object that has a private part allocated alongside: #[repr(C)]
pub struct Object {
raw: ffi::RawObject, // we know it's not all there is
no_copy: std::marker::NoCopy
}
impl Object {
// an example with a static ref, just for the sake of brevity
pub fn get_static() -> &'static Object {
unsafe {
let ptr = ffi::raw_object_get_static();
std::mem::copy_lifetime("", &*(ptr as *const Object))
}
}
}Being able to opt out of |
This comment has been minimized.
This comment has been minimized.
|
Example with safe code: let a: &'a mut CStr = foo.as_c_str();
let b: &'b mut CStr = bar.as_c_str();
mem::swap(a, b);This swaps the actual |
This comment has been minimized.
This comment has been minimized.
|
If we don't allow any method to create a mutable #[repr(C)]
pub struct CStringRef<'a> {
raw: core::nonzero::NonZero<*const libc::c_char>,
marker: std::marker::ContravariantLifetime<'a>,
}That is, instead of trying to use an If |
This comment has been minimized.
This comment has been minimized.
|
CStr is not the only reason to want unsized types with thin references: the PIMPL idiom in C++ requires a similar ability. You can get away with just not having a dereferenced type (like quantheory described), but it makes for some weird semantics: CStr and CString are both reference types, and yet CString dereferences to a CStr? Either that or you have to duplicate CStr functionality on a CString. |
This comment has been minimized.
This comment has been minimized.
That's not great for
Aside from the |
This comment has been minimized.
This comment has been minimized.
I don't think it's that strange, any more than
However, I think that this pretty much kills the idea. I don't know why I didn't think of that problem, since I've had the exact same issue with the |
This comment has been minimized.
This comment has been minimized.
|
Some ideation on the forum: http://discuss.rust-lang.org/t/unsized-types-distinct-from-dsts/1385 |
nrc
assigned
alexcrichton
Jan 22, 2015
This comment has been minimized.
This comment has been minimized.
|
Unsized types definitely the way to go for this. Though I think dynamic sized [lib::char] wrapper enforcing C invariant could be useful too. Your deallocation ideas should be subsumed by an allocator API, whenever we get it. |
mzabaluev
added some commits
Jan 30, 2015
mzabaluev
added some commits
Feb 9, 2015
This comment has been minimized.
This comment has been minimized.
|
I'm late to this party, but this is a great RFC! I love the idea of introducing a slice type for null-terminated data. I think the various motivations are well-covered in the RFC and this thread, and I buy them. This will really tighten up the That said, I agree with the original stance that @mzabaluev took that So that said, I'm interested in some means of exposing this type as unsized from the get-go. (Whereas the main proposal here exposes that |
This comment has been minimized.
This comment has been minimized.
|
@aturon, if |
This comment has been minimized.
This comment has been minimized.
I think only for unsafe code that depended on the exact representation. I will try to get some second opinions on this though, it's an important question.
I don't think so -- we want to stabilize |
This comment has been minimized.
This comment has been minimized.
|
I think in general using Rust types in FFI is pretty hazardous and I don't think either way that we should start defining FFI functions like: extern {
fn foo() -> &'static CStr;
}I would be ok changing the runtime representation of |
This comment has been minimized.
This comment has been minimized.
That's not an issue for me; I don't think any code that would e.g. transmute the reference into a slice could expect any guarantees on backwards compatibility. What's the general stance on binary compatibility in Rust 1.0 timeframe? Does anyone care? |
This comment has been minimized.
This comment has been minimized.
Completely off the table. ABI stability is going to be a very long road. |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton I don't suggest
|
This comment has been minimized.
This comment has been minimized.
|
@mzabaluev would you be ok with the |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton Yes, I'm convinced. I will update the RFC soon. |
This comment has been minimized.
This comment has been minimized.
|
I've got another opportunistic change to suggest. Having the conversion functions panic on inner NULs is a little... severe. How about |
This comment has been minimized.
This comment has been minimized.
|
@mzabaluev I would be in favor of such a change. We'll probably still panic in common cases when using |
This comment has been minimized.
This comment has been minimized.
|
I personally see the panic on inner nul as a contract of the |
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton A potential problem is, it is very rare that a real-life string has NUL characters, and at the same time it is perfectly valid in the Rust world and in Unicode text in general. So developers may be blissfully forgetful of the panic until it hits their software in the field. DoS possibilities galore. |
This comment has been minimized.
This comment has been minimized.
|
I personally see the fact that it's so rare as a justification for panicking as it's otherwise that much more difficult to deal with for everyone using |
This comment has been minimized.
This comment has been minimized.
|
I posted a discuss thread on the topic of panicking here. |
This comment has been minimized.
This comment has been minimized.
|
I have updated the proposal to @aturon's suggestions, adding a disclaimer that the inner slice is not guaranteed to have any particular size. This way we can still avoid the |
mzabaluev
added a commit
to mzabaluev/rust-c-str
that referenced
this pull request
Feb 13, 2015
aturon
referenced this pull request
Feb 18, 2015
Closed
Tracking issue for CStr, the dereferenced complement to CString (RFC 592) #22469
aturon
merged commit e4b047c
into
rust-lang:master
Feb 18, 2015
This comment has been minimized.
This comment has been minimized.
|
@mzabaluev Thanks again for this RFC! After discussion on this thread, on the discuss forum, and amongst the team, it seems that almost everyone is in favor of this change. We have decided to move forward, and try to land these revisions ahead of alpha2 if possible. |
mzabaluev commentedJan 18, 2015
Change the
Dereftarget ofCStringto a token typeCStr.Add helper functions and macros to produce
CStrreferences out of static string data.Rendered