Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upadd CStr::with_ptr and deprecate CStr::as_ptr #1642
Conversation
durka
referenced this pull request
Jun 6, 2016
Closed
move `temporary_cstring_as_ptr` from clippy to rustc #34111
durka
force-pushed the
durka:cstr-with-ptr
branch
2 times, most recently
from
8172b05
to
07783d1
Jun 6, 2016
durka
force-pushed the
durka:cstr-with-ptr
branch
from
07783d1
to
d90c45c
Jun 6, 2016
BurntSushi
added
the
T-libs
label
Jun 6, 2016
This comment has been minimized.
This comment has been minimized.
|
Is every use of Are there any uses of |
This comment has been minimized.
This comment has been minimized.
No. If the entire expression is part of a function argument, then the current rustc makes sure that the drop call happens after the function returns. This is necessary for taking the reference to an owned value. It gets problematic with pointers, because they are just values. I'll do some optimization experiments when I'm back on a PC |
This comment has been minimized.
This comment has been minimized.
No. If the call occurs in an expression, the
The Drawbacks section shows how to emulate fn get_ptr(s: &CString) -> *const c_char {
let mut ptr = ptr::null();
s.with_ptr(|p| { ptr = p; });
ptr
} |
This comment has been minimized.
This comment has been minimized.
From my watchtower, using the pointer returned by |
This comment has been minimized.
This comment has been minimized.
|
@nagisa I would say changing drop order in expressions is a breaking change. |
This comment has been minimized.
This comment has been minimized.
|
Not if llvm already does that after optimizations... Which we should definitely figure out |
This comment has been minimized.
This comment has been minimized.
|
My example prints the same in release configuration. |
This comment has been minimized.
This comment has been minimized.
If you have two Another alternative is "to mark as_ptr with unsafe", of course with introducing a new name and deprecating the old one. |
This comment has been minimized.
This comment has been minimized.
|
I guess we could add a function that works on sequences as well. Is working with slices or Vecs of CStrings a common use case?
The community has been very reluctant to use |
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
Well... the true way to do this is to add a (future miri-based) const fn that does the Maybe we could create a method that freezes the
That's a test, not a guarantee that UB doesn't occur. Especially since a use after free might not even get noticed in such a small example (the memory might not be used again due to other allocations having requirements placing them elsewhere). Have a look at this example: https://is.gd/AEiXxM Activating release mode and creating llvm IR shows the following fun IR (first block of the main function):
I forgot what combo caused it, but I've had
at some point or another. |
This comment has been minimized.
This comment has been minimized.
|
Has anybody thought of improving the docs for WARNING: make sure that the let p = let c_to_print = CString::new("Hello, world!").unwrap().as_ptr();dereferencing Use this instead: let hello_world = CString::new("Hello, world!").unwrap()
let p = let c_to_print = hello_world.as_ptr();I would have done this myself, but I am not sure about my English :( |
This comment has been minimized.
This comment has been minimized.
|
I'd be very interested to see how you get an undef there. null seems to be correct as the Docs are a part of the solution, but since all the examples show correct use of |
This comment has been minimized.
This comment has been minimized.
Ah, apparently this occurs, if the function doesn't use the argument: https://is.gd/dKLVoH |
durka
added some commits
Jun 9, 2016
This comment has been minimized.
This comment has been minimized.
|
One also wonders whether |
This comment has been minimized.
This comment has been minimized.
|
both |
This comment has been minimized.
This comment has been minimized.
|
That's a very good point. So those functions are less prone to misuse. |
This comment has been minimized.
This comment has been minimized.
bluss
commented
Jun 10, 2016
•
|
From experience in our user help channels, this feels like by far the most common use after free bug that users stumble into. Sometimes they find crashes and come ask because of it, sometimes the issue is found latently. It would be swell if a lint could catch this well, and very much worth it to introduce lints that help users to write sound Rust code in practice. |
This comment has been minimized.
This comment has been minimized.
starkat99
commented
Jun 15, 2016
|
This gets really nasty if you have to work with multiple |
This comment has been minimized.
This comment has been minimized.
|
@starkat99 I've added a simple macro to my example for that use case. |
This comment has been minimized.
This comment has been minimized.
kylewlacy
commented
Jun 15, 2016
|
IIUC, there was a time in Rust's history (long before I started using it) where What about instead using a newtype over struct CCharRef<'a> {
ptr: *const c_char,
_phantom: PhantomData<&'a [c_char]>
}
impl<'a> Deref for CCharRef<'a> {
type Target = *const c_char;
fn deref(&self) -> &*const c_char {
&self.ptr
}
}
impl CStr {
// ...
#[allow(deprecated)]
fn as_ref(&self) -> CCharRef {
CCharRef {
ptr: self.as_ptr(),
_phantom: PhantomData
}
}
}Then, the following would be an error: let s = CString::new(...).unwrap().as_ref();
// ^~~~~~~~~~~~~~~~~~~~~~~~~~
// error: borrowed value does not live long enough(Note that this would still allow |
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Jun 20, 2016
Manishearth
added a commit
to Manishearth/rust
that referenced
this pull request
Jun 20, 2016
GuillaumeGomez
added a commit
to GuillaumeGomez/rust
that referenced
this pull request
Jun 21, 2016
matklad
referenced this pull request
Jun 24, 2016
Closed
Implement `CString::new("hello").unwrap().as_ptr()` inspection #488
This comment has been minimized.
This comment has been minimized.
|
There are docs now (thanks @matklad!), but I still think this is a good idea. Any other opinions? It's been quiet for a while. FCP time? |
This comment has been minimized.
This comment has been minimized.
|
I think the addition of The |
This comment has been minimized.
This comment has been minimized.
iliekturtles
commented
Jul 13, 2016
|
If accepted I would like to see |
This comment has been minimized.
This comment has been minimized.
|
-1 to removing I would very much like to see a definitive answer to this unresolved question:
Coming from C++, I would expect this to be valid, as temporaries there aren't destroyed until the end of the full expression (after |
This comment has been minimized.
This comment has been minimized.
|
The libs team discussed this briefly at the triage meeting yesterday. We felt that there definitely is a real problem here in terms of misusage of This may be a case where the motivation may want to be fleshed out a little more before the detailed design is tackled. For example what exact cases are we targeted at solving and are we willing to compromise cases unrelated to this? (things like that) |
aturon
self-assigned this
Jul 25, 2016
This comment has been minimized.
This comment has been minimized.
|
Here is another example of people writing unsound code, even though there is a giant warning in the documentation. |
This comment has been minimized.
This comment has been minimized.
|
Here is a more tricky example of this issue (for which An insane idea: What if |
This comment has been minimized.
This comment has been minimized.
|
Yeah, slices of strings have been brought up already as a clear case where Your insane idea doesn't work unless CString also always leaks its On Thu, Sep 1, 2016 at 10:56 AM, Aleksey Kladov notifications@github.com
|
This comment has been minimized.
This comment has been minimized.
The point is not to make this memory safe, but to make this fail at runtime with an easy to google reason. We obviously can't add runtime range checks, but overwriting the buffer with some garbadge will almost certainly lead to the the loud failure, which is much better than silent UB. And perhaps we can "leak" CStrings in debug mode? We can allocate them from some kind of free list, such that deallocated CStrings are not immediately overwritting by random garbage, and instead are overwritten by our deterministic garbage :) |
This comment has been minimized.
This comment has been minimized.
|
Ah, looks like I can't simply #[cfg(debug_assertions)]
impl Drop for CString {
fn drop(&mut self) {
let pattern = b"X_X DEAD MEMORY ";
let bytes = &mut self.inner[..self.inner.len() - 1];
for (d, s) in bytes.iter_mut().zip(pattern.iter().cycle()) {
*d = *s;
}
}
}because the standard library is always build with Adding a |
arielb1
reviewed
Sep 4, 2016
| which might be convenient but would make it easier to "leak" the pointer (as easy as | ||
| `let ptr = s.with_ptr(|p| p);`). | ||
|
|
||
| - Does `f(CString::new(...).unwrap().as_ptr())` actually invoke undefined behavior, if `f` doesn't store the pointer? The author's reading of the Rust reference implies that the `CString` temporary is kept alive for the entire expression, so it's fine. However, some commenters in the RFC thread have opined that the behavior of this code is unspecified at best. |
This comment has been minimized.
This comment has been minimized.
arielb1
Sep 4, 2016
Contributor
Sure. Temporaries stay alive at least up to the nearest arena (see nikomatsakis/rust-memory-model#17), and function calls do not create these.
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Sep 13, 2016
This comment has been minimized.
This comment has been minimized.
|
I'm going to close this as there seems to be significant resistance to adding friction to the |
durka
closed this
Oct 10, 2016
This comment has been minimized.
This comment has been minimized.
goertzenator
commented
Feb 1, 2017
|
Wouldn't implementing use std::ffi::{CString, NulError};
use std::os::raw::c_char;
fn ffi_function(_hello: *const c_char) {}
// minimal wrapper for CString so we can add traits
struct MyCString(CString);
impl MyCString {
fn new<T: Into<Vec<u8>>>(t: T) -> Result<MyCString, NulError> {
CString::new(t).map(|c| MyCString(c))
}
}
impl AsRef<c_char> for MyCString {
fn as_ref(&self) -> &c_char {
unsafe{ &*self.0.as_ptr() }
}
}
fn main() {
{ // as_ptr() undefined behavior case
let p = CString::new("some_string").unwrap().as_ptr();
ffi_function(p);
}
{ // as_ptr() proper use case
let s = CString::new("some_string").unwrap();
ffi_function(s.as_ptr());
}
{ // as_ref() undefined behavior case, but borrow checker catches it
let r = MyCString::new("some_string").unwrap().as_ref();
ffi_function(r);
}
{ // as_ptr() proper use case, reference coerces to pointer
let s = MyCString::new("some_string").unwrap();
ffi_function(s.as_ref());
}
} |
This comment has been minimized.
This comment has been minimized.
|
Ideally, |
This comment has been minimized.
This comment has been minimized.
BatmanAoD
commented
Jan 17, 2019
|
I don't quite understand why this issue is closed. Doesn't this invalidate Rust's guarantee that safe-rust cannot have undefined behavior, unless there is an error in an |
This comment has been minimized.
This comment has been minimized.
BatmanAoD
commented
Jan 17, 2019
|
...upon reflection, it seems that the unsafety is in the passing of a pointer by-value to a function. I realize this would be a breaking change, but perhaps the only way to truly uphold that guarantee would be to require any function that takes a raw pointer as a parameter to be marked |
This comment has been minimized.
This comment has been minimized.
|
It's only unsafe to dereference raw pointers, not pass them around. |
This comment has been minimized.
This comment has been minimized.
BatmanAoD
commented
Jan 17, 2019
|
@sfackler Right, but there can be a safe public function That said, when I wrote that comment I was under the impression (based this old comment) that there existed an |
This comment has been minimized.
This comment has been minimized.
Yes, that's correct. You can do things with raw pointers that don't involve dereferencing them, though. This function consumes a raw pointer and is totally safe to call with any value: fn is_pointer_even(x: *mut u8) -> bool {
x as usize % 2 == 0
}
fn main() {
is_pointer_even(1 as *mut u8);
} |
This comment has been minimized.
This comment has been minimized.
BatmanAoD
commented
Jan 17, 2019
|
Ah. You're right, of course, but that doesn't seem....particularly useful. In any case, there aren't any non- |
This comment has been minimized.
This comment has been minimized.
|
That's correct. |
durka commentedJun 6, 2016
Spawned from rust-lang/rust#34111 (cc @oli-obk).
Rendered