Skip to content

Commit

Permalink
Rollup merge of rust-lang#94984 - ericseppanen:cstr_from_bytes, r=Mar…
Browse files Browse the repository at this point in the history
…k-Simulacrum

add `CStr` method that accepts any slice containing a nul-terminated string

I haven't created an issue (tracking or otherwise) for this yet; apologies if my approach isn't correct. This is my first code contribution.

This change adds a member fn that converts a slice into a `CStr`; it is intended to be safer than `from_ptr` (which is unsafe and may read out of bounds), and more useful than `from_bytes_with_nul` (which requires that the caller already know where the nul byte is).

The reason I find this useful is for situations like this:
```rust
let mut buffer = [0u8; 32];
unsafe {
    some_c_function(buffer.as_mut_ptr(), buffer.len());
}
let result = CStr::from_bytes_with_nul(&buffer).unwrap();
```

This code above returns an error with `kind = InteriorNul`, because `from_bytes_with_nul` expects that the caller has passed in a slice with the NUL byte at the end of the slice. But if I just got back a nul-terminated string from some FFI function, I probably don't know where the NUL byte is.

I would wish for a `CStr` constructor with the following properties:
- Accept `&[u8]` as input
- Scan for the first NUL byte and return the `CStr` that spans the correct sub-slice (see [future note below](rust-lang#94984 (comment))).
- Return an error if no NUL byte is found within the input slice

I asked on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/CStr.20from.20.26.5Bu8.5D.20without.20knowing.20the.20NUL.20location.3F) whether this sounded like a good idea, and got a couple of positive-sounding responses from `@joshtriplett` and `@AzureMarker.`

This is my first draft, so feedback is welcome.

A few issues that definitely need feedback:

1. Naming. `@joshtriplett` called this `from_bytes_with_internal_nul` on Zulip, but after staring at all of the available methods, I believe that this function is probably what end users want (rather than the existing fn `from_bytes_with_nul`). Giving it a simpler name (**`from_bytes`**) implies that this should be their first choice.
2. Should I add a similar method on `CString` that accepts `Vec<u8>`? I'd assume the answer is probably yes, but I figured I'd try to get early feedback before making this change bigger.
3. What should the error type look like? I made a unit struct since `CStr::from_bytes` can only fail in one obvious way, but if I need to do this for `CString` as well then that one may want to return `FromVecWithNulError`. And maybe that should dictate the shape of the `CStr` error type also?

Also, cc `@poliorcetics` who wrote rust-lang#73139 containing similar fns.
  • Loading branch information
Dylan-DPC committed Mar 19, 2022
2 parents c437730 + d5fe4ca commit 79c935a
Show file tree
Hide file tree
Showing 2 changed files with 106 additions and 0 deletions.
69 changes: 69 additions & 0 deletions library/std/src/ffi/c_str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,27 @@ impl FromVecWithNulError {
}
}

/// An error indicating that no nul byte was present.
///
/// A slice used to create a [`CStr`] must contain a nul byte somewhere
/// within the slice.
///
/// This error is created by the [`CStr::from_bytes_until_nul`] method.
///
#[derive(Clone, PartialEq, Eq, Debug)]
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
pub struct FromBytesUntilNulError(());

#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
impl Error for FromBytesUntilNulError {}

#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
impl fmt::Display for FromBytesUntilNulError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "data provided does not contain a nul")
}
}

/// An error indicating invalid UTF-8 when converting a [`CString`] into a [`String`].
///
/// `CString` is just a wrapper over a buffer of bytes with a nul terminator;
Expand Down Expand Up @@ -1239,12 +1260,60 @@ impl CStr {
}
}

/// Creates a C string wrapper from a byte slice.
///
/// This method will create a `CStr` from any byte slice that contains at
/// least one nul byte. The caller does not need to know or specify where
/// the nul byte is located.
///
/// If the first byte is a nul character, this method will return an
/// empty `CStr`. If multiple nul characters are present, the `CStr` will
/// end at the first one.
///
/// If the slice only has a single nul byte at the end, this method is
/// equivalent to [`CStr::from_bytes_with_nul`].
///
/// # Examples
/// ```
/// #![feature(cstr_from_bytes_until_nul)]
///
/// use std::ffi::CStr;
///
/// let mut buffer = [0u8; 16];
/// unsafe {
/// // Here we might call an unsafe C function that writes a string
/// // into the buffer.
/// let buf_ptr = buffer.as_mut_ptr();
/// buf_ptr.write_bytes(b'A', 8);
/// }
/// // Attempt to extract a C nul-terminated string from the buffer.
/// let c_str = CStr::from_bytes_until_nul(&buffer[..]).unwrap();
/// assert_eq!(c_str.to_str().unwrap(), "AAAAAAAA");
/// ```
///
#[unstable(feature = "cstr_from_bytes_until_nul", issue = "95027")]
pub fn from_bytes_until_nul(bytes: &[u8]) -> Result<&CStr, FromBytesUntilNulError> {
let nul_pos = memchr::memchr(0, bytes);
match nul_pos {
Some(nul_pos) => {
// SAFETY: We know there is a nul byte at nul_pos, so this slice
// (ending at the nul byte) is a well-formed C string.
let subslice = &bytes[..nul_pos + 1];
Ok(unsafe { CStr::from_bytes_with_nul_unchecked(subslice) })
}
None => Err(FromBytesUntilNulError(())),
}
}

/// Creates a C string wrapper from a byte slice.
///
/// This function will cast the provided `bytes` to a `CStr`
/// wrapper after ensuring that the byte slice is nul-terminated
/// and does not contain any interior nul bytes.
///
/// If the nul byte may not be at the end,
/// [`CStr::from_bytes_until_nul`] can be used instead.
///
/// # Examples
///
/// ```
Expand Down
37 changes: 37 additions & 0 deletions library/std/src/ffi/c_str/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,43 @@ fn from_bytes_with_nul_interior() {
assert!(cstr.is_err());
}

#[test]
fn cstr_from_bytes_until_nul() {
// Test an empty slice. This should fail because it
// does not contain a nul byte.
let b = b"";
assert_eq!(CStr::from_bytes_until_nul(&b[..]), Err(FromBytesUntilNulError(())));

// Test a non-empty slice, that does not contain a nul byte.
let b = b"hello";
assert_eq!(CStr::from_bytes_until_nul(&b[..]), Err(FromBytesUntilNulError(())));

// Test an empty nul-terminated string
let b = b"\0";
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
assert_eq!(r.to_bytes(), b"");

// Test a slice with the nul byte in the middle
let b = b"hello\0world!";
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
assert_eq!(r.to_bytes(), b"hello");

// Test a slice with the nul byte at the end
let b = b"hello\0";
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
assert_eq!(r.to_bytes(), b"hello");

// Test a slice with two nul bytes at the end
let b = b"hello\0\0";
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
assert_eq!(r.to_bytes(), b"hello");

// Test a slice containing lots of nul bytes
let b = b"\0\0\0\0";
let r = CStr::from_bytes_until_nul(&b[..]).unwrap();
assert_eq!(r.to_bytes(), b"");
}

#[test]
fn into_boxed() {
let orig: &[u8] = b"Hello, world!\0";
Expand Down

0 comments on commit 79c935a

Please sign in to comment.