Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential unsoundness in constructing Box<str> via raw pointers #137987

Closed
DiuDiu777 opened this issue Mar 4, 2025 · 1 comment
Closed

Potential unsoundness in constructing Box<str> via raw pointers #137987

DiuDiu777 opened this issue Mar 4, 2025 · 1 comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues.

Comments

@DiuDiu777
Copy link
Contributor

I observed that in the following code snippet, although it does not violate the current documented safety requirements of Box::from_raw, it ultimately triggers two instances of Undefined Behavior (UB):

  • Invalid memory layout during deallocation (when Drop)
  • Usage of invalid UTF-8 bytes (if print)

Although the cast from [u8] to *mut str seems to be the root cause, it requires no unsafe block and can compile successfully. These UBs appear only when the user constructs Box<str> with Box::from_raw.

Dose this suggest that the current safety documentation for Box::from_raw is incomplete? Should we explicitly mandate that ​the caller must ensure the memory contains valid UTF-8​ when constructing Box<str>?

#![feature(box_into_boxed_slice)]
fn main() {
    let invalid_bytes = Box::new([0xFFu8, 0xFE, 0xFD]);
    let boxed_slice = Box::into_boxed_slice(invalid_bytes);
    let raw_slice = Box::into_raw(boxed_slice) as *mut str;
    let _boxed_str = unsafe {
        Box::from_raw(raw_slice)
    };
    // println!("{:?}",boxed_str);
}

Miri will detect:

error: Undefined Behavior: incorrect layout on deallocation: alloc942 has size 3 and alignment 1, but gave size 1 and alignment 1.
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 4, 2025
@bjorn3
Copy link
Member

bjorn3 commented Mar 4, 2025

Usage of invalid UTF-8 bytes (if print)

This UB is documented. The from_raw documentation says:

The safety conditions are described in the memory layout section. And the memory layout section says:

On top of these basic layout requirements, a Box must point to a valid value of T.

Which is not true in this case and thus you are violating the safety requirements of from_raw.

Invalid memory layout during deallocation (when Drop)

This is because of the cast. boxed_slice is of type Box<[[u8; 3]]> and has length 1. It is not of type Box<[u8]> with length 3. When you cast this to Box<str> you inherit length 1. If you want to get from Box::new([0xFFu8, 0xFE, 0xFD]) to *mut str, the correct way is to do Box::into_raw(invalid_bytes as Box<[u8]>) as *mut str.

@bjorn3 bjorn3 added the C-discussion Category: Discussion or questions that doesn't represent real issues. label Mar 4, 2025
@fee1-dead fee1-dead closed this as not planned Won't fix, can't repro, duplicate, stale Mar 4, 2025
@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues.
Projects
None yet
Development

No branches or pull requests

5 participants