Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add as_ptr and as_mut_ptr inherent method to String #97483

Closed
wants to merge 1 commit into from

Conversation

Nilstrieb
Copy link
Member

Before, they went through &str and &mut str, which created intermediary references, shrinking provenance to only the initialized parts. Vec<T> already has such inherent methods added in #61114.

beef ran into this in beef#47.

The docs are mostly copied from Vec::{as_ptr, as_mut_ptr}, adapting the examples to something better fitting. The implementation simply forwards to the vec methods.

I'm not entirely sure what the correct stability attributes are for this, but I think it needs to be insta-stable like the vec methods?

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 28, 2022
@rust-highfive
Copy link
Collaborator

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 28, 2022
@Nilstrieb
Copy link
Member Author

r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs

@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 28, 2022
@rustbot rustbot removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 28, 2022
Before, they went through `&str` and `&mut str`, which created
intermediary references, shrinking provenance to only the initialized
parts. `Vec<T>` already has such inherent methods added in rust-lang#61114.
@Nilstrieb
Copy link
Member Author

Nilstrieb commented May 28, 2022

And I just found another case where this would have been helpful, compact_str#100

@Nilstrieb
Copy link
Member Author

@RalfJung you might be interested as well, since you added the method to Vec<T>

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. However, since this changes the libs API surface, I cannot approve it -- let's wait for the @rust-lang/libs-api team.

/// Modifying the string may cause its buffer to be reallocated,
/// which would also make any pointers to it invalid.
///
/// The caller must also ensure that the memory the pointer (non-transitively) points to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The caller must also ensure that the memory the pointer (non-transitively) points to
/// The caller must also ensure that the memory the pointer points to

We know it points to u8 so we don't need the transitivity comment.

/// which would also make any pointers to it invalid.
///
/// The caller must also ensure that the memory the pointer (non-transitively) points to
/// is never written to (except inside an `UnsafeCell`) using this pointer or any pointer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// is never written to (except inside an `UnsafeCell`) using this pointer or any pointer
/// is never written to using this pointer or any pointer

There is no UnsafeCell here.

@yaahc
Copy link
Member

yaahc commented Jun 8, 2022

I'm not entirely sure what the correct stability attributes are for this, but I think it needs to be insta-stable like the vec methods?

I'd definitely prefer if we could merge this unstably, though I'm not sure exactly what that would entail. If it's not much work though I think I'd prefer to gate this change on that functionality since it will help a lot in similar situations going forward. @rust-lang/compiler do y'all know how hard it would be to make it so we could merge this unstably without causing breakage?

My guess is we'd need to change method resolution to try all non-nightly methods for all deref targets first before going back and considering resolving the method call with the nightly method which doesn't seem tooooo hard (she says naively). This also feels vaguely related to rust-lang/rfcs#3240

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 3, 2022
@Dylan-DPC Dylan-DPC added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 27, 2022
@joshtriplett joshtriplett added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Jul 27, 2022
@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting. Without suggesting that we have any consensus on the implied provenance handling here, we don't have objections to merging this unstably for now. However, since it doesn't sound like there's a way to do that yet, marking this as blocked.

@joshtriplett
Copy link
Member

(Elaborating on the "implied provenance handling": these and the underlying &str/&mut str methods would return the same values here, and questions about provenance handling differences between them have not been settled.)

@saethlin
Copy link
Member

saethlin commented Jul 27, 2022

I think the PR description understates the impact that this has on the API. I'm hoping that this helps explain all the other implications of implementing as_ptr and as_mut_ptr without going through slice Deref:

fn main() {
    let mut s: Vec<u8> = Vec::with_capacity(10);
    s.extend(b"hello");
    let ptr = s.as_ptr(); // This pointer stays valid...
    let other = s.as_mut_ptr(); // Across as as_mut_ptr()
    let v = s; // Across a move of the original container
    unsafe {
        *other = b'b'; // And across a write to an aliasing pointer
        dbg!(*ptr);
    }

    let mut s = String::with_capacity(10);
    s.push_str("hello");
    let ptr = s.as_ptr(); // This pointer...
    let other = s.as_mut_ptr(); // Is invalidated here (though this case might be fixed, UCG 113)
    let v = s; // But wouldn't be invalidated here (this invalidates other with field retagging)
    unsafe {
        *other = b'b'; // We can't even get a pointer to do this without invalidating `ptr`, and we
                       // can't do this write by casting `ptr` because it doesn't have write
                       // permission.
        dbg!(*ptr);
    }
}

@yaahc
Copy link
Member

yaahc commented Jul 28, 2022

@Nilstrieb what happens when you apply an #[unstable] attribute to these methods? Is it just that it emits an unstable name collision warning lint?

@Nilstrieb
Copy link
Member Author

error: an associated function with this name may be added to the standard library in the future
    --> compiler/rustc_codegen_llvm/src/debuginfo/metadata.rs:1499:25
     |
1499 |             vtable_name.as_ptr().cast(),
     |                         ^^^^^^
     |
     = warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior!
     = note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919>
     = help: call with fully qualified syntax `bitflags::core::str::<impl str>::as_ptr(...)` to keep using the current method
     = help: add `#![feature(string_as_ptr)]` to the crate attributes to enable `std::string::String::as_ptr`

@davidtwco
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting. Without suggesting that we have any consensus on the implied provenance handling here, we don't have objections to merging this unstably for now. However, since it doesn't sound like there's a way to do that yet, marking this as blocked.

#99898 should enable this to be unblocked :)

@RalfJung
Copy link
Member

(Elaborating on the "implied provenance handling": these and the underlying &str/&mut str methods would return the same values here, and questions about provenance handling differences between them have not been settled.)

(Nit: in terms of the Rust specification / abstract machine, they would not return the same values here, but values that differ in provenance. They just erase to the same bit pattern at runtime.)

@RalfJung
Copy link
Member

As a possible argument not to have these methods, Tree Borrows makes this pretty much unnecessary -- the code in #106593 is accepted with Tree Borrows.

@Nilstrieb
Copy link
Member Author

Nilstrieb commented Apr 26, 2023

It's probably also fair to say that this is something we want to be allowed (at least I would want that :))

@RalfJung
Copy link
Member

Now I worry about whether we can remove these inherent methods from Vec again...

@Nilstrieb
Copy link
Member Author

That is probably lost hope. I guess we can say that they now exist because they are a tiny little less code than the deref path, making compilation faster :D

@Nilstrieb Nilstrieb closed this Apr 26, 2023
@Nilstrieb Nilstrieb deleted the string_as_mut_ptr branch April 26, 2023 19:36
@RalfJung
Copy link
Member

RalfJung commented Jul 19, 2023

Hm, maybe we should open this again, or at least track this in an issue? There are still examples which are UB according to Tree Borrows that we might want to allow:

fn main() {
    let mut s = String::with_capacity(10);
    s.push('x');
    let ptr = s.as_mut_ptr();
    unsafe { ptr.write(0x20) };
    let _ptr2 = s.as_mut_ptr(); // creates a slice that aliases with `ptr`.
    // That is considered like a read access, and since `ptr` is derived from
    // a mutable reference such a foreign read access invalidates `ptr`.
    unsafe { ptr.write(97) }; // UB
    println!("{s:?}");
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet