New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: RArrow Dereference for Pointer Ergonomics #3577
base: master
Are you sure you want to change the base?
Conversation
An alternative is to make a postfix dereference operator ( |
This is a good alternative. With .* even more expressions become ergonomic than with RArrow. In particular long ones that end with a dereference without a field or method access. As far as I can see there are no grammar ambiguities that prevent this either. Most important is that some kind of non-prefix dereference operator exists. React with either:
|
Clarification Question: Are you suggesting |
Yes,
|
Then maybe |
With one less |
When would you have a place followed by another place in an expression or statement? |
Never, it would not be a grammar ambiguity. It would be a readability ambiguity. It could also be confused with |
Well, now we're into the realm of opinions I suppose.
Also I don't know C++ so I've got no idea what that |
The RArrow operator doesn't have this problem. Perhaps this is a reason to have both postfix |
@Lokathor If you think there are too many dots I'd prefer (indeed as C++ is mentioned, Also, while I think it's not a concern in practice, the following is valid Rust today: fn main() {
dbg!(5.*-6.0);
} |
That wouldn't actually break, since 5 isn't an identifier |
in it certainly can't break in practice, just that I think the parser needs more special rules to distinguish |
I thought about this a lot earlier today and honestly, I disagree. For a while I was almost very convinced of this as a reason why we should adopt the If you have a long expression you want to dereference, it seems counter-intuitive that the dereferencing happens from left-to-right except the last one, which is placed at the very beginning. So, you'd probably want to add a Ultimately, So, I'm more in favour of postfix dereference than right-arrows, but I do think that it's important to explore why. It makes a lot of sense why C had them and still does, but I don't think that Rust should, especially with its focus on memory safety, since we want the dereferences to stick out in the middle of the code as places where bad things can happen. |
If |
The point here is that |
For the original RArrow proposal, are these supported or not? let a: *const [u8; 256];
(*a)[3];
// a.*[3];
// a->[3]; //?
let f: *const fn(u32) -> u32;
(*f)(5);
// f.*(5);
// f->(5); // ?
let o: *const Option<NonNull<u64>>;
(*o)?.as_ref().checked_add(7)?;
// o.*?.as_ref().checked_add(7)?;
// o->?->checked_add(7)?; |
I really like the idea of postfix dereference via |
Existing similar things: with A keyword like |
👍 for the idea of postfix dereference.
Another I've seen proposed for postfix dereference is
|
I would strongly favor postfix It's unfamiliar in the instant you first see it, but it feels like you learn it once and then you don't forget it. |
It does look a lot less noisy, yes. A point worth considering: we don't have many ASCII characters left, is this a good enough use case to burn one of them? It might well be. Are there parsing issues? |
Case 1: The compiler will tell you that "float literals must have an integer part". You currently have to write it as Case 2: Just playing around with it a bit, ops used with punctuation (eg: |
and again because fn main() {
let p = &10;
dbg!(p^ - 5);
} |
I feel like one extremely important point that is not being discussed here is the very desugaring. Is desugaring The problem of
And quite importantly... creating the reference to I can only speak from my own experience, but in general, if I could have a reference instead of a pointer, I would have a reference instead of a pointer. Instead, if I've got a pointer in my hands, it's because there's something special about it, and borrowing is quite often what's special. Accidentally borrowing is terrible: it introduces UB. This goes against the very goals of this RFC: there's nothing ergonomic about introducing UB. Which, at this point, makes me question the very motivating example: pointer.add(5)->some_field->method_returning_pointer()->other_method() Where is the
And since you need to justify each and every step -- yes, really, that's the burden you took on when you decided to write unsafe code -- then you may as well break them down so it's clearer which justification refers to which step: // SAFETY:
// - `pointer` points to a sequence of at least 6 elements since <...>.
let element = pointer.add(5);
// SAFETY:
// - `element` is not null and well aligned since `pointer` was.
// - `element` points to a sufficiently sized memory block since `pointer` pointed to a sufficiently sized sequence.
// - `element` points to a live value since <...>.
// - `element` can be borrowed immutably since <...>.
let element = &*element;
// SAFETY:
// - `element.some_field` is not null and well aligned since <...>.
// - `element.some_field` points to a sufficiently sized memory block since <...>.
// - `element.some_field` points to a live value since <...>.
// - `element.some_field` can be borrowed immutably since <...>.
let some_field = &*element.some_field;
let pointer = some_field.method_returning_pointer();
// SAFETY:
// - `pointer` is not null and well aligned since <...>.
// - `pointer` points to a sufficiently sized memory block since <...>.
// - `pointer` points to a live value since <...>.
// - `pointer` can be borrowed immutably since <...>.
let thing = &*pointer;
thing.other_method() And I think we can argue that once due diligence is made, I note that there's value in projection because it enables navigating the fields without forming intermediate references which could potentially blow up in our faces. |
I am not sure what you mean, but it doesn't create a reference. It creates a place. The requirements you state only apply if the place is later turned into a reference, but that may or may not happen.
Again, this should be "intermediate places". I agree that the |
Also, I don't believe that anyone is suggesting that Personally, I think you're overdoing it quite a bit with a list of comments on every single access. |
No, it does not. It creates a place.
Hence the term irreducible encapsulation.
You got it backwards. Since it does not create a reference, this RFC reduces UB. |
Interesting idea! Combining the two, one could go as far as to lint against any use-case of deref on pointers that does not claim access to the whole pointed-to value. Assuming all those cases could then use That way Footnotes
|
Thanks for the correction. I knew of places but I typically just immediately turn them into references so didn't think of the distinction. I tried searching, but could not find, the safety requirements for turning a pointer into a place. Are those the requirements of derefencing a pointer? (So everything I listed but borrowing)
Unless, of course, Not creating a reference is nice. Though I do note there's likely still quite a laundry list of pre-conditions which need to be validated, regardless. |
A place isn't quite an operation of its own. Making a place is one step in read or writing, in which case either the reading or writing rules apply, for example. EDIT: also, yes, calling a method can create a reference depending on the method used. However, even using |
Is there any way to apply I tend not to use raw pointers a lot, because I like to leverage types to enforce invariants. At the very least, this means using I would expect the ability to define Is there a way to represent places in the type system so that writing the function is possible? Otherwise, as mentioned by @steffahn, we may be better off having two operators:
This way, custom types can benefit from the syntax sugar instead of being second-class, and it's clear to the reader whether a reference is formed, or not. |
Something like |
Unsafe code is frequently used to interface to C and adopting People that work on low level Rust code will typically have codebases that include significant amounts of C code. I think the two will have to coexist for a very long time yet. Making syntax similar where it is easy to do so will really help people in this situation. |
One of the things that makes Rust an interesting option is not being like C in a bunch of dimensions, including syntactic ones (e.g, not copying In contrast, making more things postfix has been one of the things that have consistently worked out well for Rust (postfix It's not always a good idea to just copy things from other languages, even if they work well in those languages. Unsafe code id also frequently used to do things that have nothing to do with C, after all. C programmers should have no problem learning to write |
This is not one of these though IMO, postfix deref has numerous advantages over |
At the risk of excessive restatement: the specific biggest advantage of a general postfix deref is that it improves things not just for pointers. References and types implementing |
To me the biggest point is compositionality.
Rust has never shied away from diverging from prior language when improvements were possible. I mentioned the Curly braces were picked not just because they are familiar, but also because the alternatives (significant whitespace, "begin ... end", not sure what else was considered) were considered worse. Familiarity can never stand on its own as an argument -- that would just lead to us repeating past mistakes rather than learning from other language's mistakes. This discussion quickly reminds me why I never participate in RFCs that need new syntax. It's much less exhausting to make deep changes to the operational semantics of Rust with far-reaching consequences for all unsafe code, than to add a single new piece of syntax. Don't expect an RFC for the |
If we want postfix operators, shouldn't the same be done for other unary operators too, like |
those are spelled |
There is |
Not quite, because |
TIL that Herb Sutter's cpp2 experiment uses postfix |
For pin projections that is exactly what you want as you must not expose the unpinned place to the user to avoid unsoundness. |
Interesting. He briefly justifies this in his GitHub design note on postfix operators:
Seems like postfix deref is pretty widely used, and I've learned Zig and Ada both have it as well. In the case of Rust, and the expression |
According to https://github.com/hsutter/cppfront/wiki/Design-note:-Postfix-unary-operators-vs-binary-operators the new language currently disambiguates
the last rule seems to make it impossible to deref and call a pointer to function without an extra pair of parenthesis i.e. |
I had not realized Zig had postfix dereference already, as the For reference: https://ziglang.org/documentation/master/#Pointers (see code samples). |
+1 for either |
This RFC improves ergonomics for pointers in unsafe Rust. It adds the RArrow token as a single-dereference member access operator.
x->field
desugars to(*x).field
, andx->method()
desugars to(*x).method()
.Before:
After:
Rendered