Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about: splitting struct into separate fields despite escaped references (maybe even raw pointers) #244

Open
RalfJung opened this issue Aug 6, 2020 · 19 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-memory Topic: Related to memory accesses A-optimization Category: Discussing optimizations we might (not) want to support C-open-question Category: An open question that we should revisit

Comments

@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2020

@comex writes

For what it's worth, I think it would be nice if the compiler could do scalar-replacement-of-aggregates (i.e. splitting a structure-typed variable into individual variables for each of the fields, which can then potentially be stored in registers rather than memory) even when a reference to one of the fields escapes.

My response is:

Note that repr(Rust) doesn't save us here; repr(Rust) means the layout is unspecified but offset_of! should still work. I am not sure how to specify a language in a reasonably simple way that allows this.
(I think C may allow this optimization, making container_of-style code illegal in C. But of course tons of code still does it in practice and we all just hope things don't fall apart. I do not consider that an acceptable outcome for Rust.)

and then @comex asks

Well, have you thought about which other optimizations may be impacted by this?

Not sure what you mean -- "this" = having byte-level untyped memory? This is what LLVM does, and to my knowledge all its optimizations are compatible with that model, so I'd say the impact is not big. But I am not an expert in what kinds of optimizations compilers (want to) do. I am coming more from the perspective of how to specify the language to enable optimizations that others tell me are desirable. Given that most languages have an ambiguous or no specification of what exactly constitutes UB, I feel like that's a niche I need to fill. ;)

Being able to separate the fields of a struct even when there are pointers to it is easy in languages with a high-level view of memory, but immensely difficult in C/C++/Rust which allow byte-wise memory manipulations. While it is possible to do so, I think the resulting semantics are inscrutable for programmers. So the real question here is, how much complexity in the spec (with the associated amount of bugs caused by misunderstandings, and complexity in Miri-like tools that aim to reflect the spec) are we willing to accept to enable such optimizations?

@RalfJung RalfJung added C-open-question Category: An open question that we should revisit A-optimization Category: Discussing optimizations we might (not) want to support A-memory Topic: Related to memory accesses labels Aug 6, 2020
@Diggsey
Copy link

Diggsey commented Aug 6, 2020

repr(Rust) means the layout is unspecified but offset_of! should still work.

Why would offset_of! break that optimisation? If a reference to a field in a struct escapes, then the provenance of that reference means that the called code cannot access any other fields on the struct, so the compiler could happily put them in registers.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2020

That would use provenance to justify the optimization, which I think is very different from what @comex imagined.

Also, this doesn't work when we talk about a raw pointer to the field instead of a reference.

@comex
Copy link

comex commented Aug 6, 2020

Sorry, to clarify:

By "this" I meant container_of-type operations, i.e. computing an outer object pointer from an inner object pointer.

I was talking about separating the fields of a struct even if there is an escaped pointer to a field – but not pointers to the struct itself, or to any other fields. So the field which had pointers to it would be laid out as normal, but the rest of the struct might not be.

If container_of-type operations are forbidden for references, and if the pointer-to-field was computed using references (&foo.bar not &raw foo.bar), this optimization should be sound... at least in some cases.

At minimum, it should be fully semantics-preserving to allocate stack space for the entire struct, but only write to the field in question, leaving the rest as uninitialized memory. Really that's just a special case of aliasing assumptions combined with dead store elimination.

As a stretch goal, it would also be nice to only allocate stack space for the field, but I suppose there are a few speed bumps. Even if the user wasn't allowed to dereference the result of container_of, they could still compute it, and make assumptions about the pointer value itself:

  • The hypothetical pointer-to-struct would have to be properly aligned;
  • If multiple fields of the same struct value have pointers escape, they would have to be at the appropriate offset with respect to each other;
  • The hypothetical pointer-to-struct shouldn't overlap any other pointers visible to the code in question.

I don't know how feasible it is to actually enforce these restrictions, so in practice they might rule out the "only allocate space tor the field" optimization altogether. It might be interesting to consider whether the memory model could be adjusted to free the compiler from having to enforce them. But again, that's a stretch goal; wasting a bit of stack space is usually not a big deal.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 8, 2020

I was talking about separating the fields of a struct even if there is an escaped pointer to a field – but not pointers to the struct itself, or to any other fields. So the field which had pointers to it would be laid out as normal, but the rest of the struct might not be.

That is what I assumed. This relies on a view of memory not as a chunk of bytes, but as something "tree-like" that is structured according to the types of the data stored in there. C tries to do that. It is a mess.

If container_of-type operations are forbidden for references, and if the pointer-to-field was computed using references (&foo.bar not &raw foo.bar), this optimization should be sound... at least in some cases.

The question is, how do you forbid them? Languages are not defined by saying with optimizations are allowed, they are defined by a concrete operational semantics that says what happens when you execute the program. To permit optimizations like the one you describe, that operational semantics has to somehow make it such that pointer arithmetic starting at one field of a struct cannot reach other fields of that struct. Provenance can do that, but only when aliasing assumptions come into play -- not when raw pointers are used.

In that case the only avenue left is to make it so that the fields of a struct are not laid out next to each other in some byte-wise fashion, but instead form a tree in memory so that the fields are not next to each other linearly. But then you have to explain why memcpy and things like that work, and you end up making a lot of real-world code UB.

aliasing assumptions

If the pointer-to-field is raw, there are no aliasing assumptions we can make.

@comex
Copy link

comex commented Aug 12, 2020

If the pointer-to-field is raw, there are no aliasing assumptions we can make.

As I said, this optimization would not be applicable for raw pointers to fields.

@RalfJung
Copy link
Member Author

So you'd only do it if there were references to fields, but not if there were any raw pointers that escaped?

Yeah that should work, if we adopt Stacked Borrows or something similar. That would not rely on a different view of memory, so it is a fundamentally very different proposal.

It is in conflict with #134.

@RalfJung RalfJung changed the title What about: optimizations that rely on a higher-level view of memory What about: splitting struct into separate fields despite escaped references (maybe even raw pointers) Aug 12, 2020
@RalfJung RalfJung added the A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) label Aug 12, 2020
@Jules-Bertholet
Copy link

Varargs potentially wants something like this: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Varargs.20SROA.20sadness

@Dante-Broggi
Copy link
Contributor

IIUC, Swift does not have field splitting / SROA optimization for unsafe pointers, but does for inout and shared arguments, because all values are semantically "in registers" and the only ways to semantically ensure a value actually has an address is the withUnsafe*Pointer(to:) family of intrinsic functions, or if the value is in a static variable.
A particular example of this is described in a blog post by Matt Gallagher, where only taking the address of the first element of an array passed to strcpy causes the tail of the array to simply not be allocated.

Reflecting back to Rust, this would result in new reference types &?addr T/&?addr mut T, where the existing reference types would be &addr T/&addr mut T if the default would change in a new edition, where the new reference types could only be convertible to the current reference types in limited ways which would not expose the address, perhaps by an intrinsic with a callback API and not directly to pointers or integers.

@RalfJung
Copy link
Member Author

only taking the address of the first element of an array passed to strcpy causes the tail of the array to simply not be allocated.

With strict subobject provenance, I think the same would be allowed in Rust.

I think you are basically suggesting we should have 2 reference types -- one that is strictly limited to the given type, and one that allows accessing surrounding memory under certain conditions?

@digama0
Copy link

digama0 commented Jul 16, 2023

It sounds like the proposed non-&addr reference type would also not have guaranteed address stability. It is unclear to me whether our memory model can support that.

@RalfJung
Copy link
Member Author

It sounds like the proposed non-&addr reference type would also not have guaranteed address stability.

What exactly does that even mean?

I think you are basically suggesting we should have 2 reference types -- one that is strictly limited to the given type, and one that allows accessing surrounding memory under certain conditions?

In fact we don't even need 2 types for this, just two different operations for taking a reference -- one that restricts provenance to the subobject, and one that does not.

@digama0
Copy link

digama0 commented Jul 16, 2023

It sounds like the proposed non-&addr reference type would also not have guaranteed address stability.

What exactly does that even mean?

Something like:

fn foo(x: &noaddr u8) {
  assert!(x as usize == x as usize)
}

is not guaranteed to succeed, because the semantics of the &noaddr T as usize operation entails (possibly) putting T somewhere in memory and then taking the address of that, so there is no guarantee when you do it multiple times that you get the same result.

@CAD97
Copy link

CAD97 commented Jul 16, 2023

I would initially expect a specific reference to be consistent. But it not necessarily being so would also enable converting fn(&scalar) to fn(scalar).

Plus the original spitball would prevent converting to usize at all for the different ref type, IIUC. I don't think it's practical to have a split reference type, though; it would assuredly lead to more "color" style complaints and just restricting provenance should be possible to make sufficient, if cross-provenance pointer ops get treated like cross-alloc.

@RalfJung
Copy link
Member Author

Something like:

How does that relate to SROA? Is the point that we must not be able to tell whether two references have the right distance to match the layout of the original struct?

@digama0
Copy link

digama0 commented Jul 16, 2023

The point is that &noaddr T is really only a "logical pointer", it doesn't necessarily have the trappings associated with a pointer in memory. This is what you need to make substantial calling convention changes like SROA as optimizations.

I agree re: the high bar of adding another pointer type, though. Having this kind of pervasive non-guaranteed address would be useful for a lot of things, but also problematic for many other things so I'm not sure we want to adopt this semantics for &T, and without that it doesn't seem worth it.

@Jules-Bertholet
Copy link

I'm not sure we need something so radical as &noaddr T. SROA doesn't need fully nondeterministic addresses, it needs provenance slicing and a loosening of guarantees around offsets. I see no reason why x as usize == x as usize should ever not hold.

@digama0
Copy link

digama0 commented Jul 17, 2023

I see no reason why x as usize == x as usize should ever not hold.

I know that looks a bit odd when written that way, but it's similar to the non-guarantee around &0 as usize == &0 as usize. The &noaddr T is coerced to a &T before conversion to usize, and when you do that coercion twice you might get pointers to different locations. (You probably won't, but there is no guarantee of such.)

If we wanted to be more explicit about it, we would say that &noaddr T as usize is a compile error and then the code looks like &*x as usize == &*x as usize, which maybe looks more reasonable to be non-guaranteed (or maybe not, I don't know).

@RalfJung
Copy link
Member Author

When &noaddr are created to e.g. x.0 and x.1, with the offset between those fields being observed in other instances of the same type, then I don't see how we can do SROA without full address non-determinism. Though possibly the non-determinism could occur when the reference is created? Instead of it being a different type, &noaddr <place> could somehow allow putting the data to a different location. Though with mutable references this all becomes quite tricky as the mutation must be visible through the original place after the lifetime is over.

@Jules-Bertholet
Copy link

Jules-Bertholet commented Jul 18, 2023

Though possibly the non-determinism could occur when the reference is created?

At the latest. I think it could even be moved earlier, to when the struct is created.

I don't think it makes sense to have new reference types just for this, either we find a way to retrofit SROA onto existing references, or we don't do it at all. But the set of guarantees I am picturing would look something like this:

  • References always have a stable address
  • References/pointers to the same field of the same instance of the same struct always share the same address
  • If you have a pointer with permissions to access multiple fields of an instance of a struct, the offsets between those fields are consistent with offset_of!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-memory Topic: Related to memory accesses A-optimization Category: Discussing optimizations we might (not) want to support C-open-question Category: An open question that we should revisit
Projects
None yet
Development

No branches or pull requests

7 participants