Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve local scopes in generator MIR #60840

Merged
merged 4 commits into from May 22, 2019

Conversation

Projects
None yet
7 participants
@tmandry
Copy link
Contributor

commented May 14, 2019

Part of #52924, depended upon by the generator layout optimization #60187.

This PR adds StorageDead statements in more places in generators, so we can see when non-Drop locals have gone out of scope and recover their storage.

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

For Drop locals, we modify the MaybeStorageLive analysis to use drop to indicate that storage is no longer live for the local. Once drop returns or unwinds to our function, we implicitly assume that the local is StorageDead.

Instead of using drop, it is possible to emit more StorageDead statements in the MIR for Drop locals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant.

r? @Zoxc (feel free to reassign)
cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await

@tmandry

This comment has been minimized.

Copy link
Contributor Author

commented May 14, 2019

cc @rust-lang/lang

The assumption this PR is making is that once drop returns to a function, whether it succeeds or panics, it is UB for the function to then access that local.

Let me know if I need to file an issue separately to discuss this!

/// elaboration from creating drop flags that would have to be captured
/// by the generator. I'm not sure how important this optimization is,
/// but it is here.
generator_drop: Option<BasicBlock>,

This comment has been minimized.

Copy link
@Zoxc

Zoxc May 14, 2019

Contributor

Why is this removed?

This comment has been minimized.

Copy link
@tmandry

tmandry May 14, 2019

Author Contributor

The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (StorageDead statements) that it removed, in order to do the analysis for the generator layout optimization.

This comment has been minimized.

Copy link
@tmandry

tmandry May 14, 2019

Author Contributor

So this unifies the unwind and generator_drop paths again.

This comment has been minimized.

Copy link
@Zoxc

Zoxc May 14, 2019

Contributor

The effect of this commit is to undo the optimization mentioned in the comment. We need the drop flags (StorageDead statements) that it removed, in order to do the analysis for the generator layout optimization.

That doesn't make sense. StorageDead is unrelated to drop flags. We do not want drop flags to be created.
Additionally unwind has to drop generator arguments (we probably want to add those back) while the generator_drop path does not have them. I think you should just generate StorageDead statements for the generator_drop path, but not do the other changes.

This comment has been minimized.

Copy link
@tmandry

tmandry May 14, 2019

Author Contributor

Oh I see, yeah I completely misunderstood what was going on here.

Still, from looking at the code I couldn't find any places where the two paths are handled any differently (after my change to generate StorageDead statements for the generator_drop path), which is why I thought this change was okay. So, I guess I'm still not 100% sure what's going on here :)

This comment has been minimized.

Copy link
@tmandry

tmandry May 14, 2019

Author Contributor

I'll split up the commit so this is easier to see.

This comment has been minimized.

Copy link
@tmandry

tmandry May 15, 2019

Author Contributor

I split up the commit and reverted this change (but left it in the history for now).

@cramertj

This comment has been minimized.

Copy link
Member

commented May 14, 2019

For additional context, @RalfJung pointed out that under stacked borrows, the creation of the &mut Self reference in Drop::drop means that all existing raw pointers to the local are invalidated, so the requirement introduced in this PR is likely already enforced under the existing model (though there was also some mention that drop is already kind of "special", so it might be that we could create some kind of exception here).

This means that code like this is UB: playground example
Note that when run under MIRI, this code already produces stacked-borrows errors.

@Centril

This comment has been minimized.

Copy link
Member

commented May 15, 2019

This seems reasonable to me; the only thing I would ask is that we document the assumed invariant in a less ephemeral place than the code-base since it seems easy to forget for users, lang designers, and compiler devs alike.

@tmandry tmandry force-pushed the tmandry:preserve-scope-in-generator-mir branch from 9f6afd0 to 26c37d7 May 15, 2019

@tmandry

This comment has been minimized.

Copy link
Contributor Author

commented May 16, 2019

the only thing I would ask is that we document the assumed invariant in a less ephemeral place than the code-base

There is an issue on the reference to document guarantees around drop and panicing: rust-lang-nursery/reference#348

I added a reference to my above comment, so it should get included once documentation is added :)

@Zoxc

This comment has been minimized.

Copy link
Contributor

commented May 19, 2019

We probably want a pass to remove the StorageDead statements in the unwind path after the generator transformation.

@RalfJung

This comment has been minimized.

Copy link
Member

commented May 20, 2019

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

Do you think it would make sense to have a -Z flag for this that Miri could set, so that Miri checks these stricter rules in more places?

OTOH, this seemingly only affects the unwind path, of which test coverage is generally spotty at best. ;) (And Miri doesn't even support unwinding yet.)

@RalfJung pointed out that under stacked borrows, the creation of the &mut Self reference in Drop::drop means that all existing raw pointers to the local are invalidated, so the requirement introduced in this PR is likely already enforced under the existing model.

I realized later that there is a gap in this argument: the drop implementation itself could create a raw pointer and leak it into global state, and taht could legitimately be used later.

Relying on StorageDead is probably better.

Let me know if I need to file an issue separately to discuss this!

I don't think we have a topic yet to track StorageDead/StorageLive issues in general. That would also cover #42371 then.

I'll open one in the UCG repository. EDIT: It's at rust-lang/unsafe-code-guidelines#129.


For Drop locals, we modify the MaybeStorageLive analysis to use drop to indicate that storage is no longer live for the local. Once drop returns or unwinds to our function, we implicitly assume that the local is StorageDead.

Okay, so this is the meat of the PR as far as I am concerned: this changes the semantics of Drop to implicitly have a StorageDead in both the return and the unwind edge. If we decide to go with this approach, please open an issue in Miri so that we can implement this in Miri's semantics. It would also be helpful if you could come up with some example code that should be flagged as UB.

I have a few questions about this though:

  • Drop works on an arbitrary place, StorageDead works on a local. I suppose the new semantics only occurs when the place is exactly a local, and does not involve any projections or so?

  • In early passes of MIR construction (before drop flag elaboration), Drop is actually a conditional drop. Does the StorageDead only occur if the Drop happens, or does it always occur? Are we sure that drop flag elaboration is semantics-preserving under whatever choice we make here (and document, I hope)?

  • What about DropAndReplace?

@Zoxc

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

I think drop elaboration can turn a drop of a local into drops of its fields, which would effectively remove the implicit StorageDead.

@RalfJung

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Hm, that might be a problem. Depending on how we decide about the semantics for StorageLive, when _3 is live doing StorageDead(_3); StorageLive(_3) might be okay while just StorageLive(_3) is not okay. So, removing a StorageDead for a local that later gets made live again could be a non-semantics-preserving transformation.

@tmandry

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

We probably want a pass to remove the StorageDead statements in the unwind path after the generator transformation.

We already remove StorageLive/StorageDead statements for any of the saved locals. For the rest of them, would it buy us anything to remove them?

@Zoxc

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

@tmandry There's a list of reasons here. Not sure how important it is.

@cramertj

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Would it be possible to split further discussion of the proper semantics of StorageDead/StorageLive out into a separate issue, and continue with landing this PR? I think there's general agreement that touching variables after drop has been called on them is undefined behavior, which AFAICT is the only guarantee necessary to make the optimization this unblocks valid.

Note that this PR is blocking a significant generator size optimization (#60187) which is currently blocking the stabilization of async/await (#52924). The lack of this optimization is currently causing frequent stack overflows and memory usage issues for production users (Fuchsia), as well as other nightly users of async/await.

I'd love it if we could land these PRs, unblocking async/await and fixing these problems for production users, and continue asynchronous discussion on the best way to model things in MIR (which, AFAICT, does not affect the validity of the optimization here). Is there a blocking issue that needs to be addressed first before we can merge?

@RalfJung

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Would it be possible to split further discussion of the proper semantics of StorageDead/StorageLive out into a separate issue, and continue with landing this PR?

Fine for me. Feel free to open an issue. :)

The worst-case outcome I can see here is that we end up having to materialize the implicit StorageDead after drop in generators, to get the semantics straight. Would that be a deal-breaker?

@cramertj

This comment has been minimized.

Copy link
Member

commented May 21, 2019

Fine for me. Feel free to open an issue. :)

Opened #61015.

The worst-case outcome I can see here is that we end up having to materialize the implicit StorageDead after drop in generators, to get the semantics straight. Would that be a deal-breaker?

I certainly don't have any opposition to this in theory, but it seems worth experimenting to see what solutions we can come up with that don't significantly regress compiler performance (or other ways we can recover the compiler performance).

In the meantime, let's get this in--
@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

📌 Commit 26c37d7 has been approved by cramertj

@Zoxc

This comment has been minimized.

Copy link
Contributor

commented May 21, 2019

@cramertj Did you actually review the PR?

@tmandry

This comment has been minimized.

Copy link
Contributor Author

commented May 21, 2019

@tmandry There's a list of reasons here. Not sure how important it is.

Well, the idea of "observing a frozen stack" is probably incompatible with this change (and certainly incompatible with the optimization it enables for generators). Technically it isn't wrong to remove the StorageDead after the optimization runs; it's equivalent to recovering these "frozen stack" semantics for the locals that didn't happen to be saved in the generator. But I don't think there's a good reason why we'd want that, and it could lead to more confusion.

The reasons given in the code aren't all that compelling because it boils down to "we thought clang did it this way but actually it doesn't." My understanding is that we never changed the behavior because that would hurt performance. However, we're already going to pay the cost here of emitting these StorageDeads in generators, and I doubt removing them later will recover that cost.

@cramertj

This comment has been minimized.

Copy link
Member

commented May 21, 2019

@Zoxc Yes, I looked over it and it looked reasonable to me, but I'm much less familiar with this code than you are, so if you have specific concerns you'd like to see addressed before merging feel free to r-.

bors added a commit that referenced this pull request May 22, 2019

Auto merge of #60840 - tmandry:preserve-scope-in-generator-mir, r=cra…
…mertj

Preserve local scopes in generator MIR

Part of #52924, depended upon by the generator layout optimization #60187.

This PR adds `StorageDead` statements in more places in generators, so we can see when non-`Drop` locals have gone out of scope and recover their storage.

The reason this is only done for generators is compiler performance. See #60187 (comment) for what happens when we do this for all functions.

For `Drop` locals, we modify the `MaybeStorageLive` analysis to use `drop` to indicate that storage is no longer live for the local. Once `drop` returns or unwinds to our function, we implicitly assume that the local is `StorageDead`.

Instead of using `drop`, it is possible to emit more `StorageDead` statements in the MIR for `Drop` locals so we can handle all locals the same. I am fine with doing it that way, but this was the simplest approach for my purposes. It is also likely to be more performant.

r? @Zoxc (feel free to reassign)
cc @cramertj @eddyb @RalfJung @rust-lang/wg-async-await
@bors

This comment has been minimized.

Copy link
Contributor

commented May 22, 2019

⌛️ Testing commit 26c37d7 with merge 1cc822c...

@bors

This comment has been minimized.

Copy link
Contributor

commented May 22, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: cramertj
Pushing 1cc822c to master...

@bors bors added the merged-by-bors label May 22, 2019

@bors bors merged commit 26c37d7 into rust-lang:master May 22, 2019

2 checks passed

Travis CI - Pull Request Build Passed
Details
homu Test successful
Details

@tmandry tmandry deleted the tmandry:preserve-scope-in-generator-mir branch May 22, 2019

bors added a commit that referenced this pull request May 23, 2019

Auto merge of #61060 - tmandry:fix-drop-and-replace, r=<try>
Generate StorageLive after DropAndReplace for locals

Fixes a problem in #60840 where I forgot to consider DropAndReplace.

That PR assumes that a MIR Drop terminator implies StorageDead along both edges for locals. During drop elaboration, we turn DropAndReplace into a Drop followed by an assignment, which would incorrectly then be seen as an assignment to a local that's StorageDead. To compensate, we declare that the local is StorageLive again before assigning.

I did some local benchmarking on this change and didn't see any regressions. I wasn't running the full suite, however.

I have a test which reproduces the problem. It will be included in #60187, since I need that optimization in place for the problem to actually manifest.

bors added a commit that referenced this pull request May 30, 2019

Auto merge of #61373 - tmandry:emit-storagedead-along-unwind, r=<try>
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung

bors added a commit that referenced this pull request May 31, 2019

Auto merge of #61373 - tmandry:emit-storagedead-along-unwind, r=<try>
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung

Centril added a commit to Centril/rust that referenced this pull request Jun 4, 2019

Rollup merge of rust-lang#61373 - tmandry:emit-storagedead-along-unwi…
…nd, r=eddyb

Emit StorageDead along unwind paths for generators

Completion of the work done in rust-lang#60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also rust-lang#61060 which tried to fix this in a different way).

This finally enables the optimization implemented in rust-lang#60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung

bors added a commit that referenced this pull request Jun 5, 2019

Auto merge of #61373 - tmandry:emit-storagedead-along-unwind, r=eddyb
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung

bors added a commit that referenced this pull request Jun 6, 2019

Auto merge of #61373 - tmandry:emit-storagedead-along-unwind, r=eddyb
Emit StorageDead along unwind paths for generators

Completion of the work done in #60840. That PR made a change to implicitly consider a local `StorageDead` after Drop, but that was incorrect for DropAndReplace (see also #61060 which tried to fix this in a different way).

This finally enables the optimization implemented in #60187.

r? @eddyb
cc @Zoxc @cramertj @RalfJung

bors added a commit that referenced this pull request Jun 12, 2019

Auto merge of #60187 - tmandry:generator-optimization, r=eddyb
Generator optimization: Overlap locals that never have storage live at the same time

The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization.

More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**

To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.

Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.

For the layout computation, we use a simplified approach.

1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.

Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".

So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:

You can think of a generator as an enum, where some fields are shared between variants. e.g.

```rust
enum Generator {
  Unresumed,
  Poisoned,
  Returned,
  Suspend0(A, B, X),
  Suspend1(B),
  Suspend2(A, Y, Z),
}
```

where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.

Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.

---

Depends on:
- [x] #59897 Multi-variant layouts for generators
- [x] #60840 Preserve local scopes in generator MIR
- [x] #61373 Emit StorageDead along unwind paths for generators

Before merging:

- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at #59897 (comment))

Centril added a commit to Centril/rust that referenced this pull request Jun 12, 2019

Rollup merge of rust-lang#60187 - tmandry:generator-optimization, r=e…
…ddyb

Generator optimization: Overlap locals that never have storage live at the same time

The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](https://github.com/rust-lang/rust/blob/08bfe16129b0621bc90184f8704523d4929695ef/src/libstd/macros.rs#L365-L381)), which we rely on to implement the optimization.

More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**

To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.

Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.

For the layout computation, we use a simplified approach.

1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.

Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".

So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:

You can think of a generator as an enum, where some fields are shared between variants. e.g.

```rust
enum Generator {
  Unresumed,
  Poisoned,
  Returned,
  Suspend0(A, B, X),
  Suspend1(B),
  Suspend2(A, Y, Z),
}
```

where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.

Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.

---

Depends on:
- [x] rust-lang#59897 Multi-variant layouts for generators
- [x] rust-lang#60840 Preserve local scopes in generator MIR
- [x] rust-lang#61373 Emit StorageDead along unwind paths for generators

Before merging:

- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened rust-lang#60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at rust-lang#59897 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.