Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up logic around live locals in generator analysis #71956

Merged

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented May 6, 2020

Resolves #69902. Requires #71893.

I've found it difficult to make changes in the logic around live locals in generator/transform.rs. It uses a custom dataflow analysis, MaybeRequiresStorage, that AFAICT computes whether a local is either initialized or borrowed. That analysis is using before effects, which we should try to avoid if possible because they are harder to reason about than ones only using the unprefixed effects. @pnkfelix has suggested removing "before" effects entirely to simplify the dataflow framework, which I might pursue someday.

This PR replaces MaybeRequiresStorage with a combination of the existing MaybeBorrowedLocals and a new MaybeInitializedLocals. MaybeInitializedLocals is just MaybeInitializedPlaces with a coarser resolution: it works on whole locals instead of move paths. As a result, I was able to simplify the logic in compute_storage_conflicts and locals_live_across_suspend_points.

This is not exactly equivalent to the old logic; some generators are now smaller than before. I believe this was because the old logic was too conservative, but I'm not as familiar with the constraints as the original implementers were, so I could be wrong. For example, I don't see a reason the size of the mixed_sizes future couldn't be 5K. It went from 7K to 6K in this PR.

r? @jonas-schievink @tmandry

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 6, 2020
debug!("always_live = {:?}", always_live_locals);

// Locals that are always live or ones that need to be stored across
// suspension points are not eligible for overlap.
Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is an incorrect comment added by me during a refactor. I didn't understand the purpose of this code. Because we compress the matrix to only save GeneratorSavedLocals below, I believe the intersect is unnecessary?

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented May 6, 2020

One follow-up question is whether it's sound/desirable to consider liveness when computing storage conflicts. I believe (live & init) | borrowed is basically what @jonas-schievink needs for their NRVO pass, and now that #71006 has been merged, it should be pretty easy to compute.

Copy link
Contributor

@jonas-schievink jonas-schievink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

This is not exactly equivalent to the old logic; some generators are now smaller than before. I believe this was because the old logic was too conservative, but I'm not as familiar with the constraints as the original implementers were, so I could be wrong. For example, I don't see a reason the size of the mixed_sizes future couldn't be 5K. It went from 7K to 6K in this PR.

It would be nice to have deeper introspection mechanisms for generators so we can debug this sort of stuff more easily. It would also help with implementing further layout optimizations.

// Ignore unreachable blocks.
if data.terminator().kind == TerminatorKind::Unreachable {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct in the presence of inline assembly (which is a statement)? It might access arbitrary locals and then diverge, leading to an Unreachable terminator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and yeah, inline assembly should really be a terminator)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're correct, although this is what the current implementation does. We should probably just remove this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yeah. I'll try to exploit this for a bit, it's easy to create MIR like this in practice (luckily almost no-one uses inline asm in async constructs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.. I think we needed this for some optimizations, but we can amend this to "unreachable and has no statements"

// If the local is moved out of, or if it gets marked `StorageDead`, consider it no
// longer initialized.
PlaceContext::NonUse(NonUseContext::StorageDead)
| PlaceContext::NonMutatingUse(NonMutatingUseContext::Move) => self.trans.kill(local),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is now ignoring borrowed variables here, where the old analysis was not. With the equations you wrote down in the generator transform, I wouldn't think that this causes the size difference – in fact, I'd expect this to only affect movable generators, but it seems that that's incorrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, you're saying that the version on stable is marking locals as needing to be saved across yield points if they are borrowed even in movable generators? I thought this was an oversight based on the comment next to if !movable { in the old code. Do we always need to worry about borrows across yield points?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely fine to ignore them in movable generators as that will cause rather acute UB when the generator is in fact moved around in memory.

I'm just wondering why the previous analysis cared about borrows if they don't impact immovable generators anyways, which is what we wanted to hold off on. And I also was trying to understand why the resulting generators are smaller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous analysis cared about them because it's possible for something to require storage even if it's been de-initialized. The address could have been captured from a borrow and unsafe code could use that to re-initialize the value.

By the definition of this MaybeInitializedLocals analysis, we shouldn't care about it, but we should still assume in the generator transform that anything borrowed and MaybeStorageLive might still require storage, even if it's not initialized.

src/librustc_mir/transform/generator.rs Show resolved Hide resolved
@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented May 6, 2020

It would be nice to have deeper introspection mechanisms for generators so we can debug this sort of stuff more easily. It would also help with implementing further layout optimizations.

Lemme know if you have something particular in mind. -Zdump-mir=XXX -Zdump-mir-dataflow should help here. It works for liveness now too! However, you would need to look at the results of multiple passes to figure which is causing the local to be marked live.

@tmandry tmandry self-assigned this May 7, 2020
let loc = Location { block, statement_index };
trace!("record conflicts at {:?}", loc);
init.seek_before_primary_effect(loc);
borrowed.seek_before_primary_effect(loc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, previously, the StorageConflictVisitor would only look at the "before-effects", and these seek_before_primary_effect calls seems to match that, but the reason that was correct was that MaybeRequiresStorage was marking all locals that will be used by a statement/terminator as live in the before-effect, but now it looks like that is no longer the case since neither MaybeInitializedLocals nor MaybeBorrowedLocals use before-effects. Is this still sound, or should this check after the primary effect was applied as well? I'm not really sure.

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is indeed a bug here. I'm missing a seek_before_primary_effect(terminator_loc).

The comment below addresses your broader concern. If there's N locations in a block, we also have N dataflow effects in the block (as long as we don't use "before" effects). That means N+1 possibly unique dataflow states, since you need to consider boundary conditions (before the first effect is applied, after the last effect is applied). If we seek before all possible locations, then seek after the final location (via seek_to_block_end), we will have observed all of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes a lot of sense. Thanks for the explanation!

src/librustc_mir/transform/generator.rs Show resolved Hide resolved
// Locals that are always live conflict with all other locals.
//
// FIXME: Why do we need to handle locals without `Storage{Live,Dead}` specially here?
// Shouldn't it be enough to know whether they are initialized?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they don't have Storage{Live,Dead} then they are considered always StorageLive. If they are also borrowed then they could be reinitialized indirectly.

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, my point is that all conflicts, even between locals that are always StorageLive, will be detected in the loop below. If such a local is never live and initialized or borrowed at the same time as another local, doesn't it suggest that we were too conservative with our StorageLive annotations? Once again, I'm worried about broken MIR here, so I didn't try to change this now, but I think it's worth exploring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh sorry, yeah, we should get rid of the special casing here. I think this was a silly micro-optimization I did early on when the analysis was simply "look for locals that are StorageLive at the same time".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna change this in a follow-up to try and minimize the scope of this PR in case we need to bisect something.

//
// requires_storage := init | borrowed
//
// FIXME: This function is called in a loop, so it might be better to pass in a temporary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe you could call union_row_with twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Yeah, that's clearly better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually doesn't work. We only union into rows for locals that also require storage at this point

// FIXME: This function is called in a loop, so it might be better to pass in a temporary
// bitset rather than cloning here.
let mut requires_storage = init.get().clone();
requires_storage.union(borrowed.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I think you can do better than this by intersecting with storage_live ⋃ always_live_locals, that's what the old code did.

Copy link
Contributor Author

@ecstatic-morse ecstatic-morse May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. We could probably be using the actual liveness here, but I was worried about ill-formed MIR. I can try that in another PR, but for now storage_live | always_live is a fine approximation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, once again borrowed | init is strictly more precise. init marks locals as uninitialized when we see StorageDead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and (crucially) so does borrowed, so you're right.

// Ignore unreachable blocks.
if data.terminator().kind == TerminatorKind::Unreachable {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.. I think we needed this for some optimizations, but we can amend this to "unreachable and has no statements"


// Visit every reachable statement and terminator. The exact order does not matter. When two
// locals are live at the same point in time, add an entry in the conflict matrix.
for (block, data) in traversal::preorder(body) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, this API is so much nicer! 💯

src/librustc_mir/transform/generator.rs Show resolved Hide resolved
@ecstatic-morse ecstatic-morse force-pushed the remove-requires-storage-analysis branch from 4123f6a to 6e1416f Compare May 13, 2020 16:47
@ecstatic-morse
Copy link
Contributor Author

Rebased on top of master. I addressed some comments in the last two commits and responded to some others. I don't wanna change how we handle Unreachable in this PR, since it will make bisection less precise if either this PR or that change is incorrect.

@ecstatic-morse ecstatic-morse force-pushed the remove-requires-storage-analysis branch from 6e1416f to 987647b Compare May 13, 2020 17:03
@bors
Copy link
Contributor

bors commented May 19, 2020

☔ The latest upstream changes (presumably #69171) made this pull request unmergeable. Please resolve the merge conflicts.

@tmandry
Copy link
Member

tmandry commented May 20, 2020

Sorry for the slowness on my end, r=me after rebase

Instead of using a bespoke dataflow analysis, `MaybeRequiresStorage`,
for computing locals that need to be stored across yield points and that
have conflicting storage, use a combination of simple, generally
applicable dataflow analyses. In this case, the formula for locals
that are live at a yield point is:

    live_across_yield := (live & init) | (!movable & borrowed)

and the formula for locals that require storage (and thus may conflict
with others) at a given point is:

    requires_storage := init | borrowed

`init` is `MaybeInitializedLocals`, a direct equivalent of
`MaybeInitializedPlaces` that works only on whole `Local`s. `borrowed`
and `live` are the pre-existing `MaybeBorrowedLocals` and
`MaybeLiveLocals` analyses respectively.
The generator transform needs to inspect all possible dataflow states.
This can be done with half the number of bitset union operations if we
can assume that the relevant analyses do not use "before" effects.
...when determining what locals are live.

A local cannot be borrowed before it is `storage_live` and
`MaybeBorrowedLocals` already invalidates borrows on `StorageDead`.
Likewise, a local cannot be initialized before it is marked StorageLive
and is marked as uninitialized after `StorageDead`.
@ecstatic-morse ecstatic-morse force-pushed the remove-requires-storage-analysis branch from 987647b to 3ff9317 Compare May 20, 2020 00:57
@ecstatic-morse
Copy link
Contributor Author

@bors r=tmandry rollup=never

In case we need to bisect.

@bors
Copy link
Contributor

bors commented May 20, 2020

📌 Commit 3ff9317 has been approved by tmandry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 20, 2020
@bors
Copy link
Contributor

bors commented May 20, 2020

⌛ Testing commit 3ff9317 with merge 803443d39d917bcb03be70779f4a398b3fa7eafe...

@bors
Copy link
Contributor

bors commented May 20, 2020

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 20, 2020
@ecstatic-morse
Copy link
Contributor Author

Another msys2 failure

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 20, 2020
@bors
Copy link
Contributor

bors commented May 22, 2020

⌛ Testing commit 3ff9317 with merge 458a3e7...

@bors
Copy link
Contributor

bors commented May 22, 2020

☀️ Test successful - checks-azure
Approved by: tmandry
Pushing 458a3e7 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 22, 2020
@bors bors merged commit 458a3e7 into rust-lang:master May 22, 2020
ecstatic-morse added a commit to ecstatic-morse/rust that referenced this pull request Jun 8, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 9, 2020
Revert rust-lang#71956

...since it caused unsoundness in rust-lang#73137. Also adds a reduced version of rust-lang#73137 to the test suite. The addition of the `MaybeInitializedLocals` dataflow analysis has not been reverted, but it is no longer used.

Presumably there is a more targeted fix, but I'm worried that other bugs may be lurking. I'm not yet sure what the root cause of rust-lang#73137 is.

This will need to get backported to beta.

r? @tmandry
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 13, 2020
…s-storage-analysis, r=tmandry"

This reverts commit 458a3e7, reversing
changes made to d9417b3.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 15, 2020
…-morse,Mark-Simulacrum

[beta] backport

This is a beta backport rollup of the following:
* [beta] Revert heterogeneous SocketAddr PartialEq impls rust-lang#73318
* Fix emcc failure for wasm32. rust-lang#73213
* Revert rust-lang#71956 rust-lang#73153
* [beta] Update cargo rust-lang#73141
* Minor: off-by-one error in RELEASES.md rust-lang#72914
* normalize adt fields during structural match checking rust-lang#72897
* Revert pr 71840 rust-lang#72989
* rust-lang/cargo#8361
* e658200 from rust-lang#72901

r? @ghost
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 23, 2020
…ir, r=tmandry

Check for assignments between non-conflicting generator saved locals

This is to prevent future changes to the generator transform from reintroducing the problem that caused rust-lang#73137. Namely, a store between two generator saved locals whose storage does not conflict.

My ultimate goal is to introduce a modified version of rust-lang#71956 that handles this case properly.

r? @tmandry
@ecstatic-morse ecstatic-morse deleted the remove-requires-storage-analysis branch October 6, 2020 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MaybeRequiresStorage for generators is not necessary
5 participants