-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make closure capturing have consistent and correct behaviour around patterns #138961
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
A name like "report_error" suggests that the error in question might be user facing. Use "bug" to make it clear that the error in question will be an ICE.
In rust-lang#124902, mem-categorization got merged into ExprUseVisitor itself. Adjust the comments that have become misleading or confusing following this change.
6b31250
to
4b7bf58
Compare
Replace debug! calls that output a worse version of what #[instrument] does.
This comment has been minimized.
This comment has been minimized.
c225f17
to
ce47a4c
Compare
This comment has been minimized.
This comment has been minimized.
ce47a4c
to
75afceb
Compare
This PR changes a file inside |
Nadrieril suggested that this should be resolved through a breaking change – updated the PR description accordingly. @rustbot label +needs-crater r? @Nadrieril |
This solves the "can't find the upvar" ICEs that resulted from `maybe_read_scrutinee` being unfit for purpose.
75afceb
to
8ed61e4
Compare
@compiler-errors You've requested that the fix for #137553 land in a separate PR. However, ironically, the breaking changes are actually required by #137467 and not #137553. Do you think the removal of the now-obsolete |
We can crater both together if you think they're not worth separating. I was just trying to accelerate landing the parts that are obviously-not-breaking but it's up to you if you think that effort is worth it or if you're willing to be patient about waiting for the breaking parts (and FCP, etc). @bors try |
ExprUseVisitor: properly report discriminant reads This PR fixes rust-lang#137467. In order to do so, it needs to introduce a small breaking change surrounding the interaction of closure captures with matching against enums with uninhabited variants. Yes – to fix an ICE! ## Background The current upvar inference code handles patterns in two parts: - `ExprUseVisitor::walk_pat` finds the *bindings* being done by the pattern and captures the relevant parts - `ExprUseVisitor::maybe_read_scrutinee` determines whether matching against the pattern will at any point require inspecting a discriminant, and if so, captures *the entire scrutinee*. It also has some weird logic around bindings, deciding to also capture the entire scrutinee if *pretty much any binding exists in the pattern*, with some weird behavior like rust-lang#137553. Nevertheless, something like `|| let (a, _) = x;` will only capture `x.0`, because `maybe_read_scrutinee` does not run for irrefutable patterns at all. This causes issues like rust-lang#137467, where the closure wouldn't be capturing enough, because an irrefutable or-pattern can still require inspecting a discriminant, and the match lowering would then panic, because it couldn't find an appropriate upvar in the closure. My thesis is that this is not a reasonable implementation. To that end, I intend to merge the functionality of both these parts into `walk_pat`, which will bring upvar inference closer to what the MIR lowering actually needs – both in making sure that necessary variables get captured, fixing rust-lang#137467, and in reducing the cases where redundant variables do – fixing rust-lang#137553. This PR introduces the necessary logic into `walk_pat`, fixing rust-lang#137467. A subsequent PR will remove `maybe_read_scrutinee` entirely, which should now be redundant, fixing rust-lang#137553. The latter is still pending, as my current revision doesn't handle opaque types correctly for some reason I haven't looked into yet. ## The breaking change The following example, adapted from the testsuite, compiles on current stable, but will not compile with this PR: ```rust #[derive(Clone, Copy, PartialEq, Eq, Debug)] enum Void {} pub fn main() { let mut r = Result::<Void, (u32, u32)>::Err((0, 0)); let mut f = || { let Err((ref mut a, _)) = r; *a = 1; }; let mut g = || { //~^ ERROR: cannot borrow `r` as mutable more than once at a time let Err((_, ref mut b)) = r; *b = 2; }; f(); g(); assert_eq!(r, Err((1, 2))); } ``` The issue is that, to determine that matching against `Err` here doesn't require inspecting the discriminant, we need to query the `InhabitedPredicate` of the types involved. However, as upvar inference is done during typechecking, the relevant type might not yet be fully inferred. Because of this, performing such a check hits this assertion: https://github.com/rust-lang/rust/blob/43f0014ef0f242418674f49052ed39b70f73bc1c/compiler/rustc_middle/src/ty/inhabitedness/mod.rs#L121 The code used to compile fine, but only because the compiler incorrectly assumed that patterns used within a `let` cannot possibly be inspecting any discriminants. ## Is the breaking change necessary? One other option would be to double down, and introduce a deliberate semantics difference between `let $pat = $expr;` and `match $expr { $pat => ... }`, that syntactically determines whether the pattern is in an irrefutable position, instead of querying the types. **This would not eliminate the breaking change,** but it would limit it to more contrived examples, such as ```rust let ((true, Err((ref mut a, _, _))) | (false, Err((_, ref mut a, _)))) = x; ``` The cost here, would be the complexity added with very little benefit. ## Other notes - I performed various cleanups while working on this. The last commit of the PR is the interesting one. - Due to the temporary duplication of logic between `maybe_read_scrutinee` and `walk_pat`, some of the `#[rustc_capture_analysis]` tests report duplicate messages before deduplication. This is harmless.
That's the thing – one part is a breaking change, the other introduces insta-stable new behavior. There's no easily mergeable part to this. |
could we give this a less weird pr title pls 💀 @bors try |
Dear @rust-lang/lang, this PR proposes a breaking change to the language to fix some bizarre edge cases around precise closure captures. Crater found a single breakage, in a GitHub project. WDYT? |
…up, r=compiler-errors Various cleanup in ExprUseVisitor These are the non-behavior-changing commits from rust-lang#138961.
…up, r=compiler-errors Various cleanup in ExprUseVisitor These are the non-behavior-changing commits from rust-lang#138961.
…up, r=compiler-errors Various cleanup in ExprUseVisitor These are the non-behavior-changing commits from rust-lang#138961.
Rollup merge of rust-lang#139086 - meithecatte:expr-use-visitor-cleanup, r=compiler-errors Various cleanup in ExprUseVisitor These are the non-behavior-changing commits from rust-lang#138961.
If we were to OK this, we'd end up asking for a PR to be made to the affected project, particularly as it seems to be maintained. Probably you'll want to go ahead and do this now in any case, as it will make the story a bit simpler when we pick this up. |
Fortunately also, the regression is in the |
Forward compatibility with rust-lang/rust#138961
PR merged on our end. Thanks for letting us know. |
#[derive(Clone, Copy, PartialEq, Eq, Debug)] | ||
enum Void {} | ||
|
||
pub fn main() { | ||
let mut r = Result::<Void, (u32, u32)>::Err((0, 0)); | ||
let mut f = || { | ||
let Err((ref mut a, _)) = r; | ||
*a = 1; | ||
}; | ||
let mut g = || { | ||
//~^ ERROR: cannot borrow `r` as mutable more than once at a time | ||
let Err((_, ref mut b)) = r; | ||
*b = 2; | ||
}; | ||
f(); | ||
g(); | ||
assert_eq!(r, Err((1, 2))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?:
#[derive(Clone, Copy, PartialEq, Eq, Debug)] | |
enum Void {} | |
pub fn main() { | |
let mut r = Result::<Void, (u32, u32)>::Err((0, 0)); | |
let mut f = || { | |
let Err((ref mut a, _)) = r; | |
*a = 1; | |
}; | |
let mut g = || { | |
//~^ ERROR: cannot borrow `r` as mutable more than once at a time | |
let Err((_, ref mut b)) = r; | |
*b = 2; | |
}; | |
f(); | |
g(); | |
assert_eq!(r, Err((1, 2))); | |
} | |
enum Void {} | |
pub fn main() { | |
let mut r = Result::<Void, (u32, u32)>::Err((0, 0)); | |
let mut f = || { | |
let Err((ref mut a, _)) = r; | |
*a = 1; | |
}; | |
let mut g = || { | |
//~^ ERROR: cannot borrow `r` as mutable more than once at a time | |
let Err((_, ref mut b)) = r; | |
*b = 2; | |
}; | |
f(); | |
g(); | |
assert!(matches!(r, Err((1, 2)))); | |
} |
I like doing away with those derives if they're not part of the demonstration.
I.e.: enum Void {}
pub fn main() {
let mut r = Result::<Void, (u32, u32)>::Err((0, 0));
let mut f = || {
let Err((ref mut a, _)) = r;
*a = 1;
};
let mut g = || {
//[stable]~^ OK
//[post-pr]~^ ERROR: cannot borrow `r` as mutable more than once at a time
let Err((_, ref mut b)) = r;
*b = 2;
};
f();
g();
assert!(matches!(r, Err((1, 2))));
} This is my only reservation about this, as this is a bit of a strange and unfortunate outcome as a language matter. The story we want to tell about uninhabited types is that I believe you about the implementation difficulties. It is a bit surprising, though, given that we need to know that If it weren't for the fact that this kind of accidentally worked for other reasons, it'd be easy to say, "it's OK; we could always make this work later." But it feels less ideal to say that when we're incurring breakage. Other than this reservation, this all seems right to me. |
Well, as far as closure capture goes, we don't check this at all – we tacitly assume that matching against any pattern accepted for a
I believe this is the only instance where we would need to inspect whether a type is uninhabited before type inference is done. The only way we could properly delay it would be to do minimal capture analysis as the very last step of type inference – and make sure that we still properly handle cases where this leads to a cycle.
Well, there's already the difference that you can't use the This is consistent with the interpretation that
Technically, this is true, but the fact that pattern-matching is included in precise captures isn't a very well-known feature in the first place, and Crater didn't find any concrete instance of this breakage. Currently, we let this through, at the cost of ICE-ing on perfectly reasonable code like in #137467. |
OK, I buy that. Looks good to me then. I propose we take the breakage and the new behavior as described. @rfcbot fcp merge |
Team member @traviscross has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
This all makes sense to me at first sight, in particular the part about uninhabited variants in the last 2 comments. From a Miri perspective, we'll want to see a discriminant read at that point in the closure so that we can report UB as appropriate; this can't even work if the discriminant does not get captured in the first place. That said, I'd definitely like to hear from @Nadrieril before we land this. :) Would it be possible to have a Miri test like this? I'm thinking of a |
This should pass, and I'll investigate creating a test like that. But do note that currently, the lowering actual MIR lowering code does check for inhabitedness, so a test with one inhabited variant, and one uninhabited one, would not pass, because the read of the discriminant would be skipped. To do this in an "honest" manner, we would want to also break compat outside of closures as well, and make the actual lowering perform a read in that case, bringing the semantics into a consistent state between the two. |
Hm, I think we only skip the read in cases where it cannot fail since the place is by-val, or were there still some gaps in that? Though if let only does this for by-val cases we will not be able to do it in a closure either.
I though there should be a test here that is UB with this PR but was accepted before. But maybe not.
|
I think we're talking about a different check. See this part of the code, where if we're matching the only inhabited variant, we don't emit the rust/compiler/rustc_mir_build/src/builder/matches/match_pair.rs Lines 269 to 282 in 7bfd952
|
That's deep in the guts of pattern elaboration, I cannot follow that code or relate it to the MIR we generate.^^ I am entirely talking about what it looks like in MIR.
Though maybe I am entirely on the wrong train here. This PR does not change the MIR in the closure, does it? But it does mean more things get captured in some cases which should leave a trace in the MIR where the closure is constructed?
|
Pretty much. It only changes what gets captured into the closure, and by extension the exact place expressions MIR later uses to access these upvars.
Yes, more things in some cases (#137467), less things in others (#137553).
To the best of my knowledge, this PR makes
Okay, looks like I was not aware of some of the behavior here. As it turns out, exhaustiveness only allows omitting the uninhabited branch when matching by value.I tried this test case:enum Void {}
enum Foo {
A(u32),
B(Void),
}
fn check_foo(x: &Foo) {
match x {
Foo::A(x) => {},
}
} and got this error, which I haven't seen before:
This means that we only get to the code I linked when the uninhabitedness doesn't require reading through any references. In this case, I will try to add a testcase like you described. Off the top of your head, are you aware of any similar tests that create enum values with invalid discriminants? I'm not sure how I'd go about doing that, I must admit unsafe code is not my strong suit. |
Right, but are they equivalent in terms of the MIR that is generated for the pattern?
For this case I'd do something like: #![feature(never_type)]
#[repr(C)]
enum E {
V1, // discrminant: 0
V2, // 1
V3(!), // 2
}
fn main() {
assert_eq!(std::mem::size_of::<E>(), 4);
let val = 2u32;
let ptr = (&raw const val).cast::<E>();
// This is invalid:
unsafe { ptr.read() };
} |
This PR has two goals:
Option::unwrap()
on aNone
value with refutable patterns #138973, a slightly different case with the same root cause.x
andx @ _
irrefutable patterns #137553, making the closure capturing rules consistent betweenlet
patterns andmatch
patterns. This is new insta-stable behavior.Background
This change concerns how precise closure captures interact with patterns. As a little known feature, patterns that require inspecting only part of a value will only cause that part of the value to get captured:
I was not able to find any discussion of this behavior being introduced, or discussion of its edge-cases, but it is documented in the Rust reference.
The currently stable behavior is as follows:
walk_pat
)match
,if let
,let ... else
, but not destructuringlet
or destructuring function parameters) get processed as follows (maybe_read_scrutinee
):@
-pattern, capture the entire scrutinee by referenceYou will note that this behavior is quite weird and it's hard to imagine a sensible rationale for at least some of its aspects. It has the following issues:
@
-pattern doesn't really have any semantics by itself. This is the weird behavior tracked as Closure captures are inconsistent betweenx
andx @ _
irrefutable patterns #137553.let
and pattern-matching done throughmatch
– which is a superficial syntactic differenceThis PR aims to address all of the above issues. The new behavior is as follows:
"requires inspecting a discriminant" is also used here to mean "compare something with a constant" and other such decisions. For types other than ADTs, the details are not interesting and aren't changing.
The breaking change
During closure capture analysis, matching an
enum
against a constructor is considered to require inspecting a discriminant if theenum
has more than one variant. Notably, this is the case even if all the other variants happen to be uninhabited. This is motivated by implementation difficulties involved in querying whether types are inhabited before we're done with type inference – without moving mountains to make it happen, you hit this assert:rust/compiler/rustc_middle/src/ty/inhabitedness/mod.rs
Line 121 in 43f0014
Now, because the previous implementation did not concern itself with capturing the discriminants for irrefutable patterns at all, this is a breaking change – the following example, adapted from the testsuite, compiles on current stable, but will not compile with this PR:
Is the breaking change necessary?
One other option would be to double down, and introduce a set of syntactic rules for determining whether a sub-pattern is in an irrefutable position, instead of querying the types and checking how many variants there are.
This would not eliminate the breaking change, but it would limit it to more contrived examples, such as
In this example, the
Err
s would not be considered in an irrefutable position, because they are part of an or-pattern. However, current stable would treat this just like a tuple(bool, (T, U, _))
.While introducing such a distinction would limit the impact, I would say that the added complexity would not be commensurate with the benefit it introduces.
The new insta-stable behavior
If a pattern in a
match
expression or similar has parts it will never read, this part will not be captured anymore:Note that this behavior was pretty much already present, but only accessible with this One Weird Trick™:
Implementation notes
The first commits of the PR perform various cleanups. The action happens in two parts:
walk_pat
perform all necessary capturing. This is the part that fixes internal compiler error: two identical projections #137467.x
andx @ _
irrefutable patterns #137553.The new logic stops making the distinction between one particular example that used to work, and another ICE, tracked as #119786. As this requires an unstable feature, I am leaving this as future work.