Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unwrap_or_else_default false positive #11135

Merged
merged 1 commit into from Jul 19, 2023

Conversation

smoelius
Copy link
Contributor

@smoelius smoelius commented Jul 9, 2023

This PR fixes a false positive in the handling of unwrap_or_else with a default value when the value is needed for type inference.

An easy example to exhibit the false positive is the following:

    let option = None;
    option.unwrap_or_else(Vec::new).push(1);

The following code would not compile, because the fact that the value is a Vec has been lost:

    let option = None;
    option.unwrap_or_default().push(1);

The fix is to:

  • implement a heuristic to tell whether an expression's type can be determined purely from its subexpressions, and the arguments and locals they use;
  • apply the heuristic to unwrap_or_else's receiver.

The heuristic returns false when applied to option in the above example, but it returns true when applied to option in either of the following examples:

    let option: Option<Vec<u64>> = None;
    option.unwrap_or_else(Vec::new).push(1);
    let option = None::<Vec<u64>>;
    option.unwrap_or_else(Vec::new).push(1);

(Aside: #10120 unfairly contained multiple changes in one PR. I am trying to break that PR up into smaller pieces.)


changelog: FP: [unwrap_or_else_default]: No longer lints if the default value is needed for type inference

@rustbot
Copy link
Collaborator

rustbot commented Jul 9, 2023

r? @Manishearth

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Jul 9, 2023
@Manishearth
Copy link
Member

r? @Centri3

@rustbot
Copy link
Collaborator

rustbot commented Jul 10, 2023

Failed to set assignee to Centri3: invalid assignee

Note: Only org members, users with write permissions, or people who have commented on the PR may be assigned.

@Centri3
Copy link
Member

Centri3 commented Jul 10, 2023

Will take a look soon enough ^^

Copy link
Member

@Centri3 Centri3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite the PR, thanks for taking this on!

Comment on lines 98 to 100
pub fn clear_def_id(self) -> Certainty {
match self {
Certainty::Uncertain => Certainty::Uncertain,
Certainty::Certain(_) => Certainty::Certain(None),
Certainty::Contradiction => Certainty::Contradiction,
}
}

pub fn with_def_id(self, def_id: DefId) -> Certainty {
match self {
Certainty::Uncertain => Certainty::Uncertain,
Certainty::Certain(_) => Certainty::Certain(Some(def_id)),
Certainty::Contradiction => Certainty::Contradiction,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel like these two should use if let/matches! instead, since the others are unchanged

expr_type_certainty(cx, expr).is_certain()
}

fn expr_type_certainty(cx: &LateContext<'_>, expr: &Expr<'_>) -> Certainty {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this could use a Visitor instead, though perhaps I'm wrong

Copy link
Contributor Author

@smoelius smoelius Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure a visitor would make sense here. When a node has multiple children, the results of the respective recursive calls need to be combined in some way (e.g., meeted or joined). IMHO, visitors are not really good for that sort of situation. This could be a failure of my imagination, though.

(0..(generics.parent_count + generics.params.len()) as u32).all(|index| {
fn_sig.inputs().iter().any(|input_ty| {
input_ty.skip_binder().walk().any(|arg| {
if let GenericArgKind::Type(ty) = arg.unpack() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could perhaps use matches! here to prevent the else branch (like, matches!(arg.unpack(), GenericArgKind::Type(ty) if ty.is_param(index)))

Comment on lines 291 to 301
(0..(generics.parent_count + generics.params.len()) as u32).all(|index| {
fn_sig.inputs().iter().any(|input_ty| {
input_ty.skip_binder().walk().any(|arg| {
if let GenericArgKind::Type(ty) = arg.unpack() {
ty.is_param(index)
} else {
false
}
})
})
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite able to suggest something better rn, but this block is quite hard to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your point is very well taken. I moved the .walk().any(...) code into its own function and added a comment to help explain what this block does.

let rhs = if type_is_inferrable_from_arguments(cx, expr) {
meet(
std::iter::once(receiver_type_certainty).chain(args.iter().map(|arg| expr_type_certainty(cx, arg))),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This could perhaps use then followed by conversion to Certainty using map_or. Though I'm not sure if that's more readable here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you had in mind?

            let rhs = type_is_inferrable_from_arguments(cx, expr)
                .then(|| meet(args.iter().map(|arg| expr_type_certainty(cx, arg))))
                .unwrap_or(Certainty::Uncertain);

Do you think it is more clear?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's not 😅 let's keep it the way it is.

{
Certainty::Certain(None)
} else {
Certainty::Uncertain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This could perhaps use then followed by conversion to Certainty using map_or. Though I'm not sure if that's more readable here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions as above (is this what you had in mind and it is more clear?):

                let lhs = ((parent_certainty.is_certain() || generics.parent_count == 0) && generics.params.is_empty())
                    .then(|| Certainty::Certain(None))
                    .unwrap_or(Certainty::Uncertain);

if let TyKind::Path(qpath) = &ty.kind {
qpath_certainty(cx, qpath, true)
} else {
let mut visitor = CertaintyVisitor::new(cx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Unnecessary else branch (afaict?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the else is necessary. I expanded the comment preceding the if in attempt to better explain why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant that you could inline the else, like,

if let TyKind::Path(qpath) = &ty.kind {
    return qpath_certainy(cx, qpath, true)
}

let mut visitor = CertainyVisitor::new(cx);
// etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for misunderstanding. Fixed.

//! As a heuristic, `expr_type_is_certain` may produce false negatives, but a false positive should
//! be considered a bug.

#![allow(clippy::wrong_self_convention)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like bulk-allowing a lint like this, especially one like wrong_self_convention. We should probably put this on individual methods instead, as expect (PS: Which method is actually triggering it? I can't see any out of the ordinary.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which method is actually triggering it?

AFAICT, none are. (I can't remember why I included this originally.)

@smoelius
Copy link
Contributor Author

This is quite the PR, thanks for taking this on!

Thank you.

FWIW, I am highly sympathetic to comments like #11135 (comment). Please let me know anything I can do to make the code more clear.

@Centri3
Copy link
Member

Centri3 commented Jul 12, 2023

Yeah it looks good now 👍

@bors
Copy link
Collaborator

bors commented Jul 13, 2023

☔ The latest upstream changes (presumably #11095) made this pull request unmergeable. Please resolve the merge conflicts.

@Manishearth
Copy link
Member

@bors delegate=Centri3

@bors
Copy link
Collaborator

bors commented Jul 14, 2023

✌️ @Centri3, you can now approve this pull request!

If @Manishearth told you to "r=me" after making some further change, please make that change, then do @bors r=@Manishearth

match (self, other) {
(Some(lhs), Some(rhs)) => {
if lhs == rhs {
Some(Some(lhs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is the perfect place to use then_some: (lhs == rhs).then_some(Some(lhs))

Copy link
Member

@Centri3 Centri3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ^^ Let's just use then_some in meet then I think this is done :D

match (self, other) {
(None, _) | (_, None) => None,
(Some(lhs), Some(rhs)) => {
if lhs == rhs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as well (then_some); missed this one before

@smoelius
Copy link
Contributor Author

smoelius commented Jul 19, 2023

Looks good ^^ Let's just use then_some in meet then I think this is done :D

Thanks, @Centri3 . Should I squash this to one commit?

@Centri3
Copy link
Member

Centri3 commented Jul 19, 2023

Yeah if you could squash them that would be great

@Centri3
Copy link
Member

Centri3 commented Jul 19, 2023

Thanks! @bors r+

@bors
Copy link
Collaborator

bors commented Jul 19, 2023

📌 Commit f583fd1 has been approved by Centri3

It is now in the queue for this repository.

@bors
Copy link
Collaborator

bors commented Jul 19, 2023

⌛ Testing commit f583fd1 with merge 7a34143...

@bors
Copy link
Collaborator

bors commented Jul 19, 2023

☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test
Approved by: Centri3
Pushing 7a34143 to master...

@bors bors merged commit 7a34143 into rust-lang:master Jul 19, 2023
5 checks passed
@smoelius smoelius deleted the unwrap_or_else_default-fp branch July 19, 2023 11:52
bors added a commit that referenced this pull request Apr 4, 2024
type certainty: clear `DefId` when an expression's type changes to non-adt

Fixes #12585

The root cause of the ICE in the linked issue was in the expression `one.x`, in the array literal.

The type of `one` is the `One` struct: an adt with a DefId, so its certainty is `Certain(def_id_of_one)`. However, the field access `.x` can then change the type (to `i32` here) and that should update that `DefId` accordingly. It does do that correctly when `one.x` would be another adt with a DefId:

https://github.com/rust-lang/rust-clippy/blob/97ba291d5aa026353ad93e48cf00e06f08c73830/clippy_utils/src/ty/type_certainty/mod.rs#L90-L91

but when it *isn't* an adt and there is no def id (which is the case in the linked issue: `one.x` is an i32), it keeps the `DefId` of `One`, even though that's the wrong type (which would then lead to a contradiction later when joining `Certainty`s):
https://github.com/rust-lang/rust-clippy/blob/97ba291d5aa026353ad93e48cf00e06f08c73830/clippy_utils/src/ty/type_certainty/mod.rs#L92-L93

In particular, in the linked issue, `from_array([one.x, two.x])` would try to join the `Certainty` of the two array elements, which *should* have been `[Certain(None), Certain(None)]`, because `i32`s have no `DefId`, but instead it was `[Certain(One), Certain(Two)]`, because the DefId wasn't cleared from when it was visiting `one` and `two`. This is the "contradiction" that could be seen in the ICE message

... so this changes it to clear the `DefId` when it isn't an adt.

cc `@smoelius` you implemented this initially in #11135, does this change make sense to you?

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants