Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add heuristic to filter_map and find_map #6453

Closed
wants to merge 5 commits into from

Conversation

camsteffen
Copy link
Contributor

changelog: Reduce false positives for filter_map and find_map

Fixes #3188
Fixes #4193

Simply check if filter(..) contains a closure that uses Option or Result anywhere in any way (using ExprUseVisitor). The assumption is: if the filter does not use Option or Result at all, it is perfectly fine as it is and turning it into filter_map would make the code more complex. And the very same thing applies to find_map.

The heuristic could probably be improved or made more specific down the road. But my goal here is remove false positives without losing any positive cases. I think this is a big improvement.

To anticipate a potential concern, there is one "category" of cases for these lints that will no longer be detected: where the code in filter and map are redundant, doing the same transformation on the value (e.g. filter(|n| n + 1 > 5).map(|n| n + 1). But I think this is an acceptable loss because 1) there is no logic in these lints that actually recognizes redundant code and 2) this could be a separate lint (e.g. repetitive_iterator) - such a lint could apply to a wider variety of Iterator methods (e.g. iter.inspect(|n| println!("{}", n+1)).fold(0, |s, n| s += n + 1)).

@rust-highfive
Copy link

r? @llogiq

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Dec 14, 2020
@ghost
Copy link

ghost commented Dec 15, 2020

Thanks for working on this.

The (corrected) suggestions to fix this lint from #3188 are:

Code Suggestion
filter(|x| <expr>.is_some()).map(|x| <expr>.unwrap()) filter_map(|x| <expr>)
filter(|x| <expr>.is_some()).flat_map(|x| <expr>) filter_map(|x| <expr>)
filter(|x| <expr>.is_ok()).map(|x| <expr>.unwrap()) filter_map(|x| <expr>.ok())

I really think that these are the only times filter_map should ever be over filter(..).map(..).

Why have you gone with the heuristic instead of matching the above? Is it because the heuristic was easier or did you find some other cases that should be let through?

@camsteffen
Copy link
Contributor Author

@mikerite Actually I would be fine with implementing that instead and would even prefer it tbh. I didn't do that because I wasn't sure if there was a consensus - maybe some people would want the lint to be kept a wide net. I think there will be some missed cases if the lint is made that specific. But I think those cases would be relatively rare. As a general rule, I think it's better to not have false positives than to have comprehensiveness. Basically I'm convincing myself that you are right as I'm typing 😆.

Here's an example of a case that would be missed:

.filter(|x| match x {
  Foo(a) => a.is_some(),
  Bar(_) => true,
})
.map(|x| match x {
  Foo(a) => a.unwrap(),
  Bar(b) => b,
})

It looks like a hard problem to generalize that...but it's possible! Maybe that could be a future enhancement. Or maybe I CAN do it!!! I'm really gaining steam here!

On a more technical note, in order to implement your suggestion, I believe we need to enhance SpanlessEq to recognize "equivalent locals". That is, in .filter(|x| x.parse().is_some()).map(|y| y.parse().unwrap()), x and y are "equivalent" when comparing the closures for equality. This is something I've put some thought towards and it would be fun for me to do.

So I think I'll start to look into this but it would be good to hear more opinions.

@llogiq
Copy link
Contributor

llogiq commented Dec 19, 2020

Given recent feedback, I'd say we should start out conservative and experiment with linting more from there. We might put a _more lint in the nursery to be able to get a feel for the kind of false positives we might get, or even find heuristics to avoid them.

@bors
Copy link
Collaborator

bors commented Dec 19, 2020

☔ The latest upstream changes (presumably #6316) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@llogiq
Copy link
Contributor

llogiq commented Dec 19, 2020

rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

@camsteffen
Copy link
Contributor Author

@llogiq I'm not sure I understand. The "heuristic" makes the lint more conservative but @mikerite's suggestion makes it even more conservative, to the point where there should be no false positives. So to "start out conservative," do you mean to change the PR and adopt @mikerite's suggestion (which is what I'm leaning towards)? And then we could possibly take my "heuristic" approach and turn that into the filter_map_more lint you mentioned for the nursery.

@llogiq
Copy link
Contributor

llogiq commented Dec 20, 2020

Exactly. The recent feedback we got was overwhelming annoyance at false positives. So let's be careful to introduce the least of them we can, and work up from there, starting in the nursery for further approaches.

@flip1995
Copy link
Member

Rather than a new lint with *_more I'd prefer a config option to enable the lint on more cases. But I agree that we should start more conservative and than lint more cases from there.

@llogiq
Copy link
Contributor

llogiq commented Jan 3, 2021

Is there anything needed from our side, @camsteffen?

@camsteffen
Copy link
Contributor Author

I have made good progress and just have some finishing touches to do before updating this PR. I have some thoughts though.

Does this change merit a rename like redundant_filter_map since it no longer lints every case of filter().map()? The current name isn't so bad, but a rename would give the lint a "second chance" for projects with #![allow(clippy::filter_map)]. The lint should be much less controversial after this change.

I'm also wondering if we should just discard the "heuristic" thing. I just don't feel like it adds much value. But I'm open to keeping it with the configuration if anyone feels differently.

@ghost
Copy link

ghost commented Jan 4, 2021

Does this change merit a rename like redundant_filter_map since it no longer lints every case of filter().map()? The current name isn't so bad, but a rename would give the lint a "second chance" for projects with #![allow(clippy::filter_map)]. The lint should be much less controversial after this change.

I'm against renames just to reset false positives but I think this needs a rename anyway. I like the name manual_filter_map. There is already unnecessary_filter_map which suggests filter_map(..)filter(..).map(..) when appropriate.

Eventually Clippy (or rustc) will get something like #3122 add the problem of fixed false positives will be solved.

@camsteffen
Copy link
Contributor Author

I'm against renames just to reset false positives

Agreed. I wouldn't suggest a rename without a significant change to what the lint does.

I like the name manual_filter_map.

Sounds good to me (but I'll wait for maintainers to chime in).

There is already unnecessary_filter_map which suggests filter_map(..)filter(..).map(..) when appropriate.

Actually unnecessary_filter_map only lints cases that can be reduced to filter() OR map(). However I think that such a lint would be nice. Suppose that lint is added later and we name it separable_filter_map. Putting it all together...

  • unnecessary_filter_map: filter_map(|x| Some(..)) -> map(|x| ..) (also suggests filter)
  • manual_filter_map: filter(|x| opt(x).is_some()).map(|x| opt(x).unwrap()) -> filter_map(|x| opt(x))
  • separable_filter_map: filter_map(|x| if .. { Some(x + 1) } else { None }) -> filter(|x| ..).map(|x| x + 1)

@flip1995
Copy link
Member

flip1995 commented Jan 5, 2021

I'm also good with the rename to manual_filter_map 👍

@camsteffen
Copy link
Contributor Author

Well shoot there's another complication ☹️. The filter_map lint is also used to lint filter_map().map(), filter().flat_map(), filter_map().flat_map(). In each case, the lint is unconditional (nothing checked about the arguments to the methods). And I wonder why other combinations are not linted like map().filter_map()? Anyways, I certainly can't just lump these in with the new lint manual_filter_map. So what to do...?

  1. Leave them as is, still filter_map, at least for now?
  2. Create another new lint for these cases. combinable_iter_adaptors?
  3. Just remove these lint cases

Personally I feel that these lints go against the principle of zero-cost abstractions. That is, iterator adapters are a zero-cost abstraction (at least usually and ideally) and we should not be afraid to use them liberally if doing so makes the code more concise/readable. Perhaps combinable_iter_adapters should be a restriction lint.

@llogiq
Copy link
Contributor

llogiq commented Jan 12, 2021

I'd be OK with option 1. We can always get a follow-up PR. I'd also be OK with yet another lint. In either case, I think that unless there's a clear perf argument against it, the lints should be in the style group.

@camsteffen
Copy link
Contributor Author

the lints should be in the style group

But I am trying to make the case that combining iterator adapters can be bad style. For example, filter_map currently (incorrectly) lints the following code:

let _ = (0_u32..100)
    .filter_map(|n| match n {
        0 => None,
        1 => Some("one"),
        2 => Some("two"),
        n => Some("many"),
    })
    .map(my_function);

I suggested a "restriction" lint based on the idea that the programmer may want to combine iterator adapters at the cost of style.

@camsteffen
Copy link
Contributor Author

I opened #6591 to replace this PR.

@camsteffen camsteffen closed this Jan 15, 2021
bors added a commit that referenced this pull request Jan 22, 2021
`manual_filter_map` and `manual_find_map`

changelog: Add `manual_filter_map` and replace `find_map` with `manual_find_map`

Replaces #6453

Fixes #3188
Fixes #4193

~Depends on #6567 (to fix an internal lint false positive)~

This replaces `filter_map` and `find_map` with `manual_filter_map` and `manual_find_map` respectively. However, `filter_map` is left in place since it is used for a variety of other cases. See discussion in #6453.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make find_map lint be more conservative Make the filter_map lint more specific
5 participants