Add heuristic to filter_map and find_map #6453

camsteffen · 2020-12-14T20:53:56Z

changelog: Reduce false positives for filter_map and find_map

Simply check if filter(..) contains a closure that uses Option or Result anywhere in any way (using ExprUseVisitor). The assumption is: if the filter does not use Option or Result at all, it is perfectly fine as it is and turning it into filter_map would make the code more complex. And the very same thing applies to find_map.

The heuristic could probably be improved or made more specific down the road. But my goal here is remove false positives without losing any positive cases. I think this is a big improvement.

To anticipate a potential concern, there is one "category" of cases for these lints that will no longer be detected: where the code in filter and map are redundant, doing the same transformation on the value (e.g. filter(|n| n + 1 > 5).map(|n| n + 1). But I think this is an acceptable loss because 1) there is no logic in these lints that actually recognizes redundant code and 2) this could be a separate lint (e.g. repetitive_iterator) - such a lint could apply to a wider variety of Iterator methods (e.g. iter.inspect(|n| println!("{}", n+1)).fold(0, |s, n| s += n + 1)).

rust-highfive · 2020-12-14T20:53:59Z

r? @llogiq

(rust-highfive has picked a reviewer for you, use r? to override)

ghost · 2020-12-15T05:01:54Z

Thanks for working on this.

The (corrected) suggestions to fix this lint from #3188 are:

Code	Suggestion
`filter(\|x\| <expr>.is_some()).map(\|x\| <expr>.unwrap())`	`filter_map(\|x\| <expr>)`
`filter(\|x\| <expr>.is_some()).flat_map(\|x\| <expr>)`	`filter_map(\|x\| <expr>)`
`filter(\|x\| <expr>.is_ok()).map(\|x\| <expr>.unwrap())`	`filter_map(\|x\| <expr>.ok())`

I really think that these are the only times filter_map should ever be over filter(..).map(..).

Why have you gone with the heuristic instead of matching the above? Is it because the heuristic was easier or did you find some other cases that should be let through?

camsteffen · 2020-12-15T16:08:26Z

@mikerite Actually I would be fine with implementing that instead and would even prefer it tbh. I didn't do that because I wasn't sure if there was a consensus - maybe some people would want the lint to be kept a wide net. I think there will be some missed cases if the lint is made that specific. But I think those cases would be relatively rare. As a general rule, I think it's better to not have false positives than to have comprehensiveness. Basically I'm convincing myself that you are right as I'm typing 😆.

Here's an example of a case that would be missed:

.filter(|x| match x {
  Foo(a) => a.is_some(),
  Bar(_) => true,
})
.map(|x| match x {
  Foo(a) => a.unwrap(),
  Bar(b) => b,
})

It looks like a hard problem to generalize that...but it's possible! Maybe that could be a future enhancement. Or maybe I CAN do it!!! I'm really gaining steam here!

On a more technical note, in order to implement your suggestion, I believe we need to enhance SpanlessEq to recognize "equivalent locals". That is, in .filter(|x| x.parse().is_some()).map(|y| y.parse().unwrap()), x and y are "equivalent" when comparing the closures for equality. This is something I've put some thought towards and it would be fun for me to do.

So I think I'll start to look into this but it would be good to hear more opinions.

llogiq · 2020-12-19T19:22:43Z

Given recent feedback, I'd say we should start out conservative and experiment with linting more from there. We might put a _more lint in the nursery to be able to get a feel for the kind of false positives we might get, or even find heuristics to avoid them.

bors · 2020-12-19T21:52:42Z

☔ The latest upstream changes (presumably #6316) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

llogiq · 2020-12-19T22:58:00Z

rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

camsteffen · 2020-12-19T23:36:24Z

@llogiq I'm not sure I understand. The "heuristic" makes the lint more conservative but @mikerite's suggestion makes it even more conservative, to the point where there should be no false positives. So to "start out conservative," do you mean to change the PR and adopt @mikerite's suggestion (which is what I'm leaning towards)? And then we could possibly take my "heuristic" approach and turn that into the filter_map_more lint you mentioned for the nursery.

llogiq · 2020-12-20T18:13:02Z

Exactly. The recent feedback we got was overwhelming annoyance at false positives. So let's be careful to introduce the least of them we can, and work up from there, starting in the nursery for further approaches.

flip1995 · 2020-12-27T13:14:04Z

Rather than a new lint with *_more I'd prefer a config option to enable the lint on more cases. But I agree that we should start more conservative and than lint more cases from there.

llogiq · 2021-01-03T10:28:04Z

Is there anything needed from our side, @camsteffen?

camsteffen · 2021-01-03T16:08:37Z

I have made good progress and just have some finishing touches to do before updating this PR. I have some thoughts though.

Does this change merit a rename like redundant_filter_map since it no longer lints every case of filter().map()? The current name isn't so bad, but a rename would give the lint a "second chance" for projects with #![allow(clippy::filter_map)]. The lint should be much less controversial after this change.

I'm also wondering if we should just discard the "heuristic" thing. I just don't feel like it adds much value. But I'm open to keeping it with the configuration if anyone feels differently.

ghost · 2021-01-04T06:05:56Z

Does this change merit a rename like redundant_filter_map since it no longer lints every case of filter().map()? The current name isn't so bad, but a rename would give the lint a "second chance" for projects with #![allow(clippy::filter_map)]. The lint should be much less controversial after this change.

I'm against renames just to reset false positives but I think this needs a rename anyway. I like the name manual_filter_map. There is already unnecessary_filter_map which suggests filter_map(..) → filter(..).map(..) when appropriate.

Eventually Clippy (or rustc) will get something like #3122 add the problem of fixed false positives will be solved.

camsteffen · 2021-01-05T15:34:56Z

I'm against renames just to reset false positives

Agreed. I wouldn't suggest a rename without a significant change to what the lint does.

I like the name manual_filter_map.

Sounds good to me (but I'll wait for maintainers to chime in).

There is already unnecessary_filter_map which suggests filter_map(..) → filter(..).map(..) when appropriate.

Actually unnecessary_filter_map only lints cases that can be reduced to filter() OR map(). However I think that such a lint would be nice. Suppose that lint is added later and we name it separable_filter_map. Putting it all together...

unnecessary_filter_map: filter_map(|x| Some(..)) -> map(|x| ..) (also suggests filter)
manual_filter_map: filter(|x| opt(x).is_some()).map(|x| opt(x).unwrap()) -> filter_map(|x| opt(x))
separable_filter_map: filter_map(|x| if .. { Some(x + 1) } else { None }) -> filter(|x| ..).map(|x| x + 1)

flip1995 · 2021-01-05T15:43:12Z

I'm also good with the rename to manual_filter_map 👍

camsteffen · 2021-01-06T17:32:56Z

Well shoot there's another complication ☹️. The filter_map lint is also used to lint filter_map().map(), filter().flat_map(), filter_map().flat_map(). In each case, the lint is unconditional (nothing checked about the arguments to the methods). And I wonder why other combinations are not linted like map().filter_map()? Anyways, I certainly can't just lump these in with the new lint manual_filter_map. So what to do...?

Leave them as is, still filter_map, at least for now?
Create another new lint for these cases. combinable_iter_adaptors?
Just remove these lint cases

Personally I feel that these lints go against the principle of zero-cost abstractions. That is, iterator adapters are a zero-cost abstraction (at least usually and ideally) and we should not be afraid to use them liberally if doing so makes the code more concise/readable. Perhaps combinable_iter_adapters should be a restriction lint.

llogiq · 2021-01-12T20:50:41Z

I'd be OK with option 1. We can always get a follow-up PR. I'd also be OK with yet another lint. In either case, I think that unless there's a clear perf argument against it, the lints should be in the style group.

camsteffen · 2021-01-12T21:38:40Z

the lints should be in the style group

But I am trying to make the case that combining iterator adapters can be bad style. For example, filter_map currently (incorrectly) lints the following code:

let _ = (0_u32..100)
    .filter_map(|n| match n {
        0 => None,
        1 => Some("one"),
        2 => Some("two"),
        n => Some("many"),
    })
    .map(my_function);

I suggested a "restriction" lint based on the idea that the programmer may want to combine iterator adapters at the cost of style.

camsteffen · 2021-01-15T01:32:37Z

I opened #6591 to replace this PR.

`manual_filter_map` and `manual_find_map` changelog: Add `manual_filter_map` and replace `find_map` with `manual_find_map` Replaces #6453 Fixes #3188 Fixes #4193 ~Depends on #6567 (to fix an internal lint false positive)~ This replaces `filter_map` and `find_map` with `manual_filter_map` and `manual_find_map` respectively. However, `filter_map` is left in place since it is used for a variety of other cases. See discussion in #6453.

rust-highfive assigned llogiq Dec 14, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Dec 14, 2020

camsteffen added 5 commits December 14, 2020 14:55

Fix comment

c3dfcee

Add heuristic to filter_map

dd7d72c

Add heuristic to find_map

eeffba7

Remove unneeded allow's

c9cde4f

Use new filter map heuristic in clippy

ddaebbd

camsteffen force-pushed the filter-map-heuristic branch from bc1af63 to ddaebbd Compare December 14, 2020 20:56

bbqbaron mentioned this pull request Dec 20, 2020

Lint: filter(Option::is_some).map(Option::unwrap) #6342

Merged

camsteffen mentioned this pull request Jan 15, 2021

manual_filter_map and manual_find_map #6591

Merged

camsteffen closed this Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add heuristic to filter_map and find_map #6453

Add heuristic to filter_map and find_map #6453

camsteffen commented Dec 14, 2020

rust-highfive commented Dec 14, 2020

ghost commented Dec 15, 2020

camsteffen commented Dec 15, 2020

llogiq commented Dec 19, 2020

bors commented Dec 19, 2020

llogiq commented Dec 19, 2020

camsteffen commented Dec 19, 2020

llogiq commented Dec 20, 2020

flip1995 commented Dec 27, 2020

llogiq commented Jan 3, 2021

camsteffen commented Jan 3, 2021

ghost commented Jan 4, 2021

camsteffen commented Jan 5, 2021

flip1995 commented Jan 5, 2021

camsteffen commented Jan 6, 2021

llogiq commented Jan 12, 2021

camsteffen commented Jan 12, 2021

camsteffen commented Jan 15, 2021

Add heuristic to filter_map and find_map #6453

Add heuristic to filter_map and find_map #6453

Conversation

camsteffen commented Dec 14, 2020

rust-highfive commented Dec 14, 2020

ghost commented Dec 15, 2020

camsteffen commented Dec 15, 2020

llogiq commented Dec 19, 2020

bors commented Dec 19, 2020

llogiq commented Dec 19, 2020

camsteffen commented Dec 19, 2020

llogiq commented Dec 20, 2020

flip1995 commented Dec 27, 2020

llogiq commented Jan 3, 2021

camsteffen commented Jan 3, 2021

ghost commented Jan 4, 2021

camsteffen commented Jan 5, 2021

flip1995 commented Jan 5, 2021

camsteffen commented Jan 6, 2021

llogiq commented Jan 12, 2021

camsteffen commented Jan 12, 2021

camsteffen commented Jan 15, 2021