Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Eric-Arellano · 2020-05-26T20:18:01Z

Problem

When using the default --no-per-target-caching, ./v2 lint :: will fail if you have any targets with conflicting Python interpreter constraints, e.g. Python 2-only targets mixed with Python 3-only targets. This is a common problem due to Python 3 migrations.

It is not acceptable to say "just use --per-target-caching" because we want ./pants lint :: to work regardless of this setting.

Solution

The main change is for linter rules to now return LintResults, rather than a single LintResult. Typically, this will only have one element, but it can include more, such as partitioning by interpreter constraints. In JVM land, we might partition by JVM compatibility, for example.

Then, Pylint, Bandit, and Flake8 are rewritten to have new rules that take a Partition of the original input targets and return a single LintResult. Another rule does the partitioning and aggregates these results.

Result

./v2 lint :: works on any Python codebase, regardless of interpreter constraints.

[ci skip-rust-tests] [ci skip-jvm-tests]

…n one run [ci skip-rust-tests] [ci skip-jvm-tests]

# Delete this line to force CI to run Clippy and the Rust tests. [ci skip-rust-tests] # Delete this line to force CI to run the JVM tests. [ci skip-jvm-tests]

Eric-Arellano

Apologies that this is large. I recommend going commit-by-commit.

# Delete this line to force CI to run Clippy and the Rust tests. [ci skip-rust-tests] # Delete this line to force CI to run the JVM tests. [ci skip-jvm-tests]

stuhood · 2020-05-26T20:25:12Z

It is not acceptable to say "just use --per-target-caching" because the performance is so much worse on a cold cache.

Is this actually the case for the usecases we recommend? ie, --changed-parent=X? This is only one of many complexities added by the fmt/lint batching (as described in your doc), and we should be sure that it's worth continuing to support batching.

EDIT: And also whether it continues to be the case as we further optimize process execution via materialize_directory, etc.

Eric-Arellano · 2020-05-26T20:38:51Z

Is this actually the case for the usecases we recommend? ie, --changed-parent=X? This is only one of many complexities added by the fmt/lint batching (as described in your doc), and we should be sure that it's worth continuing to support batching.

Likely, some cases are faster with --changed-since --per-target-caching, but others are not. It's hard to reason about and depends on how many things have changed and your target granularity. FYI https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph has the benchmarks.

But, overall, we decided to default to --no-per-target-caching because in most cases it has better performance. For new users, we don't want them to have to deal with the special cases of when to use --per-target-caching and when to not. It exists as a possible optimization, but you don't need to worry about if you don't want to.

For that first-time user, we want ./pants lint :: to work regardless of whether they're using --per-target-caching or not. This PR means that --per-target-caching becomes solely about differences in performance, rather than also impacting interpreter conflicts.

We can replace the warning in https://pants.readme.io/v1.28/docs/python-lint-goal with a new green "Benefits of Pants" box about how you can run your linters on incompatible targets.

stuhood

Change looks reasonable: thanks.

As discussed, I think that we should reconsider this default though. We would never recommend that someone should run ./pants lint ::, but if they did run it, more egregious than the first run taking a long time would be none of the subsequent runs doing any caching at all due to all of the files having been batched. People should rely on our caching behavior, and it's easily broken with batching. Even in the case of --changed-parent=master, folks are going to end up running these tools multiple times: those additional runs should mostly hit local caches.

benjyw · 2020-05-27T00:13:26Z

I still think the solution to the caching issue is to cache successes (and only successes) on per-file basis, after running in a single batch. Ideally there would be an intrinsic to poke data into the CAS without running a process (although we could start out by just caching the result of true, and sucking up the extra process execution hit).

benjyw · 2020-05-27T00:13:57Z

This should be straightforward because we're not caching any data that need to be split out, we're just caching the fact "this file passed lint".

Eric-Arellano added 4 commits May 26, 2020 12:00

Linters now return LintResults (plural)

39ab5e5

[ci skip-rust-tests] [ci skip-jvm-tests]

Support V2 Flake8 and Bandit running on incompatible Python targets i…

f72fb06

…n one run [ci skip-rust-tests] [ci skip-jvm-tests]

Add support for Pylint

16ea905

# Delete this line to force CI to run Clippy and the Rust tests. [ci skip-rust-tests] # Delete this line to force CI to run the JVM tests. [ci skip-jvm-tests]

Update help message for --per-target-caching

9ec9129

# Delete this line to force CI to run Clippy and the Rust tests. [ci skip-rust-tests] # Delete this line to force CI to run the JVM tests. [ci skip-jvm-tests]

Eric-Arellano requested review from stuhood and benjyw May 26, 2020 20:18

Eric-Arellano commented May 26, 2020

View reviewed changes

Add comment explaining why we batch with Pylint

f71a61d

# Delete this line to force CI to run Clippy and the Rust tests. [ci skip-rust-tests] # Delete this line to force CI to run the JVM tests. [ci skip-jvm-tests]

Eric-Arellano added the release:features label May 26, 2020

stuhood approved these changes May 26, 2020

View reviewed changes

Eric-Arellano merged commit 0084854 into pantsbuild:master May 26, 2020

Eric-Arellano deleted the lint-group-by-compat branch May 26, 2020 23:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Eric-Arellano commented May 26, 2020 •

edited

Eric-Arellano left a comment

stuhood commented May 26, 2020 •

edited

Eric-Arellano commented May 26, 2020

stuhood left a comment •

edited

benjyw commented May 27, 2020

benjyw commented May 27, 2020

Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Conversation

Eric-Arellano commented May 26, 2020 • edited

Problem

Solution

Result

Eric-Arellano left a comment

Choose a reason for hiding this comment

stuhood commented May 26, 2020 • edited

Eric-Arellano commented May 26, 2020

stuhood left a comment • edited

Choose a reason for hiding this comment

benjyw commented May 27, 2020

benjyw commented May 27, 2020

Eric-Arellano commented May 26, 2020 •

edited

stuhood commented May 26, 2020 •

edited

stuhood left a comment •

edited