Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support V2 linters running on Python 2 and Python 3 targets at the same time #9870

Merged
merged 5 commits into from May 26, 2020

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented May 26, 2020

Problem

When using the default --no-per-target-caching, ./v2 lint :: will fail if you have any targets with conflicting Python interpreter constraints, e.g. Python 2-only targets mixed with Python 3-only targets. This is a common problem due to Python 3 migrations.

It is not acceptable to say "just use --per-target-caching" because we want ./pants lint :: to work regardless of this setting.

Solution

The main change is for linter rules to now return LintResults, rather than a single LintResult. Typically, this will only have one element, but it can include more, such as partitioning by interpreter constraints. In JVM land, we might partition by JVM compatibility, for example.

Then, Pylint, Bandit, and Flake8 are rewritten to have new rules that take a Partition of the original input targets and return a single LintResult. Another rule does the partitioning and aggregates these results.

Result

./v2 lint :: works on any Python codebase, regardless of interpreter constraints.

[ci skip-rust-tests]
[ci skip-jvm-tests]
…n one run

[ci skip-rust-tests]
[ci skip-jvm-tests]
# Delete this line to force CI to run Clippy and the Rust tests.
[ci skip-rust-tests]
# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
# Delete this line to force CI to run Clippy and the Rust tests.
[ci skip-rust-tests]
# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies that this is large. I recommend going commit-by-commit.

# Delete this line to force CI to run Clippy and the Rust tests.
[ci skip-rust-tests]
# Delete this line to force CI to run the JVM tests.
[ci skip-jvm-tests]
@stuhood
Copy link
Sponsor Member

stuhood commented May 26, 2020

It is not acceptable to say "just use --per-target-caching" because the performance is so much worse on a cold cache.

Is this actually the case for the usecases we recommend? ie, --changed-parent=X? This is only one of many complexities added by the fmt/lint batching (as described in your doc), and we should be sure that it's worth continuing to support batching.

EDIT: And also whether it continues to be the case as we further optimize process execution via materialize_directory, etc.

@Eric-Arellano
Copy link
Contributor Author

Is this actually the case for the usecases we recommend? ie, --changed-parent=X? This is only one of many complexities added by the fmt/lint batching (as described in your doc), and we should be sure that it's worth continuing to support batching.

Likely, some cases are faster with --changed-since --per-target-caching, but others are not. It's hard to reason about and depends on how many things have changed and your target granularity. FYI https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph has the benchmarks.

But, overall, we decided to default to --no-per-target-caching because in most cases it has better performance. For new users, we don't want them to have to deal with the special cases of when to use --per-target-caching and when to not. It exists as a possible optimization, but you don't need to worry about if you don't want to.

For that first-time user, we want ./pants lint :: to work regardless of whether they're using --per-target-caching or not. This PR means that --per-target-caching becomes solely about differences in performance, rather than also impacting interpreter conflicts.

We can replace the warning in https://pants.readme.io/v1.28/docs/python-lint-goal with a new green "Benefits of Pants" box about how you can run your linters on incompatible targets.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks reasonable: thanks.

As discussed, I think that we should reconsider this default though. We would never recommend that someone should run ./pants lint ::, but if they did run it, more egregious than the first run taking a long time would be none of the subsequent runs doing any caching at all due to all of the files having been batched. People should rely on our caching behavior, and it's easily broken with batching. Even in the case of --changed-parent=master, folks are going to end up running these tools multiple times: those additional runs should mostly hit local caches.

@Eric-Arellano Eric-Arellano merged commit 0084854 into pantsbuild:master May 26, 2020
@Eric-Arellano Eric-Arellano deleted the lint-group-by-compat branch May 26, 2020 23:12
@benjyw
Copy link
Sponsor Contributor

benjyw commented May 27, 2020

I still think the solution to the caching issue is to cache successes (and only successes) on per-file basis, after running in a single batch. Ideally there would be an intrinsic to poke data into the CAS without running a process (although we could start out by just caching the result of true, and sucking up the extra process execution hit).

@benjyw
Copy link
Sponsor Contributor

benjyw commented May 27, 2020

This should be straightforward because we're not caching any data that need to be split out, we're just caching the fact "this file passed lint".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants