Pylint checks should be way faster now #10207

potiuk · 2020-08-06T21:26:52Z

Instead of running separate pylint checks for tests and main source
we are running a single check now. This is possible thanks to a
nice hack - we have pylint plugin that injects the right
"# pylint: disable=" comment for all test files while reading
the file content by astroid (just before tokenization)

Thanks to that we can also separate out pylint checks
to a separate job in CI - this way all pylint checks will
be run in parallel to all other checks effectively halving
the time needed to get the static check feedback and potentially
canceling other jobs much faster.

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

Instead of running separate pylint checks for tests and main source we are running a single check now. This is possible thanks to a nice hack - we have pylint plugin that injects the right "# pylint: disable=" comment for all test files while reading the file content by astroid (just before tokenization) Thanks to that we can also separate out pylint checks to a separate job in CI - this way all pylint checks will be run in parallel to all other checks effectively halfing the time needed to get the static check feedback and potentially cancelling other jobs much faster.

codecov-commenter · 2020-08-06T23:16:39Z

Codecov Report

Merging #10207 into master will decrease coverage by 54.29%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           master   #10207       +/-   ##
===========================================
- Coverage   89.41%   35.11%   -54.30%     
===========================================
  Files        1037     1037               
  Lines       50011    50013        +2     
===========================================
- Hits        44717    17562    -27155     
- Misses       5294    32451    +27157

Flag	Coverage Δ
#kubernetes-tests-3.6-9.6	`?`
#kubernetes-tests-image-3.6-v1.16.9	`?`
#kubernetes-tests-image-3.6-v1.17.5	`?`
#kubernetes-tests-image-3.6-v1.18.6	`?`
#kubernetes-tests-image-3.7-v1.16.9	`?`
#kubernetes-tests-image-3.7-v1.17.5	`?`
#kubernetes-tests-image-3.7-v1.18.6	`?`
#mysql-tests-Core-3.7-5.7	`?`
#mysql-tests-Core-3.8-5.7	`?`
#mysql-tests-Integration-3.7-5.7	`34.75% <100.00%> (+<0.01%)`	⬆️
#mysql-tests-Integration-3.8-5.7	`35.01% <100.00%> (+<0.01%)`	⬆️
#postgres-tests-Core-3.6-10	`?`
#postgres-tests-Core-3.6-9.6	`?`
#postgres-tests-Core-3.7-10	`?`
#postgres-tests-Core-3.7-9.6	`?`
#postgres-tests-Integration-3.6-10	`?`
#postgres-tests-Integration-3.6-9.6	`34.73% <100.00%> (+<0.01%)`	⬆️
#postgres-tests-Integration-3.7-10	`?`
#postgres-tests-Integration-3.7-9.6	`?`
#sqlite-tests-Core-3.6	`?`
#sqlite-tests-Core-3.8	`?`
#sqlite-tests-Integration-3.6	`34.18% <100.00%> (+<0.01%)`	⬆️
#sqlite-tests-Integration-3.8	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
airflow/jobs/scheduler_job.py	`16.39% <100.00%> (-74.25%)`	⬇️
airflow/hooks/S3_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/pig_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/hdfs_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/http_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/jdbc_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/contrib/__init__.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/druid_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/hive_hooks.py	`0.00% <0.00%> (-100.00%)`	⬇️
airflow/hooks/mssql_hook.py	`0.00% <0.00%> (-100.00%)`	⬇️
... and 906 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d79e722...2b57575. Read the comment docs.

tests/dags/test_logging_in_dag.py

turbaszek · 2020-08-07T05:22:19Z

tests/airflow_pylint/disable_checks_for_tests.py

+        mod.file_bytes = "\n".join(decoded_lines).encode("utf-8")
+
+
+MANAGER.register_transform(scoped_nodes.Module, transform)


I'm just wondering if we can first run pylint for main sources and then, just add additional disables to pylintrc? Not sure if this will be simpler

This is exactly what we did before and that was a pain. It took much more time than this one.
This is a huge optimization, that's why I did it.

The problem is that when you run pylint for tests, it also pulls in all the tested code which means that basically a lot of the main airflow code was parsed and processed twice - once in the "pylint main" and once in the "pylint tests". This code was not verified by Pylint but it was parsed and analyzed so that the test code could be pylint-checked so it's not a full duplication, but I think the savings are significant

Some random stats:

Before the change:

pylint main: 6:20s
pylint tests: 3:59s

Tota: 10m 20s

After the change:
combined pylint: 8:20s

So we save 20% (2 minutes) of elapsed time by doing this.

potiuk · 2020-08-07T08:03:13Z

I also added yet another small speedup - I realised that when we split pre-commits we should have two separate pre-comit virtualenv caches - that should give another 30-40 seconds boost overall.

boring-cyborg bot added the area:dev-tools label Aug 6, 2020

potiuk requested review from kaxil, feluelle, BasPH, mik-laj and turbaszek August 6, 2020 21:26

potiuk force-pushed the speedup-pylint-tests branch 7 times, most recently from 6ec1b71 to 7f87ca6 Compare August 6, 2020 22:57

potiuk force-pushed the speedup-pylint-tests branch from 7f87ca6 to f402c40 Compare August 6, 2020 23:02

turbaszek reviewed Aug 7, 2020

View reviewed changes

tests/dags/test_logging_in_dag.py Show resolved Hide resolved

turbaszek reviewed Aug 7, 2020

View reviewed changes

fixup! Pylint checks should be way faster now

2b57575

turbaszek approved these changes Aug 7, 2020

View reviewed changes

potiuk merged commit 9e3b7d9 into apache:master Aug 7, 2020

potiuk deleted the speedup-pylint-tests branch August 7, 2020 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pylint checks should be way faster now #10207

Pylint checks should be way faster now #10207

Uh oh!

potiuk commented Aug 6, 2020 •

edited

Loading

Uh oh!

codecov-commenter commented Aug 6, 2020 •

edited

Loading

Uh oh!

Uh oh!

turbaszek Aug 7, 2020

Uh oh!

potiuk Aug 7, 2020 •

edited

Loading

Uh oh!

potiuk commented Aug 7, 2020

Uh oh!

Uh oh!

		mod.file_bytes = "\n".join(decoded_lines).encode("utf-8")


		MANAGER.register_transform(scoped_nodes.Module, transform)

Pylint checks should be way faster now #10207

Pylint checks should be way faster now #10207

Uh oh!

Conversation

potiuk commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

turbaszek Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

potiuk Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

potiuk commented Aug 7, 2020

Uh oh!

Uh oh!

potiuk commented Aug 6, 2020 •

edited

Loading

codecov-commenter commented Aug 6, 2020 •

edited

Loading

potiuk Aug 7, 2020 •

edited

Loading