Skip to content

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Aug 6, 2020

Instead of running separate pylint checks for tests and main source
we are running a single check now. This is possible thanks to a
nice hack - we have pylint plugin that injects the right
"# pylint: disable=" comment for all test files while reading
the file content by astroid (just before tokenization)

Thanks to that we can also separate out pylint checks
to a separate job in CI - this way all pylint checks will
be run in parallel to all other checks effectively halving
the time needed to get the static check feedback and potentially
canceling other jobs much faster.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk potiuk force-pushed the speedup-pylint-tests branch 7 times, most recently from 6ec1b71 to 7f87ca6 Compare August 6, 2020 22:57
Instead of running separate pylint checks for tests and main source
we are running a single check now. This is possible thanks to a
nice hack - we have pylint plugin that injects the right
"# pylint: disable=" comment for all test files while reading
the file content by astroid (just before tokenization)

Thanks to that we can also separate out pylint checks
to a separate job in CI - this way all pylint checks will
be run in parallel to all other checks effectively halfing
the time needed to get the static check feedback and potentially
cancelling other jobs much faster.
@potiuk potiuk force-pushed the speedup-pylint-tests branch from 7f87ca6 to f402c40 Compare August 6, 2020 23:02
@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2020

Codecov Report

Merging #10207 into master will decrease coverage by 54.29%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #10207       +/-   ##
===========================================
- Coverage   89.41%   35.11%   -54.30%     
===========================================
  Files        1037     1037               
  Lines       50011    50013        +2     
===========================================
- Hits        44717    17562    -27155     
- Misses       5294    32451    +27157     
Flag Coverage Δ
#kubernetes-tests-3.6-9.6 ?
#kubernetes-tests-image-3.6-v1.16.9 ?
#kubernetes-tests-image-3.6-v1.17.5 ?
#kubernetes-tests-image-3.6-v1.18.6 ?
#kubernetes-tests-image-3.7-v1.16.9 ?
#kubernetes-tests-image-3.7-v1.17.5 ?
#kubernetes-tests-image-3.7-v1.18.6 ?
#mysql-tests-Core-3.7-5.7 ?
#mysql-tests-Core-3.8-5.7 ?
#mysql-tests-Integration-3.7-5.7 34.75% <100.00%> (+<0.01%) ⬆️
#mysql-tests-Integration-3.8-5.7 35.01% <100.00%> (+<0.01%) ⬆️
#postgres-tests-Core-3.6-10 ?
#postgres-tests-Core-3.6-9.6 ?
#postgres-tests-Core-3.7-10 ?
#postgres-tests-Core-3.7-9.6 ?
#postgres-tests-Integration-3.6-10 ?
#postgres-tests-Integration-3.6-9.6 34.73% <100.00%> (+<0.01%) ⬆️
#postgres-tests-Integration-3.7-10 ?
#postgres-tests-Integration-3.7-9.6 ?
#sqlite-tests-Core-3.6 ?
#sqlite-tests-Core-3.8 ?
#sqlite-tests-Integration-3.6 34.18% <100.00%> (+<0.01%) ⬆️
#sqlite-tests-Integration-3.8 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
airflow/jobs/scheduler_job.py 16.39% <100.00%> (-74.25%) ⬇️
airflow/hooks/S3_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/pig_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/hdfs_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/http_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/jdbc_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/contrib/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/druid_hook.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/hive_hooks.py 0.00% <0.00%> (-100.00%) ⬇️
airflow/hooks/mssql_hook.py 0.00% <0.00%> (-100.00%) ⬇️
... and 906 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d79e722...2b57575. Read the comment docs.

mod.file_bytes = "\n".join(decoded_lines).encode("utf-8")


MANAGER.register_transform(scoped_nodes.Module, transform)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering if we can first run pylint for main sources and then, just add additional disables to pylintrc? Not sure if this will be simpler

Copy link
Member Author

@potiuk potiuk Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what we did before and that was a pain. It took much more time than this one.
This is a huge optimization, that's why I did it.

The problem is that when you run pylint for tests, it also pulls in all the tested code which means that basically a lot of the main airflow code was parsed and processed twice - once in the "pylint main" and once in the "pylint tests". This code was not verified by Pylint but it was parsed and analyzed so that the test code could be pylint-checked so it's not a full duplication, but I think the savings are significant

Some random stats:

Before the change:

pylint main: 6:20s
pylint tests: 3:59s

Tota: 10m 20s

After the change:
combined pylint: 8:20s

So we save 20% (2 minutes) of elapsed time by doing this.

@potiuk
Copy link
Member Author

potiuk commented Aug 7, 2020

I also added yet another small speedup - I realised that when we split pre-commits we should have two separate pre-comit virtualenv caches - that should give another 30-40 seconds boost overall.

@potiuk potiuk merged commit 9e3b7d9 into apache:master Aug 7, 2020
@potiuk potiuk deleted the speedup-pylint-tests branch August 7, 2020 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants