Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pylint parallel processing not effective with single package argument #479

Closed
pylint-bot opened this issue Feb 24, 2015 · 5 comments
Closed
Labels

Comments

@pylint-bot
Copy link

Originally reported by: Pavel Roskin (BitBucket: pavel_roskin)


pylint --jobs=N is not benefiting user supplying a package as the argument, even if the package contains multiple Python sources.

Example:

$ git clone https://github.com/wbsoft/frescobaldi.git
$ cd frescobaldi
$ time find frescobaldi_app -name '*.py' | xargs pylint >/dev/null
real    0m43.313s
user    0m42.760s
sys     0m0.588s
$ time find frescobaldi_app -name '*.py' | xargs pylint --jobs=8 >/dev/null
real    0m13.574s
user    1m40.511s
sys     0m1.674s
$ time pylint frescobaldi_app >/dev/null
real    0m43.235s
user    0m42.795s
sys     0m0.472s
$ time pylint --jobs=8 frescobaldi_app >/dev/null
real    0m43.382s
user    0m42.834s
sys     0m0.614s

Only supplying both --jobs=8 and separate python sources cuts the runtime from 43 to 14 seconds. If the package is supplied on the command line, there is no win from --jobs=8.

I'm using an Intel i7 CPU with 4 cores / 8 threads of execution.


@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Improve the performance of --jobs when dealing only with a package name.

The performance is improved by obtaining the files which should be analyzed
from the list of given modules, using PyLinter.expand_modules. This is already
what PyLinter._do_check does, but the behaviour was missing from PyLinter._parallel_check.
Closes issue #479.

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Should be fixed now. Please tell me if you encounter any problems with it.

@pylint-bot
Copy link
Author

Original comment by Pavel Roskin (BitBucket: pavel_roskin):


I tested that on Frescobaldi. The speed has greatly improved. But there are differences in the output. The output with --jobs=8 has this line:

E: 58,26: Module '__builtin__' has no '_' member (no-member)

And that's despite the fact that I have this in ~/.pylintrc:

[VARIABLES]
additional-builtins=_

The "External dependencies" section misses some dependencies.

"nb duplicated lines" was 299, became 0.

The whole section called "% errors / warnings by module" is missing.

Most numbers in the "Messages" section are significantly less. For example, trailing-whitespace was 5118, became 65.

Many of such problems occur when the python files are specified on the command line, so perhaps they could be discussed in a separate issue.

@pylint-bot
Copy link
Author

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore):


Interesting. So the problem occurs when you are specifying the files in command line, not the package? Also, please open another issue to discuss this. Thanks.

@pylint-bot
Copy link
Author

Original comment by Pavel Roskin (BitBucket: pavel_roskin):


Done. See #501 and #502. It turns out that in case of the no-member, --jobs fixed the behavior. Pylint did not see the file that guaranteed that __builtin__._ would be defined before using, so it was right to complain. additional-builtins doesn't affect explicit references to __builtin__, which is probably correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant