-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for parallel execution (autopep8's --jobs
opt)
#107
Conversation
Signed-off-by: Giampaolo Rodola <g.rodola@gmail.com>
Update: I set |
Since I'm not sure whether / when this PR will be merged, to whoever wants to used this feature, this is how you can install this very PR by using pip:
|
...by pip-installing a PR I provided for autoflake8 packages which adds --jobs option to the tool, see PyCQA/autoflake#107 Signed-off-by: Giampaolo Rodola <g.rodola@gmail.com>
Hello @giampaolo Thanks for this PR, this will be very useful for big projects! Using macOS 12.5 (21G72) and Python 3.9.12, when I add unused imports in several files of my folder and then do the following:
I can see that it detects the unused imports, but then it hangs forever. I need to force quit it and then it prints the following:
When there are no issues in the folder, autoflake takes ~8s instead of ~36s, so that's very promising! 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea, but I have one question about turning the args into a dict. Also, can you update your branch with the most recent changes in the master branch?
Thanks for contributing!
Hello!
I'm trying but 2645f85 messed things up for this PR. The problem with 2645f85 is that
Correct. |
Gotcha, we can reorganize the code. It doesn't make sense to support stdout/stdin with parallel execution anyways, so when the user passes stdin as an input we could force serial execution? And then restructure the code to ensure that we can support both paths? Like, how does autopep8 handle stdin/stdout? |
Done and pushed. Please note that tests are green:
...but I did not add any test for the new code path (I only tested this manually). |
use os.cpu_count() instead of multiprocessing.cpu_count(): the latter may raise `NotImplementedError`
The end-to-end tests should exercise it since the default behavior is to use all available CPUs. Can you fix the pre-commit violation? Running |
done
I put a pdb in autoflake.py and the multiprocessing part is not exercised. Note that the logic is: if args["jobs"] == 1 or len(files) == 1 or args["jobs"] == 1 or '-' in files or standard_out is not None:
# serial code
else:
# parallel code |
Oh interesting. Does it not get exercised with test_fuzz.py either?
|
Mmm no. If I read test_fuzz.py code right it passes one file at a time. Instead autoflake should be invoked as:
That sort of invocation will trigger parallel execution. |
That's fair enough. We can add something later that will do it. |
Hello. Today I discovered this project existed, and I immediately integrated it in my own projects. Differently from autopep8 tool, I noticed it does not support the
--jobs
CLI option, so I decided to submit this PR. I ran this patched version ofautoflake
againstpsutil
code base, and it resulted in more than a 2x speedup (my laptop has 8 logical cores).Standard:
Using
--jobs
opt:About the patch: I had to turn
argparse.Namespace
into a dict because multiprocessing module is not able to serialize it.EDIT: fixed it
Unfortunatelyŧest_autoflake.py
reports 4 failures that I'm not sure how to fix. Hope this helps and thanks for this great tool.