Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow due to flake8 #1

Closed
asottile opened this issue Dec 21, 2017 · 15 comments
Closed

Very slow due to flake8 #1

asottile opened this issue Dec 21, 2017 · 15 comments

Comments

@asottile
Copy link
Owner

flake8 doesn't support any in-process interface so this is currently subprocessing to run flake8.

In the current scheme, it runs one process per filename. This could be improved to batch up files sent to flake8 (which will internally use multiprocessing for some speed).

@sigmavirus24
Copy link

So this is an accepted feature request meaning we just need to get 'round to implementing it on flake8. That said, I've been toying with a catch-all collection of flake8 report formatters in PyCQA and I feel like you could more easily do this with a more structured format, e.g., flake8 --format=json {bunch,of,files} and then just parse that output as json instead of individual invocations. I think I have a flake8-json reporter on my laptop somewhere.

@asottile
Copy link
Owner Author

Yeah that feature request mostly obsoletes this tool (though autofixing is nicer than not imo).

I've still been reaching for an in-process way to do this (and an in-process way to lint source (not files)) with flake8 for a while but sadly there's been nothing of the sort since v3.

@sigmavirus24
Copy link

Right, so this is more of a "We need to collaboratively design an API" than "This is slow" issue which brings us back to https://gitlab.com/pycqa/flake8/issues/208 and so far you're the only person who has weighed in on that.

Would an idealized API look something like this:

from flake8 import api

config = api.parse_configuration_files()
flake8 = api.checker_from(config)
for violation in flake8.check(filename=filename, source=source):
    # Do things

@sigmavirus24
Copy link

Also https://gitlab.com/pycqa/flake8-json exists now and is pip installable if you want to swap to JSON in the short-term

@asottile
Copy link
Owner Author

asottile commented Jan 1, 2018

That sketch of an api looks like what I would want :)

There is also a quite a lot of overhead coming from ~somewhere as well:

$ time flake8 /dev/null

real	0m0.387s
user	0m0.200s
sys	0m0.032s

but I imagine that to be more difficult to fix :S

@sigmavirus24
Copy link

I expect that is pkg_resources.iter_entry_points and may be resolved if https://gitlab.com/pycqa/flake8/issues/390 ever happens

@sigmavirus24
Copy link

To be clear, pkg_resources scans your entire site-packages directory (every package installed) just to find entry-points for Flake8 in this case. It's a known not ideal way of extending the project, but until recently was the best way to do so.

@asottile
Copy link
Owner Author

asottile commented Jan 1, 2018

I generated a profile to confirm and I'm actually not seeing pkg_resources on this specific run: https://i.fluffy.cc/XNqRc2GXD37mW7dw7dFB8XFwBrWdWFT7.svg

I also don't expect pkg_resources to contribute that much in this instance -- there's very little installed in this virtualenv (yes I'm familiar with the sadness that is the import-side-effects of the pkg_resources module and the building of working_set)

@sigmavirus24
Copy link

If I'm reading that correctly it's mostly in flake8/main/application.py and flake8/main/options.py? That's really weird given that options.py is just https://github.com/PyCQA/flake8/blob/f8344997267b8ca87a96c690a3515a443005b653/src/flake8/main/options.py

@asottile
Copy link
Owner Author

asottile commented Jan 1, 2018

yeah strange indeed -- I'll keep poking with pstats and see if there's something more glaringly obvious. This is a reasonably fast computer and on an SSD so I don't expect importing a file with no side-effects to be that expensive... (should just be unmarshalling pyc files after all!).

@sigmavirus24
Copy link

Yeah that feature request mostly obsoletes this tool (though autofixing is nicer than not imo).

Btw, I don't think that feature obsoletes this tool, I think it enables this tool. If Flake8 starts reporting lines that have noqa but no error, then that makes autofixing this tool's primary purpose and makes things much much better

@mxr
Copy link
Sponsor Contributor

mxr commented Jun 20, 2018

Could yesqa by default only run on py files that contain # noqa? Instead of all py files?

@asottile
Copy link
Owner Author

That would probably alleviate some of the problems, but isn't a full solution for the slowness

@asottile
Copy link
Owner Author

This PR improved yesqa's performance by as much as 50% when I played with it: https://gitlab.com/pycqa/flake8/merge_requests/305 -- should be available when I release flake8 3.7.6

@asottile
Copy link
Owner Author

asottile commented May 8, 2019

probably about as fast as we're going to get now -- I think this is sufficient to say "fixed"!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants