Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the coverage collection speed #9

Closed
jvoisin opened this issue Dec 17, 2019 · 10 comments
Closed

Improving the coverage collection speed #9

jvoisin opened this issue Dec 17, 2019 · 10 comments

Comments

@jvoisin
Copy link
Contributor

@jvoisin jvoisin commented Dec 17, 2019

I did some quick bench-marking of the fuzzing process, and it seems that there are some easy ways to gain some precious exec/s:

This is the baseline:
without

This is the one with a stupid optimization in the profiler:
with

I simply added @functools.lru_cache(maxsize=1024, typed=False) on top of the abs_file function, and got a significant performance boost. Do we want to monkey-patch this, or is it worth trying to upstream it? In our case, it's ok to do some caching, since nothing but the input changes between runs, it might be different in the general usecase of coverage.py, I don't know.

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 17, 2019

It seems that coverage.py isn't optimized for speed at all, so it might be interesting to discuss a bit with its upstream, since there are a lot of low-hanging fruits there.

@yevgenypats
Copy link
Contributor

@yevgenypats yevgenypats commented Dec 18, 2019

Yes I'm not really familiar with coverage.py internal I think it would be better to ask the team there so maybe they can explain the reason for some of the "un-optimizations"

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 18, 2019

I guess you're fixing the version of coverage.py to the latest one supporting python2?
I don't think that the coverage.py people are still maintain it, nor willig to backport things there :/

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 18, 2019

The latest version of coverage,py has dreadful performances :/

new

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 20, 2019

Where do you want to go from here? I think that we can either:

  • Write our own coverage collector
  • Monkey-patch the fixed-version of coverage.py
  • Fork the fixed-version of coverage.py and put our improvements there
  • Use the latest version of coverage.py, accept the performance impact, and try to optimize it
@yevgenypats
Copy link
Contributor

@yevgenypats yevgenypats commented Dec 20, 2019

I would lean towards option 2 or 4. as 1 seems like a lot of work+duplicate work. 3 would make sense if we will upstream those fixes eventually but I would prefer to talk to the coverage.py maintainer first before opening the PR.

so to sum-up I would say that a monkey patch sounds like the quickest solution. what do you say?

jvoisin added a commit to jvoisin/pythonfuzz that referenced this issue Dec 20, 2019
Since we're using an old fixed version of coverage.py, we can monkey-patch it
to significantly increase its performances. This commit adds memoization around
a syscall-intensive function, giving around +50% in performances on my
benchmark (fuzzitdev#9).
@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 20, 2019

The newest versions of coverage.py aren't compatible with python2 anymore, do we care about this?

I'm fine with doing some monkey-patching :)

Since the new versions of coverage.py are using a completely different backend (sqlite), most of our improvements wouldn't be really mergeable :/

@yevgenypats
Copy link
Contributor

@yevgenypats yevgenypats commented Dec 20, 2019

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Dec 23, 2019

Well, it might still be useful to fuzz old python libraries: people won't stop using them because python2 is deprecated, unfortunately :/

@jvoisin
Copy link
Contributor Author

@jvoisin jvoisin commented Jun 10, 2020

Done in e438c4c

@jvoisin jvoisin closed this Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.