Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upImproving the coverage collection speed #9
Comments
|
It seems that coverage.py isn't optimized for speed at all, so it might be interesting to discuss a bit with its upstream, since there are a lot of low-hanging fruits there. |
|
Yes I'm not really familiar with coverage.py internal I think it would be better to ask the team there so maybe they can explain the reason for some of the "un-optimizations" |
|
I guess you're fixing the version of coverage.py to the latest one supporting python2? |
|
Where do you want to go from here? I think that we can either:
|
|
I would lean towards option 2 or 4. as 1 seems like a lot of work+duplicate work. 3 would make sense if we will upstream those fixes eventually but I would prefer to talk to the coverage.py maintainer first before opening the PR. so to sum-up I would say that a monkey patch sounds like the quickest solution. what do you say? |
Since we're using an old fixed version of coverage.py, we can monkey-patch it to significantly increase its performances. This commit adds memoization around a syscall-intensive function, giving around +50% in performances on my benchmark (fuzzitdev#9).
|
The newest versions of coverage.py aren't compatible with python2 anymore, do we care about this? I'm fine with doing some monkey-patching :) Since the new versions of coverage.py are using a completely different backend (sqlite), most of our improvements wouldn't be really mergeable :/ |
|
I don't have particular problem not supporting python2 as its officially
not supported anymore.
…On Sat, Dec 21, 2019, 12:17 AM jvoisin ***@***.***> wrote:
The newest versions of coverage.py aren't compatible with python2 anymore,
do we care about this?
I'm fine with doing some monkey-patching :)
Since the new versions of coverage.py are using a completely different
backend (sqlite), most of our improvements wouldn't be really mergeable :/
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9?email_source=notifications&email_token=AD52CDVQOVPWMDKIZTPXX63QZVAANA5CNFSM4J4CQGKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOKJMY#issuecomment-568108211>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD52CDU73AFARITNYYPJWWDQZVAANANCNFSM4J4CQGKA>
.
|
|
Well, it might still be useful to fuzz old python libraries: people won't stop using them because python2 is deprecated, unfortunately :/ |
|
Done in e438c4c |

I did some quick bench-marking of the fuzzing process, and it seems that there are some easy ways to gain some precious exec/s:
This is the baseline:

This is the one with a stupid optimization in the profiler:

I simply added
@functools.lru_cache(maxsize=1024, typed=False)on top of theabs_filefunction, and got a significant performance boost. Do we want to monkey-patch this, or is it worth trying to upstream it? In our case, it's ok to do some caching, since nothing but the input changes between runs, it might be different in the general usecase of coverage.py, I don't know.