Add additional JobRun stats for blackbox fuzzers#5238
Add additional JobRun stats for blackbox fuzzers#5238ViniciustCosta merged 10 commits intomasterfrom
Conversation
4c7b0a9 to
3fe165d
Compare
3fe165d to
4068b54
Compare
cb69ba1 to
ba7b83f
Compare
33ba946 to
2b259ca
Compare
part 1 of Populate job run stats for blackbox fuzzers e.g. total fuzzing hours, test cases generated, test case generation time, test case executions and test case execution time.
Clean up comments Fix
a918536 to
4affc41
Compare
|
|
||
| def _timedelta_to_duration_string(time_delta): | ||
| """Converts a datetime.timedelta to ISO8601 duration string. | ||
| BigQuery Load API requires the ISO8601 duration string rather than an INTERVAL |
There was a problem hiding this comment.
I found an issue while testing in dev and needed to change this to ISO8601 format instead of the SQL INTERVAL format.
I verified the new format by running bq load --source_format=NEWLINE_DELIMITED_JSON 'clusterfuzz-development:ochang_js_fuzzer_stats.temp_JobRun' test_bq.json
And now I can successfully query the table in dev.
Previously I was seeing this error in dev: https://paste.googleplex.com/5295014885326848
|
I verified that the big query writes succeed locally after the last commit by running: I do see a couple failures but the don't seem related to the new schema: and But the cronjob that's running in dev seemed to show different errors that sound like it wasn't using my latest changes which I merged to dev already. I'm not sure if I need to do anything else to make sure that the cron job is running with these changes too. |
The cronjob was running an old version of the code, which I found by looking up the |
| """Fuzzer stats exception.""" | ||
|
|
||
|
|
||
| def _timedelta_to_duration_string(time_delta): |
There was a problem hiding this comment.
nit: I wonder if there is a standard/common lib to convert this and avoid us having to implement and maintain it.
There was a problem hiding this comment.
There's nothing in datetime, but it looks like we could import isodate and use isodate.duration_isoformat
https://github.com/gweis/isodate
There was a problem hiding this comment.
It should be good, as this package is tracked in our third party internally: https://source.corp.google.com/piper///depot/google3/third_party/py/isodate/METADATA
Feel free to decide if it's worth it, not a blocker :)
There was a problem hiding this comment.
I'm happy to pull that library in and use that so we don't need to maintain the duration formatting.
I'm struggling to get the dependency added correctly. Is there any documentation for how to do this?
When I try to run pipenv lock I'm seeing a larger diff than I would expect. I suspect there is something wrong with my local clusterfuzz installation.
Could I follow up with that change?
There was a problem hiding this comment.
I don't think we have doc for that (best place to look should be under that "how to do clusterfuzz development" doc).
IIRC it should work by installing with pip and running pipenv lock as you said. The large diff might be due to a lot of packages not having their version pinned, so this also updates them :/
I'll merge this PR and you can try to change this in a follow-up.
This PR aims to improve our ability to benchmark blackbox fuzzers effectiveness and monitor their health.
Context
When measuring fuzzer effectiveness, we’re interested in the rate at which the fuzzer can execute testcases and find bugs. This requires a baseline definition of total fuzzing hours across all types of fuzzers.
Some of these stats are already computed for Monarch monitoring in
monitoring_metrics.py, but we'd like to get them into BigQuery to have a consistent data source for analysisChanges
Adds
testcases_generated,testcase_execution_duration,testcase_generation_duration, andfuzzing_durationto the uworkerFuzzTaskOutputproto and write those stats to the JobRun BigQuery table.Notes
This only adds metrics for blackbox fuzzers. We will need to aggregate the fuzzing session hours and execution metrics we already store in the TestcaseRun tables for engine guided fuzzers if we want comparisons across fuzzer types. This either means we need to aggregate those tables and write to the JobRun tables, or do the aggregation in our plx workflows/scripts.
Testing
FuzzTaskOutputproto and written to BigQuerylogs from running a local bot: https://paste.googleplex.com/4551083230887936
local BigQuery stats: https://paste.googleplex.com/6697329928306688
dev BigQuery json: https://paste.googleplex.com/4627780642930688
dev BigQuery table https://paste.googleplex.com/6251944436957184
Added unit tests for the stats