Skip to content

Benchmark dataset evaluating code review and security analysis tools on the OpenSSF CVE Benchmark.

License

Notifications You must be signed in to change notification settings

DeepSourceCorp/benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DeepSource Benchmarks

Benchmark dataset evaluating code review and security analysis tools on the OpenSSF CVE Benchmark.

Benchmarked Tools

Last updated: February 8, 2026

Data Format

Judged Results (benchmarks/judged-results/)

Final evaluation results in JSONL format with fields:

  • cve_id: CVE identifier
  • variant: fixed or unfixed
  • detected_issues: Issues found by the tool
  • TP, FP, TN, FN: Classification metrics
  • judge_reasoning: Explanation of the judgment

Processed Results (benchmarks/processed/)

Intermediate formatted results from each tool, normalized for comparison.

Raw Output (benchmarks/raw-output/)

Original tool outputs per CVE, preserving the exact response from each tool.

Archive

The archive/ directory contains prompts and data from earlier benchmark runs:

References

About

Benchmark dataset evaluating code review and security analysis tools on the OpenSSF CVE Benchmark.

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •