Skip to content

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection (RAID 2023) https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf

Notifications You must be signed in to change notification settings

wagner-group/diversevul

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 

Repository files navigation

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

Yizheng Chen, Zhoujie Ding, Lamya Alowain, Xinyun Chen, David Wagner

https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf

Dataset

Our DiverseVul dataset can be downloaded from this URL: https://drive.google.com/file/d/12IWKhmLhq7qn5B_iXgn5YerOQtkH-6RG/view?usp=sharing

The metadata of the dataset is available here: https://drive.google.com/file/d/19cJ7avNtsziaYkrrYuW7FeFdvgrxoNLc/view?usp=sharing The meta data contains commit URLs and repository URLs for 7,512 commits in the DiverseVul dataset. Note that the metadata file is missing 3 commit URLs compared to the extract dataset above.

The following spreadsheet contains the data for our label noise analysis experiment in Section 5: https://docs.google.com/spreadsheets/d/1Tns31RHeozRJF9e5Ie-Iw7nRIKJhrA2xvUUjTmFf5ec/edit?usp=sharing

About

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection (RAID 2023) https://surrealyz.github.io/files/pubs/raid23-diversevul.pdf

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published