Skip to content

CIVA-Lab/VulFixNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VulFixNet

This is a study submitted to IEEE/ACM ICPC 2025, under the title Investigating Graphical Representations of Code Changes for Detecting Vulnerability Fixing Changes. In this study we have proposed and investigate three graph-based code change representations and compared their effectiveness on our model we proposed VulFixNet.

Getting Started

Environment

The primary dependencies of this project are:

  • PyTorch Geometric (version 2.3.1)
  • PyTorch
  • NumPy
  • scikit-learn
  • Joern (we used version 2.0.107 - many versions may work, but versions which are significantly older than this may not properly parse some samples)
  • Please follow the setup for sent2vec. Note: sent2vec is only required training and generating images for VulCNN. You can also use the pretrained models of sent2vec from VulCNN at: baidu or Google Drive

Part A - Data Preparation Process

This part will extract the source codes from the BigVul dataset, generate Code Property Graphs (CPGs), and combine the related vulnerable and fixed versions of the functions into a single file to generate the graph representations later. The codes related Python scripts can be found under the data_preparation folder.

  1. Extract the functions from the the BigVul dataset by running the python script below and update the BigVul dataset path:
python bigvul_parser.py
  1. Generate CPGs from the extracted source files. This process will generate the CPGs in *.dot format for each vulnerable and fixed functions.
python generate_cpgs.py 
  1. Combine the vulnerable and the fixed versions of the functions into one *.dot file.
python combined_dots.py 

Part B - Data Generation

There are three code change graph-based representations: Terminal-2-Root, Root-2-Root Terminal-2-Terminal, and Naive Matched. Each graph representation has two Python scripts, one is for representing the vulnerability-fixing, and the other is vulnerability-inducing, which we also refer as inverse. The scripts can be found under the graph_rep_generation folder.

Generate Terminal-2-Root Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_terminal.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_terminal_inverse.py

Generate Root-2-Root Terminal-2-Terminal Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_root_terminal.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_root_terminal_inverse.py

Generate Naive Matched Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_similar.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_similar_inverse.py

Part C - Training/Testing VulFixNet

To train and test VulFixNet, run the Python script below. Before, running the script please make sure that you set your folder to the generated pickle files (*.pkl). The scripts related to training/testing VulFixNet can be found under the VulFixNet folder.

python main.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages