VulFixNet

This is a study submitted to IEEE/ACM ICPC 2025, under the title Investigating Graphical Representations of Code Changes for Detecting Vulnerability Fixing Changes. In this study we have proposed and investigate three graph-based code change representations and compared their effectiveness on our model we proposed VulFixNet.

Getting Started

Environment

The primary dependencies of this project are:

PyTorch Geometric (version 2.3.1)
PyTorch
NumPy
scikit-learn
Joern (we used version 2.0.107 - many versions may work, but versions which are significantly older than this may not properly parse some samples)
Please follow the setup for sent2vec. Note: sent2vec is only required training and generating images for VulCNN. You can also use the pretrained models of sent2vec from VulCNN at: baidu or Google Drive

Part A - Data Preparation Process

This part will extract the source codes from the BigVul dataset, generate Code Property Graphs (CPGs), and combine the related vulnerable and fixed versions of the functions into a single file to generate the graph representations later. The codes related Python scripts can be found under the data_preparation folder.

Extract the functions from the the BigVul dataset by running the python script below and update the BigVul dataset path:

python bigvul_parser.py

Generate CPGs from the extracted source files. This process will generate the CPGs in *.dot format for each vulnerable and fixed functions.

python generate_cpgs.py

Combine the vulnerable and the fixed versions of the functions into one *.dot file.

python combined_dots.py

Part B - Data Generation

There are three code change graph-based representations: Terminal-2-Root, Root-2-Root Terminal-2-Terminal, and Naive Matched. Each graph representation has two Python scripts, one is for representing the vulnerability-fixing, and the other is vulnerability-inducing, which we also refer as inverse. The scripts can be found under the graph_rep_generation folder.

Generate Terminal-2-Root Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_terminal.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_terminal_inverse.py

Generate Root-2-Root Terminal-2-Terminal Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_root_terminal.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_root_terminal_inverse.py

Generate Naive Matched Graph Representation

Run the Python script below to generate the vulnerability-fixing representation.

python graph_match_similar.py

Run the Python script below to generate the vulnerability-inducing representation.

python graph_match_similar_inverse.py

Part C - Training/Testing VulFixNet

To train and test VulFixNet, run the Python script below. Before, running the script please make sure that you set your folder to the generated pickle files (*.pkl). The scripts related to training/testing VulFixNet can be found under the VulFixNet folder.

python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
VulFixNet		VulFixNet
data_preparation		data_preparation
graph_rep_generation		graph_rep_generation
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VulFixNet

Getting Started

Environment

Part A - Data Preparation Process

Part B - Data Generation

Generate Terminal-2-Root Graph Representation

Generate Root-2-Root Terminal-2-Terminal Graph Representation

Generate Naive Matched Graph Representation

Part C - Training/Testing VulFixNet

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VulFixNet

Getting Started

Environment

Part A - Data Preparation Process

Part B - Data Generation

Generate Terminal-2-Root Graph Representation

Generate Root-2-Root Terminal-2-Terminal Graph Representation

Generate Naive Matched Graph Representation

Part C - Training/Testing VulFixNet

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages