SVA-ICL: Improving LLM-based Software Vulnerability Assessment via In-Context Learning and Information Fusion
This is the source code to the paper "SVA-ICL: Improving LLM-based Software Vulnerability Assessment via In-Context Learning and Information Fusion". Please refer to the paper for the experimental details.
- The
dataset
folder contains all the data used in the experiments for RQ1-RQ5. - The
dataset2
anddataset3
folders store the additional two random samples used in the discussion section. - Due to the large size of the datasets, we have stored them in Google Drive: Google Drive Link.
- The results for RQ1 and RQ2 are stored in the
results3
andresults2
folders, respectively. - The results for RQ3 and RQ4 are stored in the
results_RQ3
andresults_RQ4
folders, respectively. - The results for RQ5 are stored in the
results
folder. - The experimental results for the discussion section are stored in the
results_gpt35
,results_gpt4o
,results_dataset2
, andresults_dataset3
folders.
We use the bert_whitening
trained models, which are stored in the model
, model_dataset2
, and model_dataset3
folders.
- Use the provided Jupyter files for data preprocessing.
- Run
bert_whitening.py
. After running, we get the semantic vector library of the training set, kernel, and bias. - Run
ccgir.py
to get the most similar code fragments for the test set. - Run
search_info_form_code.ipynb
to get all the data required for the prompt template. - Run
deepseek.ipynb
to call the LLM and complete the vulnerability assessment task.