SVEN: Security Hardening and Adversarial Testing for Code LLMs

SVEN enables controlling LLMs to generate secure (for security hardening) or unsafe code (for adversarial testing), while maintaining functional correctness. It achieves this by learning continuous prompts (or prefixes) with specialized loss terms on our curated dataset. This repository contains SVEN's source code and trained prefixes, as well as training and evaluation data. For more technical details, check our paper.

Directory Structure

The directory structure of this repository is shown as below:

.
|-- data_train_val     # our curated dataset for training and validation
|-- data_eval          # datasets used for evaluation
|-- sven               # SVEN's source code
|-- scripts            # scripts for training and evaluation
|-- trained            # trained prefixes

SVEN currently supports CodeGen, InCoder, and SantaCoder. It should be straightforward to add support for other LLMs (PR welcomed).

Setup

Set up Python dependencies (a virtual environment is recommended) and GitHub CodeQL:

$ pip install -r requirements.txt
$ pip install -e .
$ ./setup_codeql.sh

Evaluation

The evaluation consists of two parts: security and functional correctness. You should run the evaluation scripts under the ./scripts directory. Make sure to use CUDA_VISIBLE_DEVICES to select the correct GPUs.

Evaluation on Security

To evaluate the security of the original LLM, run the command below. The model 350m can be replaced by {2b, 6b, incoder, santa}. See sec_eval.py for other options, such as using --temp to adjust temperature and using --eval_type to select the evaluation scenarios.

$ python sec_eval.py --model_type lm --model_dir 350m --output_name sec-eval-350m-lm

To evaluate the security of SVEN using the trained models provided by us, run:

$ python sec_eval.py --model_type prefix --model_dir ../trained/350m-prefix/checkpoint-last --output_name sec-eval-350m-prefix

Use print_results.py to obtain the evaluation results. An example command for the original LLM is:

$ python print_results.py --eval_dir ../experiments/sec_eval/sec-eval-350m-lm

Evaluation on Functional Correctness

We use the HumanEval benchmark from the MultiPL-E framework to evaluate functional correctness. To evaluate the original LLM, run the command below. Check human_eval_gen.py for other generation arguments.

$ python human_eval_gen.py --model_type lm --model_dir 350m --output_name human-eval-350m-lm
$ python human_eval_exec.py --output_name human-eval-350m-lm

For SVEN, we need to run the two branches sec and vul separately via the --control argument. The command below is for the sec branch:

$ python human_eval_gen.py --model_type prefix --model_dir ../trained/350m-prefix/checkpoint-last --control sec --output_name human-eval-350m-prefix-sec
$ python human_eval_exec.py --output_name human-eval-350m-prefix-sec

To view the results (for the original LLM for example), run:

$ python print_results.py --eval_type human_eval --eval_dir ../experiments/human_eval/human-eval-350m-lm

Training

We have provided our trained prefixes in ./trained. To train SVEN yourself, run:

$ python train.py --output_name 350m-prefix-new --pretrain_dir 350m

Citation

@article{sven-llm,
  author    = {Jingxuan He and Martin Vechev},
  title     = {Large Language Models for Code: Security Hardening and Adversarial Testing},
  journal   = {CoRR},
  volume    = {abs/2302.05319},
  year      = {2023},
  url       = {https://arxiv.org/abs/2302.05319},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_eval

data_eval

data_train_val

data_train_val

scripts

scripts

sven

sven

trained

trained

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

setup_codeql.sh

setup_codeql.sh

Repository files navigation

SVEN: Security Hardening and Adversarial Testing for Code LLMs

Directory Structure

Setup

Evaluation

Evaluation on Security

Evaluation on Functional Correctness

Training

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data_eval		data_eval
data_train_val		data_train_val
scripts		scripts
sven		sven
trained		trained
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
setup_codeql.sh		setup_codeql.sh

License

eth-sri/sven

Folders and files

Latest commit

History

Repository files navigation

SVEN: Security Hardening and Adversarial Testing for Code LLMs

Directory Structure

Setup

Evaluation

Evaluation on Security

Evaluation on Functional Correctness

Training

Citation

About

Resources

License

Stars

Watchers

Forks

Languages