Constrained Decoding for Secure Code Generation

We propose a test suite CodeGuard+ to evaluate both security and correctness of Code LLMs. We also propose to use constrained decoding techniques to make Code LLMs to generate secure and correct code. More details can be found in the paper.

Directory Structure

The directory structure of this repository is as follows:

.
|-- test_suite         # Our test suite, CodeGuard+
|-- outputs.tar.gz     # Programs generated by CodeGen, SVEN, and StarCoder2 using different decoding methods
    |-- codegen
    |-- sven
    |-- starcoder2

Test suite

Our test suite CodeGuard+ is adapted from this paper. It now includes 23 prompts, along with corresponding unit tests and CodeQL queries. You can find prompts and CodeQL queries in test_suite/new_trained and unit_tests in test_suite/unit_tests.

Code LLMs

We evaluate three Code LLMs, including CodeGen-2.7B, SVEN, and StarCoder2-3B.

Decoding Methods

For CodeGen and SVEN, we evaluate two unconstrained decoding methods, Nucleus Sampling and Beam Sampling, and one constrained decoding method, Constrained Beam Sampling. For StarCoder2, we evaluate two unconstrained decoding methods, Nucleus Sampling and Beam Sampling, and two constrained decoding methods, Constrained Beam Sampling and MuCoLa.

Generated Programs

During our evaluations, we generated programs using different Code LLMs and decoding methods. All outputs can be found in outputs.tar.gz. After unzipping, you will find three folders corresponding to three Code LLMs. For instance, the outputs of StarCoder2 + Constrained Beam Sampling can be found in outputs/starcoder2/star-cbs.

Work in Progress

This repository is still under construction, thank you for your patience!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
test_suite		test_suite
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
outputs.tar.gz		outputs.tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_suite

test_suite

.gitignore

.gitignore

LICENSE

LICENSE

README.MD

README.MD

outputs.tar.gz

outputs.tar.gz

Repository files navigation

Constrained Decoding for Secure Code Generation

Directory Structure

Test suite

Code LLMs

Decoding Methods

Generated Programs

Work in Progress

About

Releases

Packages

Languages

License

Dynamite321/CodeGuardPlus

Folders and files

Latest commit

History

Repository files navigation

Constrained Decoding for Secure Code Generation

Directory Structure

Test suite

Code LLMs

Decoding Methods

Generated Programs

Work in Progress

About

Resources

License

Stars

Watchers

Forks

Languages