SecLLMHolmes 🔍

🎉 IEEE Symposium on Security and Privacy 2024!

🛠️ Updates (Latest - Nov 6, 2024)

🚨 See UPDATES.md for details

SecLLMHolmes is a generalized, fully automated, and scalable framework to systematically evaluate the performance (i.e., accuracy and reasoning capabilities) of LLMs for vulnerability detection.

⚙️ Features

Assessing identification and reasoning in vulnerability detection
Fully-automated evaluation
Scalable to any chat-based LLM
Comprehensive testing over 8-distinct and critical dimensions for vulnerability detection
Evaluation over C/C++ and Python programming languages
Assessment over code scenarios with three complexity levels
Tests for eight most dangerous classes of vulnerabilities (CWEs)
Robustness testing over range of minor to major code augmentations
Assessment over a diverse set of 17 prompts

🔬 Build an `adapter` to evaluate YOUR LLM from scratch

To evaluate your LLM using our framework, you need to create an adapter. You can do this by simply modifying the (src/adapter.py) and implementing the following three functions in it:

prepare_prompt: define best prompting practices and rules specific to your LLM
prepare: define, prepare, or load your model
chat: define message structure and chat inference method

Note: For more details, please refer to our paper (Section 3.1) and see examples of adapters for LLMs included in our study.

Get OpenAI API-KEY: As our evaluation framework uses OpenAI's GPT-4o API, you need to provide your own API-KEY. (OpenAI's official documentation)
Add Model's Name: This model name will be used to store the evaluation results for your LLM

🏃🏽‍♂️‍➡️ How to Run

After you have created your adapter, please go ahead and follow these steps to run your evaluation:

Create python environment

python3 -m venv env
source env/bin/activate

Install the required packages

pip install -r requirements.txt

Note: For need to manually install all packages that are required to run your LLM

Run your adapter, and it will create a resulta directory and store all results in it

cd src
python adapter.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
adapter-examples		adapter-examples
datasets		datasets
our-results		our-results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
UPDATES.md		UPDATES.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SecLLMHolmes 🔍

🛠️ Updates (Latest - Nov 6, 2024)

⚙️ Features

🔬 Build an `adapter` to evaluate YOUR LLM from scratch

🏃🏽‍♂️‍➡️ How to Run

About

Releases

Packages

Contributors 2

Languages

License

ai4cloudops/SecLLMHolmes

Folders and files

Latest commit

History

Repository files navigation

SecLLMHolmes 🔍

🛠️ Updates (Latest - Nov 6, 2024)

⚙️ Features

🔬 Build an adapter to evaluate YOUR LLM from scratch

🏃🏽‍♂️‍➡️ How to Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

🔬 Build an `adapter` to evaluate YOUR LLM from scratch

Packages