This is the repository that contains the source code for RTS-Attack.
python=3.11.10 openai=1.98.0 google-genai=1.28.0 matplotlib=3.8.0/3.10.0 pandas=2.2.3 beautifulsoup4=4.13.4 numpy=1.26.4 torch=2.1.1+cu121 torchaudio=2.1.1+cu121 torchvision=0.16.1+cu121
This project mainly contains the following dirs/files:
-
constant/consts.py: This file contains all string texts, such as prompts and LLMs name. -
dataset/advbench.csv: https://github.com/llm-attacks/llm-attacks/tree/main/data/advbench. -
dto: This dir contains model classes (name, url, api_key, ...). -
service: This dir contains all function components.openai_service.py: Init openai api; Request openai api to get response; Count tokens;pre_experiments_service.py: Use this file to finish pre experiments.query_parser_service.py: Query classification and Query extraction.scenario_generator_service.py: Generate scenarios.instruction_generator_service.py: This file is not necessary because of its simple function (only a rewriting prompt), we rewrite instructions when executing jailbreak attacks.jailbreak_attacker_service.py: Use this file to finish jailbreak (including rewrite instructions, evaluate asr_g, and extract reason, harmful score).prompt_evaluator_service.py: Evaluate scenarios quality.
-
test: This dir contains all test files for service. You can use them to call service functions.
To use this project, you need to obtain API keys for the relevant large language models. Here are the steps and considerations for different models:
- Sign up for an OpenAI account:
- Go to the OpenAI official website (https://openai.com/) and sign up for an account if you don't have one already.
- Generate an API key:
- Log in to your OpenAI account and navigate to the API key management section.
- Create a new API key. Make sure to keep this key secure, as it grants access to your OpenAI account resources.
- Similar to OpenAI, you need to sign up for accounts on the respective platforms that provide these models.
- Obtain the API keys from their official websites or developer portals.
Note:
- Some models may require additional configuration, such as setting the base URL. Make sure to follow the official documentation of each model to correctly configure these settings.
- API keys are sensitive information. Do not share them publicly or commit them to version control systems. It is recommended to use environment variables or other secure methods to manage your API keys in a production environment.
We do not provide one click script execution because it is difficult to integrate all functions into one, more specifically, different LLMs have different reply formats. You can run the test files for each stage separately to obtain the results.
If you want to contribute to this project, please follow these steps:
- Fork the project repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them with descriptive commit messages.
- Push your changes to your forked repository.
- Create a pull request to the original repository.