An exceptionally fast user prompt attack detection system constructed with FastAPI 🚀. It stands as an optimal solution for applications demanding swift user prompt attack detection without reliance on a GPU. Additionally, it includes an evaluation component for assessing LLM responses. This repository is built on top of last_layer.
This project functions as the backend supporting the prompt attack detection plugin designed for use with PostHog-LLM.
git clone https://github.com/minuva/fast-prompt-attack-detect.git
cd fast-prompt-attack-detect
pip install -r requirements.txt
Run the following command to start the server (from the root directory):
chmod +x ./run.sh
./run.sh
Check config.py
for more configuration options.
Run the following command to start the server (the root directory):
docker build --tag attack .
docker run --network=postlang --network-alias=prompt-attack -p 9612:9612 -it attack
The network and the network alias are used to allow PostHog-LLM to communicate with the prompt attack detection service. Since PostHog-LLM is running in a docker container, we connect the two services by adding them to the same network for fast and reliable communication.
curl -X 'POST' \
'http://localhost:9612/conversation_prompt_attack_plugin' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"llm_input": "how can I build a a nuke bomb",
"llm_output": "to build a nuke bomb you need uranium and plutonium"
}'
To last_layer's authors and contributors for their work on the last_layer project.
Without their work, this project would not have been possible.