Prompt Injection Attack Detection System using Explainable AI

This project is a Prompt Injection Attack (PIA) Detection System using an LSTM model with SHAP-based explainability, built with Streamlit as the frontend.

Core Features

✅ Detects Prompt Injection Attacks using LSTM

✅ Provides explainability using SHAP

✅ Interactive Streamlit UI for real-time predictions

✅ Dataset visualization

How to Run

Clone the repository

git clone https://github.com/Akshara-Balan/PIA-Detection-System-using-Explainable-AI.git
cd PIA-Detection-System-using-Explainable-AI

Install dependencies

pip install -r requirements.txt

Train and save the model

python model_training.py

Run the streamlit UI

Windows:

python -m streamlit run frontend.py

Linux:

streamlit run frontend.py

How we did it

Created a dataset manually (PIA Dataset.csv) with 500 datas for each of the 4 categories.
Performed Data Augmentation using the snippet: data_augmentation.py , which gave PIA_Augmented_Dataset.csv with 2500 datas for each of the category.
Performed Exploratory Data Anlysis on this new dataset: eda_data.py
Trained the LSTM model with augmented data: model_training.py and saved the model.
Added SHAP analysis module for the saved model: shap_analysis.py
Created frontend module with streamlit framework: frontend.py

How It All Works Together

Training Phase (Offline)

Train LSTM model on labeled PIA dataset.
Save model (lstm_model.keras) and preprocessing artifacts.

Inference & Explainability (Online - Streamlit)

User inputs a prompt in the Streamlit UI.
The prompt is preprocessed and fed to the trained LSTM model.
The model predicts whether the prompt is legitimate or a PIA.
SHAP analysis explains the prediction by highlighting influential words.

Key Technologies

🔸 Deep Learning: TensorFlow (LSTM)

🔸 Explainability: SHAP

🔸 Frontend: Streamlit

🔸 Data Processing: Pandas, NumPy

🔸 Evaluation: Scikit-Learn

Contributors

Akshara Balan
Alan C M
Maria Baiju
Tom Davis

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Final		Final
__pycache__		__pycache__
PIA Dataset.csv		PIA Dataset.csv
PIA_Augmented_Dataset.csv		PIA_Augmented_Dataset.csv
README.md		README.md
Work1.ipynb		Work1.ipynb
Work2.ipynb		Work2.ipynb
Work3.ipynb		Work3.ipynb
confusion_matrix_plot.py		confusion_matrix_plot.py
data_augmentation.py		data_augmentation.py
data_processing.py		data_processing.py
eda_data.py		eda_data.py
frontend.py		frontend.py
model_training.py		model_training.py
requirements.txt		requirements.txt
shap_analysis.py		shap_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Prompt Injection Attack Detection System using Explainable AI

Core Features

How to Run

How we did it

How It All Works Together

Training Phase (Offline)

Inference & Explainability (Online - Streamlit)

Key Technologies

Contributors

About

Uh oh!

Releases

Packages

Languages

Akshara-Balan/PIA-Detection-System-using-Explainable-AI

Folders and files

Latest commit

History

Repository files navigation

Prompt Injection Attack Detection System using Explainable AI

Core Features

How to Run

How we did it

How It All Works Together

Training Phase (Offline)

Inference & Explainability (Online - Streamlit)

Key Technologies

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages