This project is a Prompt Injection Attack (PIA) Detection System using an LSTM model with SHAP-based explainability, built with Streamlit as the frontend.
✅ Detects Prompt Injection Attacks using LSTM
✅ Provides explainability using SHAP
✅ Interactive Streamlit UI for real-time predictions
✅ Dataset visualization
- Clone the repository
git clone https://github.com/Akshara-Balan/PIA-Detection-System-using-Explainable-AI.git
cd PIA-Detection-System-using-Explainable-AI- Install dependencies
pip install -r requirements.txt- Train and save the model
python model_training.py-
Run the streamlit UI
Windows:
python -m streamlit run frontend.py
Linux:
streamlit run frontend.py
-
Created a dataset manually (PIA Dataset.csv) with 500 datas for each of the 4 categories.
-
Performed Data Augmentation using the snippet: data_augmentation.py , which gave PIA_Augmented_Dataset.csv with 2500 datas for each of the category.
-
Performed Exploratory Data Anlysis on this new dataset: eda_data.py
-
Trained the LSTM model with augmented data: model_training.py and saved the model.
-
Added SHAP analysis module for the saved model: shap_analysis.py
-
Created frontend module with streamlit framework: frontend.py
- Train LSTM model on labeled PIA dataset.
- Save model (lstm_model.keras) and preprocessing artifacts.
- User inputs a prompt in the Streamlit UI.
- The prompt is preprocessed and fed to the trained LSTM model.
- The model predicts whether the prompt is legitimate or a PIA.
- SHAP analysis explains the prediction by highlighting influential words.
🔸 Deep Learning: TensorFlow (LSTM)
🔸 Explainability: SHAP
🔸 Frontend: Streamlit
🔸 Data Processing: Pandas, NumPy
🔸 Evaluation: Scikit-Learn
-
Akshara Balan
-
Alan C M
-
Maria Baiju
-
Tom Davis