# Ensemble-Based Poison Detection on Structured Data

This notebook simulates data poisoning in structured healthcare-like data and implements
ensemble-based detection using classifier disagreement inspired by EPIC framework.

**Goals:**
- Simulate diagnosis dataset
- Calculate logistic Regression, Random Forest, and SVM
- Poison detection via prediction disagreement
- Evaluate poison impact and detection rate

In [None]:
# Imports
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, f1_score
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

## Simulated Healthcare Dataset
The code below creates a structured dataset to mimic diagnosis prediction from healthcare analysis data like lab tests, vitals, etc.

In [None]:
# Simulate clean data
np.random.seed(0)
n_samples = 1000
X = np.random.normal(0, 1, (n_samples, 10))
y = (X[:, 0] + X[:, 1] > 0).astype(int)  # synthetic label

# Inject label poisoning in 10% of data
poison_rate = 0.1
n_poisoned = int(poison_rate * n_samples)
y_poisoned = y.copy()
poison_indices = np.random.choice(n_samples, n_poisoned, replace=False)
y_poisoned[poison_indices] = 1 - y_poisoned[poison_indices]  # flip labels

# Train/val/test split
X_train, X_test, y_train, y_test = train_test_split(X, y_poisoned, test_size=0.2, random_state=42)
X_train_clean, _, y_train_clean, _ = train_test_split(X, y, test_size=0.2, random_state=42)

##  Ensemble Models and Prediction Disagreement
Three models are trained and observed on how frequently they disagree, which is useful for detecting poison injections.
This section also generates synthetic patient-like records with 10% label poisoning for testing of poison injection.


In [None]:
# Train ensemble classifiers
clf1 = LogisticRegression(max_iter=500).fit(X_train, y_train)
clf2 = RandomForestClassifier(n_estimators=100).fit(X_train, y_train)
clf3 = SVC(probability=True).fit(X_train, y_train)

# Predict on test set
pred1 = clf1.predict(X_test)
pred2 = clf2.predict(X_test)
pred3 = clf3.predict(X_test)

# Count disagreement
disagreements = (pred1 != pred2) | (pred1 != pred3) | (pred2 != pred3)
disagreement_rate = np.mean(disagreements)
print(f"Disagreement rate: {disagreement_rate:.2f}")

# Train three different classifiers and analyze how often they disagree for poison detection

## Poison Detection via Disagreement
High-disagreement samples are flagged and checked for how many samples were poisoned. Disagreements are analyzed to detect potential poisoned inputs based on the EPIC framework.
Expected output: Disagreement increases on poisoned samples. High disagreement will flag anomalies.


In [None]:
# Map back to original test indices
test_indices = np.arange(n_samples)[-len(X_test):]
actual_poison_flags = np.isin(test_indices, poison_indices)
predicted_poison_flags = disagreements

# Calculate detection performance
tp = np.sum(actual_poison_flags & predicted_poison_flags)
fp = np.sum(~actual_poison_flags & predicted_poison_flags)
fn = np.sum(actual_poison_flags & ~predicted_poison_flags)

precision = tp / (tp + fp + 1e-9)
recall = tp / (tp + fn + 1e-9)

print(f"Poison Detection Precision: {precision:.2f}")
print(f"Poison Detection Recall: {recall:.2f}")

# Compute precision and recall for how well disagreement identifies poisoned records.

##  Accuracy Comparison
Compare model performance on clean vs poisoned data.
The EPIC detection method is simulated by flagging predictions where classifiers disagree.
This code compares the performance of clean and poisoned models on the same test set.

In [None]:
# Visualize disagreement-based poison detection result
plt.figure(figsize=(6, 4))
plt.bar(['Detected Poisons (TP)', 'Missed Poisons (FN)'], [tp, fn], color=['green', 'red'])
plt.title('Poison Detection via Ensemble Disagreement')
plt.ylabel('Number of Samples')
plt.grid(axis='y')
plt.show()

In [None]:
# Accuracy on poisoned and clean train sets
clf1_clean = LogisticRegression(max_iter=500).fit(X_train_clean, y_train_clean)
acc_poisoned = accuracy_score(y_test, clf1.predict(X_test))
acc_clean = accuracy_score(y_test, clf1_clean.predict(X_test))

plt.bar(['Clean Model', 'Poisoned Model'], [acc_clean, acc_poisoned])
plt.title('Model Accuracy (Clean vs Poisoned)')
plt.ylabel('Accuracy')
plt.ylim(0, 1)
plt.show()