
# Fast Poison-Point Finder — Jupyter Walkthrough

This notebook walks you through the **#7 Fast Poison-Point Finder** pipeline using CIFAR-10:
1. Environment check
2. Create poisoned indices (10% blended trigger)
3. Train clean & poisoned models
4. Evaluate clean accuracy & attack success rate (ASR)
5. Compute embeddings
6. Rank suspicious samples (Mahalanobis, LOF, Fusion)
7. Evaluate Precision@k
8. (Optional) Plot results

> Tip: Run this notebook from inside the project folder after installing requirements.



## 0) Environment setup (run locally on your machine)
Open a terminal in the project folder and run:
```bash
python -m venv .venv
# Linux/Mac:
source .venv/bin/activate
# Windows (PowerShell):
# .venv\Scripts\Activate.ps1

pip install -r requirements.txt
pip install jupyter
jupyter lab   # or: jupyter notebook
```
Then open this notebook (`FastPoisonFinder.ipynb`).


In [17]:
# 1) Sanity check: Python, Torch, and device
import torch, torchvision
#print("Torch:", torch.__version__, "Torchvision:", torchvision.__version__)
#print("CUDA available:", torch.cuda.is_available())


## 1) Create poisoned indices
Pick 10% of CIFAR-10 training images to poison and set the target class to `0`.


In [13]:

!python scripts/make_poison_indices.py --poison-rate 0.10 --target-class 0 --seed 1 --out data/poisoned_blended_p10_s1.json


python: can't open file 'C:\\Users\\rafir\\scripts\\make_poison_indices.py': [Errno 2] No such file or directory



## 2) Train models
- Clean model (no poisoning)
- Poisoned model (applies blended white-square trigger, re-labels to target class during training)


In [None]:

# This will auto-download CIFAR-10 the first time
!python scripts/train_model.py --epochs 40 --save ckpt/clean_resnet18.pth
!python scripts/train_model.py --epochs 40 --poison-indices data/poisoned_blended_p10_s1.json --save ckpt/poisoned_resnet18.pth



## 3) Evaluate Clean Accuracy and Attack Success Rate (ASR)
ASR is measured by adding the trigger to **all test images** and checking how often the model predicts the target class.


In [None]:

!python scripts/eval_asr.py --model ckpt/clean_resnet18.pth
!python scripts/eval_asr.py --model ckpt/poisoned_resnet18.pth --target 0 --size 5 --alpha 0.2



## 4) Compute embeddings for the (poisoned) training set
We use a frozen ImageNet ResNet-18 backbone and cache embeddings/labels/indices to a `.npz` file.


In [None]:

!python scripts/compute_embeddings.py --poison-indices data/poisoned_blended_p10_s1.json --out data/embeddings_blended_p10_s1.npz



## 5) Rank suspicious samples
Try **Mahalanobis**, **LOF**, and **Fusion** (equal weights). Save Top-200 indices for each method.


In [None]:

!python scripts/rank_suspicious.py --embeddings data/embeddings_blended_p10_s1.npz --method mahalanobis --topk 200 --out data/rank_mahal_top200.json
!python scripts/rank_suspicious.py --embeddings data/embeddings_blended_p10_s1.npz --method lof         --topk 200 --out data/rank_lof_top200.json
!python scripts/rank_suspicious.py --embeddings data/embeddings_blended_p10_s1.npz --method fusion      --topk 200 --out data/rank_fusion_top200.json



## 6) Evaluate Precision@k
Compares top-k ranked indices to the ground-truth poisoned indices.


In [None]:

!python scripts/eval_precision.py --rank data/rank_mahal_top200.json --poison-indices data/poisoned_blended_p10_s1.json --k 200
!python scripts/eval_precision.py --rank data/rank_lof_top200.json   --poison-indices data/poisoned_blended_p10_s1.json --k 200
!python scripts/eval_precision.py --rank data/rank_fusion_top200.json --poison-indices data/poisoned_blended_p10_s1.json --k 200



## 7) (Optional) Quick plots
P@k curve for different methods and a bar chart for Clean Acc vs. ASR.


In [None]:

# Plotting helper for Precision@k if you saved ranked lists for multiple k values
import json, matplotlib.pyplot as plt, numpy as np

def precision_at_k(ranked_indices, poisoned_set, k):
    return len(set(ranked_indices[:k]).intersection(poisoned_set))/k

poison_meta = json.load(open("data/poisoned_blended_p10_s1.json"))
poisoned = set(poison_meta["indices"])

methods = ["rank_mahal_top200.json","rank_lof_top200.json","rank_fusion_top200.json"]
labels  = ["Mahalanobis","LOF","Fusion"]
ks = [50,100,200]
vals = []

for fn in methods:
    j = json.load(open("data/"+fn))
    ranked = j["indices_ranked"]
    vals.append([precision_at_k(ranked, poisoned, k) for k in ks])

vals = np.array(vals)

for i, lab in enumerate(labels):
    plt.plot(ks, vals[i], marker='o', label=lab)
plt.xlabel("k"); plt.ylabel("Precision@k"); plt.title("Precision@k on Blended-10%"); plt.legend(); plt.show()
