Skip to content

binyxu/PRISM

Repository files navigation

PRISM: Prototype Refinement & Inspection via Statistical Monitoring

License: MIT

From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense > Under review at ICML 2026

PRISM is a novel test-time backdoor defense framework that shifts the paradigm from internal model diagnosis to Online External Semantic Auditing. By harnessing Universal Vision-Language Models (VLMs) as independent auditors, PRISM decouples the defense mechanism from the compromised victim model, effectively neutralizing diverse attacks (including clean-image and dynamic triggers) without access to training data or model weights.

🚀 Key Features

  • External Semantic Auditing: Uses a frozen, pre-trained VLM (e.g., CLIP, Qwen-VL, Gemma) to audit the prediction stream, eliminating the attack surface of weight-manipulation attacks.
  • Hybrid VLM Teacher: Overcomes the domain gap of general VLMs by fusing static text anchors with Visual Prototypes learned online from the test stream.
  • Adaptive Router: Employing Cornish-Fisher expansion to model non-Gaussian logit margins, dynamically calibrating gating thresholds to distinguish between benign samples and backdoor triggers.
  • Online Evolution: Utilizes Cumulative Moving Average (CMA) for stable, memory-efficient online updates of statistical moments and prototypes.

📊 Performance

PRISM achieves state-of-the-art performance across 17 datasets and 11 attack types (Classic, Dynamic, Clean-Label, Clean-Image).

  • CIFAR-10: Suppresses Attack Success Rate (ASR) to <1% while improving Clean Accuracy.
  • Robustness: Effective against advanced adaptive attacks (Flooding, Periodic, Mixed) and typically robust to poison rates up to 10%.
  • Efficiency: High inference efficiency ($O(1)$ updates) suitable for real-time streams.

📂 Directory Structure

PRISM/
├── config/PRISM/                 # Configuration files (gamma, batch_size, vlm_type)
├── prism.py                      # Base PRISM defense implementation
├── prism_load_prob.py            # VLarge VLM optimization (precomputed probs)
├── prism_online_adaptive.py      # Core implementation: Hybrid Teacher + Adaptive Router
├── prism_online_adaptive_stream.py # Streaming version for continuous inference
├── prism_ablation.py             # Code for component analysis (w/o Update, w/o Skewness)
├── test_vlm_dataset.py           # Utility to benchmark VLM zero-shot performance
├── run_prism_base.sh             # Run basic dual-stream inference
├── run_prism_vlarge_vlm.sh       # Run defense with Generative VLMs (Qwen, Gemma)
├── run_prism_online_adaptive.sh  # Run full PRISM (Recommended)
├── run_prism_online_adaptive_stream.sh # Run streaming simulation
├── run_prism_ablation.sh         # Reproduce ablation study results
└── test_vlm_dataset.sh           # Evaluate VLM domain gaps

🛠️ Usage

1. Online Adaptive Defense (Recommended)

To run the full PRISM framework with the Hybrid VLM Teacher and Adaptive Router (Cornish-Fisher thresholding):

bash run_prism_online_adaptive.sh

This script initializes the safe warm-up phase and enables continuous prototype refinement.

2. Large VLM Support (Generative Models)

For computationally intensive Generative VLMs (e.g., Qwen2.5-VL, Gemma-3) utilizing Key-Value (KV) cache strategies:

bash run_prism_vlarge_vlm.sh

3. Base Defense

To run the dual-stream inference without online statistical updates (static auditing):

bash run_prism_base.sh

4. Ablation Studies

To reproduce the component analysis (e.g., impact of Skewness Correction or Prototype Refinement):

bash run_prism_ablation.sh

5. VLM Performance Benchmark

To evaluate the zero-shot baseline of different VLM backbones on specific datasets:

bash test_vlm_dataset.sh

⚙️ Configuration

Modify the YAML files in config/PRISM/ to adjust defense parameters:

  • gamma (): The base confidence coefficient for the Adaptive Router (Default optimal: -2).
  • vlm_type: Switch between Embedding models (CLIP, SigLIP) and Generative models.
  • alpha: Balancing coefficient for the Hybrid Fusion of text and visual prototypes.

📝 Citation

If you find this work useful, please cite our paper:

@article{prism2026,
  title={From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense},
  author={Anonymous Authors},
  journal={Under Review at ICML},
  year={2026}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors