From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense > Under review at ICML 2026
PRISM is a novel test-time backdoor defense framework that shifts the paradigm from internal model diagnosis to Online External Semantic Auditing. By harnessing Universal Vision-Language Models (VLMs) as independent auditors, PRISM decouples the defense mechanism from the compromised victim model, effectively neutralizing diverse attacks (including clean-image and dynamic triggers) without access to training data or model weights.
- External Semantic Auditing: Uses a frozen, pre-trained VLM (e.g., CLIP, Qwen-VL, Gemma) to audit the prediction stream, eliminating the attack surface of weight-manipulation attacks.
- Hybrid VLM Teacher: Overcomes the domain gap of general VLMs by fusing static text anchors with Visual Prototypes learned online from the test stream.
- Adaptive Router: Employing Cornish-Fisher expansion to model non-Gaussian logit margins, dynamically calibrating gating thresholds to distinguish between benign samples and backdoor triggers.
- Online Evolution: Utilizes Cumulative Moving Average (CMA) for stable, memory-efficient online updates of statistical moments and prototypes.
PRISM achieves state-of-the-art performance across 17 datasets and 11 attack types (Classic, Dynamic, Clean-Label, Clean-Image).
- CIFAR-10: Suppresses Attack Success Rate (ASR) to <1% while improving Clean Accuracy.
- Robustness: Effective against advanced adaptive attacks (Flooding, Periodic, Mixed) and typically robust to poison rates up to 10%.
-
Efficiency: High inference efficiency (
$O(1)$ updates) suitable for real-time streams.
PRISM/
├── config/PRISM/ # Configuration files (gamma, batch_size, vlm_type)
├── prism.py # Base PRISM defense implementation
├── prism_load_prob.py # VLarge VLM optimization (precomputed probs)
├── prism_online_adaptive.py # Core implementation: Hybrid Teacher + Adaptive Router
├── prism_online_adaptive_stream.py # Streaming version for continuous inference
├── prism_ablation.py # Code for component analysis (w/o Update, w/o Skewness)
├── test_vlm_dataset.py # Utility to benchmark VLM zero-shot performance
├── run_prism_base.sh # Run basic dual-stream inference
├── run_prism_vlarge_vlm.sh # Run defense with Generative VLMs (Qwen, Gemma)
├── run_prism_online_adaptive.sh # Run full PRISM (Recommended)
├── run_prism_online_adaptive_stream.sh # Run streaming simulation
├── run_prism_ablation.sh # Reproduce ablation study results
└── test_vlm_dataset.sh # Evaluate VLM domain gaps
To run the full PRISM framework with the Hybrid VLM Teacher and Adaptive Router (Cornish-Fisher thresholding):
bash run_prism_online_adaptive.sh
This script initializes the safe warm-up phase and enables continuous prototype refinement.
For computationally intensive Generative VLMs (e.g., Qwen2.5-VL, Gemma-3) utilizing Key-Value (KV) cache strategies:
bash run_prism_vlarge_vlm.sh
To run the dual-stream inference without online statistical updates (static auditing):
bash run_prism_base.sh
To reproduce the component analysis (e.g., impact of Skewness Correction or Prototype Refinement):
bash run_prism_ablation.sh
To evaluate the zero-shot baseline of different VLM backbones on specific datasets:
bash test_vlm_dataset.sh
Modify the YAML files in config/PRISM/ to adjust defense parameters:
gamma(): The base confidence coefficient for the Adaptive Router (Default optimal: -2).vlm_type: Switch between Embedding models (CLIP, SigLIP) and Generative models.alpha: Balancing coefficient for the Hybrid Fusion of text and visual prototypes.
If you find this work useful, please cite our paper:
@article{prism2026,
title={From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense},
author={Anonymous Authors},
journal={Under Review at ICML},
year={2026}
}