Awesome Data Poisoning and Backdoor Attacks

Disclaimer: This repository may not include all relevant papers in this area. Use at your own discretion and please contribute any missing or overlooked papers via pull request.

A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them.

Surveys

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses (TPAMI 2022) [paper]
A Survey on Data Poisoning Attacks and Defenses (DSC 2022) [paper]

Benchmark

APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (arXiv 2023) [paper] [code]
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (ICML 2021) [paper] [code]

2024

Test-Time Poisoning Attacks Against Test-Time Adaptation Models (S&P 2024) [paper] [code]
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors (NDSS 2024) [paper]

ICLR

Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark (ICLR 2024) [paper]
Towards Reliable and Efficient Backdoor Trigger Inversion via Decoupling Benign Features (ICLR 2024) [paper]
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection (ICLR 2024) [paper]
Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency (ICLR 2024) [paper]
Adversarial Feature Map Pruning for Backdoor (ICLR 2024) [paper]
Safe and Robust Watermark Injection with a Single OoD Image (ICLR 2024) [paper]
Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios (ICLR 2024) [paper]
Backdoor Contrastive Learning via Bi-level Trigger Optimization (ICLR 2024) [paper]
BadEdit: Backdooring Large Language Models by Model Editing (ICLR 2024) [paper]
Backdoor Federated Learning by Poisoning Backdoor-Critical Layers (ICLR 2024) [paper]
Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection (ICLR 2024) [paper]
Influencer Backdoor Attack on Semantic Segmentation (ICLR 2024) [paper]
Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective (ICLR 2024) [paper]
Universal Backdoor Attacks (ICLR 2024) [paper]
Demystifying Poisoning Backdoor Attacks from a Statistical Perspective (ICLR 2024) [paper]
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models (ICLR 2024) [paper]
Rethinking CNN’s Generalization to Backdoor Attack from Frequency Domain (ICLR 2024) [paper]
Like Oil and Water: Group Robustness Methods and Poisoning Defenses Don't Mix (ICLR 2024) [paper]
VDC: Versatile Data Cleanser for Detecting Dirty Samples via Visual-Linguistic Inconsistency (ICLR 2024) [paper]
Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning (ICLR 2024) [paper]
Universal Jailbreak Backdoors from Poisoned Human Feedback (ICLR 2024) [paper]
Teach LLMs to Phish: Stealing Private Information from Language Models (ICLR 2024) [paper]

CVPR

Data Poisoning based Backdoor Attacks to Contrastive Learning (CVPR 2024) [paper] [code]
Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving (CVPR 2024) [paper]
Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment (CVPR 2024)
BrainWash: A Poisoning Attack to Forget in Continual Learning (CVPR 2024) [paper]
Not All Prompts Are Secure: A Switchable Backdoor Attack against Pre-trained Models (CVPR 2024)
Test-Time Backdoor Defense via Detecting and Repairing (CVPR 2024) [paper]
Nearest Is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks (CVPR 2024)
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning (CVPR 2024) [paper] [code]
Temperature-based Backdoor Attacks on Thermal Infrared Object Detection (CVPR 2024)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning (CVPR 2024) [paper]
Re-thinking Data Availablity Attacks Against Deep Neural Networks (CVPR 2024) [paper]

NAACL

From Shortcuts to Triggers: Backdoor Defense with Denoised PoE (NAACL 2024) [paper] [code]
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors (NAACL 2024) [paper] [code]
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (NAACL 2024) [paper]
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models (NAACL 2024) [paper]
PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning (NAACL 2024)
Backdoor Attacks on Multilingual Machine Translation (NAACL 2024) [paper]
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections (NAACL 2024) [paper]
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection (NAACL 2024) [paper] [code]
Composite Backdoor Attacks Against Large Language Models (NAACL 2024 Findings) [paper] [code]
Task-Agnostic Detector for Insertion-Based Backdoor Attacks (NAACL 2024 Findings) [paper]
Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning (NAACL 2024 Findings) [paper]

2023

arXiv

Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack (arXiv 2023) [code]
Exploring the Limits of Indiscriminate Data Poisoning Attacks (arXiv 2023) [paper]
Students Parrot Their Teachers: Membership Inference on Model Distillation (arXiv 2023) [paper]
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (arXiv 2023) [paper] [code]
Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks (arXiv 2023) [paper] [code]
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms (arXiv 2023) [paper] [code]
Temporal Robustness against Data Poisoning (arXiv 2023) [paper]
A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification (arXiv 2023) [paper]
Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks (arXiv 2023) [paper] [code]
Backdoor Attacks with Input-unique Triggers in NLP (arXiv 2023) [paper]
Do Backdoors Assist Membership Inference Attacks? (arXiv 2023) [paper]
Black-box Backdoor Defense via Zero-shot Image Purification (arXiv 2023) [paper]
Influencer Backdoor Attack on Semantic Segmentation (arXiv 2023) [paper]
TrojViT: Trojan Insertion in Vision Transformers (arXiv 2023) [paper]
Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling (arXiv 2023) [paper] [code]
Poisoning Web-Scale Training Datasets is Practical (arXiv 2023) [paper]
Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (arXiv 2023) [paper]
MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion (arXiv 2023) [paper]
Launching a Robust Backdoor Attack under Capability Constrained Scenarios (arXiv 2023) [paper]
Certifiable Robustness for Naive Bayes Classifiers (arXiv 2023) [paper] [code]
Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks (arXiv 2023) [paper] [code]
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (arXiv 2023) [paper] [code]
Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning (arXiv 2023) [paper]
BadSAM: Exploring Security Vulnerabilities of SAM via Backdoor Attacks (arXiv 2023) [paper]
Backdoor Learning on Sequence to Sequence Models (arXiv 2023) [paper]
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (arXiv 2023) [paper]
Evil from Within: Machine Learning Backdoors through Hardware Trojans (arXiv 2023) [paper]

ICLR

Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [paper]
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only (ICLR 2023) [paper]
TrojText: Test-time Invisible Textual Trojan Insertion (ICLR 2023) [paper] [code]
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning? (ICLR 2023) [paper] [code]
Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks (ICLR 2023) [paper] [code]
Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023) [paper] [code]
Few-shot Backdoor Attacks via Neural Tangent Kernels (ICLR 2023) [paper] [code]
SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency (ICLR 2023) [paper] [code]
Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective (ICLR 2023) [paper] [code]
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization (ICLR 2023) [paper]
Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure (ICLR 2023) [paper]
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors (ICLR 2023) [paper] [code]
Towards Robustness Certification Against Universal Perturbations (ICLR 2023) [paper] [code]
Understanding Influence Functions and Datamodels via Harmonic Analysis (ICLR 2023) [paper]
Distilling Cognitive Backdoor Patterns within an Image (ICLR 2023) [paper] [code]
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning (ICLR 2023) [paper] [code]
UNICORN: A Unified Backdoor Trigger Inversion Framework (ICLR 2023) [paper] [code]

ICML

Poisoning Language Models During Instruction Tuning (ICML 2023) [paper] [code]
Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning (ICML 2023) [paper] [code]
Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression (ICML 2023) [paper] [code]
Poisoning Generative Replay in Continual Learning to Promote Forgetting (ICML 2023) [paper] [code]
Exploring Model Dynamics for Accumulative Poisoning Discovery (ICML 2023) [paper] [code]
Data Poisoning Attacks Against Multimodal Encoders (ICML 2023) [paper] [code]
Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks (ICML 2023) [paper] [code]
Run-Off Election: Improved Provable Defense against Data Poisoning Attacks (ICML 2023) [paper] [code]
Revisiting Data-Free Knowledge Distillation with Poisoned Teachers (ICML 2023) [paper] [code]
Certified Robust Neural Networks: Generalization and Corruption Resistance (ICML 2023) [paper] [code]
Understanding Backdoor Attacks through the Adaptability Hypothesis (ICML 2023) [paper]
Robust Collaborative Learning with Linear Gradient Overhead (ICML 2023) [paper] [code]
Graph Contrastive Backdoor Attacks (ICML 2023) [paper]
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023) [paper] [code]
Rethinking Backdoor Attacks (ICML 2023) [paper]
UMD: Unsupervised Model Detection for X2X Backdoor Attacks (ICML 2023) [paper]
LeadFL: Client Self-Defense against Model Poisoning in Federated Learning (ICML 2023) [paper] [code]

NeurIPS

BadTrack: A Poison-Only Backdoor Attack on Visual Object Tracking (NeurIPS 2023) [paper]
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP (NeurIPS 2023) [paper]
Robust Contrastive Language-Image Pretraining against Data Poisoning and Backdoor Attacks (NeurIPS 2023) [paper] [code]
Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features (NeurIPS 2023) [paper] [PyTorch code] [MindSpore Code]
What Distributions are Robust to Indiscriminate Poisoning Attacks for Linear Learners? (NeurIPS 2023) [paper]
Label Poisoning is All You Need (NeurIPS 2023) [paper]
Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks (NeurIPS 2023) [paper] [code]
Temporal Robustness against Data Poisoning (NeurIPS 2023) [paper]
VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models (NeurIPS 2023) [paper] [code]
CBD: A Certified Backdoor Detector Based on Local Dominant Probability (NeurIPS 2023) [paper]
BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning (NeurIPS 2023) [paper]
Fed-FA: Theoretically Modeling Client Data Divergence for Federated Language Backdoor Defense (NeurIPS 2023) [paper]
Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples (NeurIPS 2023) [paper] [PyTorch code] [MindSpore Code]
IBA: Towards Irreversible Backdoor Attacks in Federated Learning (NeurIPS 2023) [paper] [code]
Towards Stable Backdoor Purification through Feature Shift Tuning (NeurIPS 2023) [paper] [code]
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks (NeurIPS 2023) [paper] [code]
Lockdown: Backdoor Defense for Federated Learning with Isolated Subspace Training (NeurIPS 2023) [paper] [code]
A3FL: Adversarially Adaptive Backdoor Attacks to Federated Learning (NeurIPS 2023) [paper] [code]
FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning (NeurIPS 2023) [paper] [code]
A Unified Detection Framework for Inference-Stage Backdoor Defenses (NeurIPS 2023) [paper]
Black-box Backdoor Defense via Zero-shot Image Purification (NeurIPS 2023) [paper] [code]
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots (NeurIPS 2023) [paper]

CVPR

Backdoor Defense via Deconfounded Representation Learning (CVPR 2023) [paper] [code]
Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks (CVPR 2023) [paper]
CUDA: Convolution-based Unlearnable Datasets (CVPR 2023) [paper] [code]
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger (CVPR 2023) [paper]
Single Image Backdoor Inversion via Robust Smoothed Classifiers (CVPR 2023) [paper] [code]
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples (CVPR 2023) [paper] [code]
Backdoor Defense via Adaptively Splitting Poisoned Dataset (CVPR 2023) [paper] [code]
Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency (CVPR 2023) [paper] [code]
Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning (CVPR 2023) [paper] [code]
Color Backdoor: A Robust Poisoning Attack in Color Space (CVPR 2023) [paper]
How to Backdoor Diffusion Models? (CVPR 2023) [paper] [code]
Backdoor Cleansing With Unlabeled Data (CVPR 2023) [paper] [code]
MEDIC: Remove Model Backdoors via Importance Driven Cloning (CVPR 2023) [paper] [code]
Architectural Backdoors in Neural Networks (CVPR 2023) [paper]
Detecting Backdoors in Pre-Trained Encoders (CVPR 2023) [paper] [code]
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection (CVPR 2023) [paper] [code]
Progressive Backdoor Erasing via Connecting Backdoor and Adversarial Attacks (CVPR 2023) [paper]
You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks? (CVPR 2023) [paper]
Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs (CVPRW 2023) [paper]

ICCV

TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models (ICCV 2023) [paper] [code]
Towards Attack-tolerant Federated Learning via Critical Parameter Analysis (ICCV 2023) [paper] [code]
VertexSerum: Poisoning Graph Neural Networks for Link Inference (ICCV 2023) [paper]
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data (ICCV 2023) [paper] [code]
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning (arXiv 2023) [paper] [code]
Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (ICCV2023) [paper]
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis (ICCV2023) [paper] [code]
Beating Backdoor Attack at Its Own Game (ICCV 2023) [paper] [code]
Multi-Metrics Adaptively Identifies Backdoors in Federated Learning (ICCV2023) [paper] [code]
PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning (ICCV2023) [paper]
The Perils of Learning from Unlabeled Data: Backdoor Attacks on Semi-Supervised Learning [paper]

S&P

Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers (S&P 2023) [paper]
SNAP: Efficient Extraction of Private Properties with Poisoning (S&P 2023) [paper] [code]
BayBFed: Bayesian Backdoor Defense for Federated Learning (S&P 2023) [paper]
RAB: Provable Robustness Against Backdoor Attacks (S&P 2023) [paper]
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information (S&P 2023) [paper]
3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning (S&P 2023) [paper]

ACL

BITE: Textual Backdoor Attacks with Iterative Trigger Injection (ACL 2023) [paper] [code]
Backdooring Neural Code Search (ACL 2023) [paper] [code]
Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark (ACL 2023) [paper] [code]
NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models (ACL 2023) [paper] [code]
Multi-target Backdoor Attacks for Code Pre-trained Models (ACL 2023) [code] [code]
A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning (ACL 2023) [paper]
Defending against Insertion-based Textual Backdoor Attacks via Attribution (ACL 2023) [paper]
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias (ACL 2023) [paper]
Maximum Entropy Loss, the Silver Bullet Targeting Backdoor Attacks in Pre-trained Language Models (ACL Findings 2023) [paper]

EMNLP

Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation (EMNLP 2023) [paper] [code]
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (EMNLP 2023) [paper] [code]
Poisoning Retrieval Corpora by Injecting Adversarial Passages (EMNLP 2023) [paper] [code]
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning (EMNLP 2023 Findings) [paper]
Attention-Enhancing Backdoor Attacks Against BERT-based Models (EMNLP 2023 Findings) [paper]
Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers (EMNLP 2023 Findings) [paper]

Others

RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching (UAI 2023) [paper]
Defending Against Backdoor Attacks by Layer-wise Feature Analysis (PAKDD 2023) [paper] [code]
Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures (SIGIR 2023) [paper]
The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples (SIGIR 2023) [paper]
PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification (ACM MM 2023) [paper] [code]
A Dual Stealthy Backdoor: From Both Spatial and Frequency Perspectives (ACM MM 2023) [paper]
How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? (USENIX Security 2023) [paper] [code]
PORE: Provably Robust Recommender Systems against Data Poisoning Attacks (USENIX Security 2023) [paper]
On the Security Risks of Knowledge Graph Reasoning (USENIX Security 2023) [paper] [code]
Fedward: Flexible Federated Backdoor Defense Framework with Non-IID Data (ICME 2023) [paper]
BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT (NDSS 2023) [paper]
Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators (GLSVLSI 2023) [paper]
Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning (SecTL 2023) [paper]
Beyond the Model: Data Pre-processing Attack to Deep Learning Models in Android Apps (SecTL 2023) [paper]

2022

Transferable Unlearnable Examples (arXiv 2022) [paper]
Natural Backdoor Datasets (arXiv 2022) [paper]
Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World (arXiv 2022) [paper]
Backdoor Attacks on Self-Supervised Learning (CVPR 2022) [paper] [code]
Poisons that are learned faster are more effective (CVPR 2022 Workshops) [paper]
Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning (ICLR 2022) [paper] [code]
Adversarial Unlearning of Backdoors via Implicit Hypergradient (ICLR 2022) [paper] [code]
Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022) [paper] [code]
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch (NeurIPS 2022) [paper] [code]
Policy Resilience to Environment Poisoning Attacks on Reinforcement Learning (NeurIPS 2022 Workshop MLSW) [paper]
Hard to Forget: Poisoning Attacks on Certified Machine Unlearning (AAAI 2022) [paper] [code]
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks (AAAI 2022) [paper]
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning (USENIX Security 2022) [paper]
Planting Undetectable Backdoors in Machine Learning Models (FOCS 2022) [paper]

2021

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations (arXiv 2021) [paper]
How Robust Are Randomized Smoothing Based Defenses to Data Poisoning? (CVPR 2021) [paper]
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release (ICLR 2021 Workshop on Security and Safety in Machine Learning Systems) [paper]
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching (ICLR 2021) [paper] [code]
Unlearnable Examples: Making Personal Data Unexploitable (ICLR 2021) [paper] [code]
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks (ICLR 2021) [paper] [code]
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition (ICLR 2021) [paper]
What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning (ICLR 2021 Workshop) [paper]
Neural Tangent Generalization Attacks (ICML 2021) [paper]
SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation (ICML 2021) [paper]
Adversarial Examples Make Strong Poisons (NeurIPS 2021) [paper]
Anti-Backdoor Learning: Training Clean Models on Poisoned Data (NeurIPS 2021) [paper] [code]
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective (ICCV 2021) [paper] [code]
Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks (AAAI 2021) [paper] [code]
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff (ICASSP 2021) [paper]

2020

On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping (arXiv 2020) [paper] [code]
Backdooring and poisoning neural networks with image-scaling attacks (arXiv 2020) [paper]
Poisoned classifiers are not only backdoored, they are fundamentally broken (arXiv 2020) [paper] [code]
Invisible backdoor attacks on deep neural networks via steganography and regularization (TDSC 2020) [paper]
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs (CVPR 2020) [paper] [code]
MetaPoison: Practical General-purpose Clean-label Data Poisoning (NeurIPS 2020) [paper]
Input-Aware Dynamic Backdoor Attack (NeurIPS 2020) [paper] [code]
How To Backdoor Federated Learning (AISTATS 2020) [paper]
Reflection backdoor: A natural backdoor attack on deep neural networks (ECCV 2020) [paper]
Practical Poisoning Attacks on Neural Networks (ECCV 2020) [paper]
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases (ECCV 2020) [paper] [code]
Deep k-NN Defense Against Clean-Label Data Poisoning Attacks (ECCV 2020 Workshops) [paper] [code]
Radioactive data: tracing through training (ICML 2020) [paper]
Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks (ICML 2020) [paper]
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing (ICML 2020) [paper]
An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks (KDD 2020) [paper] [code]
Hidden Trigger Backdoor Attacks (AAAI 2020) [paper] [code]

2019

Label-consistent backdoor attacks (arXiv 2019) [paper]
Poisoning Attacks with Generative Adversarial Nets (arXiv 2019) [paper]
TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems (arXiv 2019) [paper]
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (IEEE Access 2019) [paper]
Data Poisoning against Differentially-Private Learners: Attacks and Defenses (IJCAI 2019) [paper]
DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks (IJCAI 2019) [paper]
Sever: A Robust Meta-Algorithm for Stochastic Optimization (ICML 2019) [paper]
Learning with Bad Training Data via Iterative Trimmed Loss Minimization (ICML 2019) [paper]
Universal Multi-Party Poisoning Attacks (ICML 2019) [paper]
Transferable Clean-Label Poisoning Attacks on Deep Neural Nets (ICML 2019) [paper]
Defending Neural Backdoors via Generative Distribution Modeling (NeurIPS 2019) [paper]
Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder (NeurIPS 2019) [paper]
The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure (AAAI 2019) [paper]
Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models (IEEE Transactions on Services Computing 2019) [paper]
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks (IEEE Symposium on Security and Privacy 2019) [paper]
STRIP: a defence against trojan attacks on deep neural networks (ACSAC 2019) [paper]

2018

Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering (arXiv 2018) [paper]
Spectral Signatures in Backdoor Attacks (NeurIPS 2018) [paper]
Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (NeurIPS 2018) [paper]
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise (NeurIPS 2018) [paper]
Trojaning Attack on Neural Networks (NDSS 2018) [paper]
Label Sanitization Against Label Flipping Poisoning Attacks (ECML PKDD 2018 Workshops) [paper]
Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring (USENIX Security 2018) [paper]

2017

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (arXiv 2017) [paper]
Generative Poisoning Attack Method Against Neural Networks (arXiv 2017) [paper]
Delving into Transferable Adversarial Examples and Black-box Attacks (ICLR 2017) [paper]
Understanding Black-box Predictions via Influence Functions (ICML 2017) [paper] [code]
Certified Defenses for Data Poisoning Attacks (NeurIPS 2017) [paper]

2016

Data Poisoning Attacks on Factorization-Based Collaborative Filtering (NeurIPS 2016) [paper]

2015

Is Feature Selection Secure against Training Data Poisoning? (ICML 2015) [paper]
Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners (AAAI 2015) [paper]

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Awesome Data Poisoning and Backdoor Attacks

Surveys

Benchmark

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

About

License

penghui-yang/awesome-data-poisoning-and-backdoor-attacks

Folders and files

Latest commit

History

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Awesome Data Poisoning and Backdoor Attacks

Surveys

Benchmark

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

About

Topics

Resources

License

Stars

Watchers

Forks