It appears that the List of All Adversarial Example Papers has been experiencing crashes over the past few days. In the absence of this valuable resource, staying up-to-date with the latest research papers in this field has become challenging. Consequently, I created a repository aimed at aggregating and maintaining the most current papers in this domain. While this repository may not encompass every paper, I did try. If you find any papers we have missed, just drop me an email. We have included the data from List of All Adversarial Example Papers till 2023-09-01. We also provide a list of papers about transfer-based attacks here.
-
Xiaojing Fan, Chunliang Tao
-
Duanyi Yao, Songze Li, Ye Xue, Jin Liu
-
FDI: Attack Neural Code Generation Systems through User Feedback Channel
Zhensu Sun, Xiaoning Du, Xiapu Luo, Fu Song, David Lo, Li Li
-
EnJa: Ensemble Jailbreak on Large Language Models
Jiahao Zhang, Zilong Wang, Ruofan Wang, Xingjun Ma, Yu-Gang Jiang
-
LaFA: Latent Feature Attacks on Non-negative Matrix Factorization
Minh Vu, Ben Nebgen, Erik Skau, Geigh Zollicoffer, Juan Castorena, Kim Rasmussen, Boian Alexandrov, Manish Bhattarai
-
Kien T. Pham, Jingye Chen, Qifeng Chen
-
Enhancing Output Diversity Improves Conjugate Gradient-based Adversarial Attacks
Keiichiro Yamamura, Issa Oe, Hiroki Ishikura, Katsuki Fujisawa
-
PushPull-Net: Inhibition-driven ResNet robust to image corruptions
Guru Swaroop Bennabhaktula, Enrique Alegre, Nicola Strisciuglio, George Azzopardi
-
Exploring RAG-based Vulnerability Augmentation with LLMs
Seyed Shayan Daneshvar, Yu Nong, Xu Yang, Shaowei Wang, Haipeng Cai
-
RCDM: Enabling Robustness for Conditional Diffusion Model
Weifeng Xu, Xiang Zhu, Xiaoyong Li
-
Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense
Qilei Li, Ahmed M. Abdelmoniem
-
Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models
Haonan Zheng, Wen Jiang, Xinyang Deng, Wenrui Li
-
Róisín Luo, James McDermott, Colm O'Riordan
-
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Jingtong Su, Julia Kempe, Karen Ullrich
-
Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs
Peng Ding, Jingyu Wu, Jun Kuang, Dan Ma, Xuezhi Cao, Xunliang Cai, Shi Chen, Jiajun Chen, Shujian Huang
-
Assessing Robustness of Machine Learning Models using Covariate Perturbations
Arun Prakash R, Anwesha Bhattacharyya, Joel Vaughan, Vijayan N. Nair
-
EmoBack: Backdoor Attacks Against Speaker Identification Using Emotional Prosody
Coen Schoof, Stefanos Koffas, Mauro Conti, Stjepan Picek
-
Yuntao Shou, Haozhi Lan, Xiangyong Cao
-
ADBM: Adversarial diffusion bridge model for reliable adversarial purification
Xiao Li, Wenxuan Sun, Huanran Chen, Qiongxiu Li, Yining Liu, Yingzhe He, Jie Shi, Xiaolin Hu
-
OTAD: An Optimal Transport-Induced Robust Model for Agnostic Adversarial Attack
Kuo Gai, Sicong Wang, Shihua Zhang
-
CERT-ED: Certifiably Robust Text Classification for Edit Distance
Zhuoqun Huang, Neil G Marchant, Olga Ohrimenko, Benjamin I. P. Rubinstein
-
Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion
Honglei Miao, Fan Ma, Ruijie Quan, Kun Zhan, Yi Yang
-
Revocable Backdoor for Deep Model Trading
Yiran Xu, Nan Zhong, Zhenxing Qian, Xinpeng Zhang
-
Adversarial Text Rewriting for Text-aware Recommender Systems
Sejoon Oh, Gaurav Verma, Srijan Kumar
-
Benchmarking Attacks on Learning with Errors
Emily Wenger, Eshika Saxena, Mohamed Malhou, Ellie Thieu, Kristin Lauter
-
Measuring What Matters: Intrinsic Distance Preservation as a Robust Metric for Embedding Quality
Steven N. Hart, Thomas E. Tavolara
-
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren, Steven Basart, Adam Khoja, Alice Gatti, Long Phan, Xuwang Yin, Mantas Mazeika, Alexander Pan, Gabriel Mukobi, Ryan H. Kim, Stephen Fitz, Dan Hendrycks
-
Defending Jailbreak Attack in VLMs via Cross-modality Information Detector
Yue Xu, Xiuyuan Qi, Zhan Qin, Wenjie Wang
-
Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model
Zhichao Zhang, Xinyue Li, Wei Sun, Jun Jia, Xiongkuo Min, Zicheng Zhang, Chunyi Li, Zijian Chen, Puyi Wang, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Guangtao Zhai
-
Conditioned Prompt-Optimization for Continual Deepfake Detection
Francesco Laiti, Benedetta Liberatori, Thomas De Min, Elisa Ricci
-
Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Shi Liu, Kecheng Zheng, Wei Chen
-
Diff-Cleanse: Identifying and Mitigating Backdoor Attacks in Diffusion Models
Jiang Hao, Xiao Jin, Hu Xiaoguang, Chen Tianyou
-
Sazzad Sayyed, Milin Zhang, Shahriar Rifat, Ananthram Swami, Michael De Lucia, Francesco Restuccia
-
FACL-Attack: Frequency-Aware Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon
-
Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks
Hunmin Yang, Jongoh Jeong, Kuk-Jin Yoon
-
PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning
Muhammad Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy, Lin Liu, Habibullah Habibullah, Ryszard Kowalczyk
-
Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks
Yunfeng Diao, Naixin Zhai, Changtao Miao, Xun Yang, Meng Wang
-
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Sara Abdali, Jia He, CJ Barberan, Richard Anarfi
-
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
Boyang Zhang, Yicong Tan, Yun Shen, Ahmed Salem, Michael Backes, Savvas Zannettou, Yang Zhang
-
DeepBaR: Fault Backdoor Attack on Deep Neural Network Layers
C. A. Martínez-Mejía, J. Solano, J. Breier, D. Bucko, X. Hou
-
Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu
-
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic Interpretability
Jorge García-Carrasco, Alejandro Maté, Juan Trujillo
-
Keming Wu, Man Yao, Yuhong Chou, Xuerui Qiu, Rui Yang, Bo Xu, Guoqi Li
-
BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning
Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Mingli Zhu, Ruotong Wang, Li Liu, Chao Shen
-
Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank
Shashank Gupta, Harrie Oosterhuis, Maarten de Rijke
-
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
-
Aditya Kulkarni, Vivek Balachandran, Dinil Mon Divakaran, Tamal Das
-
Enhancing Adversarial Text Attacks on BERT Models with Projected Gradient Descent
Hetvi Waghela, Jaydip Sen, Sneha Rakshit
-
Exploring the Adversarial Robustness of CLIP for AI-generated Image Detection
Vincenzo De Rosa, Fabrizio Guillaro, Giovanni Poggi, Davide Cozzolino, Luisa Verdoliva
-
Towards Clean-Label Backdoor Attacks in the Physical World
Thinh Dao, Cuong Chi Le, Khoa D Doan, Kok-Seng Wong
-
EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection
Shigang Liu, Di Cao, Junae Kim, Tamas Abraham, Paul Montague, Seyit Camtepe, Jun Zhang, Yang Xiang
-
Debiased Graph Poisoning Attack via Contrastive Surrogate Objective
Kanghoon Yoon, Yeonjun In, Namkyeong Lee, Kibum Kim, Chanyoung Park
-
Adversarial Robustification via Text-to-Image Diffusion Models
Daewon Choi, Jongheon Jeong, Huiwon Jang, Jinwoo Shin
-
Robust VAEs via Generating Process of Noise Augmented Data
Hiroo Irobe, Wataru Aoki, Kimihiro Yamazaki, Yuhui Zhang, Takumi Nakagawa, Hiroki Waida, Yuichiro Wada, Takafumi Kanamori
-
Accuracy-Privacy Trade-off in the Mitigation of Membership Inference Attack in Federated Learning
Sayyed Farid Ahamed, Soumya Banerjee, Sandip Roy, Devin Quinn, Marc Vucovich, Kevin Choi, Abdul Rahman, Alison Hu, Edward Bowen, Sachin Shetty
-
Haonan Zheng, Xinyang Deng, Wen Jiang, Wenrui Li
-
The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models
Zihui Wu, Haichang Gao, Jianping He, Ping Wang
-
Peak-Controlled Logits Poisoning Attack in Federated Distillation
Yuhan Tang, Aoxu Zhang, Zhiyuan Wu, Bo Gao, Tian Wen, Yuwei Wang, Sheng Sun
-
Is the Digital Forensics and Incident Response Pipeline Ready for Text-Based Threats in LLM Era?
Avanti Bhandarkar, Ronald Wilson, Anushka Swarup, Mengdi Zhu, Damon Woodard
-
Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis
Cristian-Alexandru Botocan, Raphael Meier, Ljiljana Dolamic
-
RIDA: A Robust Attack Framework on Incomplete Graphs
Jianke Yu, Hanchen Wang, Chen Chen, Xiaoyang Wang, Wenjie Zhang, Ying Zhang
-
Adversarial Robust Decision Transformer: Enhancing Robustness of RvS via Minimax Returns-to-go
Xiaohang Tang, Afonso Marques, Parameswaran Kamalaruban, Ilija Bogunovic
-
Robust Deep Hawkes Process under Label Noise of Both Event and Occurrence
Xiaoyu Tan, Bin Li, Xihe Qiu, Jingjing Huang, Yinghui Xu, Wei Chu
-
How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?
Leo Yu-Ho Lo, Huamin Qu
-
Physical Adversarial Attack on Monocular Depth Estimation via Shape-Varying Patches
Chenxing Zhao, Yang Li, Shihao Wu, Wenyi Tan, Shuangju Zhou, Quan Pan
-
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang
-
From Sands to Mansions: Enabling Automatic Full-Life-Cycle Cyberattack Construction with LLM
Lingzhi Wang, Jiahui Wang, Kyle Jung, Kedar Thiagarajan, Emily Wei, Xiangmin Shen, Yan Chen, Zhenyuan Li
-
Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models
Shi Lin, Rongchang Li, Xun Wang, Changting Lin, Wenpeng Xing, Meng Han
-
RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent
Huiyu Xu, Wenhui Zhang, Zhibo Wang, Feng Xiao, Rui Zheng, Yunhe Feng, Zhongjie Ba, Kui Ren
-
Can Large Language Models Automatically Jailbreak GPT-4V?
Yuanwei Wu, Yue Huang, Yixin Liu, Xiang Li, Pan Zhou, Lichao Sun
-
Algebraic Adversarial Attacks on Integrated Gradients
Lachlan Simpson, Federico Costanza, Kyle Millar, Adriel Cheng, Cheng-Chew Lim, Hong Gunn Chew
-
Hao Zhou, Kun Sun, Shaoming Li, Yangfeng Fan, Guibin Jiang, Jiaqi Zheng, Tao Li
-
Backdoor Attacks against Hybrid Classical-Quantum Neural Networks
Ji Guo, Wenbo Jiang, Rui Zhang, Wenshu Fan, Jiachen Li, Guoming Lu
-
Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning
Xinwei Liu, Xiaojun Jia, Yuan Xun, Siyuan Liang, Xiaochun Cao
-
Xiaojin Zhang, Wei Chen
-
Neha A S, Vivek Chaturvedi, Muhammad Shafique
-
ImPress: Securing DRAM Against Data-Disturbance Errors via Implicit Row-Press Mitigation
Moinuddin Qureshi, Anish Saxena, Aamer Jaleel
-
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He
-
Visually Robust Adversarial Imitation Learning from Videos with Contrastive Learning
Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis
-
Cross-Task Attack: A Self-Supervision Generative Framework Based on Attention Shift
Qingyuan Zeng, Yunpeng Gong, Min Jiang
-
Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models
Zhuo Chen, Jiawei Liu, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu, Xiaozhong Liu
-
Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han
-
Jiyuan Fu, Zhaoyu Chen, Kaixun Jiang, Haijing Guo, Shuyong Gao, Wenqiang Zhang
-
Krait: A Backdoor Attack Against Graph Prompt Tuning
Ying Song, Rita Singh, Balaji Palanisamy
-
Motif-Consistent Counterfactuals with Adversarial Refinement for Graph-Level Anomaly Detection
Chunjing Xiao, Shikang Pang, Wenxin Tai, Yanlong Huang, Goce Trajcevski, Fan Zhou
-
Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls
Aras Selvi, Eleonora Kreacic, Mohsen Ghassemi, Vamsi Potluru, Tucker Balch, Manuela Veloso
-
BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization
Ahmed Allam
-
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks
Yixiang Qiu, Hao Fang, Hongyao Yu, Bin Chen, MeiKang Qiu, Shu-Tao Xia
-
Turning Generative Models Degenerate: The Power of Data Poisoning Attacks
Shuli Jiang, Swanand Ravindra Kadhe, Yi Zhou, Farhan Ahmed, Ling Cai, Nathalie Baracaldo
-
Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection
Youheng Sun, Shengming Yuan, Xuanhan Wang, Lianli Gao, Jingkuan Song
-
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks
Antoni Kowalczuk, Jan Dubiński, Atiyeh Ashari Ghomi, Yi Sui, George Stein, Jiapeng Wu, Jesse C. Cresswell, Franziska Boenisch, Adam Dziedzic
-
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
Kaichen Zhang, Bo Li, Peiyuan Zhang, Fanyi Pu, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu
-
Zhaoxin Wang, Handing Wang, Cong Tian, Yaochu Jin
-
Contrastive Adversarial Training for Unsupervised Domain Adaptation
Jiahong Chen, Zhilin Zhang, Lucy Li, Behzad Shahrasbi, Arjun Mishra
-
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, Bo Li
-
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion
Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin, Juho Lee
-
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Yong-Hyun Park, Sangdoo Yun, Jin-Hwa Kim, Junho Kim, Geonhui Jang, Yonghyun Jeong, Junghyo Jo, Gayoung Lee
-
Lin Luo, Yuri Nakao, Mathieu Chollet, Hiroya Inakoshi, Simone Stumpf
-
Feature Inference Attack on Shapley Values
Xinjian Luo, Yangfan Jiang, Xiaokui Xiao
-
AEMIM: Adversarial Examples Meet Masked Image Modeling
Wenzhao Xiang, Chang Liu, Hang Su, Hongyang Yu
-
Enhancing TinyML Security: Study of Adversarial Attack Transferability
Parin Shah, Yuvaraj Govindarajulu, Pavan Kulkarni, Manojkumar Parmar
-
Variational Randomized Smoothing for Sample-Wise Adversarial Robustness
Ryo Hase, Ye Wang, Toshiaki Koike-Akino, Jing Liu, Kieran Parsons
-
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko, Nicolas Flammarion
-
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models
Ouxiang Li, Yanbin Hao, Zhicai Wang, Bin Zhu, Shuo Wang, Zaixi Zhang, Fuli Feng
-
Cycle Contrastive Adversarial Learning for Unsupervised image Deraining
Chen Zhao, Weiling Cai, ChengWei Hu, Zheng Yuan
-
Hao Ding, Tuxun Lu, Yuqian Zhang, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Mathias Unberath
-
IPA-NeRF: Illusory Poisoning Attack Against Neural Radiance Fields
Wenxiang Jiang, Hanwei Zhang, Shuo Zhao, Zhongwen Guo, Hao Wang
-
UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening
Siyuan Cheng, Guangyu Shen, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Hanxi Guo, Shiqing Ma, Xiangyu Zhang
-
Relaxing Graph Transformers for Adversarial Attacks
Philipp Foth, Lukas Gosch, Simon Geisler, Leo Schwinn, Stephan Günnemann
-
One-Shot Unlearning of Personal Identities
Thomas De Min, Subhankar Roy, Massimiliano Mancini, Stéphane Lathuilière, Elisa Ricci
-
Generalized Coverage for More Robust Low-Budget Active Learning
Wonho Bae, Junhyug Noh, Danica J. Sutherland
-
Backdoor Attacks against Image-to-Image Networks
Wenbo Jiang, Hongwei Li, Jiaming He, Rui Zhang, Guowen Xu, Tianwei Zhang, Rongxing Lu
-
Learning to Unlearn for Robust Machine Unlearning
Mark He Huang, Lin Geng Foo, Jun Liu
-
Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks
Quang H. Nguyen, Nguyen Ngoc-Hieu, The-Anh Ta, Thanh Nguyen-Tang, Hoang Thanh-Tung, Khoa D. Doan
-
Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks
Lukas Gosch, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar, Stephan Günnemann
-
Look Within, Why LLMs Hallucinate: A Causal Perspective
He Li, Haoang Chi, Mingyu Liu, Wenjing Yang
-
Augmented Neural Fine-Tuning for Efficient Backdoor Purification
Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Nazanin Rahnavard
-
CLIP-Guided Networks for Transferable Targeted Attacks
Hao Fang, Jiawei Kong, Bin Chen, Tao Dai, Hao Wu, Shu-Tao Xia
-
SENTINEL: Securing Indoor Localization against Adversarial Attacks with Capsule Neural Networks
Danish Gufran, Pooja Anandathirtha, Sudeep Pasricha
-
Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z Kaplan, Pouya Bashivan, Irina Rish
-
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models
Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang
-
Partner in Crime: Boosting Targeted Poisoning Attacks against Federated Learning
Shihua Sun, Shridatt Sugrim, Angelos Stavrou, Haining Wang
-
SemiAdv: Query-Efficient Black-Box Adversarial Attack with Unlabeled Images
Mingyuan Fan, Yang Liu, Cen Chen, Ximeng Liu
-
Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations
David N. Palacio, Daniel Rodriguez-Cardenas, Alejandro Velasco, Dipin Khati, Kevin Moran, Denys Poshyvanyk
-
Robustness of LLMs to Perturbations in Text
Ayush Singh, Navpreet Singh, Shubham Vatsal
-
Refusing Safe Prompts for Multi-modal Large Language Models
Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong
-
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu
-
TAPI: Towards Target-Specific and Adversarial Prompt Injection against Code LLMs
Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren
-
Deep Adversarial Defense Against Multilevel-Lp Attacks
Ren Wang, Yuxuan Li, Alfred Hero
-
Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off
Levente Halmosi, Bálint Mohos, Márk Jelasity
-
Distributed Backdoor Attacks on Federated Graph Learning and Certified Defenses
Yuxin Yang, Qiang Li, Jinyuan Jia, Yuan Hong, Binghui Wang
-
PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning
Sizai Hou, Songze Li, Tayyebeh Jahani-Nezhad, Giuseppe Caire
-
DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks
Guang Yang, Yu Zhou, Xiang Chen, Xiangyu Zhang, Terry Yue Zhuo, David Lo, Taolue Chen
-
CEIPA: Counterfactual Explainable Incremental Prompt Attack Analysis on Large Language Models
Dong Shu, Mingyu Jin, Tianle Chen, Chong Zhang, Yongfeng Zhang
-
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning
Ning Wang, Shanghao Shi, Yang Xiao, Yimin Chen, Y. Thomas Hou, Wenjing Lou
-
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants
John Heibel, Daniel Lowd
-
Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment
Yufan Liu, Wanqian Zhang, Dayan Wu, Zheng Lin, Jingzi Gu, Weiping Wang
-
Jinlong Li, Zequn Jie, Elisa Ricci, Lin Ma, Nicu Sebe
-
Rethinking the Threat and Accessibility of Adversarial Attacks against Face Recognition Systems
Yuxin Cao, Yumeng Zhu, Derui Wang, Sheng Wen, Minhui Xue, Jin Lu, Hao Ge
-
Yunfeng Diao, Baiqi Wu, Ruixuan Zhang, Xun Yang, Meng Wang, He Wang
-
How to beat a Bayesian adversary
Zihan Ding, Kexin Jin, Jonas Latz, Chenguang Liu
-
Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Giorgio Severi, Simona Boboila, John Holodnak, Kendra Kratkiewicz, Rauf Izmailov, Alina Oprea
-
Enhancing Privacy of Spatiotemporal Federated Learning against Gradient Inversion Attacks
Lele Zheng, Yang Cao, Renhe Jiang, Kenjiro Taura, Yulong Shen, Sheng Li, Masatoshi Yoshikawa
-
Md Mashrur Arifin, Md Shoaib Ahmed, Tanmai Kumar Ghosh, Jun Zhuang, Jyh-haw Yeh
-
D'Jeff K. Nkashama, Jordan Masakuna Félicien, Arian Soltani, Jean-Charles Verdier, Pierre-Martin Tardif, Marc Frappier, Froduald Kabanza
-
Tuning Vision-Language Models with Candidate Labels by Prompt Alignment
Zhifang Zhang, Beibei Li
-
Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jiawei Chen, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He
-
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities
Tianjie Ju, Yiting Wang, Xinbei Ma, Pengzhou Cheng, Haodong Zhao, Yulong Wang, Lifeng Liu, Jian Xie, Zhuosheng Zhang, Gongshen Liu
-
Qian Yang, Weixiang Yan, Aishwarya Agrawal
-
Mitigating Backdoor Attacks using Activation-Guided Model Editing
Felix Hsieh, Huy H. Nguyen, AprilPyone MaungMaung, Dmitrii Usynin, Isao Echizen
-
A Hybrid Training-time and Run-time Defense Against Adversarial Attacks in Modulation Classification
Lu Zhang, Sangarapillai Lambotharan, Gan Zheng, Guisheng Liao, Ambra Demontis, Fabio Roli
-
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng
-
Hiding Local Manipulations on SAR Images: a Counter-Forensic Attack
Sara Mandelli, Edoardo Daniele Cannas, Paolo Bestagini, Stefano Tebaldini, Stefano Tubaro
-
Universal Multi-view Black-box Attack against Object Detectors via Layout Optimization
Donghua Wang, Wen Yao, Tingsong Jiang, Chao Li, Xiaoqian Chen
-
Improving the Transferability of Adversarial Examples by Feature Augmentation
Donghua Wang, Wen Yao, Tingsong Jiang, Xiaohu Zheng, Junqi Wu, Xiaoqian Chen
-
AstroSpy: On detecting Fake Images in Astronomy via Joint Image-Spectral Representations
Mohammed Talha Alam, Raza Imam, Mohsen Guizani, Fakhri Karray
-
Towards Physics-informed Cyclic Adversarial Multi-PSF Lensless Imaging
Abeer Banerjee, Sanjay Singh
-
Event Trojan: Asynchronous Event-based Backdoor Attacks
Ruofei Wang, Qing Guo, Haoliang Li, Renjie Wan
-
Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning
Yuqi Jia, Minghong Fang, Hongbin Liu, Jinghuai Zhang, Neil Zhenqiang Gong
-
$R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
Mintong Kang, Bo Li
-
Yanxu Zhu, Jinlin Xiao, Yuhang Wang, Jitao Sang
-
Bidur Khanal, Tianhong Dai, Binod Bhattarai, Cristian Linte
-
Elena Camuffo, Umberto Michieli, Simone Milani, Jijoong Moon, Mete Ozay
-
FORAY: Towards Effective Attack Synthesis against Deep Logical Vulnerabilities in DeFi Protocols
Hongbo Wen, Hanzhi Liu, Jiaxin Song, Yanju Chen, Wenbo Guo, Yu Feng
-
Gradient Diffusion: A Perturbation-Resilient Gradient Leakage Attack
Xuan Liu, Siqi Cai, Qihua Zhou, Song Guo, Ruibin Li, Kaiwei Lin
-
Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense
Qi Zhou, Zipeng Ye, Yubo Tang, Wenjian Luo, Yuhui Shi, Yan Jia
-
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu, Zexin Bi, Fusheng Wang, Chao Chen
-
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models
Vyas Raina, Mark Gales
-
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen
-
Self-Supervised Representation Learning for Adversarial Attack Detection
Yi Li, Plamen Angelov, Neeraj Suri
-
Faeze S. Banitaba, Sercan Aygun, M. Hassan Najafi
-
Non-Cooperative Backdoor Attacks in Federated Learning: A New Threat Landscape
Tuan Nguyen, Dung Thuy Nguyen, Khoa D Doan, Kok-Seng Wong
-
Adversarial Robustness of VAEs across Intersectional Subgroups
Chethan Krishnamurthy Ramanaik, Arjun Roy, Eirini Ntoutsi
-
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
Yuyan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao
-
Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers
Terry Tong, Jiashu Xu, Qin Liu, Muhao Chen
-
Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Xinglin Li, Xianwen He, Yao Li, Minhao Cheng
-
DART: Deep Adversarial Automated Red Teaming for LLM Safety
Bojian Jiang, Yi Jing, Tianhao Shen, Qing Yang, Deyi Xiong
-
TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers
Fatemeh Nourilenjan Nokabadi, Yann Batiste Pequignot, Jean-Francois Lalonde, Christian Gagné
-
Kejia Zhang, Juanjuan Weng, Yuanzheng Cai, Zhiming Luo, Shaozi Li
-
Future Events as Backdoor Triggers: Investigating Temporal Vulnerabilities in LLMs
Sara Price, Arjun Panickssery, Sam Bowman, Asa Cooper Stickland
-
Self-Evaluation as a Defense Against Adversarial Attacks on LLMs
Hannah Brown, Leon Lin, Kenji Kawaguchi, Michael Shieh
-
Zhexin Zhang, Junxiao Yang, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang
-
Zhihua Jin, Shiyi Liu, Haotian Li, Xun Zhao, Huamin Qu
-
SPLITZ: Certifiable Robustness via Split Lipschitz Randomized Smoothing
Meiyu Zhong, Ravi Tandon
-
Xiang Ling, Zhiyu Wu, Bin Wang, Wei Deng, Jingzheng Wu, Shouling Ji, Tianyue Luo, Yanjun Wu
-
Simon Ostermann, Kevin Baum, Christoph Endres, Julia Masloh, Patrick Schramowski
-
EvolBA: Evolutionary Boundary Attack under Hard-label Black Box condition
Ayane Tajima, Satoshi Ono
-
Adversarial Magnification to Deceive Deepfake Detection through Super Resolution
Davide Alessandro Coccomini, Roberto Caldelli, Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro
-
Parameter Matching Attack: Enhancing Practical Applicability of Availability Attacks
Yu Zhe, Jun Sakuma
-
Face Reconstruction Transfer Attack as Out-of-Distribution Generalization
Yoon Gyo Jung, Jaewoo Park, Xingbo Dong, Hojin Park, Andrew Beng Jin Teoh, Octavia Camps
-
Towards More Realistic Extraction Attacks: An Adversarial Perspective
Yash More, Prakhar Ganesh, Golnoosh Farnadi
-
Jiexin Wang, Xitong Luo, Liuwen Cao, Hongkui He, Hailin Huang, Jiayuan Xie, Adam Jatowt, Yi Cai
-
Kalibinuer Tiliwalidi, Chengyin Hu, Weiwen Shi
-
Learning Robust 3D Representation from CLIP via Dual Denoising
Shuqing Luo, Bowen Qu, Wei Gao
-
Semantic-guided Adversarial Diffusion Model for Self-supervised Shadow Removal
Ziqi Zeng, Chen Zhao, Weiling Cai, Chenyu Dong
-
Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability
Chenxi Li, Abhinav Kumar, Zhen Guo, Jie Hou, Reza Tourani
-
Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP
Ayush Ranjan, Daniel Wen, Karthik Bhat
-
Yiquan Li, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Bo Li, Chaowei Xiao
-
A Whole-Process Certifiably Robust Aggregation Method Against Backdoor Attacks in Federated Learning
Anqi Zhou, Yezheng Liu, Yidong Chai, Hongyi Zhu, Xinyue Ge, Yuanchun Jiang, Meng Wang
-
Query-Efficient Hard-Label Black-Box Attack against Vision Transformers
Chao Zhou, Xiaowen Shi, Yuan-Gen Wang
-
Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness
Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung, Che-Rung Lee
-
Deceptive Diffusion: Generating Synthetic Adversarial Examples
Lucas Beerens, Catherine F. Higham, Desmond J. Higham
-
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation
Danny Halawi, Alexander Wei, Eric Wallace, Tony T. Wang, Nika Haghtalab, Jacob Steinhardt
-
IDT: Dual-Task Adversarial Attacks for Privacy Protection
Pedro Faustini, Shakila Mahjabin Tonni, Annabelle McIver, Qiongkai Xu, Mark Dras
-
NLPerturbator: Studying the Robustness of Code LLMs to Natural Language Variations
Junkai Chen, Zhenhao Li, Xing Hu, Xin Xia
-
GM-DF: Generalized Multi-Scenario Deepfake Detection
Yingxin Lai, Zitong Yu, Jing Yang, Bin Li, Xiangui Kang, Linlin Shen
-
Guanghao Zhu, Jing Zhang, Juanxiu Liu, Xiaohui Du, Ruqian Hao, Yong Liu, Lin Liu
-
Backdoor Attack in Prompt-Based Continual Learning
Trang Nguyen, Anh Tran, Nhat Ho
-
Virtual Context: Enhancing Jailbreak Attacks with Special Token Injection
Yuqi Zhou, Lin Lu, Hanchi Sun, Pan Zhou, Lichao Sun
-
DiffuseDef: Improved Robustness to Adversarial Attacks
Zhenhao Li, Marek Rei, Lucia Specia
-
Rethinking harmless refusals when fine-tuning foundation models
Florin Pop, Judd Rosenblatt, Diogo Schwerz de Lucena, Michael Vaiana
-
Data Poisoning Attacks to Locally Differentially Private Frequent Itemset Mining Protocols
Wei Tong, Haoyu Chen, Jiacheng Niu, Sheng Zhong
-
Poisoned LangChain: Jailbreak LLMs by LangChain
Ziqiu Wang, Jun Liu, Shengkai Zhang, Yang Yang
-
Haolang Lu, Hongrui Peng, Guoshun Nan, Jiaoyang Cui, Cheng Wang, Weifei Jin
-
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Yejin Choi, Nouha Dziri
-
SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
Caishuang Huang, Wanxu Zhao, Rui Zheng, Huijie Lv, Shihan Dou, Sixian Li, Xiao Wang, Enyu Zhou, Junjie Ye, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang
-
Machine Unlearning Fails to Remove Data Poisoning Attacks
Martin Pawelczyk, Jimmy Z. Di, Yiwei Lu, Gautam Kamath, Ayush Sekhari, Seth Neel
-
Diffusion-based Adversarial Purification for Intrusion Detection
Mohamed Amine Merzouk, Erwan Beurier, Reda Yaich, Nora Boulahia-Cuppens, Frédéric Cuppens
-
A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens
Zhijie Nie, Richong Zhang, Zhanyu Wu
-
Inherent Challenges of Post-Hoc Membership Inference for Large Language Models
Matthieu Meeus, Shubham Jain, Marek Rei, Yves-Alexandre de Montjoye
-
UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification
Alvaro Lopez Pellicer, Kittipos Giatgong, Yi Li, Neeraj Suri, Plamen Angelov
-
Emma Hart, Quentin Renau, Kevin Sim, Mohamad Alissa
-
Zhengyue Zhao, Xiaoyun Zhang, Kaidi Xu, Xing Hu, Rui Zhang, Zidong Du, Qi Guo, Yunji Chen
-
Evaluating and Analyzing Relationship Hallucinations in LVLMs
Mingrui Wu, Jiayi Ji, Oucheng Huang, Jiale Li, Yuhang Wu, Xiaoshuai Sun, Rongrong Ji
-
Improving robustness to corruptions with multiplicative weight perturbations
Trung Trinh, Markus Heinonen, Luigi Acerbi, Samuel Kaski
-
Noisy Neighbors: Efficient membership inference attacks against LLMs
Filippo Galli, Luca Melis, Tommaso Cucinotta
-
BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Yi Zeng, Weiyu Sun, Tran Ngoc Huynh, Dawn Song, Bo Li, Ruoxi Jia
-
Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models
Bei Yan, Jie Zhang, Zheng Yuan, Shiguang Shan, Xilin Chen
-
Automated Adversarial Discovery for Safety Classifiers
Yash Kumar Lal, Preethi Lahoti, Aradhana Sinha, Yao Qin, Ananth Balashankar
-
Towards unlocking the mystery of adversarial fragility of neural networks
Jingchao Gao, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Catherine Xu, Hui Xie, Weiyu Xu
-
CBPF: Filtering Poisoned Data Based on Composite Backdoor Attack
Hanfeng Xia, Haibo Hong, Ruili Wang
-
Blind Baselines Beat Membership Inference Attacks for Foundation Models
Debeshee Das, Jie Zhang, Florian Tramèr
-
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
Jannik Kossen, Jiatong Han, Muhammed Razzak, Lisa Schut, Shreshth Malik, Yarin Gal
-
EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation
Tianyu Wei, Shanmin Pang, Qi Guo, Yizhuo Ma, Qing Guo
-
Federated Adversarial Learning for Robust Autonomous Landing Runway Detection
Yi Li, Plamen Angelov, Zhengxin Yu, Alvaro Lopez Pellicer, Neeraj Suri
-
Large Language Models for Link Stealing Attacks Against Graph Neural Networks
Faqian Guan, Tianqing Zhu, Hui Sun, Wanlei Zhou, Philip S. Yu
-
MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
Xirui Li, Hengguang Zhou, Ruochen Wang, Tianyi Zhou, Minhao Cheng, Cho-Jui Hsieh
-
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei
-
Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning
Suyi Li, Chenyi Jiang, Shidong Wang, Yang Long, Zheng Zhang, Haofeng Zhang
-
DataFreeShield: Defending Adversarial Attacks without Training Data
Hyeyoon Lee, Kanghyun Choi, Dain Kwon, Sunjong Park, Mayoore Selvarasa Jaiswal, Noseong Park, Jonghyun Choi, Jinho Lee
-
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng
-
Enhancing robustness of data-driven SHM models: adversarial training with circle loss
Xiangli Yang, Xijie Deng, Hanwei Zhang, Yang Zou, Jianxi Yang
-
ObscurePrompt: Jailbreaking Large Language Models via Obscure Input
Yue Huang, Jingyu Tang, Dongping Chen, Bingda Tang, Yao Wan, Lichao Sun, Xiangliang Zhang
-
MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization
Zhaozhe Hu, Jia-Li Yin, Bin Chen, Luojun Lin, Bo-Hao Chen, Ximeng Liu
-
Explainable AI Security: Exploring Robustness of Graph Neural Networks to Adversarial Attacks
Tao Wu, Canyixing Cui, Xingping Xian, Shaojie Qiao, Chao Wang, Lin Yuan, Shui Yu
-
Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning
Yujing Wang, Hainan Zhang, Sijia Wen, Wangjie Qiu, Binghui Guo
-
Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization
Tanapat Ratchatorn, Masayuki Tanaka
-
Adversaries Can Misuse Combinations of Safe Models
Erik Jones, Anca Dragan, Jacob Steinhardt
-
AGSOA:Graph Neural Network Targeted Attack Based on Average Gradient and Structure Optimization
Yang Chen, Bin Zhou
-
Bayes' capacity as a measure for reconstruction attacks in federated learning
Sayan Biswas, Mark Dras, Pedro Faustini, Natasha Fernandes, Annabelle McIver, Catuscia Palamidessi, Parastoo Sadeghi
-
Jia-Li Yin, Haoyuan Zheng, Ximeng Liu
-
Benchmarking Unsupervised Online IDS for Masquerade Attacks in CAN
Pablo Moriano, Steven C. Hespeler, Mingyan Li, Robert A. Bridges
-
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran
-
Adversarial Attacks on Large Language Models in Medicine
Yifan Yang, Qiao Jin, Furong Huang, Zhiyong Lu
-
Stealth edits for provably fixing or attacking large language models
Oliver J. Sutton, Qinghua Zhou, Wei Wang, Desmond J. Higham, Alexander N. Gorban, Alexander Bastounis, Ivan Y. Tyukin
-
UIFV: Data Reconstruction Attack in Vertical Federated Learning
Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao
-
Can Go AIs be adversarially robust?
Tom Tseng, Euan McLean, Kellin Pelrine, Tony T. Wang, Adam Gleave
-
Yunze Xiao, Yujia Hu, Kenny Tsu Wei Choo, Roy Ka-wei Lee
-
Defending Against Social Engineering Attacks in the Age of LLMs
Lin Ai, Tharindu Kumarage, Amrita Bhattacharjee, Zizhou Liu, Zheng Hui, Michael Davinroy, James Cook, Laura Cassani, Kirill Trapeznikov, Matthias Kirchner, Arslan Basharat, Anthony Hoogs, Joshua Garland, Huan Liu, Julia Hirschberg
-
Adversarial Attacks on Multimodal Agents
Chen Henry Wu, Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried, Aditi Raghunathan
-
Attack and Defense of Deep Learning Models in the Field of Web Attack Detection
Lijia Shi, Shihao Dong
-
MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification
Harrison Gietz, Jugal Kalita
-
NoiSec: Harnessing Noise for Security against Adversarial and Backdoor Attacks
Md Hasan Shahriar, Ning Wang, Y. Thomas Hou, Wenjing Lou
-
DLP: towards active defense against backdoor attacks with decoupled learning process
Zonghao Ying, Bin Wu
-
Saliency Attention and Semantic Similarity-Driven Adversarial Perturbation
Hetvi Waghela, Jaydip Sen, Sneha Rakshit
-
Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection
Sungwon Park, Sungwon Han, Meeyoung Cha
-
Knowledge-to-Jailbreak: One Knowledge Point Worth One Attack
Shangqing Tu, Zhuoran Pan, Wenxuan Wang, Zhexin Zhang, Yuliang Sun, Jifan Yu, Hongning Wang, Lei Hou, Juanzi Li
-
Lingrui Mei, Shenghua Liu, Yiwei Wang, Baolong Bi, Jiayi Mao, Xueqi Cheng
-
Harmonizing Feature Maps: A Graph Convolutional Approach for Enhancing Adversarial Robustness
Kejia Zhang, Juanjuan Weng, Junwei Wu, Guoqing Yang, Shaozi Li, Zhiming Luo
-
Yang Lou, Yi Zhu, Qun Song, Rui Tan, Chunming Qiao, Wei-Bin Lee, Jianping Wang
-
Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness
Maayan Ehrenberg, Roy Ganz, Nir Rosenfeld
-
Obfuscating IoT Device Scanning Activity via Adversarial Example Generation
Haocong Li, Yaxin Zhang, Long Cheng, Wenjia Niu, Haining Wang, Qiang Li
-
Is poisoning a real threat to LLM alignment? Maybe more so than you think
Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang
-
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Robert Hönig, Javier Rando, Nicholas Carlini, Florian Tramèr
-
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran
-
KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs
Aihua Pei, Zehua Yang, Shunan Zhu, Ruoxi Cheng, Ju Jia, Lina Wang
-
Wenhan Yao, Jiangkun Yang, Yongqiang He, Jia Liu, Weiping Wen
-
Yuqing Wang, Yun Zhao
-
Imperceptible Face Forgery Attack via Adversarial Semantic Mask
Decheng Liu, Qixuan Su, Chunlei Peng, Nannan Wang, Xinbo Gao
-
Improving Adversarial Robustness via Decoupled Visual Representation Masking
Decheng Liu, Tao Chen, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao
-
Graph Neural Backdoor: Fundamentals, Methodologies, Applications, and Future Directions
Xiao Yang, Gaolei Li, Jianhua Li
-
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
Rui Ye, Jingyi Chai, Xiangrui Liu, Yaodong Yang, Yanfeng Wang, Siheng Chen
-
Trading Devil: Robust backdoor attack via Stochastic investment models and Bayesian approach
Orson Mengara
-
E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks
Dingqiang Yuan, Xiaohua Xu, Lei Yu, Tongchang Han, Rongchang Li, Meng Han
-
Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model
Siemen Herremans, Ali Anwar, Siegfried Mercelis
-
Bag of Lies: Robustness in Continuous Pre-training BERT
Ine Gevers, Walter Daelemans
-
Robustness-Inspired Defense Against Backdoor Attacks on Graph Neural Networks
Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, Suhang Wang
-
Zhang Chen, Luca Demetrio, Srishti Gupta, Xiaoyi Feng, Zhaoqiang Xia, Antonio Emanuele Cinà, Maura Pintor, Luca Oneto, Ambra Demontis, Battista Biggio, Fabio Roli
-
Semantic Membership Inference Attack against Large Language Models
Hamid Mozaffari, Virendra J. Marathe
-
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models
Changjiang Li, Ren Pang, Bochuan Cao, Jinghui Chen, Fenglong Ma, Shouling Ji, Ting Wang
-
PRISM: A Design Framework for Open-Source Foundation Model Safety
Terrence Neumann, Bryan Jones
-
Adaptive Randomized Smoothing: Certifying Multi-Step Defences against Adversarial Examples
Saiyue Lyu, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, Mathias Lécuyer
-
Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model
Jiayang Meng, Tao Huang, Hong Chen, Cuiping Li
-
Graph Transductive Defense: a Two-Stage Defense for Graph Membership Inference Attacks
Peizhi Niu, Chao Pan, Siheng Chen, Olgica Milenkovic
-
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
Rui Wang, Liping Chen, Kong AiK Lee, Zhen-Hua Ling
-
Adversarial Evasion Attack Efficiency against Large Language Models
João Vitorino, Eva Maia, Isabel Praça
-
Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis
Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib, Mohamed Deriche
-
Adversarial Patch for 3D Local Feature Extractor
Yu Wen Pao, Li Chang Lai, Hong-Yi Lin
-
Transformation-Dependent Adversarial Attacks
Yaoteng Tan, Zikui Cai, M. Salman Asif
-
On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
Hashmat Shadab Malik, Numan Saeed, Asif Hanif, Muzammal Naseer, Mohammad Yaqub, Salman Khan, Fahad Shahbaz Khan
-
RRLS : Robust Reinforcement Learning Suite
Adil Zouitine, David Bertoin, Pierre Clavier, Matthieu Geist, Emmanuel Rachelson
-
Genetic Column Generation for Computing Lower Bounds for Adversarial Classification
Maximilian Penka
-
Graph Transductive Defense: a Two-Stage Defense for Graph Membership Inference Attacks
Peizhi Niu, Chao Pan, Siheng Chen, Olgica Milenkovic
-
Adversarial Evasion Attack Efficiency against Large Language Models
João Vitorino, Eva Maia, Isabel Praça
-
Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis
Prithwijit Chowdhury, Mohit Prabhushankar, Ghassan AlRegib, Mohamed Deriche
-
Adversarial Patch for 3D Local Feature Extractor
Yu Wen Pao, Li Chang Lai, Hong-Yi Lin
-
AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
Yitao Xu, Tong Zhang, Sabine Süsstrunk
-
Transformation-Dependent Adversarial Attacks
Yaoteng Tan, Zikui Cai, M. Salman Asif
-
On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
Hashmat Shadab Malik, Numan Saeed, Asif Hanif, Muzammal Naseer, Mohammad Yaqub, Salman Khan, Fahad Shahbaz Khan
-
Genetic Column Generation for Computing Lower Bounds for Adversarial Classification
Maximilian Penka
-
Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey
Shang Wang, Tianqing Zhu, Bo Liu, Ding Ming, Xu Guo, Dayong Ye, Wanlei Zhou
-
Zijin Lin, Yue Zhao, Kai Chen, Jinwen He
-
Yu-Hsiang Huang, Yuche Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-De Lin
-
Dual Thinking and Perceptual Analysis of Deep Learning Models using Human Adversarial Examples
Kailas Dayanandan, Anand Sinha, Brejesh Lall
-
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study
Yichi Zhang, Yao Huang, Yitong Sun, Chang Liu, Zhe Zhao, Zhengwei Fang, Yifan Wang, Huanran Chen, Xiao Yang, Xingxing Wei, Hang Su, Yinpeng Dong, Jun Zhu
-
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
-
AudioMarkBench: Benchmarking Robustness of Audio Watermarking
Hongbin Liu, Moyang Guo, Zhengyuan Jiang, Lun Wang, Neil Zhenqiang Gong
-
Erasing Radio Frequency Fingerprinting via Active Adversarial Perturbation
Zhaoyi Lu, Wenchao Xu, Ming Tu, Xin Xie, Cunqing Hua, Nan Cheng
-
Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions
Leonardo Cotta, Chris J. Maddison
-
Adversarial Machine Unlearning
Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, Yang Liu
-
Out-Of-Context Prompting Boosts Fairness and Robustness in Large Language Model Predictions
Leonardo Cotta, Chris J. Maddison
-
Adversarial Machine Unlearning
Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, Yang Liu
-
Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models
Xi Li, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang
-
Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning
Xiaoting Lyu, Yufei Han, Wei Wang, Jingkai Liu, Yongsheng Zhu, Guangquan Xu, Jiqiang Liu, Xiangliang Zhang
-
Reinforced Compressive Neural Architecture Search for Versatile Adversarial Robustness
Dingrong Wang, Hitesh Sapkota, Zhiqiang Tao, Qi Yu
-
Shenao Yan, Shen Wang, Yue Duan, Hanbin Hong, Kiho Lee, Doowon Kim, Yuan Hong
-
Shuai Zhao, Meihuizi Jia, Zhongliang Guo, Leilei Gan, Jie Fu, Yichao Feng, Fengjun Pan, Luu Anh Tuan
-
PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Wei Li, Pin-Yu Chen, Sijia Liu, Ren Wang
-
Injecting Undetectable Backdoors in Deep Learning and Language Models
Alkis Kalavasis, Amin Karbasi, Argyris Oikonomou, Katerina Sotiraki, Grigoris Velegkas, Manolis Zampetakis
-
DMS: Addressing Information Loss with More Steps for Pragmatic Adversarial Attacks
Zhiyu Zhu, Jiayu Zhang, Xinyi Wang, Zhibo Jin, Huaming Chen
-
Artificial Intelligence as the New Hacker: Developing Agents for Offensive Security
Leroy Jacob Valencia
-
DMS: Addressing Information Loss with More Steps for Pragmatic Adversarial Attacks
Zhiyu Zhu, Jiayu Zhang, Xinyi Wang, Zhibo Jin, Huaming Chen
-
SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner
Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, Juergen Rahmel
-
Enhancing Adversarial Transferability via Information Bottleneck Constraints
Biqing Qi, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou
-
Exploring Adversarial Robustness of Deep State Space Models
Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma, Bowen Zhou
-
Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability
Junqi Gao, Biqing Qi, Yao Li, Zhichang Guo, Dong Li, Yuming Xing, Dazhi Zhang
-
Adversarial flows: A gradient flow characterization of adversarial attacks
Lukas Weigand, Tim Roith, Martin Burger
-
Sales Whisperer: A Human-Inconspicuous Attack on LLM Brand Recommendations
Weiran Lin, Anna Gerchanovsky, Omer Akgul, Lujo Bauer, Matt Fredrikson, Zifan Wang
-
ADBA:Approximation Decision Boundary Approach for Black-Box Adversarial Attacks
Feiyang Wang, Xingquan Zuo, Hai Huang, Gang Chen
-
Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks
Lanzino Romeo, Fontana Federico, Diko Anxhelo, Marini Marco Raoul, Cinque Luigi
-
The Price of Implicit Bias in Adversarially Robust Generalization
Nikolaos Tsilivis, Natalie Frank, Nathan Srebro, Julia Kempe
-
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
Fan Liu, Zhao Xu, Hao Liu
-
Batch-in-Batch: a new adversarial training framework for initial perturbation and sample selection
Yinting Wu, Pai Peng, Bo Cai, Le Li
-
Improving Alignment and Robustness with Short Circuiting
Andy Zou, Long Phan, Justin Wang, Derek Duenas, Maxwell Lin, Maksym Andriushchenko, Rowan Wang, Zico Kolter, Matt Fredrikson, Dan Hendrycks
-
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Zonghao Ying, Aishan Liu, Tianyuan Zhang, Zhengmin Yu, Siyuan Liang, Xianglong Liu, Dacheng Tao
-
AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens
Lin Lu, Hai Yan, Zenghui Yuan, Jiawen Shi, Wenqi Wei, Pin-Yu Chen, Pan Zhou
-
PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning
Tianrong Zhang, Zhaohan Xi, Ting Wang, Prasenjit Mitra, Jinghui Chen
-
FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality
Keyu Chen, Yuheng Lei, Hao Cheng, Haoran Wu, Wenchao Sun, Sifa Zheng
-
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents
Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian
-
Jun Liu, Jiantao Zhou, Jiandian Zeng, Jinyu Tian
-
VQUNet: Vector Quantization U-Net for Defending Adversarial Atacks by Regularizing Unwanted Noise
Zhixun He, Mukesh Singhal
-
ZeroPur: Succinct Training-Free Adversarial Purification
Xiuli Bi, Zonglin Yang, Bo Liu, Xiaodong Cun, Chi-Man Pun, Pietro Lio, Bin Xiao
-
Are Your Models Still Fair? Fairness Attacks on Graph Neural Networks via Node Injections
Zihan Luo, Hong Huang, Yongkang Zhou, Jiping Zhang, Nuo Chen
-
Distributional Adversarial Loss
Saba Ahmadi, Siddharth Bhandari, Avrim Blum, Chen Dan, Prabhav Jain
-
Defending Large Language Models Against Attacks With Residual Stream Activation Analysis
Amelia Kawasaki, Andrew Davis, Houssam Abbas
-
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders
Tingxu Han, Weisong Sun, Ziqi Ding, Chunrong Fang, Hanwei Qian, Jiaxun Li, Zhenyu Chen, Xiangyu Zhang
-
Certifiably Byzantine-Robust Federated Conformal Prediction
Mintong Kang, Zhen Lin, Jimeng Sun, Cao Xiao, Bo Li
-
CR-UTP: Certified Robustness against Universal Text Perturbations
Qian Lou, Xin Liang, Jiaqi Xue, Yancheng Zhang, Rui Xie, Mengxin Zheng
-
QROA: A Black-Box Query-Response Optimization Attack on LLMs
Hussein Jawad, Nicolas J.-B. BRUNEL (LaMME)
-
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Shusuke Takahashi, Yuki Mitsufuji
-
SVASTIN: Sparse Video Adversarial Attack via Spatio-Temporal Invertible Neural Networks
Yi Pan, Jun-Jie Huang, Zihan Chen, Wentao Zhao, Ziyue Wang
-
Yaohua Liu, Jiaxin Gao, Xuan Liu, Xianghao Jiao, Xin Fan, Risheng Liu
-
Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing
Youwei Shu, Xi Xiao, Derui Wang, Yuxin Cao, Siji Chen, Jason Xue, Linyi Li, Bo Li
-
Ai-Sampler: Adversarial Learning of Markov kernels with involutive maps
Evgenii Egorov, Ricardo Valperga, Efstratios Gavves
-
Auditing Privacy Mechanisms via Label Inference Attacks
Róbert István Busa-Fekete, Travis Dick, Claudio Gentile, Andrés Muñoz Medina, Adam Smith, Marika Swanberg
-
BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Jiaqi Xue, Mengxin Zheng, Yebowen Hu, Fei Liu, Xun Chen, Qian Lou
-
FedAdOb: Privacy-Preserving Federated Deep Learning with Adaptive Obfuscation
Hanlin Gu, Jiahuan Luo, Yan Kang, Yuan Yao, Gongxi Zhu, Bowen Li, Lixin Fan, Qiang Yang
-
Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang
-
Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Huiping Zhuang, Cen Chen
-
Exploring Vulnerabilities and Protections in Large Language Models: A Survey
Frank Weizhen Liu, Chenhui Hu
-
Are you still on track!? Catching LLM Task Drift with Activations
Sahar Abdelnabi, Aideen Fay, Giovanni Cherubin, Ahmed Salem, Mario Fritz, Andrew Paverd
-
Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading
Evan Crothers, Herna Viktor, Nathalie Japkowicz
-
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
-
Adversarial 3D Virtual Patches using Integrated Gradients
Chengzeng You, Zhongyuan Hau, Binbin Xu, Soteris Demetriou
-
Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training
Jan Melechovsky, Ambuj Mehrish, Berrak Sisman, Dorien Herremans
-
Poisoning Attacks and Defenses in Recommender Systems: A Survey
Zongwei Wang, Junliang Yu, Min Gao, Guanhua Ye, Shazia Sadiq, Hongzhi Yin
-
Constraint-based Adversarial Example Synthesis
Fang Yu, Ya-Yu Chi, Yu-Fang Chen
-
Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers
Fatemeh Nourilenjan Nokabadi, Jean-François Lalonde, Christian Gagné
-
Model for Peanuts: Hijacking ML Models without Training Access is Possible
Mahmoud Ghorbel, Halima Bouzidi, Ioan Marius Bilasco, Ihsen Alouani
-
Safeguarding Large Language Models: A Survey
Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, Saddek Bensalem, Xiaowei Huang
-
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
Andis Draguns, Andrew Gritsevskiy, Sumeet Ramesh Motwani, Charlie Rogers-Smith, Jeffrey Ladish, Christian Schroeder de Witt
-
Stealing Image-to-Image Translation Models With a Single Query
Nurit Spingarn-Eliezer, Tomer Michaeli
-
Invisible Backdoor Attacks on Diffusion Models
Sen Li, Junchi Ma, Minhao Cheng
-
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang
-
Thibault Simonetto, Salah Ghamizi, Maxime Cordy
-
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang
-
A Novel Defense Against Poisoning Attacks on Federated Learning: LayerCAM Augmented with Autoencoder
Jingjing Zheng, Xin Yuan, Kai Li, Wei Ni, Eduardo Tovar, Jon Crowcroft
-
Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu
-
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
Jinyin Chen, Xiaoming Zhao, Haibin Zheng, Xiao Li, Sheng Xiang, Haifeng Guo
-
Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens
Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh, Wenbo Guo, Han Liu, Xinyu Xing
-
GI-NAS: Boosting Gradient Inversion Attacks through Adaptive Neural Architecture Search
Wenbo Yu, Hao Fang, Bin Chen, Xiaohang Sui, Chuan Chen, Hao Wu, Shu-Tao Xia, Ke Xu
-
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
Feiteng Fang, Yuelin Bai, Shiwen Ni, Min Yang, Xiaojun Chen, Ruifeng Xu
-
Certifying Global Robustness for Deep Neural Networks
You Li, Guannan Zhao, Shuyu Kong, Yunqi He, Hai Zhou
-
Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization
Yisu Liu, Jinyang An, Wanqian Zhang, Dayan Wu, Jingzi Gu, Zheng Lin, Weiping Wang
-
GANcrop: A Contrastive Defense Against Backdoor Attacks in Federated Learning
Xiaoyun Gan, Shanyu Gan, Taizhi Su, Peng Liu
-
ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning
Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran
-
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
Xiaojun Jia, Tianyu Pang, Chao Du, Yihao Huang, Jindong Gu, Yang Liu, Xiaochun Cao, Min Lin
-
Investigating and unmasking feature-level vulnerabilities of CNNs to adversarial perturbations
Davide Coppola, Hwee Kuan Lee
-
Query Provenance Analysis for Robust and Efficient Query-based Black-box Attack Defense
Shaofei Li, Ziqi Zhang, Haomin Jia, Ding Li, Yao Guo, Xiangqun Chen
-
BackdoorIndicator: Leveraging OOD Data for Proactive Backdoor Detection in Federated Learning
Songze Li, Yanbo Dai
-
RASE: Efficient Privacy-preserving Data Aggregation against Disclosure Attacks for IoTs
Zuyan Wang, Jun Tao, Dika Zou
-
Exfiltration of personal information from ChatGPT via prompt injection
Gregory Schwartzman
-
HOLMES: to Detect Adversarial Examples with Multiple Detectors
Jing Wen
-
Efficient LLM-Jailbreaking by Introducing Visual Modality
Zhenxing Niu, Yuyao Sun, Haodong Ren, Haoxuan Ji, Quan Wang, Xiaoke Ma, Gang Hua, Rong Jin
-
Context Injection Attacks on Large Language Models
Cheng'an Wei, Kai Chen, Yue Zhao, Yujia Gong, Lu Xiang, Shenchen Zhu
-
Large Language Model Watermark Stealing With Mixed Integer Programming
Zhaoxi Zhang, Xiaomei Zhang, Yanjun Zhang, Leo Yu Zhang, Chao Chen, Shengshan Hu, Asif Gill, Shirui Pan
-
AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization
Jiawei Chen, Xiao Yang, Zhengwei Fang, Yu Tian, Yinpeng Dong, Zhaoxia Yin, Hang Su
-
DiffPhysBA: Diffusion-based Physical Backdoor Attack against Person Re-Identification in Real-World
Wenli Sun, Xinyang Jiang, Dongsheng Li, Cairong Zhao
-
Hao Cheng, Erjia Xiao, Jiahang Cao, Le Yang, Kaidi Xu, Jindong Gu, Renjing Xu
-
Weilin Lin, Li Liu, Shaokui Wei, Jianze Li, Hui Xiong
-
BAN: Detecting Backdoors Activated by Adversarial Neuron Noise
Xiaoyun Xu, Zhuoran Liu, Stefanos Koffas, Shujian Yu, Stjepan Picek
-
Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable
Martin Bertran, Shuai Tang, Michael Kearns, Jamie Morgenstern, Aaron Roth, Zhiwei Steven Wu
-
Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models
Fujiao Ji, Kiho Lee, Hyungjoon Koo, Wenhao You, Euijin Choo, Hyoungshick Kim, Doowon Kim
-
Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks
Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho
-
Maya Anderson, Guy Amit, Abigail Goldsteen
-
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
Haibo Jin, Andy Zhou, Joe D. Menke, Haohan Wang
-
Phantom: General Trigger Attacks on Retrieval Augmented Language Generation
Harsh Chaudhari, Giorgio Severi, John Abascal, Matthew Jagielski, Christopher A. Choquette-Choo, Milad Nasr, Cristina Nita-Rotaru, Alina Oprea
-
Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images
Krishnakant Singh, Thanush Navaratnam, Jannik Holmer, Simone Schaub-Meyer, Stefan Roth
-
Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Yujia Liu, Tong Bu, Jianhao Ding, Zecheng Hao, Tiejun Huang, Zhaofei Yu
-
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
Ethan Rathbun, Christopher Amato, Alina Oprea
-
Leveraging Many-To-Many Relationships for Defending Against Visual-Language Adversarial Attacks
Futa Waseda, Antonio Tejero-de-Pablos
-
EntProp: High Entropy Propagation for Improving Accuracy and Robustness
Shohei Enomoto
-
Verifiably Robust Conformal Prediction
Linus Jeary, Tom Kuipers, Mehran Hosseini, Nicola Paoletti
-
DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints
Andrew Zhao, Quentin Xu, Matthieu Lin, Shenzhi Wang, Yong-jin Liu, Zilong Zheng, Gao Huang
-
Convex neural network synthesis for robustness in the 1-norm
Ross Drummond, Chris Guiver, Matthew C. Turner
-
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu
-
Saurabh Pathak, Samridha Shrestha, Abdelrahman AlMahmoud
-
Robust Entropy Search for Safe Efficient Bayesian Optimization
Dorina Weichert, Alexander Kister, Patrick Link, Sebastian Houben, Gunar Ernis
-
Voice Jailbreak Attacks Against GPT-4o
Xinyue Shen, Yixin Wu, Michael Backes, Yang Zhang
-
PermLLM: Private Inference of Large Language Models within 3 Seconds under WAN
Fei Zheng, Chaochao Chen, Zhongxuan Han, Xiaolin Zheng
-
Node Injection Attack Based on Label Propagation Against Graph Neural Network
Peican Zhu, Zechen Pan, Keke Tang, Xiaodong Cui, Jinhuan Wang, Qi Xuan
-
AI Risk Management Should Incorporate Both Safety and Security
Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Geiping, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal
-
Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies
Yipu Chen, Haotian Xue, Yongxin Chen
-
Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing
Wei Zhao, Zhe Li, Yige Li, Ye Zhang, Jun Sun
-
Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
Daniel Bethell, Simos Gerasimou, Radu Calinescu, Calum Imrie
-
Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective
Nan Li, Haiyang Yu, Ping Yi
-
Magnitude-based Neuron Pruning for Backdoor Defens
Nan Li, Haoyu Jiang, Ping Yi
-
White-box Multimodal Jailbreaks Against Large Vision-Language Models
Ruofan Wang, Xingjun Ma, Hanxu Zhou, Chuanjun Ji, Guangnan Ye, Yu-Gang Jiang
-
ATM: Adversarial Tuning Multi-agent System Makes a Robust Retrieval-Augmented Generator
Junda Zhu, Lingyong Yan, Haibo Shi, Dawei Yin, Lei Sha
-
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks
Zhenxing Niu, Yuyao Sun, Qiguang Miao, Rong Jin, Gang Hua
-
Cross-Context Backdoor Attacks against Graph Prompt Learning
Xiaoting Lyu, Yufei Han, Wei Wang, Hangwei Qian, Ivor Tsang, Xiangliang Zhang
-
Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder
Wenlong Gou, Chuanhang Yu, Juntao Ma, Gang Wu, Vladimir Mordachev
-
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain
-
Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning
Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu
-
Improved Generation of Adversarial Examples Against Safety-aligned LLMs
Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen
-
Stochastic Adversarial Networks for Multi-Domain Text Classification
Xu Wang, Yuan Wu
-
TrojFM: Resource-efficient Backdoor Attacks against Very Large Foundation Models
Yuzhou. Nie, Yanting. Wang, Jinyuan. Jia, Michael J. De Lucia, Nathaniel D. Bastian, Wenbo. Guo, Dawn. Song
-
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang, Yanda Li, Junyuan Jiang, Zepeng Ding, Guochao Jiang, Jiaqing Liang, Deqing Yang
-
A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness
Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni
-
Exploiting the Layered Intrinsic Dimensionality of Deep Models for Practical Adversarial Training
Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Hassan Sajjad, Sanjay Chawla
-
Privacy-Aware Visual Language Models
Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano
-
Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation
Liang Shi, Jie Zhang, Shiguang Shan
-
Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models
Fengfan Zhou, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Lizhuang Ma, Hefei Ling
-
Spectral regularization for adversarially-robust representation learning
Sheng Yang, Jacob A. Zavatone-Veth, Cengiz Pehlevan
-
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu, Yu-Lin Tsai, Chih-Hsun Lin, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang
-
The Uncanny Valley: Exploring Adversarial Robustness from a Flatness Perspective
Nils Philipp Walter, Linara Adilova, Jilles Vreeken, Michael Kamp
-
OSLO: One-Shot Label-Only Membership Inference Attacks
Yuefeng Peng, Jaechul Roh, Subhransu Maji, Amir Houmansadr
-
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
ShengYun Peng, Pin-Yu Chen, Matthew Hull, Duen Horng Chau
-
Fengji Ma, Li Liu, Hei Victor Cheng
-
Exploring Backdoor Attacks against Large Language Model-based Decision Making
Ruochen Jiao, Shaoyuan Xie, Justin Yue, Takami Sato, Lixu Wang, Yixuan Wang, Qi Alfred Chen, Qi Zhu
-
Automatic Jailbreaking of the Text-to-Image Generative AI Systems
Minseon Kim, Hyomin Lee, Boqing Gong, Huishuai Zhang, Sung Ju Hwang
-
Visualizing the Shadows: Unveiling Data Poisoning Behaviors in Federated Learning
Xueqing Zhang, Junkai Zhang, Ka-Ho Chow, Juntao Chen, Ying Mao, Mohamed Rahouti, Xiang Li, Yuchen Liu, Wenqi Wei
-
Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level
Runlin Lei, Yuwei Hu, Yuchen Ren, Zhewei Wei
-
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu, Hongyi Guo, Yingxiang Yang, Jose Blanchet, Zhaoran Wang
-
Partial train and isolate, mitigate backdoor attack
Yong Li, Han Gao
-
Pruning for Robust Concept Erasing in Diffusion Models
Tianyun Yang, Juan Cao, Chang Xu
-
Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models
Xijie Huang, Xinyuan Wang, Hantao Zhang, Jiawen Xi, Jingkun An, Hao Wang, Chengwei Pan
-
Diffusion-Reward Adversarial Imitation Learning
Chun-Mao Lai, Hsiang-Chun Wang, Ping-Chun Hsieh, Yu-Chiang Frank Wang, Min-Hung Chen, Shao-Hua Sun
-
Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack
Mingli Zhu, Siyuan Liang, Baoyuan Wu
-
Enhancing Adversarial Transferability Through Neighborhood Conditional Sampling
Chunlin Qiu, Yiheng Duan, Lingchen Zhao, Qian Wang
-
Detecting Adversarial Data via Perturbation Forgery
Qian Wang, Chen Li, Yuchen Luo, Hefei Ling, Ping Li, Jiazhong Chen, Shijuan Huang, Ning Yu
-
Shelly Golan, Roy Ganz, Michael Elad
-
R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model
Changhoon Kim, Kyle Min, Yezhou Yang
-
Certifying Adapters: Enabling and Enhancing the Certification of Classifier Adversarial Robustness
Jieren Deng, Hanbin Hong, Aaron Palmer, Xin Zhou, Jinbo Bi, Kaleel Mahmood, Yuan Hong, Derek Aguiar
-
Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor
Shaokui Wei, Hongyuan Zha, Baoyuan Wu
-
Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency
Runqi Lin, Chaojian Yu, Bo Han, Hang Su, Tongliang Liu
-
M. Saeid HaghighiFard, Sinem Coleri
-
Towards Black-Box Membership Inference Attack for Diffusion Models
Jingwei Li, Jing Dong, Tianxing He, Jingzhao Zhang
-
Siyuan Ma, Weidi Luo, Yu Wang, Xiaogeng Liu, Muhao Chen, Bo Li, Chaowei Xiao
-
How Does Bayes Error Limit Probabilistic Robust Accuracy
Ruihan Zhang, Jun Sun
-
RFLPA: A Robust Federated Learning Framework against Poisoning Attacks with Secure Aggregation
Peihua Mai, Ran Yan, Yan Pang
-
Coordinated Disclosure for AI: Beyond Security Vulnerabilities
Sven Cattell, Avijit Ghosh, Lucie-Aimée Kaffee
-
Robust Diffusion Models for Adversarial Purification
Guang Lin, Zerui Tao, Jianhai Zhang, Toshihisa Tanaka, Qibin Zhao
-
Certifiably Robust RAG against Retrieval Corruption
Chong Xiang, Tong Wu, Zexuan Zhong, David Wagner, Danqi Chen, Prateek Mittal
-
Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, Sijia Liu
-
BDetCLIP: Multimodal Prompting Contrastive Test-Time Backdoor Detection
Yuwei Niu, Shuo He, Qi Wei, Feng Liu, Lei Feng
-
Scale-Invariant Feature Disentanglement via Adversarial Learning for UAV-based Object Detection
Fan Liu, Liang Yao, Chuanyi Zhang, Ting Wu, Xinlei Zhang, Jun Zhou, Xiruo Jiang
-
Better Membership Inference Privacy Measurement through Discrepancy
Ruihan Wu, Pengrun Huang, Kamalika Chaudhuri
-
Adversarial Attacks on Hidden Tasks in Multi-Task Learning
Yu Zhe, Rei Nagaike, Daiki Nishiyama, Kazuto Fukuchi, Jun Sakuma
-
Decaf: Data Distribution Decompose Attack against Federated Learning
Zhiyang Dai, Chunyi Zhou, Anmin Fu
-
Florent Guépin, Nataša Krčo, Matthieu Meeus, Yves-Alexandre de Montjoye
-
DAGER: Exact Gradient Inversion for Large Language Models
Ivo Petrov, Dimitar I. Dimitrov, Maximilian Baader, Mark Niklas Müller, Martin Vechev
-
Efficient Adversarial Training in LLMs with Continuous Attacks
Sophie Xhonneux, Alessandro Sordoni, Stephan Günnemann, Gauthier Gidel, Leo Schwinn
-
TrojanForge: Adversarial Hardware Trojan Examples with Reinforcement Learning
Amin Sarihi, Peter Jamieson, Ahmad Patooghy, Abdel-Hameed A. Badawy
-
Adversarial Imitation Learning from Visual Observations using Latent Information
Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis
-
Simon Chi Lok Yu, Jie He, Pasquale Minervini, Jeff Z. Pan
-
A Neurosymbolic Framework for Bias Correction in CNNs
Parth Padalkar, Natalia Ślusarz, Ekaterina Komendantskaya, Gopal Gupta
-
Robust width: A lightweight and certifiable adversarial defense
Jonathan Peck, Bart Goossens
-
Can Implicit Bias Imply Adversarial Robustness?
Hancheng Min, René Vidal
-
BadGD: A unified data-centric framework to identify gradient descent vulnerabilities
Chi-Hua Wang, Guang Cheng
-
Robustifying Safety-Aligned Large Language Models through Clean Data Curation
Xiaoqun Liu, Jiacheng Liang, Muchao Ye, Zhaohan Xi
-
ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users
Guanlin Li, Kangjie Chen, Shudong Zhang, Jie Zhang, Tianwei Zhang
-
Large Language Model Sentinel: Advancing Adversarial Robustness by LLM Agent
Guang Lin, Qibin Zhao
-
Learning to Transform Dynamically for Better Adversarial Transferability
Rongyi Zhu, Zeliang Zhang, Susan Liang, Zhuo Liu, Chenliang Xu
-
Certified Robustness against Sparse Adversarial Perturbations via Data Localization
Ambar Pal, René Vidal, Jeremias Sulam
-
SLIFER: Investigating Performance and Robustness of Malware Detection Pipelines
Andrea Ponte, Dmitrijs Trizna, Luca Demetrio, Battista Biggio, Fabio Roli
-
Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs
Yihao Huang, Chong Wang, Xiaojun Jia, Qing Guo, Felix Juefei-Xu, Jian Zhang, Geguang Pu, Yang Liu
-
MoGU: A Framework for Enhancing Safety of Open-Sourced LLMs While Preserving Their Usability
Yanrui Du, Sendong Zhao, Danyang Zhao, Ming Ma, Yuhan Chen, Liangyu Huo, Qing Yang, Dongliang Xu, Bing Qin
-
Impact of Non-Standard Unicode Characters on Security and Comprehension in Large Language Models
Johan S Daniel, Anand Pal
-
Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li
-
Towards Transferable Attacks Against Vision-LLMs in Autonomous Driving with Typography
Nhat Chung, Sensen Gao, Tuan-Anh Vu, Jie Zhang, Aishan Liu, Yun Lin, Jin Song Dong, Qing Guo
-
Eidos: Efficient, Imperceptible Adversarial 3D Point Clouds
Hanwei Zhang, Luo Cheng, Qisong He, Wei Huang, Renjue Li, Ronan Sicre, Xiaowei Huang, Holger Hermanns, Lijun Zhang
-
Towards Imperceptible Backdoor Attack in Self-supervised Learning
Hanrong Zhang, Zhenting Wang, Tingxu Han, Mingyu Jin, Chenlu Zhan, Mengnan Du, Hongwei Wang, Shiqing Ma
-
Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy
Shengfang Zhai, Huanran Chen, Yinpeng Dong, Jiajun Li, Qingni Shen, Yansong Gao, Hang Su, Yang Liu
-
Boosting Robustness by Clipping Gradients in Distributed Learning
Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Ahmed Jellouli, Geovani Rizk, John Stephan
-
Identity Inference from CLIP Models using Only Textual Data
Songze Li, Ruoxi Cheng, Xiaojun Jia
-
A New Formulation for Zeroth-Order Optimization of Adversarial EXEmples in Malware Detection
Marco Rando, Luca Demetrio, Lorenzo Rosasco, Fabio Roli
-
Nearly Tight Black-Box Auditing of Differentially Private Machine Learning
Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro
-
Generating camera failures as a class of physics-based adversarial examples
Manav Prabhakar, Jwalandhar Girnar, Arpan Kusari
-
Universal Robustness via Median Randomized Smoothing for Real-World Super-Resolution
Zakariya Chaouai, Mohamed Tamaazousti
-
Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers
Shayan Mohajer Hamidi, Linfeng Ye
-
Safety Alignment for Vision Language Models
Zhendong Liu, Yuanbi Nie, Yingshui Tan, Xiangyu Yue, Qiushi Cui, Chongjun Wang, Xiaoyong Zhu, Bo Zheng
-
DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks
Patrik Velčický, Jakub Breier, Xiaolu Hou, Mladen Kovačević
-
Weixiang Zhao, Yulin Hu, Zhuojun Li, Yang Deng, Yanyan Zhao, Bing Qin, Tat-Seng Chua
-
TrojanRAG: Retrieval-Augmented Generation Can Be Backdoor Driver in Large Language Models
Pengzhou Cheng, Yidong Ding, Tianjie Ju, Zongru Wu, Wei Du, Ping Yi, Zhuosheng Zhang, Gongshen Liu
-
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Cornelius Emde, Francesco Pinto, Thomas Lukasiewicz, Philip H.S. Torr, Adel Bibi
-
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response
Tianrong Zhang, Bochuan Cao, Yuanpu Cao, Lu Lin, Prasenjit Mitra, Jinghui Chen
-
Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization
Daniel Kuelbs, Sanjay Lall, Mert Pilanci
-
Memory Scraping Attack on Xilinx FPGAs: Private Data Extraction from Terminated Processes
Bharadwaj Madabhushi, Sandip Kundu, Daniel Holcomb
-
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Jiaqi Li, Qianshan Wei, Chuanyi Zhang, Guilin Qi, Miaozeng Du, Yongrui Chen, Sheng Bi
-
Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming
Jiaxu Liu, Xiangyu Yin, Sihao Wu, Jianhong Wang, Meng Fang, Xinping Yi, Xiaowei Huang
-
Generative AI and Large Language Models for Cyber Security: All Insights You Need
Mohamed Amine Ferrag, Fatima Alwahedi, Ammar Battah, Bilel Cherif, Abdechakour Mechri, Norbert Tihanyi
-
Transparency Distortion Robustness for SOTA Image Segmentation Tasks
Volker Knauthe, Arne Rak, Tristan Wirth, Thomas Pöllabauer, Simon Metzler, Arjan Kuijper, Dieter W. Fellner
-
Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks
Boheng Li, Yishuo Cai, Haowei Li, Feng Xue, Zhifeng Li, Yiming Li
-
Zerui Zhang, Zhichao Sun, Zelong Liu, Bo Du, Rui Yu, Zhou Zhao, Yongchao Xu
-
Robust Classification via a Single Diffusion Model
Huanran Chen, Yinpeng Dong, Zhengyi Wang, Xiao Yang, Chengqi Duan, Hang Su, Jun Zhu
-
Fan Shi, Chong Zhang, Takahiro Miki, Joonho Lee, Marco Hutter, Stelian Coros
-
How to Train a Backdoor-Robust Model on a Poisoned Dataset without Auxiliary Data?
Yuwen Pu, Jiahao Chen, Chunyi Zhou, Zhou Feng, Qingming Li, Chunqiang Hu, Shouling Ji
-
A Stealthy Backdoor Attack for Without-Label-Sharing Split Learning
Yuwen Pu, Zhuoyuan Ding, Jiahao Chen, Chunyi Zhou, Qingming Li, Chunqiang Hu, Shouling Ji
-
Rethinking the Vulnerabilities of Face Recognition Systems:From a Practical Perspective
Jiahao Chen, Zhiqiang Shen, Yuwen Pu, Chunyi Zhou, Shouling Ji
-
GPT-4 Jailbreaks Itself with Near-Perfect Success Using Self-Explanation
Govind Ramesh, Yao Dou, Wei Xu
-
Interactive Simulations of Backdoors in Neural Networks
Peter Bajcsy, Maxime Bros
-
Yuwen Qian, Shuchi Wu, Kang Wei, Ming Ding, Di Xiao, Tao Xiang, Chuan Ma, Song Guo
-
A novel reliability attack of Physical Unclonable Functions
Gaoxiang Li, Yu Zhuang
-
Fed-Credit: Robust Federated Learning with Credibility Management
Jiayan Chen, Zhirong Qian, Tianhui Meng, Xitong Gao, Tian Wang, Weijia Jia
-
Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space
Qianmei Liu, Yufei Kuang, Jie Wang
-
Adaptive Batch Normalization Networks for Adversarial Robustness
Shao-Yuan Lo, Vishal M. Patel
-
Hikmat Khan, Ghulam Rasool, Nidhal Carla Bouaynaya
-
Data Contamination Calibration for Black-box LLMs
Wentao Ye, Jiaqi Hu, Liyao Li, Haobo Wang, Gang Chen, Junbo Zhao
-
Decentralized Privacy Preservation for Critical Connections in Graphs
Conggai Li, Wei Ni, Ming Ding, Youyang Qu, Jianjun Chen, David Smith, Wenjie Zhang, Thierry Rakotoarivelo
-
GAN-GRID: A Novel Generative Attack on Smart Grid Stability Prediction
Emad Efatinasab, Alessandro Brighente, Mirco Rampazzo, Nahal Azadi, Mauro Conti
-
Adaptive Batch Normalization Networks for Adversarial Robustness
Shao-Yuan Lo, Vishal M. Patel
-
Efficient Model-Stealing Attacks Against Inductive Graph Neural Networks
Marcin Podhajski, Jan Dubiński, Franziska Boenisch, Adam Dziedzic, Agnieszka Pregowska, Tomasz Michalak
-
Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation
Yuxi Li, Yi Liu, Yuekang Li, Ling Shi, Gelei Deng, Shengquan Chen, Kailong Wang
-
An Invisible Backdoor Attack Based On Semantic Feature
Yangming Chen
-
Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems
Shengxiang Sun, Shenzhe Zhu
-
A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers
Tom Roth, Inigo Jauregi Unanue, Alsharif Abuadbba, Massimo Piccardi
-
On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks
Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester
-
Certified Robust Accuracy of Neural Networks Are Bounded due to Bayes Errors
Ruihan Zhang, Jun Sun
-
A GAN-Based Data Poisoning Attack Against Federated Learning Systems and Its Countermeasure
Wei Sun, Bo Gao, Ke Xiong, Yuwei Wang, Pingyi Fan, Khaled Ben Letaief
-
Sketches-based join size estimation under local differential privacy
Meifan Zhang, Xin Liu, Lihua Yin
-
BOSC: A Backdoor-based Framework for Open Set Synthetic Image Attribution
Jun Wang, Benedetta Tondi, Mauro Barni
-
Revisiting the Robust Generalization of Adversarial Prompt Tuning
Fan Yang, Mingxuan Xia, Sangzhou Xia, Chicheng Ma, Hui Hui
-
Trustworthy Actionable Perturbations
Jesse Friedbaum, Sudarshan Adiga, Ravi Tandon
-
Thanh Nguyen, Tung M. Luu, Tri Ton, Chang D. Yoo
-
Xuanli He, Qiongkai Xu, Jun Wang, Benjamin I. P. Rubinstein, Trevor Cohn
-
BadActs: A Universal Backdoor Defense in the Activation Space
Biao Yi, Sishuo Chen, Yiming Li, Tong Li, Baolei Zhang, Zheli Liu
-
Duo Peng, Qiuhong Ke, Jun Liu
AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA
Weitao Feng, Wenbo Zhou, Jiyan He, Jie Zhang, Tianyi Wei, Guanlin Li, Tianwei Zhang, Weiming Zhang, Nenghai Yu
-
Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning
Udi Aharon, Revital Marbel, Ran Dubin, Amit Dvir, Chen Hajaj
-
Detecting Complex Multi-step Attacks with Explainable Graph Neural Network
Wei Liu, Peng Gao, Haotian Zhang, Ke Li, Weiyong Yang, Xingshen Wei, Shuji Wu
-
MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection
Ximiao Zhang, Min Xu, Dehui Qiu, Ruixin Yan, Ning Lang, Xiuzhuang Zhou
-
Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors
Jiachen Sun, Changsheng Wang, Jiongxiao Wang, Yiwei Zhang, Chaowei Xiao
-
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers
Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia
-
Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation
Yixing Huang, Zahra Khodabakhshi, Ahmed Gomaa, Manuel Schmidt, Rainer Fietkau, Matthias Guckenberger, Nicolaus Andratschke, Christoph Bert, Stephanie Tanadini-Lang, Florian Putz
-
Rethinking Graph Backdoor Attacks: A Distribution-Preserving Perspective
Zhiwei Zhang, Minhua Lin, Enyan Dai, Suhang Wang
-
Boosting Few-Pixel Robustness Verification via Covering Verification Designs
Yuval Shapira, Naor Wiesel, Shahar Shabelman, Dana Drachsler-Cohen
-
Generative AI for Secure and Privacy-Preserving Mobile Crowdsensing
Yaoqi Yang, Bangning Zhang, Daoxing Guo, Hongyang Du, Zehui Xiong, Dusit Niyato, Zhu Han
-
Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks
Haonan An, Guang Hua, Zhiping Lin, Yuguang Fang
-
DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection
Yuhao Sun, Lingyun Yu, Hongtao Xie, Jiaming Li, Yongdong Zhang
-
Keep It Private: Unsupervised Privatization of Online Text
Calvin Bao, Marine Carpuat
-
Abdulrahman Alabdulakreem, Christian M Arnold, Yerim Lee, Pieter M Feenstra, Boris Katz, Andrei Barbu
-
Infrared Adversarial Car Stickers
Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, Xiaolin Hu
-
Adversarial Robustness for Visual Grounding of Multimodal Large Language Models
Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia
-
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency
Linshan Hou, Ruili Feng, Zhongyun Hua, Wei Luo, Leo Yu Zhang, Yiming Li
-
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, Fred Roosta
-
Yichuan Shi, Olivera Kotevska, Viktor Reshniak, Abhishek Singh, Ramesh Raskar
-
Adversarial Robustness Guarantees for Quantum Classifiers
Neil Dowling, Maxwell T. West, Angus Southwell, Azar C. Nakhl, Martin Sevior, Muhammad Usman, Kavan Modi
-
Learnable Privacy Neurons Localization in Language Models
Ruizhe Chen, Tianxiang Hu, Yang Feng, Zuozhu Liu
-
Meenatchi Sundaram Muthu Selva Annamalai, Georgi Ganev, Emiliano De Cristofaro
-
Haiyu Wu, Sicong Tian, Jacob Gutierrez, Aman Bhatta, Kağan Öztürk, Kevin W. Bowyer
-
Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization
Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson
-
Cross-Input Certified Training for Universal Perturbations
Changming Xu, Gagandeep Singh
-
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer
Weifei Jin, Yuxin Cao, Junjie Su, Qi Shen, Kai Ye, Derui Wang, Jie Hao, Ziyao Liu
-
Words Blending Boxes. Obfuscating Queries in Information Retrieval using Differential Privacy
Francesco Luigi De Faveri, Guglielmo Faggioli, Nicola Ferro
-
Benjamin Camus, Théo Voillemin, Corentin Le Barbu, Jean-Christophe Louvigné, Carole Belloni, Emmanuel Vallée
-
Properties that allow or prohibit transferability of adversarial attacks among quantized networks
Abhishek Shrestha, Jürgen Großmann
-
DP-RuL: Differentially-Private Rule Learning for Clinical Decision Support Systems
Josephine Lamp, Lu Feng, David Evans
-
Anthony M. Barrett, Krystal Jackson, Evan R. Murphy, Nada Madkour, Jessica Newman
-
Khoi Tran Dang, Kevin Delmas, Jérémie Guiochet, Joris Guérin
-
Towards Safe Large Language Models for Medicine
Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju
-
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models
Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff
-
UnMarker: A Universal Attack on Defensive Watermarking
Andre Kassis, Urs Hengartner
-
Boqi Chen, Kristóf Marussy, Oszkár Semeráth, Gunter Mussbacher, Dániel Varró
-
Differentially Private Federated Learning: A Systematic Review
Jie Fu, Yuan Hong, Xinpeng Ling, Leixia Wang, Xun Ran, Zhiyu Sun, Wendy Hui Wang, Zhili Chen, Yang Cao
-
Work-in-Progress: Crash Course: Can (Under Attack) Autonomous Driving Beat Human Drivers?
Francesco Marchiori, Alessandro Brighente, Mauro Conti
-
Adversarial Machine Learning Threats to Spacecraft
Rajiv Thummala, Shristi Sharma, Matteo Calabrese, Gregory Falco
-
Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang
-
The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks
Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan
-
RS-Reg: Probabilistic and Robust Certified Regression Through Randomized Smoothing
Aref Miri Rekavandi, Olga Ohrimenko, Benjamin I.P. Rubinstein
-
Private Data Leakage in Federated Human Activity Recognition for Wearable Healthcare Devices
Kongyang Chen, Dongping Zhang, Bing Mi
-
GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation
Andrey V. Galichin, Mikhail Pautov, Alexey Zhavoronkin, Oleg Y. Rogov, Ivan Oseledets
-
Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection
Dehong Kong, Siyuan Liang, Wenqi Ren
-
Qilin Zhou, Zhengyuan Wei, Haipeng Wang, Bo Jiang, W.K. Chan
-
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
Liam Dugan, Alyssa Hwang, Filip Trhlik, Josh Magnus Ludan, Andrew Zhu, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch
-
Backdoor Removal for Generative Large Language Models
Haoran Li, Yulin Chen, Zihao Zheng, Qi Hu, Chunkit Chan, Heshan Liu, Yangqiu Song
-
Evaluating Google's Protected Audience Protocol
Minjun Long, David Evans
-
Mayank Bakshi, Sara Ghasvarianjahromi, Yauhen Yakimenka, Allison Beemer, Oliver Kosut, Joerg Kliewer
-
Shadow-Free Membership Inference Attacks: Recommender Systems Are More Vulnerable Than You Thought
Xiaoxiao Chi, Xuyun Zhang, Yan Wang, Lianyong Qi, Amin Beheshti, Xiaolong Xu, Kim-Kwang Raymond Choo, Shuo Wang, Hongsheng Hu
-
Disrupting Style Mimicry Attacks on Video Imagery
Josephine Passananti, Stanley Wu, Shawn Shan, Haitao Zheng, Ben Y. Zhao
-
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum
-
Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning
Yujie Zhang, Neil Gong, Michael K. Reiter
-
Disttack: Graph Adversarial Attacks Toward Distributed GNN Training
Yuxiang Zhang, Xin Liu, Meng Wu, Wei Yan, Mingyu Yan, Xiaochun Ye, Dongrui Fan
-
Amira Guesmi, Nishant Suresh Aswani, Muhammad Shafique
-
Juanjuan Weng, Zhiming Luo, Shaozi Li
-
Evaluating Adversarial Robustness in the Spatial Frequency Domain
Keng-Hsin Liao, Chin-Yuan Yeh, Hsi-Wen Chen, Ming-Syan Chen
-
Certified $\ell_2$ Attribution Robustness via Uniformly Smoothed Attributions
Fan Wang, Adams Wai-Kin Kong
-
Risks of Practicing Large Language Models in Smart Grid: Threat Modeling and Validation
Jiangnan Li, Yingyuan Yang, Jinyuan Sun
-
PLeak: Prompt Leaking Attacks against Large Language Model Applications
Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, Yinzhi Cao
-
LLM-Generated Black-box Explanations Can Be Adversarially Helpful
Rohan Ajwani, Shashidhar Reddy Javaji, Frank Rudzicz, Zining Zhu
-
Towards Robust Physical-world Backdoor Attacks on Lane Detection
Xinwei Zhang, Aishan Liu, Tianyuan Zhang, Siyuan Liang, Xianglong Liu
-
Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness
Siyuan Li, Xi Lin, Yaju Liu, Jianhua Li
-
Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM
Xikang Yang, Xuehai Tang, Songlin Hu, Jizhong Han
-
Towards Accurate and Robust Architectures via Neural Architecture Search
Yuwei Ou, Yuqi Feng, Yanan Sun
-
Universal Adversarial Perturbations for Vision-Language Pre-trained Models
Peng-Fei Zhang, Zi Huang, Guangdong Bai
-
Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers
Binxiao Huang, Jason Chun Lok, Chang Liu, Ngai Wong
-
Model Inversion Robustness: Can Transfer Learning Help?
Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen, Ngai-Man Cheung
-
Privacy-Preserving Edge Federated Learning for Intelligent Mobile-Health Systems
Amin Aminifar, Matin Shokri, Amir Aminifar
-
Link Stealing Attacks Against Inductive Graph Neural Networks
Yixin Wu, Xinlei He, Pascal Berrang, Mathias Humbert, Michael Backes, Neil Zhenqiang Gong, Yang Zhang
-
A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data
Meenatchi Sundaram Muthu Selva Annamalai, Andrea Gadotti, Luc Rocher
-
High-Performance Privacy-Preserving Matrix Completion for Trajectory Recovery
Jiahao Guo, An-Bao Xu
-
Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models
Yang Bai, Ge Pei, Jindong Gu, Yong Yang, Xingjun Ma
-
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
Vyas Raina, Rao Ma, Charles McGhee, Kate Knill, Mark Gales
-
BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization
Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru
-
Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search
Zachary Coalson, Huazheng Wang, Qingyun Wu, Sanghyun Hong
-
Shuo Shao, Yiming Li, Hongwei Yao, Yiling He, Zhan Qin, Kui Ren
-
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun
-
BiasKG: Adversarial Knowledge Graphs to Induce Bias in Large Language Models
Chu Fei Luo, Ahmad Ghawanmeh, Xiaodan Zhu, Faiza Khan Khattak
-
Mitigating Bias Using Model-Agnostic Data Attribution
Sander De Coninck, Wei-Cheng Wang, Sam Leroux, Pieter Simoens
-
Espresso: Robust Concept Filtering in Text-to-Image Models
Anudeep Das, Vasisht Duddu, Rui Zhang, N. Asokan
-
Xuyang Zhong, Yixiao Huang, Chen Liu
-
Adversarial Threats to Automatic Modulation Open Set Recognition in Wireless Networks
Yandie Yang, Sicheng Zhang, Kuixian Li, Qiao Tian, Yun Lin
-
HackCar: a test platform for attacks and defenses on a cost-contained automotive architecture
Dario Stabili, Filip Valgimigli, Edoardo Torrini, Mirco Marchetti
-
Systematic Use of Random Self-Reducibility against Physical Attacks
Ferhat Erata, TingHung Chiu, Anthony Etim, Srilalith Nampally, Tejas Raju, Rajashree Ramu, Ruzica Piskac, Timos Antonopoulos, Wenjie Xiong, Jakub Szefer
-
Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals
Joshua Clymer, Caden Juang, Severin Field
-
Adversary-Guided Motion Retargeting for Skeleton Anonymization
Thomas Carr, Depeng Xu, Aidong Lu
-
Model Reconstruction Using Counterfactual Explanations: Mitigating the Decision Boundary Shift
Pasan Dissanayake, Sanghamitra Dutta
-
Untargeted Adversarial Attack on Knowledge Graph Embeddings
Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, Jun Liu
-
Chenxi Qiu
-
Locally Differentially Private In-Context Learning
Chunyan Zheng, Keke Sun, Wenhao Zhao, Haibo Zhou, Lixin Jiang, Shaoyang Song, Chunlai Zhou
-
A2-DIDM: Privacy-preserving Accumulator-enabled Auditing for Distributed Identity of DNN Model
Tianxiu Xie, Keke Gai, Jing Yu, Liehuang Zhu, Kim-Kwang Raymond Choo
-
Revisiting character-level adversarial attacks
Elias Abad Rocamora, Yongtao Wu, Fanghui Liu, Grigorios G. Chrysos, Volkan Cevher
-
IPFed: Identity protected federated learning for user authentication
Yosuke Kaga, Yusei Suzuki, Kenta Takahashi
-
Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li
-
Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza Mahbod
-
Effective and Robust Adversarial Training against Data and Label Corruptions
Peng-Fei Zhang, Zi Huang, Xin-Shun Xu, Guangdong Bai
-
Unlearning Backdoor Attacks through Gradient-Based Model Pruning
Kealan Dunnett, Reza Arablouei, Dimity Miller, Volkan Dedeoglu, Raja Jurdak
-
Explainability-Informed Targeted Malware Misclassification
Quincy Card, Kshitiz Aryal, Maanak Gupta
-
Enabling Privacy-Preserving and Publicly Auditable Federated Learning
Huang Zeng, Anjia Yang, Jian Weng, Min-Rong Chen, Fengjun Xiao, Yi Liu, Ye Yao
-
A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning
Xiaoyang Xu, Mengda Yang, Wenzhe Yi, Ziang Li, Juan Wang, Hongxin Hu, Yong Zhuang, Yaxin Liu
-
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
George-Octavian Barbulescu, Peter Triantafillou
-
Assessing Adversarial Robustness of Large Language Models: An Empirical Study
Zeyu Yang, Zhao Meng, Xiaochen Zheng, Roger Wattenhofer
-
Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability
Juanjuan Weng, Zhiming Luo, Shaozi Li
-
UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Yiting Qu, Xinyue Shen, Yixin Wu, Michael Backes, Savvas Zannettou, Yang Zhang
-
Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation
Guangtao Zheng, Wenqian Ye, Aidong Zhang
-
Derui Wang, Minhui Xue, Bo Li, Seyit Camtepe, Liming Zhu
-
GI-SMN: Gradient Inversion Attack against Federated Learning without Prior Knowledge
Jin Qian, Kaimin Wei, Yongdong Wu, Jilian Zhang, Jipeng Chen, Huan Bao
-
Federated Learning Privacy: Attacks, Defenses, Applications, and Policy Landscape - A Survey
Joshua C. Zhao, Saurabh Bagchi, Salman Avestimehr, Kevin S. Chan, Somali Chaterji, Dimitris Dimitriadis, Jiacheng Li, Ninghui Li, Arash Nourian, Holger R. Roth
-
Cutting through buggy adversarial example defenses: fixing 1 line of code breaks Sabre
Nicholas Carlini
-
FOBNN: Fast Oblivious Binarized Neural Network Inference
Xin Chen, Zhili Chen, Benchang Dong, Shiwen Wei, Lin Chen, Daojing He
-
DarkFed: A Data-Free Backdoor Attack in Federated Learning
Minghui Li, Wei Wan, Yuxuan Ning, Shengshan Hu, Lulu Xue, Leo Yu Zhang, Yichen Wang
-
LaserEscape: Detecting and Mitigating Optical Probing Attacks
Saleh Khalaj Monfared, Kyle Mitard, Andrew Cannon, Domenic Forte, Shahin Tajik
-
Korn Sooksatra, Greg Hamerly, Pablo Rivas
-
On Adversarial Examples for Text Classification by Perturbing Latent Representations
Korn Sooksatra, Bikram Khanal, Pablo Rivas
-
Ravikumar Balakrishnan, Marius Arvinte, Nageen Himayat, Hosein Nikopour, Hassnaa Moustafa
-
Korn Sooksatra, Greg Hamerly, Pablo Rivas
-
On Adversarial Examples for Text Classification by Perturbing Latent Representations
Korn Sooksatra, Bikram Khanal, Pablo Rivas
-
Ravikumar Balakrishnan, Marius Arvinte, Nageen Himayat, Hosein Nikopour, Hassnaa Moustafa
-
Generative adversarial learning with optimal input dimension and its adaptive generator architecture
Zhiyao Tan, Ling Zhou, Huazhen Lin
-
Secure Inference for Vertically Partitioned Data Using Multiparty Homomorphic Encryption
Shuangyi Chen, Yue Ju, Zhongwen Zhu, Ashish Khisti
-
Differentially Private Federated Learning without Noise Addition: When is it Possible?
Jiang Zhang, Yahya H Ezzeldin, Ahmed Roushdy Elkordy, Konstantinos Psounis, Salman Avestimehr
-
Differentially Private Synthetic Data with Private Density Estimation
Nikolija Bojkovic, Po-Ling Loh
-
Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints
Siow Meng Low, Akshat Kumar
-
AnoGAN for Tabular Data: A Novel Approach to Anomaly Detection
Aditya Singh, Pavan Reddy
-
Confidential and Protected Disease Classifier using Fully Homomorphic Encryption
Aditya Malik, Nalini Ratha, Bharat Yalavarthi, Tilak Sharma, Arjun Kaushik, Charanjit Jutla
-
Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy
Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu, Premkumar Devanbu, Mohammad Amin Alipour
-
Defense against Joint Poison and Evasion Attacks: A Case Study of DERMS
Zain ul Abdeen, Padmaksha Roy, Ahmad Al-Tawaha, Rouxi Jia, Laura Freeman, Peter Beling, Chen-Ching Liu, Alberto Sangiovanni-Vincentelli, Ming Jin
-
Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness
Zhenan Shao, Linjian Ma, Bo Li, Diane M. Beck
-
Detecting Edited Knowledge in Language Models
Paul Youssef, Zhixue Zhao, Jörg Schlötterer, Christin Seifert
-
Shang Shang, Xinqiang Zhao, Zhongjiang Yao, Yepeng Yao, Liya Su, Zijing Fan, Xiaodan Zhang, Zhengwei Jiang
-
Zehan Zhu, Yan Huang, Xin Wang, Jinming Xu
-
Updating Windows Malware Detectors: Balancing Robustness and Regression against Adversarial EXEmples
Matous Kozak, Luca Demetrio, Dmitrijs Trizna, Fabio Roli
-
Metric Differential Privacy at the User-Level
Jacob Imola, Amrita Roy Chowdhury, Kamalika Chaudhuri
-
Xiaoyan Su, Yinghao Zhu, Run Li
-
Impact of Architectural Modifications on Deep Learning Adversarial Robustness
Firuz Juraev, Mohammed Abuhamad, Simon S. Woo, George K Thiruvathukal, Tamer Abuhmed
-
From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings
Firuz Juraev, Mohammed Abuhamad, Eric Chan-Tin, George K. Thiruvathukal, Tamer Abuhmed
-
Adversarial Botometer: Adversarial Analysis for Social Bot Detection
Shaghayegh Najari, Davood Rafiee, Mostafa Salehi, Reza Farahbakhsh
-
Anton Plaksin, Vitaly Kalev
-
Uniformly Stable Algorithms for Adversarial Training and Beyond
Jiancong Xiao, Jiawei Zhang, Zhi-Quan Luo, Asuman Ozdaglar
-
A Novel Approach to Guard from Adversarial Attacks using Stable Diffusion
Trinath Sai Subhash Reddy Pittala, Uma Maheswara Rao Meleti, Geethakrishna Puligundla
-
Optimistic Regret Bounds for Online Learning in Adversarial Markov Decision Processes
Sang Bin Moon, Abolfazl Hashemi
-
ProFLingo: A Fingerprinting-based Copyright Protection Scheme for Large Language Models
Heng Jin, Chaoyu Zhang, Shanghao Shi, Wenjing Lou, Y. Thomas Hou
-
Adaptive and robust watermark against model extraction attack
Kaiyi Pang, Tao Qi, Chuhan Wu, Minhao Bai
-
Xun Jiao, Fred Lin, Harish D. Dixit, Joel Coburn, Abhinav Pandey, Han Wang, Jianyu Huang, Venkat Ramesh, Wang Xu, Daniel Moore, Sriram Sankar
-
Privacy-aware Berrut Approximated Coded Computing for Federated Learning
Xavier Martínez Luaña, Rebeca P. Díaz Redondo, Manuel Fernández Veiga
-
Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk
Xinyi Ni, Lifeng Lai
-
Adversarial Attacks on Reinforcement Learning Agents for Command and Control
Ahaan Dabholkar, James Z. Hare, Mark Mittrick, John Richardson, Nicholas Waytowich, Priya Narayanan, Saurabh Bagchi
-
ATTAXONOMY: Unpacking Differential Privacy Guarantees Against Practical Adversaries
Rachel Cummings, Shlomi Hod, Jayshree Sarathy, Marika Swanberg
-
Explainability Guided Adversarial Evasion Attacks on Malware Detectors
Kshitiz Aryal, Maanak Gupta, Mahmoud Abdelsalam, Moustafa Saleh
-
Backdoor-based Explainable AI Benchmark for High Fidelity Evaluation of Attribution Methods
Peiyu Yang, Naveed Akhtar, Jiantong Jiang, Ajmal Mian
-
Wei-Ning Chen, Berivan Isik, Peter Kairouz, Albert No, Sewoong Oh, Zheng Xu
-
Temporal assessment of malicious behaviors: application to turnout field data monitoring
Sara Abdellaoui, Emil Dumitrescu, Cédric Escudero, Eric Zamaï
-
Robustness of graph embedding methods for community detection
Zhi-Feng Wei, Pablo Moriano, Ramakrishnan Kannan
-
Daniel Gibert, Luca Demetrio, Giulio Zizzo, Quan Le, Jordi Planes, Battista Biggio
-
Xuanli He, Jun Wang, Qiongkai Xu, Pasquale Minervini, Pontus Stenetorp, Benjamin I. P. Rubinstein, Trevor Cohn
-
AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples
Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor, Luca Demetrio, Ambra Demontis, Battista Biggio, Ismail Ben Ayed, Fabio Roli
-
ASAM: Boosting Segment Anything Model with Adversarial Tuning
Bo Li, Haoke Xiao, Lv Tang
-
Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective
Xiaoxuan Han, Songlin Yang, Wei Wang, Yang Li, Jing Dong
-
Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective
Wanqi Zhou, Shuanghao Bai, Qibin Zhao, Badong Chen
-
Norbert Tihanyi, Tamas Bisztray, Mohamed Amine Ferrag, Ridhi Jain, Lucas C. Cordeiro
-
Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh, Elvir Karimov, Mikhail Pautov, Oleg Y. Rogov, Ivan Oseledets
-
Harmonic Machine Learning Models are Robust
Nicholas S. Kersting, Yi Li, Aman Mohanty, Oyindamola Obisesan, Raphael Okochu
-
Uncertainty-boosted Robust Video Activity Anticipation
Zhaobo Qi, Shuhui Wang, Weigang Zhang, Qingming Huang
-
Xi Xin, Fei Huang, Giles Hooker
-
A Systematic Evaluation of Adversarial Attacks against Speech Emotion Recognition Models
Nicolas Facchinetti, Federico Simonetta, Stavros Ntalampiras
-
Assessing Cybersecurity Vulnerabilities in Code Large Language Models
Md Imran Hossen, Jianyi Zhang, Yinzhi Cao, Xiali Hei
-
Adversarial Examples: Generation Proposal in the Context of Facial Recognition Systems
Marina Fuster, Ignacio Vidaurreta
-
Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks
Yassine Abbahaddou, Sofiane Ennadir, Johannes F. Lutzeyer, Michalis Vazirgiannis, Henrik Boström
-
Privacy-Preserving Aggregation for Decentralized Learning with Byzantine-Robustness
Ali Reza Ghavamipour, Benjamin Zi Hao Zhao, Oguzhan Ersoy, Fatih Turkmen
-
Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics
Xiaoshuai Wu, Xin Liao, Bo Ou, Yuling Liu, Zheng Qin
-
Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection
Yizhou Chen, Zeyu Sun, Zhihao Gong, Dan Hao
-
Talking Nonsense: Probing Large Language Models' Understanding of Adversarial Gibberish Inputs
Valeriia Cherepanova, James Zou
-
Human-Imperceptible Retrieval Poisoning Attacks in LLM-Powered Applications
Quan Zhang, Binqi Zeng, Chijin Zhou, Gwihwan Go, Heyuan Shi, Yu Jiang
-
Enhancing Privacy and Security of Autonomous UAV Navigation
Vatsal Aggarwal, Arjun Ramesh Kaushik, Charanjit Jutla, Nalini Ratha
-
Lakmal Meegahapola, Hamza Hassoune, Daniel Gatica-Perez
-
Defending Spiking Neural Networks against Adversarial Attacks through Image Purification
Weiran Chen, Qi Sun, Qi Xu
-
Adversarial Reweighting with $α$-Power Maximization for Domain Adaptation
Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu
-
Estimating the Robustness Radius for Randomized Smoothing with 100$\times$ Sample Efficiency
Emmanouil Seferis, Stefanos Kollias, Chih-Hong Cheng
-
Adversarial Consistency and the Uniqueness of the Adversarial Bayes Classifier
Natalie S. Frank
-
Evaluations of Machine Learning Privacy Defenses are Misleading
Michael Aerni, Jie Zhang, Florian Tramèr
-
Beyond Traditional Threats: A Persistent Backdoor Attack on Federated Learning
Tao Liu, Yuhang Zhang, Zhu Feng, Zhiqin Yang, Chen Xu, Dapeng Man, Wu Yang
-
Center-Based Relaxed Learning Against Membership Inference Attacks
Xingli Fang, Jung-Eun Kim
-
Adrien Le Coz, Houssem Ouertatani, Stéphane Herbin, Faouzi Adjed
-
Constructing Optimal Noise Channels for Enhanced Robustness in Quantum Machine Learning
David Winderl, Nicola Franco, Jeanette Miriam Lorenz
-
Towards Precise Observations of Neural Model Robustness in Classification
Wenchuan Mu, Kwan Hui Lim
-
Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples
Kuofeng Gao, Jindong Gu, Yang Bai, Shu-Tao Xia, Philip Torr, Wei Liu, Zhifeng Li
-
Understanding Privacy Risks of Embeddings Induced by Large Language Models
Zhihao Zhu, Ninglu Shao, Defu Lian, Chenwang Wu, Zheng Liu, Yi Yang, Enhong Chen
-
Don't Say No: Jailbreaking LLM by Suppressing Refusal
Yukai Zhou, Wenjie Wang
-
PAD: Patch-Agnostic Defense against Adversarial Patch Attacks
Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, Cong Zou
-
Zhe Zhang, Ryumei Nakada, Linjun Zhang
-
Boosting Model Resilience via Implicit Adversarial Data Augmentation
Xiaoling Zhou, Wei Ye, Zhemg Lee, Rui Xie, Shikun Zhang
-
Cristopher McIntyre-Garcia, Adrien Heymans, Beril Borali, Won-Sook Lee, Shiva Nejati
-
A Notion of Uniqueness for the Adversarial Bayes Classifier
Natalie S. Frank
-
A General Black-box Adversarial Attack on Graph-based Fake News Detectors
Peican Zhu, Zechen Pan, Yang Liu, Jiwei Tian, Keke Tang, Zhen Wang
-
Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung, Che-Rung Lee
-
Universal Adversarial Triggers Are Not Universal
Nicholas Meade, Arkil Patel, Siva Reddy
-
3D Face Morphing Attack Generation using Non-Rigid Registration
Jag Mohan Singh, Raghavendra Ramachandra
-
Vision Transformer-based Adversarial Domain Adaptation
Yahan Li, Yuan Wu
-
Beyond Deepfake Images: Detecting AI-Generated Videos
Danial Samadi Vahdati, Tai D. Nguyen, Aref Azizpour, Matthew C. Stamm
-
Vidit Khazanchi, Pavan Kulkarni, Yuvaraj Govindarajulu, Manojkumar Parmar
-
CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning
Haolin Wu, Jing Chen, Ruiying Du, Cong Wu, Kun He, Xingcan Shang, Hao Ren, Guowen Xu
-
Security Analysis of WiFi-based Sensing Systems: Threats from Perturbation Attacks
Hangcheng Cao, Wenbin Huang, Guowen Xu, Xianhao Chen, Ziyang He, Jingyang Hu, Hongbo Jiang, Yuguang Fang
-
PoisonedFL: Model Poisoning Attacks to Federated Learning via Multi-Round Consistency
Yueqi Xie, Minghong Fang, Neil Zhenqiang Gong
-
Sehyun Ryu, Jonggyu Jang, Hyun Jong Yang
-
Advancing Recommender Systems by mitigating Shilling attacks
Aditya Chichani, Juzer Golwala, Tejas Gundecha, Kiran Gawande
-
Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions
Divyansh Agarwal, Alexander R. Fabbri, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
-
An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape
Sifat Muhammad Abdullah, Aravind Cheruvu, Shravya Kanchi, Taejoong Chung, Peng Gao, Murtuza Jadliwala, Bimal Viswanath
-
Enhancing Privacy in Face Analytics Using Fully Homomorphic Encryption
Bharat Yalavarthi, Arjun Ramesh Kaushik, Arun Ross, Vishnu Boddeti, Nalini Ratha
-
A Comparative Analysis of Adversarial Robustness for Quantum and Classical Machine Learning Models
Maximilian Wendlinger, Kilian Tscharke, Pascal Debus
-
Attacks on Third-Party APIs of Large Language Models
Wanru Zhao, Vidit Khazanchi, Haodi Xing, Xuanli He, Qiongkai Xu, Nicholas Donald Lane
-
Talk Too Much: Poisoning Large Language Models under Token Limit
Jiaming He, Wenbo Jiang, Guanyu Hou, Wenshu Fan, Rui Zhang, Hongwei Li
-
Phoebe Jing, Yijing Gao, Xianlong Zeng
-
Leverage Variational Graph Representation For Model Poisoning on Federated Learning
Kai Li, Xin Yuan, Jingjing Zheng, Wei Ni, Falko Dressler, Abbas Jamalipour
-
Tobias Ladner, Michael Eichelbeck, Matthias Althoff
-
Manipulating Recommender Systems: A Survey of Poisoning Attacks and Countermeasures
Thanh Toan Nguyen, Quoc Viet Hung Nguyen, Thanh Tam Nguyen, Thanh Trung Huynh, Thanh Thi Nguyen, Matthias Weidlich, Hongzhi Yin
-
Double Privacy Guard: Robust Traceable Adversarial Watermarking against Face Recognition
Yunming Zhang, Dengpan Ye, Sipeng Shen, Caiyun Xie, Ziyi Liu, Jiacheng Deng, Long Tang
-
Every Breath You Don't Take: Deepfake Speech Detection Using Breath
Seth Layton, Thiago De Andrade, Daniel Olszewski, Kevin Warren, Carrie Gates, Kevin Butler, Patrick Traynor
-
Rethinking LLM Memorization through the Lens of Adversarial Compression
Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
-
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs
Javier Rando, Francesco Croce, Kryštof Mitka, Stepan Shabalin, Maksym Andriushchenko, Nicolas Flammarion, Florian Tramèr
-
The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking
Yuying Li, Zeyan Liu, Junyi Zhao, Liangqin Ren, Fengjun Li, Jiebo Luo, Bo Luo
-
Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith
-
Protecting Your LLMs with Information Bottleneck
Zichuan Liu, Zefan Wang, Linjie Xu, Jinyu Wang, Lei Song, Tianchun Wang, Chunlin Chen, Wei Cheng, Jiang Bian
-
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
Wenyi Xiao, Ziwei Huang, Leilei Gan, Wanggui He, Haoyuan Li, Zhelun Yu, Hao Jiang, Fei Wu, Linchao Zhu
-
Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Taeho Hwang, Jong C. Park
-
Zero-shot Cross-lingual Stance Detection via Adversarial Language Adaptation
Bharathi A, Arkaitz Zubiaga
-
Swap It Like Its Hot: Segmentation-based spoof attacks on eye-tracking images
Anish S. Narkar, Brendan David-John
-
FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge
Hanzhe Li, Jiaran Zhou, Bin Li, Junyu Dong, Yuezun Li
-
Wenhao Lan, Yijun Yang, Haihua Shen, Shan Li
-
Towards Better Adversarial Purification via Adversarial Denoising Diffusion Training
Yiming Liu, Kezhao Liu, Yao Xiao, Ziyi Dong, Xiaogang Xu, Pengxu Wei, Liang Lin
-
Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference
Yujin Han, Difan Zou
-
Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning
Huan Bao, Kaimin Wei, Yongdong Wu, Jin Qian, Robert H. Deng
-
Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation
Xulin Chen, Ruipeng Liu, Garrett E. Katz
-
Dual Model Replacement:invisible Multi-target Backdoor Attack based on Federal Learning
Rong Wang, Guichen Zhou, Mingjun Gao, Yunpeng Xiao
-
Poisoning Attacks on Federated Learning-based Wireless Traffic Prediction
Zifan Zhang, Minghong Fang, Jiayuan Huang, Yuchen Liu
-
A mean curvature flow arising in adversarial training
Leon Bungert, Tim Laux, Kerrek Stinson
-
Offensive AI: Enhancing Directory Brute-forcing Attack with the Use of Language Models
Alberto Castagnaro, Mauro Conti, Luca Pajola
-
Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion
Hongyu Zhu, Sichu Liang, Wentao Hu, Fangqi Li, Ju Jia, Shilin Wang
-
Xu Yang, Jiapeng Zhang, Qifeng Zhang, Zhuo Tang
-
Interval Abstractions for Robust Counterfactual Explanations
Junqi Jiang, Francesco Leofante, Antonio Rago, Francesca Toni
-
Towards General Conceptual Model Editing via Adversarial Representation Engineering
Yihao Zhang, Zeming Wei, Jun Sun, Meng Sun
-
Trojan Detection in Large Language Models: Insights from The Trojan Detection Challenge
Narek Maloyan, Ekansh Verma, Bulat Nutfullin, Bislan Ashinov
-
Attack on Scene Flow using Point Clouds
Haniyeh Ehsani Oskouie, Mohammad-Shahram Moin, Shohreh Kasaei
-
Mean Aggregator Is More Robust Than Robust Aggregators Under Label Poisoning Attacks
Jie Peng, Weiyu Li, Qing Ling
-
LLMs in Web-Development: Evaluating LLM-Generated PHP code unveiling vulnerabilities and limitations
Rebeka Tóth, Tamas Bisztray, László Erdodi
-
Robust EEG-based Emotion Recognition Using an Inception and Two-sided Perturbation Model
Shadi Sartipi, Mujdat Cetin
-
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
Anselm Paulus, Arman Zharmagambetov, Chuan Guo, Brandon Amos, Yuandong Tian
-
Pixel is a Barrier: Diffusion Models Are More Adversarially Robust Than We Think
Haotian Xue, Yongxin Chen
-
AdvLoRA: Adversarial Low-Rank Adaptation of Vision-Language Models
Yuheng Ji, Yue Liu, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Gang Zhou, Xingwei Zhang, Xinwang Liu, Xiaolong Zheng
-
PristiQ: A Co-Design Framework for Preserving Data Security of Quantum Learning in the Cloud
Zhepeng Wang, Yi Sheng, Nirajan Koirala, Kanad Basu, Taeho Jung, Cheng-Chang Lu, Weiwen Jiang
-
Chenxi Yang, Yujia Liu, Dingquan Li, Yan Zhong, Tingting Jiang
-
Backdoor Attacks and Defenses on Semantic-Symbol Reconstruction in Semantic Communications
Yuan Zhou, Rose Qingyang Hu, Yi Qian
-
Jose Cribeiro-Ramallo, Vadim Arzamasov, Federico Matteucci, Denis Wambold, Klemens Böhm
-
How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples
Dren Fazlija, Arkadij Orlov, Johanna Schrader, Monty-Maximilian Zühlke, Michael Rohs, Daniel Kudenko
-
A Clean-graph Backdoor Attack against Graph Convolutional Networks with Poisoned Label Only
Jiazhu Dai, Haoyu Sun
-
Heqi Peng, Yunhong Wang, Ruijie Yang, Beichen Li, Rui Wang, Yuanfang Guo
-
Aravinda Reddy PN, Raghavendra Ramachandra, Krothapalli Sreenivasa Rao, Pabitra Mitra
-
PATE-TripleGAN: Privacy-Preserving Image Synthesis with Gaussian Differential Privacy
Zepeng Jiang, Weiwei Ni, Yifan Zhang
-
Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images
Santosh, Li Lin, Irene Amerini, Xin Wang, Shu Hu
-
SA-Attack: Speed-adaptive stealthy adversarial attack on trajectory prediction
Huilin Yin, Jiaxiang Li, Pengju Zhen, Jun Yan
-
Beichen Li, Yuanfang Guo, Heqi Peng, Yangxi Li, Yunhong Wang
-
Defending against Data Poisoning Attacks in Federated Learning via User Elimination
Nick Galanis
-
The Power of Words: Generating PowerShell Attacks from Natural Language
Pietro Liguori, Christian Marescalco, Roberto Natella, Vittorio Orbinato, Luciano Pianese
-
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni, Rui Ye, Yuxi Wei, Zhen Xiang, Yanfeng Wang, Siheng Chen
-
Privacy-Preserving Debiasing using Data Augmentation and Machine Unlearning
Zhixin Pan, Emma Andrews, Laura Chang, Prabhat Mishra
-
DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection
Shuwei Hou, Yan Ju, Chengzhe Sun, Shan Jia, Lipeng Ke, Riky Zhou, Anita Nikolich, Siwei Lyu
-
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe
-
Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
Raz Lapid, Almog Dubin, Moshe Sipper
-
Proteus: Preserving Model Confidentiality during Graph Optimizations
Yubo Gao, Maryam Haghifam, Christina Giannoula, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar
-
Introducing v0.5 of the AI Safety Benchmark from MLCommons
Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse Khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, et al. (46 additional authors not shown)
-
Advancing the Robustness of Large Language Models through Self-Denoised Smoothing
Jiabao Ji, Bairu Hou, Zhen Zhang, Guanhua Zhang, Wenqi Fan, Qing Li, Yang Zhang, Gaowen Liu, Sijia Liu, Shiyu Chang
-
Enhance Robustness of Language Models Against Variation Attack through Graph Integration
Zi Xiong, Lizhi Qing, Yangyang Kang, Jiawei Liu, Hongsong Li, Changlong Sun, Xiaozhong Liu, Wei Lu
-
Uncovering Safety Risks in Open-source LLMs through Concept Activation Vector
Zhihao Xu, Ruixuan Huang, Xiting Wang, Fangzhao Wu, Jing Yao, Xing Xie
-
Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement
Pushkar Shukla, Dhruv Srikanth, Lee Cohen, Matthew Turk
-
Sungwon Han, Hyeonho Song, Sungwon Park, Meeyoung Cha
-
Yuchen Zhu, Yufeng Zhang, Zhaoran Wang, Zhuoran Yang, Xiaohong Chen
-
KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning
Marco Arazzi, Serena Nicolazzo, Antonino Nocera
-
TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment
Qinfeng Li, Zhiqiang Shen, Zhenghan Qin, Yangfan Xie, Xuhong Zhang, Tianyu Du, Jianwei Yin
-
Sampling-based Pseudo-Likelihood for Membership Inference Attacks
Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki
-
A Federated Learning Approach to Privacy Preserving Offensive Language Identification
Marcos Zampieri, Damith Premasiri, Tharindu Ranasinghe
-
GenFighter: A Generative and Evolutive Textual Attack Removal
Md Athikul Islam, Edoardo Serra, Sushil Jajodia
-
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data
Zixuan Zhu, Rui Wang, Cong Zou, Lihua Jing
-
Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness
Hangtao Zhang, Shengshan Hu, Yichen Wang, Leo Yu Zhang, Ziqi Zhou, Xianlong Wang, Yanjun Zhang, Chao Chen
-
Qiang Li, Michal Yemini, Hoi-To Wai
-
Exploring DNN Robustness Against Adversarial Attacks Using Approximate Multipliers
Mohammad Javad Askarizadeh, Ebrahim Farahmand, Jorge Castro-Godinez, Ali Mahani, Laura Cabrera-Quiros, Carlos Salazar-Garcia
-
A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications
Antonio Boiano, Marco Di Gennaro, Luca Barbieri, Michele Carminati, Monica Nicoli, Alessandro Redondi, Stefano Savazzi, Albert Sund Aillet, Diogo Reis Santos, Luigi Serio
-
Private Attribute Inference from Images with Vision-Language Models
Batuhan Tömekçe, Mark Vero, Robin Staab, Martin Vechev
-
Towards a Novel Perspective on Adversarial Examples Driven by Frequency
Zhun Zhang, Yi Zeng, Qihe Liu, Shijie Zhou
-
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning
Xiao Wang, Tianze Chen, Xianjun Yang, Qi Zhang, Xun Zhao, Dahua Lin
-
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Yong Dai, Lei Han, Nan Du
-
Qi Guo, Shanmin Pang, Xiaojun Jia, Qing Guo
-
Adversarial Identity Injection for Semantic Face Image Synthesis
Giuseppe Tarollo, Tomaso Fontanini, Claudio Ferrari, Guido Borghi, Andrea Prati
-
Do Counterfactual Examples Complicate Adversarial Training?
Eric Yeats, Cameron Darwin, Eduardo Ortega, Frank Liu, Hai Li
-
Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback
Qiwei Di, Jiafan He, Quanquan Gu
-
Differentially Private Optimization with Sparse Gradients
Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Ravi Kumar, Pasin Manurangsi
-
Privacy at a Price: Exploring its Dual Impact on AI Fairness
Mengmeng Yang, Ming Ding, Youyang Qu, Wei Ni, David Smith, Thierry Rakotoarivelo
-
Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models
Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
-
Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label
Byeongkeun Kang, Sinhae Cha, Yeejin Lee
-
Beyond Noise: Privacy-Preserving Decentralized Learning with Virtual Nodes
Sayan Biswas, Mathieu Even, Anne-Marie Kermarrec, Laurent Massoulie, Rafael Pires, Rishi Sharma, Martijn de Vos
-
Privacy-Preserving Intrusion Detection using Convolutional Neural Networks
Martin Kodys, Zhongmin Dai, Vrizlynn L. L. Thing
-
Mitigating the Curse of Dimensionality for Certified Robustness via Dual Randomized Smoothing
Song Xia, Yu Yi, Xudong Jiang, Henghui Ding
-
Ti-Patch: Tiled Physical Adversarial Patch for no-reference video quality metrics
Victoria Leonenkova, Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin
-
On the Efficiency of Privacy Attacks in Federated Learning
Nawrin Tabassum, Ka-Ho Chow, Xuyu Wang, Wenbin Zhang, Yanzhao Wu
-
Privacy-Preserving Federated Unlearning with Certified Client Removal
Ziyao Liu, Huanyi Ye, Yu Jiang, Jiyuan Shen, Jiale Guo, Ivan Tjuawinata, Kwok-Yan Lam
-
Deceiving to Enlighten: Coaxing LLMs to Self-Reflection for Enhanced Bias Detection and Mitigation
Ruoxi Cheng, Haoxuan Ma, Shuirong Cao
-
AIGeN: An Adversarial Approach for Instruction Generation in VLN
Niyati Rawal, Roberto Bigazzi, Lorenzo Baraldi, Rita Cucchiara
-
Black-box Adversarial Transferability: An Empirical Study in Cybersecurity Perspective
Khushnaseeb Roshan, Aasim Zafar
-
Make Split, not Hijack: Preventing Feature-Space Hijacking Attacks in Split Learning
Tanveer Khan, Mindaugas Budzys, Antonis Michalas
-
FaceCat: Enhancing Face Recognition Security with a Unified Generative Model Framework
Jiawei Chen, Xiao Yang, Yinpeng Dong, Hang Su, Jianteng Peng, Zhaoxia Yin
-
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Brian R. Bartoldson, James Diffenderfer, Konstantinos Parasyris, Bhavya Kailkhura
-
Proof-of-Learning with Incentive Security
Zishuo Zhao, Zhixuan Fang, Xuechao Wang, Yuan Zhou
-
CodeCloak: A Method for Evaluating and Mitigating Code Leakage by LLM Code Assistants
Amit Finkman, Eden Bar-Kochva, Avishag Shapira, Dudu Mimran, Yuval Elovici, Asaf Shabtai
-
Stability and Generalization in Free Adversarial Training
Xiwei Cheng, Kexin Fu, Farzan Farnia
-
Multimodal Attack Detection for Action Recognition Models
Furkan Mumcu, Yasin Yilmaz
-
A Survey of Neural Network Robustness Assessment in Image Recognition
Jie Wang, Jun Ai, Minyan Lu, Haoran Su, Dan Yu, Yutao Zhang, Junda Zhu, Jingyu Liu
-
Adversarial Imitation Learning via Boosting
Jonathan D. Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun
-
VertAttack: Taking advantage of Text Classifiers' horizontal vision
Jonathan Rusert
-
Practical Region-level Attack against Segment Anything Models
Yifan Shen, Zhengyuan Li, Gang Wang
-
Struggle with Adversarial Defense? Try Diffusion
Yujie Li, Yanbin Wang, Haitao xu, Bin Liu, Jianguo Sun, Zhenhao Guo, Wenrui Ma
-
Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts
Yang Li, Songlin Yang, Wei Wang, Ziwen He, Bo Peng, Jing Dong
-
Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues
Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu
-
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee, Tejas Gokhale, Chitta Baral, Yezhou Yang
-
Cui Zhang, Xiao Xu, Qiong Wu, Pingyi Fan, Qiang Fan, Huiling Zhu, Jiangzhou Wang
-
FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models
Yanting Wang, Wei Zou, Jinyuan Jia
-
Juntaek Lim, Youngeun Kwon, Ranggi Hwang, Kiwan Maeng, G. Edward Suh, Minsoo Rhu
-
Dipkamal Bhusal, Md Tanvirul Alam, Monish K. Veerabhadran, Michael Clifford, Sara Rampazzi, Nidhi Rastogi
-
Differentially Private GANs for Generating Synthetic Indoor Location Data
Vahideh Moghtadaiee, Mina Alishahi, Milad Rabiei
-
Differentially Private Reinforcement Learning with Self-Play
Dan Qiao, Yu-Xiang Wang
-
ZhenZhe Gao, Zhenjun Tang, Zhaoxia Yin, Baoyuan Wu, Yue Lu
-
Zeyi Liao, Huan Sun
-
Privacy preserving layer partitioning for Deep Neural Network models
Kishore Rajasekar, Randolph Loh, Kar Wai Fok, Vrizlynn L. L. Thing
-
Enhancing Network Intrusion Detection Performance using Generative Adversarial Networks
Xinxing Zhao, Kar Wai Fok, Vrizlynn L. L. Thing
-
Backdoor Contrastive Learning via Bi-level Trigger Optimization
Weiyu Sun, Xinyu Zhang, Hao Lu, Yingcong Chen, Ting Wang, Jinghui Chen, Lu Lin
-
Latent Guard: a Safety Framework for Text-to-image Generation
Runtao Liu, Ashkan Khakzar, Jindong Gu, Qifeng Chen, Philip Torr, Fabio Pizzati
-
LLM Agents can Autonomously Exploit One-day Vulnerabilities
Richard Fang, Rohan Bindu, Akul Gupta, Daniel Kang
-
Persistent Classification: A New Approach to Stability of Data and Adversarial Examples
Brian Bell, Michael Geyer, David Glickenstein, Keaton Hamm, Carlos Scheidegger, Amanda Fernandez, Juston Moore
-
Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization
Runqi Lin, Chaojian Yu, Tongliang Liu
-
CodeFort: Robust Training for Code Generation Models
Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras
-
Towards a Game-theoretic Understanding of Explanation-based Membership Inference Attacks
Kavita Kumari, Murtuza Jadliwala, Sumit Kumar Jha, Anindya Maiti
-
SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
Xinfeng Li, Yuchen Yang, Jiangyi Deng, Chen Yan, Yanjiao Chen, Xiaoyu Ji, Wenyuan Xu
-
How to Craft Backdoors with Unlabeled Data Alone?
Yifei Wang, Wenhan Ma, Yisen Wang
-
Logit Calibration and Feature Contrast for Robust Federated Learning on Non-IID Data
Yu Qiao, Chaoning Zhang, Apurba Adhikary, Choong Seon Hong
-
Adversarial purification for no-reference image-quality metrics: applicability study and new methods
Aleksandr Gushchin, Anna Chistyakova, Vladislav Minashkin, Anastasia Antsiferova, Dmitriy Vatolin
-
Simpler becomes Harder: Do LLMs Exhibit a Coherent Behavior on Simplified Corpora?
Miriam Anschütz, Edoardo Mosca, Georg Groh
-
Poisoning Prevention in Federated Learning and Differential Privacy via Stateful Proofs of Execution
Norrathep Rattanavipanon, Ivan de Oliviera Nunes
-
Fatima Ezzeddine, Mirna Saad, Omran Ayoub, Davide Andreoletti, Martin Gjoreski, Ihab Sbeity, Marc Langheinrich, Silvia Giordano
-
LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks
Jianlang Chen, Xuhong Ren, Qing Guo, Felix Juefei-Xu, Di Lin, Wei Feng, Lei Ma, Jianjun Zhao
-
On adversarial training and the 1 Nearest Neighbor classifier
Amir Hagai, Yair Weiss
-
Towards Robust Domain Generation Algorithm Classification
Arthur Drichel, Marc Meyer, Ulrike Meyer
-
Privacy-preserving Scanpath Comparison for Pervasive Eye Tracking
Suleyman Ozdel, Efe Bozkir, Enkelejda Kasneci
-
Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs
Bibek Upadhayay, Vahid Behzadan
-
Towards Building a Robust Toxicity Predictor
Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Liutong Zhou, Yanjun Qi
-
SoK: Gradient Leakage in Federated Learning
Jiacheng Du, Jiahui Hu, Zhibo Wang, Peng Sun, Neil Zhenqiang Gong, Kui Ren
-
Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data
Tim Baumgärtner, Yang Gao, Dana Alon, Donald Metzler
-
Investigating the Impact of Quantization on Adversarial Robustness
Qun Li, Yuan Meng, Chen Tang, Jiacheng Jiang, Zhi Wang
-
David and Goliath: An Empirical Evaluation of Attacks and Defenses for QNNs at the Deep Edge
Miguel Costa, Sandro Pinto
-
Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods
Roopkatha Dey, Aivy Debnath, Sayak Kumar Dutta, Kaustav Ghosh, Arijit Mitra, Arghya Roy Chowdhury, Jaydip Sen
-
Out-of-Distribution Data: An Acquaintance of Adversarial Examples -- A Survey
Naveen Karunanayake, Ravin Gunawardena, Suranga Seneviratne, Sanjay Chawla
-
BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack
Viet Quoc Vo, Ehsan Abbasnejad, Damith C. Ranasinghe
-
Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing
Chengyan Fu, Wenjie Wang
-
Flexible Fairness Learning via Inverse Conditional Permutation
Yuheng Lai, Leying Guan
-
Enabling Privacy-Preserving Cyber Threat Detection with Federated Learning
Yu Bi, Yekai Li, Xuan Feng, Xianghang Mi
-
Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning
Ruiqi Zhang, Licong Lin, Yu Bai, Song Mei
-
Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge
Weikai Lu, Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Zelin Chen, Huiping Zhuang, Cen Chen
-
Privacy-Preserving Deep Learning Using Deformable Operators for Secure Task Learning
Fabian Perez, Jhon Lopez, Henry Arguello
-
Quantum Adversarial Learning for Kernel Methods
Giuseppe Montalbano, Leonardo Banchi
-
Inference-Time Rule Eraser: Distilling and Removing Bias Rules to Mitigate Bias in Deployed Models
Yi Zhang, Jitao Sang
-
Zhilong Wang, Yebo Cao, Peng Liu
-
SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials
Mael Jullien, Marco Valentino, André Freitas
-
How much reliable is ChatGPT's prediction on Information Extraction under Input Perturbations?
Ishani Mondal, Abhilasha Sancheti
-
Privacy-Preserving Traceable Functional Encryption for Inner Product
Muyao Qiu, Jinguang Han
-
Trustless Audits without Revealing Data or Models
Suppakit Waiwitlikhit, Ion Stoica, Yi Sun, Tatsunori Hashimoto, Daniel Kang
-
Data Poisoning Attacks on Off-Policy Policy Evaluation Methods
Elita Lobo, Harvineet Singh, Marek Petrik, Cynthia Rudin, Himabindu Lakkaraju
-
D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy
Yongqi Yang, Zhihao Qian, Ye Zhu, Yu Wu
-
Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
Shizhan Gong, Qi Dou, Farzan Farnia
-
Francesco Marchiori, Mauro Conti
-
Goal-guided Generative Prompt Injection Attack on Large Language Models
Chong Zhang, Mingyu Jin, Qinkai Yu, Chengzhi Liu, Haochen Xue, Xiaobo Jin
-
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Simone Tedeschi, Felix Friedrich, Patrick Schramowski, Kristian Kersting, Roberto Navigli, Huu Nguyen, Bo Li
-
Precision Guided Approach to Mitigate Data Poisoning Attacks in Federated Learning
K Naveen Kumar, C Krishna Mohan, Aravind Machiry
-
Watermark-based Detection and Attribution of AI-Generated Content
Zhengyuan Jiang, Moyang Guo, Yuepeng Hu, Neil Zhenqiang Gong
-
Trilokesh Ranjan Sarkar, Nilanjan Das, Pralay Sankar Maitra, Bijoy Some, Ritwik Saha, Orijita Adhikary, Bishal Bose, Jaydip Sen
-
Ana-Maria Cretu, Miruna Rusu, Yves-Alexandre de Montjoye
-
You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks
Qiushi Li, Yan Zhang, Ju Ren, Qi Li, Yaoxue Zhang
-
Increased LLM Vulnerabilities from Fine-tuning and Quantization
Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, Prashanth Harshangi
-
Knowledge Distillation-Based Model Extraction Attack using Private Counterfactual Explanations
Fatima Ezzeddine, Omran Ayoub, Silvia Giordano
-
Learn When (not) to Trust Language Models: A Privacy-Centric Adaptive Model-Aware Approach
Chengkai Huang, Rui Wang, Kaige Xie, Tong Yu, Lina Yao
-
Stephen Meisenbacher, Nihildev Nandakumar, Alexandra Klymenko, Florian Matthes
-
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?
Shuo Chen, Zhen Han, Bailan He, Zifeng Ding, Wenqian Yu, Philip Torr, Volker Tresp, Jindong Gu
-
Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks
Lei Zhang, Yuhang Zhou, Yi Yang, Xinbo Gao
-
Learn What You Want to Unlearn: Unlearning Inversion Attacks against Machine Unlearning
Hongsheng Hu, Shuo Wang, Tian Dong, Minhui Xue
-
Privacy-Enhancing Technologies for Artificial Intelligence-Enabled Systems
Liv d'Aliberti, Evan Gronberg, Joseph Kovba
-
Jianming Tong, Jingtian Dang, Anupam Golder, Callie Hao, Arijit Raychowdhury, Tushar Krishna
-
Weidi Luo, Siyuan Ma, Xiaogeng Liu, Xiaoyu Guo, Chaowei Xiao
-
Adversarial Attacks and Dimensionality in Text Classifiers
Nandish Chattopadhyay, Atreya Goswami, Anupam Chattopadhyay
-
Qianqiao Xu, Zhiliang Tian, Hongyan Wu, Zhen Huang, Yiping Song, Feng Liu, Dongsheng Li
-
Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models
Jiachen Ma, Anda Cao, Zhiqing Xiao, Jie Zhang, Chao Ye, Junbo Zhao
-
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion
-
Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack
Ying Zhou, Ben He, Le Sun
-
Red-Teaming Segment Anything Model
Krzysztof Jankowski, Bartlomiej Sobieski, Mateusz Kwiatkowski, Jakub Szulc, Michal Janik, Hubert Baniecki, Przemyslaw Biecek
-
Towards Robust 3D Pose Transfer with Adversarial Learning
Haoyu Chen, Hao Tang, Ehsan Adeli, Guoying Zhao
-
Exploring Backdoor Vulnerabilities of Chat Models
Yunzhuo Hao, Wenkai Yang, Yankai Lin
-
BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks
Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang
-
Multi-granular Adversarial Attacks against Black-box Neural Ranking Models
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng
-
UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models
Zihan Guan, Mengxuan Hu, Sheng Li, Anil Vullikanti
-
An Unsupervised Adversarial Autoencoder for Cyber Attack Detection in Power Distribution Grids
Mehdi Jabbari Zideh, Mohammad Reza Khalghani, Sarika Khushalani Solanki
-
STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario
Renyang Liu, Kwok-Yan Lam, Wei Zhou, Sixing Wu, Jun Zhao, Dongting Hu, Mingming Gong
-
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Runhao Zeng, Xiaoyong Chen, Jiaming Liang, Huisi Wu, Guangzhong Cao, Yong Guo
-
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang, Hongye Fu, Wei Zou, Jinyuan Jia
-
MedBN: Robust Test-Time Adaptation against Malicious Test Samples
Hyejin Park, Jeongyeon Hwang, Sunung Mun, Sangdon Park, Jungseul Ok
-
Towards Understanding Dual BN In Hybrid Adversarial Training
Chenshuang Zhang, Chaoning Zhang, Kang Zhang, Axi Niu, Junmo Kim, In So Kweon
-
Janis Goldzycher, Paul Röttger, Gerold Schneider
-
Manipulating Neural Path Planners via Slight Perturbations
Zikang Xiong, Suresh Jagannathan
-
CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection
Jiayi Zhu, Qing Guo, Felix Juefei-Xu, Yihao Huang, Yang Liu, Geguang Pu
-
Safe and Robust Reinforcement-Learning: Principles and Practice
Taku Yamagata, Raul Santos-Rodriguez
-
Bayesian Learned Models Can Detect Adversarial Malware For Free
Bao Gia Doan, Dang Quang Nguyen, Paul Montague, Tamas Abraham, Olivier De Vel, Seyit Camtepe, Salil S. Kanhere, Ehsan Abbasnejad, Damith C. Ranasinghe
-
MisGUIDE : Defense Against Data-Free Deep Learning Model Extraction
Mahendra Gurve, Sankar Behera, Satyadev Ahlawat, Yamuna Prasad
-
Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Avisek Naug, Sahand Ghorbanpour
-
Out-of-distribution Rumor Detection via Test-Time Adaptation
Xiang Tao, Mingqing Zhang, Qiang Liu, Shu Wu, Liang Wang
-
DataCook: Crafting Anti-Adversarial Examples for Healthcare Data Copyright Protection
Sihan Shang, Jiancheng Yang, Zhenglong Sun, Pascal Fua
-
Maciej K Wozniak, Mattias Hansson, Marko Thiel, Patric Jensfelt
-
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Jiawen Shi, Zenghui Yuan, Yinuo Liu, Yue Huang, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong
-
Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving
Junhao Zheng, Chenhao Lin, Jiahao Sun, Zhengyu Zhao, Qian Li, Chao Shen
-
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
Xiangyu Yin, Wenjie Ruan
-
Securing GNNs: Explanation-Based Identification of Backdoored Training Graphs
Jane Downer, Ren Wang, Binghui Wang
-
Targeted Visualization of the Backbone of Encoder LLMs
Isaac Roberts, Alexander Schulz, Luca Hermes, Barbara Hammer
-
$\textit{LinkPrompt}$: Natural and Universal Adversarial Attacks on Prompt-based Language Models
Yue Xu, Wenjie Wang
-
The Anatomy of Adversarial Attacks: Concept-based XAI Dissection
Georgii Mikriukov, Gesina Schwalbe, Franz Motzkus, Korinna Bade
-
Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion
Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum
-
Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models
Kaikang Zhao, Xi Chen, Wei Huang, Liuxin Ding, Xianglong Kong, Fan Zhang
-
Md Abdul Kadir, GowthamKrishna Addluri, Daniel Sonntag
-
CipherFormer: Efficient Transformer Private Inference with Low Round Complexity
Weize Wang, Yi Kuang
-
Task-Agnostic Detector for Insertion-Based Backdoor Attacks
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen
-
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
-
Siyuan Liang, Kuanrong Liu, Jiajun Gong, Jiawei Liang, Yuan Xun, Ee-Chien Chang, Xiaochun Cao
-
Robust Diffusion Models for Adversarial Purification
Guang Lin, Zerui Tao, Jianhai Zhang, Toshihisa Tanaka, Qibin Zhao
-
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals
Rui Zheng, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang
-
Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions
Kaiwen Wang, Yinzhe Shen, Martin Lauer
-
An Embarrassingly Simple Defense Against Backdoor Attacks On SSL
Aryan Satpathy, Nilaksh, Dhruva Rajwade
-
A Transfer Attack to Image Watermarks
Yuepeng Hu, Zhengyuan Jiang, Moyang Guo, Neil Gong
-
Dazhong Rong, Shuheng Shen, Xinyi Fu, Peng Qian, Jianhai Chen, Qinming He, Xing Fu, Weiqiang Wang
-
Robust optimization for adversarial learning with finite sample complexity guarantees
André Bertolace, Konstatinos Gatsis, Kostas Margellos
-
Twin Auto-Encoder Model for Learning Separable Representation in Cyberattack Detection
Phai Vu Dinh, Quang Uy Nguyen, Thai Hoang Dinh, Diep N. Nguyen, Bao Son Pham, Eryk Dutkiewicz
-
Differentially Private Next-Token Prediction of Large Language Models
James Flemings, Meisam Razaviyayn, Murali Annavaram
-
SoftPatch: Unsupervised Anomaly Detection with Noisy Data
Xi Jiang, Ying Chen, Qiang Nie, Yong Liu, Jianlin Liu, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng
-
Locating and Mitigating Gender Bias in Large Language Models
Yuchen Cai, Ding Cao, Rongxi Guo, Yaqin Wen, Guiquan Liu, Enhong Chen
-
Detoxifying Large Language Models via Knowledge Editing
Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen
-
Longzheng Wang, Xiaohan Xu, Lei Zhang, Jiarui Lu, Yongxiu Xu, Hongbo Xu, Chuang Zhang
-
Adversary-Robust Graph-Based Learning of WSIs
Saba Heidari Gheshlaghi, Milan Aryal, Nasim Yahyasoltani, Masoud Ganji
-
Yangchun Zhang, Yirui Zhou
-
Improving the Robustness of Large Language Models via Consistency Alignment
Zhao Yukun, Yan Lingyong, Sun Weiwei, Xing Guoliang, Wang Shuaiqiang, Meng Chong, Cheng Zhicong, Ren Zhaochun, Yin Dawei
-
Yangchun Zhang, Yirui Zhou
-
Adversary-Augmented Simulation to evaluate client-fairness on HyperLedger Fabric
Erwan Mahe, Rouwaida Abdallah, Sara Tucci-Piergiovanni, Pierre-Yves Piriou
-
FIT-RAG: Black-Box RAG with Factual Information and Token Reduction
Yuren Mao, Xuemei Dong, Wenyi Xu, Yunjun Gao, Bin Wei, Ying Zhang
-
Adversary-Robust Graph-Based Learning of WSIs
Saba Heidari Gheshlaghi, Milan Aryal, Nasim Yahyasoltani, Masoud Ganji
-
Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot
-
HETAL: Efficient Privacy-preserving Transfer Learning with Homomorphic Encryption
Seewoo Lee, Garam Lee, Jung Woo Kim, Junbum Shin, Mun-Kyu Lee
-
Improving Robustness to Model Inversion Attacks via Sparse Coding Architectures
Sayanton V. Dibbo, Adam Breuer, Juston Moore, Michael Teti
-
Protected group bias and stereotypes in Large Language Models
Hadas Kotek, David Q. Sun, Zidi Xiu, Margit Bowler, Christopher Klein
-
Diffusion Attack: Leveraging Stable Diffusion for Naturalistic Image Attacking
Qianyu Guo, Jiaming Fu, Yawen Lu, Dongming Gan
-
BadEdit: Backdooring large language models by model editing
Yanzhou Li, Tianlin Li, Kangjie Chen, Jian Zhang, Shangqing Liu, Wenhan Wang, Tianwei Zhang, Yang Liu
-
Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection
Davide Alessandro Coccomini, Roberto Caldelli, Claudio Gennaro, Giuseppe Fiameni, Giuseppe Amato, Fabrizio Falchi
-
Have You Poisoned My Data? Defending Neural Networks against Data Poisoning
Fabio De Gaspari, Dorjan Hitaj, Luigi V. Mancini
-
Adversarial Attacks and Defenses in Automated Control Systems: A Comprehensive Benchmark
Vitaliy Pozdnyakov, Aleksandr Kovalenko, Ilya Makarov, Mikhail Drobyshevskiy, Kirill Lukyanov
-
Devam Mondal, Carlo Lipizzi
-
Multi-Modal Hallucination Control by Visual Information Grounding
Alessandro Favero, Luca Zancato, Matthew Trager, Siddharth Choudhary, Pramuditha Perera, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
-
Optimal Transport for Fairness: Archival Data Repair using Small Research Data Sets
Abigail Langbridge, Anthony Quinn, Robert Shorten
-
Electioneering the Network: Dynamic Multi-Step Adversarial Attacks for Community Canvassing
Saurabh Sharma, Ambuj SIngh
-
FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization
Cheng Yang, Jixi Liu, Yunhe Yan, Chuan Shi
-
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li
-
Robust NAS under adversarial training: benchmark, theory, and beyond
Yongtao Wu, Fanghui Liu, Carl-Johann Simon-Gabriel, Grigorios G Chrysos, Volkan Cevher
-
ADAPT to Robustify Prompt Tuning Vision Transformers
Masih Eskandar, Tooba Imtiaz, Zifeng Wang, Jennifer Dy
-
Ehsan Lari, Vinay Chakravarthi Gogineni, Reza Arablouei, Stefan Werner
-
Andrea Venturi, Dario Stabili, Mirco Marchetti
-
A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models
Stephen R. Pfohl, Heather Cole-Lewis, Rory Sayres, Darlene Neal, Mercy Asiedu, Awa Dieng, Nenad Tomasev, Qazi Mamunur Rashid, Shekoofeh Azizi, Negar Rostamzadeh, Liam G. McCoy, Leo Anthony Celi, Yun Liu, Mike Schaekermann, Alanna Walton, Alicia Parrish, Chirag Nagpal, Preeti Singh, Akeiylah Dewitt, Philip Mansfield, Sushant Prakash, Katherine Heller, Alan Karthikesalingam, Christopher Semturs, Joelle Barral, Greg Corrado, Yossi Matias, Jamila Smith-Loud, Ivor Horn, Karan Singhal
-
Yujia Liu, Chenxi Yang, Dingquan Li, Jianhao Ding, Tingting Jiang
-
Amira Guesmi, Muhammad Abdullah Hanif, Ihsen Alouani, Bassem Ouni, Muhammad Shafique
-
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model
Yuxin Cao, Jinghao Li, Xi Xiao, Derui Wang, Minhui Xue, Hao Ge, Wei Liu, Guangwu Hu
-
Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
-
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games
Zixuan Wu, Sean Ye, Manisha Natarajan, Matthew C. Gombolay
-
Improving LoRA in Privacy-preserving Federated Learning
Youbang Sun, Zitao Li, Yaliang Li, Bolin Ding
-
Impart: An Imperceptible and Effective Label-Specific Backdoor Attack
Jingke Zhao, Zan Wang, Yongwei Wang, Lanjun Wang
-
Invisible Backdoor Attack Through Singular Value Decomposition
Wenmin Chen, Xiaowei Xu
-
RobustSentEmbed: Robust Sentence Embeddings Using Adversarial Self-Supervised Contrastive Learning
Javad Rafiei Asl, Prajwal Panzade, Eduardo Blanco, Daniel Takabi, Zhipeng Cai
-
COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits
Mintong Kang, Nezihe Merve Gürel, Linyi Li, Bo Li
-
A Modified Word Saliency-Based Adversarial Attack on Text Classification Models
Hetvi Waghela, Sneha Rakshit, Jaydip Sen
-
Jiyuan Fu, Zhaoyu Chen, Kaixun Jiang, Haijing Guo, Jiafeng Wang, Shuyong Gao, Wenqiang Zhang
-
Understanding Robustness of Visual State Space Models for Image Classification
Chengbin Du, Yanxi Li, Chang Xu
-
Adversarial Knapsack and Secondary Effects of Common Information for Cyber Operations
Jon Goohs, Georgel Savin, Lucas Starks, Josiah Dykstra, William Casey
-
Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries
Swetha Ganesh, Jiayu Chen, Gugan Thoppe, Vaneet Aggarwal
-
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Song
-
Revisiting Adversarial Training under Long-Tailed Distributions
Xinli Yue, Ningping Mou, Qian Wang, Lingchen Zhao
-
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks
Chong Wang, Yi Yu, Lanqing Guo, Bihan Wen
-
Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning
Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Lim
-
Towards Adversarially Robust Dataset Distillation by Curvature Regularization
Eric Xue, Yijiang Li, Haoyang Liu, Yifan Shen, Haohan Wang
-
Rui Zhang, Dawei Cheng, Xin Liu, Jie Yang, Yi Ouyang, Xian Wu, Yefeng Zheng
-
Federated Learning with Anomaly Detection via Gradient and Reconstruction Analysis
Zahir Alsulaimawi
-
Zahir Alsulaimawi
-
Interactive Trimming against Evasive Online Data Manipulation Attacks: A Game-Theoretic Approach
Yue Fu, Qingqing Ye, Rong Du, Haibo Hu
-
Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency
Soumyadeep Pal, Yuguang Yao, Ren Wang, Bingquan Shen, Sijia Liu
-
ADEdgeDrop: Adversarial Edge Dropping for Robust Graph Neural Networks
Zhaoliang Chen, Zhihao Wu, Ylli Sadikaj, Claudia Plant, Hong-Ning Dai, Shiping Wang, Wenzhong Guo
-
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering
Zhixuan Shen, Haonan Luo, Sijia Li, Tianrui Li
-
Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao
-
VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition
Benjamin Ramtoula, Daniele De Martini, Matthew Gadd, Paul Newman
-
Impact of Synthetic Images on Morphing Attack Detection Using a Siamese Network
Juan Tapia, Christoph Busch
-
Anomaly Detection by Adapting a pre-trained Vision Language Model
Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai
-
Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement
Daiwei Yu, Zhuorong Li, Lina Wei, Canghong Jin, Yun Zhang, Sixian Chan
-
Hallgrimur Thorsteinsson, Valdemar J Henriksen, Tong Chen, Raghavendra Selvan
-
Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Kaipeng Zhang
-
Counterfactual contrastive learning: robust representations via causal image synthesis
Melanie Roschewitz, Fabio De Sousa Ribeiro, Tian Xia, Galvin Khara, Ben Glocker
-
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation
Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James T. Kwok, Yu Zhang
-
Ciphertext-Only Attack on a Secure $k$-NN Computation on Cloud
Shyam Murthy, Santosh Kumar Upadhyaya, Srinivas Vivek
-
Optimistic Verifiable Training by Controlling Hardware Nondeterminism
Megha Srivastava, Simran Arora, Dan Boneh
-
Evaluating LLMs for Gender Disparities in Notable Persons
Lauren Rhue, Sofie Goethals, Arun Sundararajan
-
Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk
Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang
-
An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts on Vision-Language Models
Haochen Luo, Jindong Gu, Fengyuan Liu, Philip Torr
-
Robust Subgraph Learning by Monitoring Early Training Representations
Sepideh Neshatfar, Salimeh Yasaei Sekeh
-
Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks
Roey Bokobza, Yisroel Mirsky
-
Robust Decision Aggregation with Adversarial Experts
Yongkang Guo, Yuqing Kong
-
Versatile Defense Against Adversarial Attacks on Image Recognition
Haibo Zhang, Zhihua Yao, Kouichi Sakurai
-
RAF-GI: Towards Robust, Accurate and Fast-Convergent Gradient Inversion Attack in Federated Learning
Can Liu, Jin Wang, Dongyang Yu
-
Yifei Gao, Jiaqi Wang, Zhiyu Lin, Jitao Sang
-
Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks
Khondoker Murad Hossain, Tim Oates
-
SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks
Guy Amit, Abigail Goldsteen, Ariel Farkash
-
Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chen-Xiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu
-
A Bayesian Approach to OOD Robustness in Image Classification
Prakhar Kaushik, Adam Kortylewski, Alan Yuille
-
Exploring Safety Generalization Challenges of Large Language Models via Code
Qibing Ren, Chang Gao, Jing Shao, Junchi Yan, Xin Tan, Wai Lam, Lizhuang Ma
-
Tian Yu, Shaolei Zhang, Yang Feng
-
Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations
Chenyu You, Yifei Min, Weicheng Dai, Jasjeet S. Sekhon, Lawrence Staib, James S. Duncan
-
Backdoor Attack with Mode Mixture Latent Modification
Hongwei Zhang, Xiaoyin Xu, Dongsheng An, Xianfeng Gu, Min Zhang
-
FairRR: Pre-Processing for Group Fairness through Randomized Response
Xianli Zeng, Joshua Ward, Guang Cheng
-
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica
-
Poisoning Programs by Un-Repairing Code: Security Concerns of AI-generated Code
Cristina Improta
-
Chuangchuang Tan, Ping Liu, RenShuai Tao, Huan Liu, Yao Zhao, Baoyuan Wu, Yunchao Wei
-
PeerAiD: Improving Adversarial Distillation from a Specialized Peer Tutor
Jaewon Jung, Hongsun Jang, Jaeyong Song, Jinho Lee
-
Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification
Shuai Li, Xiaoguang Ma, Shancheng Jiang, Lu Meng
-
Intra-Section Code Cave Injection for Adversarial Evasion Attacks on Windows PE Malware File
Kshitiz Aryal, Maanak Gupta, Mahmoud Abdelsalam, Moustafa Saleh
-
Real is not True: Backdoor Attacks Against Deepfake Detection
Hong Sun, Ziqiang Li, Lei Liu, Bin Li
-
Stealing Part of a Production Language Model
Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Eric Wallace, David Rolnick, Florian Tramèr
-
Fuseinin Mumuni, Alhassan Mumuni
-
In-context Prompt Learning for Test-time Vision Recognition with Frozen Vision-language Model
Junhui Yin, Xinyu Zhang, Lin Wu, Xianghua Xie, Xiaojie Wang
-
Federated Learning: Attacks, Defenses, Opportunities, and Challenges
Ghazaleh Shirvani, Saeid Ghasemshirazi, Behzad Beigzadeh
-
Attacking Transformers with Feature Diversity Adversarial Perturbation
Chenxing Gao, Hang Zhou, Junqing Yu, YuTeng Ye, Jiale Cai, Junle Wang, Wei Yang
-
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin
-
Hard-label based Small Query Black-box Adversarial Attack
Jeonghwan Park, Paul Miller, Niall McLaughlin
-
Wei Duan, Hui Liu
-
Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume
Ping Guo, Cheng Gong, Xi Lin, Zhiyuan Yang, Qingfu Zhang
-
Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu
-
Towards Multimodal Sentiment Analysis Debiasing via Bias Purification
Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, Lihua Zhang
-
The Impact of Quantization on the Robustness of Transformer-based Text Classifiers
Seyed Parsa Neshaei, Yasaman Boreshban, Gholamreza Ghassem-Sani, Seyed Abolghasem Mirroshandel
-
Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds
Tianrui Lou, Xiaojun Jia, Jindong Gu, Li Liu, Siyuan Liang, Bangyan He, Xiaochun Cao
-
Federated Learning Method for Preserving Privacy in Face Recognition System
Enoch Solomon, Abraham Woubie
-
EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV
Huiming Sun, Jiacheng Guo, Zibo Meng, Tianyun Zhang, Jianwu Fang, Yuewei Lin, Hongkai Yu
-
Eda Yilmaz, Hacer Yalim Keles
-
Privacy-preserving Fine-tuning of Large Language Models through Flatness
Tiejin Chen, Longchao Da, Huixue Zhou, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei
-
Cristiana Tiago, Sten Roar Snare, Jurica Sprem, Kristin McLeod
-
Membership Inference Attacks and Privacy in Topic Modeling
Nico Manzonelli, Wanrong Zhang, Salil Vadhan
-
Automatic and Universal Prompt Injection Attacks against Large Language Models
Xiaogeng Liu, Zhiyuan Yu, Yizhe Zhang, Ning Zhang, Chaowei Xiao
-
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem
Dorjan Hitaj, Giulio Pagnotta, Fabio De Gaspari, Sediola Ruko, Briland Hitaj, Luigi V. Mancini, Fernando Perez-Cruz
-
Kalibinuer Tiliwalidi
-
Yingrui Ji, Yao Zhu, Zhigang Li, Jiansheng Chen, Yunlong Kong, Jingbo Chen
-
Probing the Robustness of Time-series Forecasting Models with CounterfacTS
Håkon Hanisch Kjærnli, Lluis Mas-Ribas, Aida Ashrafi, Gleb Sizov, Helge Langseth, Odd Erik Gundersen
-
Learning Adversarial MDPs with Stochastic Hard Constraints
Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti
-
Verified Training for Counterfactual Explanation Robustness under Data Shift
Anna P. Meyer, Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni
-
DeepEclipse: How to Break White-Box DNN-Watermarking Schemes
Alessandro Pegoraro, Carlotta Segna, Kavita Kumari, Ahmad-Reza Sadeghi
-
Neural Exec: Learning (and Learning from) Execution Triggers for Prompt Injection Attacks
Dario Pasquini, Martin Strohmeier, Carmela Troncoso
-
Unsupervised Contrastive Learning for Robust RF Device Fingerprinting Under Time-Domain Shift
Jun Chen, Weng-Keen Wong, Bechir Hamdaoui
-
Improving Adversarial Training using Vulnerability-Aware Perturbation Budget
Olukorede Fakorede, Modeste Atsague, Jin Tian
-
Effect of Ambient-Intrinsic Dimension Gap on Adversarial Vulnerability
Rajdeep Haldar, Yue Xing, Qifan Song
-
Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations
Xiaolin Sun, Zizhan Zheng
-
Fooling Neural Networks for Motion Forecasting via Adversarial Attacks
Edgar Medina, Leyong Loh
-
FLGuard: Byzantine-Robust Federated Learning via Ensemble of Contrastive Models
Younghan Lee, Yungi Cho, Woorim Han, Ho Bae, Yunheung Paek
-
Recall-Oriented Continual Learning with Generative Adversarial Meta-Model
Haneol Kang, Dong-Wan Choi
-
Towards Robust Federated Learning via Logits Calibration on Non-IID Data
Yu Qiao, Apurba Adhikary, Chaoning Zhang, Choong Seon Hong
-
XAI-Based Detection of Adversarial Attacks on Deepfake Detectors
Ben Pinhasov, Raz Lapid, Rony Ohayon, Moshe Sipper, Yehudit Aperstein
-
Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications
Stav Cohen, Ron Bitton, Ben Nassi
-
Enhancing Security in Federated Learning through Adaptive Consensus-Based Model Update Validation
Zahir Alsulaimawi
-
GuardT2I: Defending Text-to-Image Models from Adversarial Prompts
Yijun Yang, Ruiyuan Gao, Xiao Yang, Jianyuan Zhong, Qiang Xu
-
Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models
Arijit Ghosh Chowdhury, Md Mofijul Islam, Vaibhav Kumar, Faysal Hossain Shezan, Vaibhav Kumar, Vinija Jain, Aman Chadha
-
Query Recovery from Easy to Hard: Jigsaw Attack against SSE
Hao Nie, Wei Wang, Peng Xu, Xianglong Zhang, Laurence T. Yang, Kaitai Liang
-
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
Yifan Zeng, Yiran Wu, Xiao Zhang, Huazheng Wang, Qingyun Wu
-
AXOLOTL: Fairness through Assisted Self-Debiasing of Large Language Model Outputs
Sana Ebrahimi, Kaiwen Chen, Abolfazl Asudeh, Gautam Das, Nick Koudas
-
Robust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey
Lucas Schott, Josephine Delas, Hatem Hajri, Elies Gherbi, Reda Yaich, Nora Boulahia-Cuppens, Frederic Cuppens, Sylvain Lamprier
-
DPP-Based Adversarial Prompt Searching for Lanugage Models
Xu Zhang, Xiaojun Wan
-
Tayuki Osa, Tatsuya Harada
-
Attacking Delay-based PUFs with Minimal Adversary Model
Hongming Fei, Owen Millwood, Prosanta Gope, Jack Miskelly, Biplab Sikdar
-
Differentially Private Worst-group Risk Minimization
Xinyu Zhou, Raef Bassily
-
Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification
Zihan Wang, Peiyi Wang, Houfeng Wang
-
MPAT: Building Robust Deep Neural Networks against Textual Adversarial Attacks
Fangyuan Zhang, Huichi Zhou, Shuangjiao Li, Hongtao Wang
-
PRSA: Prompt Reverse Stealing Attacks against Large Language Models
Yong Yang, Xuhong Zhang, Yi Jiang, Xi Chen, Haoyu Wang, Shouling Ji, Zonghui Wang
-
Sonal Joshi, Thomas Thebaud, Jesús Villalba, Najim Dehak
-
Syntactic Ghost: An Imperceptible General-purpose Backdoor Attacks on Pre-trained Language Models
Pengzhou Cheng, Wei Du, Zongru Wu, Fengwei Zhang, Libo Chen, Gongshen Liu
-
Watermark Stealing in Large Language Models
Nikola Jovanović, Robin Staab, Martin Vechev
-
PrivatEyes: Appearance-based Gaze Estimation Using Federated Secure Multi-Party Computation
Mayar Elfares, Pascal Reisert, Zhiming Hu, Wenwu Tang, Ralf Küsters, Andreas Bulling
-
Typographic Attacks in Large Multimodal Models Can be Alleviated by More Informative Prompts
Hao Cheng, Erjia Xiao, Renjing Xu
-
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik
-
Verification of Neural Networks' Global Robustness
Anan Kabaha, Dana Drachsler-Cohen
-
Hongbang Yuan, Pengfei Cao, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao
-
Pointing out the Shortcomings of Relation Extraction Models with Semantically Motivated Adversarials
Gennaro Nolano, Moritz Blum, Basil Ell, Philipp Cimiano
-
LoRA-as-an-Attack! Piercing LLM Safety Under The Share-and-Play Scenario
Hongyi Liu, Zirui Liu, Ruixiang Tang, Jiayi Yuan, Shaochen Zhong, Yu-Neng Chuang, Li Li, Rui Chen, Xia Hu
-
Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen
-
Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks
Zhewei Wu, Ruilong Yu, Qihe Liu, Shuying Cheng, Shilin Qiu, Shijie Zhou
-
Catastrophic Overfitting: A Potential Blessing in Disguise
Mengnan Zhao, Lihe Zhang, Yuqiu Kong, Baocai Yin
-
A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
Fangzhou Wu, Ning Zhang, Somesh Jha, Patrick McDaniel, Chaowei Xiao
-
Unveiling Privacy, Memorization, and Input Curvature Links
Deepak Ravikumar, Efstathia Soufleri, Abolfazl Hashemi, Kaushik Roy
-
Alexander Unnervik, Hatef Otroshi Shahreza, Anjith George, Sébastien Marcel
-
Pre-training Differentially Private Models with Limited Public Data
Zhiqi Bu, Xinwei Zhang, Mingyi Hong, Sheng Zha, George Karypis
-
Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective
Xinjian Luo, Yangfan Jiang, Fei Wei, Yuncheng Wu, Xiaokui Xiao, Beng Chin Ooi
-
Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue
Zhenhong Zhou, Jiuyang Xiang, Haopeng Chen, Quan Liu, Zherui Li, Sen Su
-
FairBelief - Assessing Harmful Beliefs in Language Models
Mattia Setzu, Marta Marchiori Manerba, Pasquale Minervini, Debora Nozza
-
Extreme Miscalibration and the Illusion of Adversarial Robustness
Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis
-
Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain
Qunliang Xing, Mai Xu, Shengxi Li, Xin Deng, Meisong Zheng, Huaida Liu, Ying Chen
-
Preserving Fairness Generalization in Deepfake Detection
Li Lin, Xinan He, Yan Ju, Xin Wang, Feng Ding, Shu Hu
-
Black-box Adversarial Attacks Against Image Quality Assessment Models
Yu Ran, Ao-Xiang Zhang, Mingjie Li, Weixuan Tang, Yuan-Gen Wang
-
Structure-Guided Adversarial Training of Diffusion Models
Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui
-
Towards Fairness-Aware Adversarial Learning
Yanghao Zhang, Tianle Zhang, Ronghui Mu, Xiaowei Huang, Wenjie Ruan
-
Model X-ray:Detect Backdoored Models via Decision Boundary
Yanghao Su, Jie Zhang, Ting Xu, Tianwei Zhang, Weiming Zhang, Nenghai Yu
-
Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates
Daniele Angioni, Luca Demetrio, Maura Pintor, Luca Oneto, Davide Anguita, Battista Biggio, Fabio Roli
-
Lorenzo Peracchio, Giovanna Nicora, Enea Parimbelli, Tommaso Mario Buonocore, Roberto Bergamaschi, Eleonora Tavazzi, Arianna Dagliati, Riccardo Bellazzi
-
LLM-Resistant Math Word Problem Generation via Adversarial Attacks
Roy Xie, Chengxuan Huang, Junlin Wang, Bhuwan Dhingra
-
Bo Yang, Hengwei Zhang, Chenwei Li, Jindong Wang
-
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Xuantong Liu, Tianyang Hu, Wenjia Wang, Kenji Kawaguchi, Yuan Yao
-
Investigating Deep Watermark Security: An Adversarial Transferability Perspective
Biqing Qi, Junqi Gao, Yiang Luo, Jianxing Liu, Ligang Wu, Bowen Zhou
-
Training Implicit Generative Models via an Invariant Statistical Loss
José Manuel de Frutos, Pablo M. Olmos, Manuel A. Vázquez, Joaquín Míguez
-
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models
Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
-
Immunization against harmful fine-tuning attacks
Domenic Rosati, Jan Wehner, Kai Williams, Łukasz Bartoszcze, Jan Batzner, Hassan Sajjad, Frank Rudzicz
-
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions
Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
-
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang
-
Unveiling Vulnerability of Self-Attention
Khai Jiet Liong, Hongqiu Wu, Hai Zhao
-
Edge Detectors Can Make Deep Convolutional Neural Networks More Robust
Jin Ding, Jie-Chao Zhao, Yong-Zhi Sun, Ping Tan, Jia-Wei Wang, Ji-En Ma, You-Tong Fang
-
Improving the JPEG-resistance of Adversarial Attacks on Face Recognition by Interpolation Smoothing
Kefu Guo, Fengfan Zhou, Hefei Ling, Ping Li, Hui Liu
-
On the (In)feasibility of ML Backdoor Detection as an Hypothesis Testing Problem
Georg Pichler, Marco Romanelli, Divya Prakash Manivannan, Prashanth Krishnamurthy, Farshad Khorrami, Siddharth Garg
-
Leonid Boytsov, Ameya Joshi, Filipe Condessa
-
Hao Wang, Hao Li, Minlie Huang, Lei Sha
-
Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Jiabao Ji, Bairu Hou, Alexander Robey, George J. Pappas, Hamed Hassani, Yang Zhang, Eric Wong, Shiyu Chang
-
Adversarial-Robust Transfer Learning for Medical Imaging via Domain Assimilation
Xiaohui Chen, Tie Luo
-
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
Xirui Li, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh
-
LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper
Daoyuan Wu, Shuai Wang, Yang Liu, Ning Liu
-
Jiawei Zhou, Linye Lyu, Daojing He, Yu Li
-
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
Yihao Zhang, Hangzhou He, Jingyu Zhu, Huanran Chen, Yifei Wang, Zeming Wei
-
Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul Jennings, Xingyu Zhao
-
A First Look at GPT Apps: Landscape and Vulnerability
Zejun Zhang, Li Zhang, Xin Yuan, Anlan Zhang, Mengwei Xu, Feng Qian
-
Fast Adversarial Attacks on Language Models In One GPU Minute
Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan, Priyatham Kattakinda, Atoosa Chegini, Soheil Feizi
-
Distilling Adversarial Robustness Using Heterogeneous Teachers
Jieren Deng, Aaron Palmer, Rigel Mahmood, Ethan Rathbun, Jinbo Bi, Kaleel Mahmood, Derek Aguiar
-
Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm
Yanqi Qiao, Dazhuang Liu, Rui Wang, Kaitai Liang
-
The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)
Shenglai Zeng, Jiankun Zhang, Pengfei He, Yue Xing, Yiding Liu, Han Xu, Jie Ren, Shuaiqiang Wang, Dawei Yin, Yi Chang, Jiliang Tang
-
Futa Waseda, Isao Echizen
-
Xinshuo Hu, Baotian Hu, Dongfang Li, Xiaoguang Li, Lifeng Shang
-
Jinxu Zhao, Guanting Dong, Yueyan Qiu, Tingfeng Hui, Xiaoshuai Song, Daichi Guo, Weiran Xu
-
Quadruplet Loss For Improving the Robustness to Face Morphing Attacks
Iurii Medvedev, Nuno Gonçalves
-
BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan
-
COBIAS: Contextual Reliability in Bias Assessment
Priyanshul Govil, Vamshi Krishna Bonagiri, Manas Gaur, Ponnurangam Kumaraguru, Sanorita Dey
-
Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images
Zefeng Wang, Zhen Han, Shuo Chen, Fan Xue, Zifeng Ding, Xun Xiao, Volker Tresp, Philip Torr, Jindong Gu
-
Mitigating Fine-tuning Jailbreak Attack with Backdoor Enhanced Alignment
Jiongxiao Wang, Jiazhao Li, Yiquan Li, Xiangyu Qi, Muhao Chen, Junjie Hu, Yixuan Li, Bo Li, Chaowei Xiao
-
Mudjacking: Patching Backdoor Vulnerabilities in Foundation Models
Hongbin Liu, Michael K. Reiter, Neil Zhenqiang Gong
-
SoK: Analyzing Adversarial Examples: A Framework to Study Adversary Knowledge
Lucas Fenaux, Florian Kerschbaum
-
Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content
Federico Bianchi, James Zou
-
SDXL-Lightning: Progressive Adversarial Diffusion Distillation
Shanchuan Lin, Anran Wang, Xiao Yang
-
GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie, Minghong Fang, Renjie Pi, Neil Gong
-
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
Vyas Raina, Adian Liusie, Mark Gales
-
Learning to Poison Large Language Models During Instruction Tuning
Yao Qiang, Xiangyu Zhou, Saleh Zare Zade, Mohammad Amin Roshani, Douglas Zytko, Dongxiao Zhu
-
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein
-
Robustness of Deep Neural Networks for Micro-Doppler Radar Classification
Mikolaj Czerkawski, Carmine Clemente, Craig MichieCraig Michie, Christos Tachtatzis
-
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models
Jiawei Liang, Siyuan Liang, Man Luo, Aishan Liu, Dongchen Han, Ee-Chien Chang, Xiaochun Cao
-
Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
Zhiwei Wang, Huazheng Wang, Hongning Wang
-
AttackGNN: Red-Teaming GNNs in Hardware Security Using Reinforcement Learning
Vasudev Gohil, Satwik Patnaik, Dileep Kalathil, Jeyavijayan Rajendran
-
Uncertainty-driven and Adversarial Calibration Learning for Epicardial Adipose Tissue Segmentation
Kai Zhao, Zhiming Liu, Jiaqi Liu, Jingbiao Zhou, Bihong Liao, Huifang Tang, Qiuyu Wang, Chunquan Li
-
Fake Resume Attacks: Data Poisoning on Online Job Platforms
Michiharu Yamashita, Thanh Tran, Dongwon Lee
-
Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs
Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Han Fang, Aishan Liu, Ee-Chien Chang
-
TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification
Martin Gubri, Dennis Ulmer, Hwaran Lee, Sangdoo Yun, Seong Joon Oh
-
VGMShield: Mitigating Misuse of Video Generative Models
Yan Pang, Yang Zhang, Tianhao Wang
-
Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors
Yiwei Lu, Matthew Y.R. Yang, Gautam Kamath, Yaoliang Yu
-
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies
Xiangyu Liu, Chenghao Deng, Yanchao Sun, Yongyuan Liang, Furong Huang
-
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou, Yufei Han, Haomin Zhuang, Taicheng Guo, Kehan Guo, Zhenwen Liang, Hongyan Bao, Xiangliang Zhang
-
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran
-
ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs
Pengrui Han, Rafal Kocielnik, Adhithya Saravanan, Roy Jiang, Or Sharir, Anima Anandkumar
-
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
Zongru Wu, Zhuosheng Zhang, Pengzhou Cheng, Gongshen Liu
-
Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning
Shuai Zhao, Leilei Gan, Luu Anh Tuan, Jie Fu, Lingjuan Lyu, Meihuizi Jia, Jinming Wen
-
Dynamic Environment Responsive Online Meta-Learning with Fairness Awareness
Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Feng Chen
-
Query-Based Adversarial Prompt Generation
Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr
-
AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization
Jiyao Li, Mingze Ni, Yifei Dong, Tianqing Zhu, Wei Liu
-
Leo Hyun Park, Jaeuk Kim, Myung Gyo Oh, Jaewoo Park, Taekyoung Kwon
-
Privacy-Preserving Low-Rank Adaptation for Latent Diffusion Models
Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhang
-
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao
-
Attacks on Node Attributes in Graph Neural Networks
Ying Xu, Michael Lanier, Anindya Sarkar, Yevgeniy Vorobeychik
-
How Susceptible are Large Language Models to Ideological Manipulation?
Kai Chen, Zihao He, Jun Yan, Taiwei Shi, Kristina Lerman
-
Stealthy Attack on Large Language Model based Recommendation
Jinghao Zhang, Yuting Liu, Qiang Liu, Shu Wu, Guibing Guo, Liang Wang
-
Token-Ensemble Text Generation: On Attacking the Automatic AI-Generated Text Detection
Fan Huang, Haewoon Kwak, Jisun An
-
Maintaining Adversarial Robustness in Continuous Learning
Xiaolei Ru, Xiaowei Cao, Zijia Liu, Jack Murdoch Moore, Xin-Ya Zhang, Xia Zhu, Wenjia Wei, Gang Yan
-
Disclosure and Mitigation of Gender Bias in LLMs
Xiangjue Dong, Yibo Wang, Philip S. Yu, James Caverlee
-
Zhongliang Guo, Weiye Li, Yifei Qian, Ognjen Arandjelović, Lei Fang
-
Connect the dots: Dataset Condensation, Differential Privacy, and Adversarial Uncertainty
Kenneth Odoh
-
Adversarial Curriculum Graph Contrastive Learning with Pair-wise Augmentation
Xinjian Zhao, Liang Zhang, Yang Liu, Ruocheng Guo, Xiangyu Zhao
-
Zero-shot sampling of adversarial entities in biomedical question answering
R. Patrick Xian, Alex J. Lee, Vincent Wang, Qiming Cui, Russell Ro, Reza Abbasi-Asl
-
Universal Prompt Optimizer for Safe Text-to-Image Generation
Zongyu Wu, Hongcheng Gao, Yueze Wang, Xiang Zhang, Suhang Wang
-
Uncertainty, Calibration, and Membership Inference Attacks: An Information-Theoretic Perspective
Meiyi Zhu, Caili Guo, Chunyan Feng, Osvaldo Simeone
-
ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment
Yangyifei Luo, Zhuo Chen, Lingbing Guo, Qian Li, Wenxuan Zeng, Zhixin Cai, Jianxin Li
-
Yixin Wan, Kai-Wei Chang
-
VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models
Ziyi Yin, Muchao Ye, Tianrong Zhang, Jiaqi Wang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma
-
DART: A Principled Approach to Adversarially Robust Unsupervised Domain Adaptation
Yunjuan Wang, Hussein Hazimeh, Natalia Ponomareva, Alexey Kurakin, Ibrahim Hammoud, Raman Arora
-
Generating Visual Stimuli from EEG Recordings using Transformer-encoder based EEG encoder and GAN
Rahul Mishra, Arnav Bhavsar
-
Improving EEG Signal Classification Accuracy Using Wasserstein Generative Adversarial Networks
Joshua Park, Priyanshu Mahey, Ore Adeniyi
-
Alvin Grissom II, Ryan F. Lei, Jeova Farias Sales Rocha Neto, Bailey Lin, Ryan Trotter
-
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun
-
Align before Attend: Aligning Visual and Textual Features for Multimodal Hateful Content Detection
Eftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah M. Preum
-
Álvaro Huertas-García, Alejandro Martín, Javier Huertas-Tato, David Camacho
-
FedRDF: A Robust and Dynamic Aggregation Function against Poisoning Attacks in Federated Learning
Enrique Mármol Campos, Aurora González Vidal, José Luis Hernández Ramos, Antonio Skarmeta
-
Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov, Aladin Virmaux, Giuseppe Paolo, Themis Palpanas, Ievgen Redko
-
PAL: Proxy-Guided Black-Box Attack on Large Language Models
Chawin Sitawarin, Norman Mu, David Wagner, Alexandre Araujo
-
Reward Poisoning Attack Against Offline Reinforcement Learning
Yinglun Xu, Rohan Gumaste, Gagandeep Singh
-
AbuseGPT: Abuse of Generative AI ChatBots to Create Smishing Campaigns
Ashfak Md Shibli, Mir Mehedi A. Pritom, Maanak Gupta
-
Privacy Attacks in Decentralized Learning
Abdellah El Mrini, Edwige Cyffers, Aurélien Bellet
-
How Much Does Each Datapoint Leak Your Privacy? Quantifying the Per-datum Membership Leakage
Achraf Azize, Debabrota Basu
-
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun
-
Álvaro Huertas-García, Alejandro Martín, Javier Huertas-Tato, David Camacho
-
Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts
Tobias Enders, James Harrison, Maximilian Schiffer
-
How Much Does Each Datapoint Leak Your Privacy? Quantifying the Per-datum Membership Leakage
Achraf Azize, Debabrota Basu
-
Backdoor Attack against One-Class Sequential Anomaly Detection Models
He Cheng, Shuhan Yuan
-
Exploring the Adversarial Capabilities of Large Language Models
Lukas Struppek, Minh Hieu Le, Dominik Hintersdorf, Kristian Kersting
-
Weiheng Chai, Brian Testa, Huantao Ren, Asif Salekin, Senem Velipasalar
-
Review-Incorporated Model-Agnostic Profile Injection Attacks on Recommender Systems
Shiyi Yang, Lina Yao, Chen Wang, Xiwei Xu, Liming Zhu
-
Attacking Large Language Models with Projected Gradient Descent
Simon Geisler, Tom Wollschläger, M. H. I. Abdalla, Johannes Gasteiger, Stephan Günnemann
-
Detecting Adversarial Spectrum Attacks via Distance to Decision Boundary Statistics
Wenwei Zhao, Xiaowen Li, Shangqing Zhao, Jie Xu, Yao Liu, Zhuo Lu
-
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
-
Yuhui Shi, Qiang Sheng, Juan Cao, Hao Mi, Beizhe Hu, Danding Wang
-
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang
-
Play Guessing Game with LLM: Indirect Jailbreak Attack with Implicit Clues
Zhiyuan Chang, Mingyang Li, Yi Liu, Junjie Wang, Qing Wang, Yang Liu
-
Andrew Lowy, Zhuohang Li, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang
-
Data Reconstruction Attacks and Defenses: A Systematic Evaluation
Sheng Liu, Zihan Wang, Qi Lei
-
Faster Repeated Evasion Attacks in Tree Ensembles
Lorenzo Cascioli, Laurens Devos, Ondřej Kuželka, Jesse Davis
-
Generating Universal Adversarial Perturbations for Quantum Classifiers
Gautham Anil, Vishnu Vinod, Apurva Narayan
-
Qiyuan An, Christos Sevastopoulos, Fillia Makedon
-
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xingang Guo, Fangxu Yu, Huan Zhang, Lianhui Qin, Bin Hu
-
Test-Time Backdoor Attacks on Multimodal Large Language Models
Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin
-
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
-
Oracle-Efficient Differentially Private Learning with Public Data
Adam Block, Mark Bun, Rathin Desai, Abhishek Shetty, Steven Wu
-
PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining
Mishaal Kazmi, Hadrien Lautraite, Alireza Akbari, Mauricio Soroco, Qiaoyue Tang, Tao Wang, Sébastien Gambs, Mathias Lécuyer
-
Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Jon Egana-Zubia, Raul Orduna-Urrutia
-
Topological safeguard for evasion attack interpreting the neural networks' behavior
Xabier Echeberria-Barrio, Amaia Gil-Lerchundi, Iñigo Mendialdua, Raul Orduna-Urrutia
-
PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models
Wei Zou, Runpeng Geng, Binghui Wang, Jinyuan Jia
-
OrderBkd: Textual backdoor attack through repositioning
Irina Alekseevskaia, Konstantin Arkhipenko
-
Customizable Perturbation Synthesis for Robust SLAM Benchmarking
Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang
-
Multi-Attribute Vision Transformers are Efficient and Robust Learners
Hanan Gani, Nada Saadi, Noor Hussein, Karthik Nandakumar
-
Do Membership Inference Attacks Work on Large Language Models?
Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, Hannaneh Hajishirzi
-
NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness
Xabier Echeberria-Barrio, Mikel Gorricho, Selene Valencia, Francesco Zola
-
Yunzhe Xue, Usman Roshan
-
A Random Ensemble of Encrypted Vision Transformers for Adversarially Robust Defense
Ryota Iijima, Sayaka Shiota, Hitoshi Kiya
-
Architectural Neural Backdoors from First Principles
Harry Langford, Ilia Shumailov, Yiren Zhao, Robert Mullins, Nicolas Papernot
-
TETRIS: Towards Exploring the Robustness of Interactive Segmentation
Andrey Moskalenko, Vlad Shakhuro, Anna Vorontsova, Anton Konushin, Anton Antonov, Alexander Krapukhin, Denis Shepelev, Konstantin Soshin
-
RAMP: Boosting Adversarial Robustness Against Multiple lp Perturbations
Enyi Jiang, Gagandeep Singh
-
Anomaly Unveiled: Securing Image Classification against Adversarial Patch Attacks
Nandish Chattopadhyay, Amira Guesmi, Muhammad Shafique
-
Evaluating Membership Inference Attacks and Defenses in Federated Learning
Gongxi Zhu, Donghao Li, Hanlin Gu, Yuxing Han, Yuan Yao, Lixin Fan, Qiang Yang
-
Studious Bob Fight Back Against Jailbreaking via Prompt Adversarial Tuning
Yichuan Mo, Yuji Wang, Zeming Wei, Yisen Wang
-
StruQ: Defending Against Prompt Injection with Structured Queries
Sizhe Chen, Julien Piet, Chawin Sitawarin, David Wagner
-
Quantifying and Enhancing Multi-modal Robustness with Modality Preference
Zequn Yang, Yake Wei, Ce Liang, Di Hu
-
Jialuo He, Wei Chen, Xiaojin Zhang
-
Is Adversarial Training with Compressed Datasets Effective?
Tong Chen, Raghavendra Selvan
-
Investigating White-Box Attacks for On-Device Models
Mingyi Zhou, Xiang Gao, Jing Wu, Kui Liu, Hailong Sun, Li Li
-
Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang
-
Savvy: Trustworthy Autonomous Vehicles Architecture
Ali Shoker, Rehana Yasmin, Paulo Esteves-Verissimo
-
Chaojun Xiao, Pengle Zhang, Xu Han, Guangxuan Xiao, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Song Han, Maosong Sun
-
Adversarial Robustness Through Artifact Design
Tsufit Shua, Mahmood Sharif
-
Group Distributionally Robust Dataset Distillation with Risk Minimization
Saeed Vahidian, Mingyu Wang, Jianyang Gu, Vyacheslav Kungurtsev, Wei Jiang, Yiran Chen
-
EvoSeed: Unveiling the Threat on Deep Neural Networks with Real-World Illusions
Shashank Kotyan, PoYuan Mao, Danilo Vasconcellos Vargas
-
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models
Lijun Li, Bowen Dong, Ruohui Wang, Xuhao Hu, Wangmeng Zuo, Dahua Lin, Yu Qiao, Jing Shao
-
Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Chirag Agarwal, Sree Harsha Tanneru, Himabindu Lakkaraju
-
Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Pedro Vianna, Muawiz Chaudhary, Paria Mehrbod, An Tang, Guy Cloutier, Guy Wolf, Michael Eickenberg, Eugene Belilovsky
-
De-amplifying Bias from Differential Privacy in Language Model Fine-tuning
Sanjari Srivastava, Piotr Mardziel, Zhikhun Zhang, Archana Ahlawat, Anupam Datta, John C Mitchell
-
Partially Recentralization Softmax Loss for Vision-Language Models Robustness
Hao Wang, Xin Zhang, Jinzhe Jiang, Yaqian Zhao, Chen Li
-
Lei Yu, Meng Han, Yiming Li, Changting Lin, Yao Zhang, Mingyang Zhang, Yan Liu, Haiqin Weng, Yuseok Jeon, Ka-Ho Chow, Stacy Patterson
-
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Song
-
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science
Xiangru Tang, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, Yilun Zhao, Jian Tang, Zhuosheng Zhang, Arman Cohan, Zhiyong Lu, Mark Gerstein
-
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks
-
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
Xuechunzi Bai, Angelina Wang, Ilia Sucholutsky, Thomas L. Griffiths
-
Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and Defenses
Hao Fang, Yixiang Qiu, Hongyao Yu, Wenbo Yu, Jiawei Kong, Baoli Chong, Bin Chen, Xuan Wang, Shu-Tao Xia
-
Fairness and Privacy Guarantees in Federated Contextual Bandits
Sambhav Solanki, Shweta Jain, Sujit Gujar
-
PAC-Bayesian Adversarially Robust Generalization Bounds for Graph Neural Network
Tan Sun, Junhong Lin
-
Towards Fair, Robust and Efficient Client Contribution Evaluation in Federated Learning
Meiying Zhang, Huan Zhao, Sheldon Ebron, Kan Yang
-
Enyan Dai, Minhua Lin, Suhang Wang
-
Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning
Sarwar Khan
-
Exploiting Class Probabilities for Black-box Sentence-level Attacks
Raha Moraffah, Huan Liu
-
A Generative Approach to Surrogate-based Black-box Attacks
Raha Moraffah, Huan Liu
-
Evading Data Contamination Detection for Language Models is (too) Easy
Jasper Dekoninck, Mark Niklas Müller, Maximilian Baader, Marc Fischer, Martin Vechev
-
Homograph Attacks on Maghreb Sentiment Analyzers
Fatima Zahra Qachfar, Rakesh M. Verma
-
Conversation Reconstruction Attack Against GPT Models
Junjie Chu, Zeyang Sha, Michael Backes, Yang Zhang
-
Haibo Jin, Ruoxi Chen, Andy Zhou, Jinyin Chen, Yang Zhang, Haohan Wang
-
Shuai Li, Xiaoyu Jiang, Xiaoguang Ma
-
DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models
Yang Sui, Huy Phan, Jinqi Xiao, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
-
Time-Distributed Backdoor Attacks on Federated Spiking Learning
Gorka Abad, Stjepan Picek, Aitor Urbieta
-
Adversarial Data Augmentation for Robust Speaker Verification
Zhenyu Zhou, Junhui Chen, Namin Wang, Lantian Li, Dong Wang
-
Arabic Synonym BERT-based Adversarial Examples for Text Classification
Norah Alshahrani, Saied Alshahrani, Esma Wali, Jeanna Matthews
-
Generalization Properties of Adversarial Training for $\ell_0$-Bounded Adversarial Attacks
Payam Delgosha, Hamed Hassani, Ramtin Pedarsani
-
DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers
Oryan Yehezkel, Alon Zolfi, Amit Baras, Yuval Elovici, Asaf Shabtai
-
PROSAC: Provably Safe Certification for Machine Learning Models under Adversarial Attacks
Ziquan Liu, Zhuo Zhi, Ilija Bogunovic, Carsten Gerner-Beuerle, Miguel Rodrigues
-
Seeing is not always believing: The Space of Harmless Perturbations
Lu Chen, Shaofeng Li, Benhao Huang, Fan Yang, Zheng Li, Jie Li, Yuan Luo
-
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Haoran Li, Zicheng Zhang, Wang Luo, Congying Han, Yudong Hu, Tiande Guo, Shichen Liao
-
Yun-Wei Chu, Dong-Jun Han, Seyyedali Hosseinalipour, Christopher G. Brinton
-
Data Poisoning for In-context Learning
Pengfei He, Han Xu, Yue Xing, Hui Liu, Makoto Yamada, Jiliang Tang
-
Universal Post-Training Reverse-Engineering Defense Against Backdoors in Deep Neural Networks
Xi Li, Hang Wang, David J. Miller, George Kesidis
-
MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers
Yatong Bai, Mo Zhou, Vishal M. Patel, Somayeh Sojoudi
-
Trustworthy Distributed AI Systems: Robustness, Privacy, and Governance
Wenqi Wei, Ling Liu
-
STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition
Yi Chang, Zhao Ren, Zixing Zhang, Xin Jing, Kun Qian, Xi Shao, Bin Hu, Tanja Schultz, Björn W. Schuller
-
Delving into Decision-based Black-box Attacks on Semantic Segmentation
Zhaoyu Chen, Zhengyang Shan, Jingwen Chang, Kaixun Jiang, Dingkang Yang, Yiting Cheng, Wenqiang Zhang
-
Synthetic Data for the Mitigation of Demographic Biases in Face Recognition
Pietro Melzi, Christian Rathgeb, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Dominik Lawatsch, Florian Domin, Maxim Schaubert
-
Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors
Dingcheng Yang, Yang Bai, Xiaojun Jia, Yang Liu, Xiaochun Cao, Wenjian Yu
-
SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding
Chanho Park, Namyoon Lee
-
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
Calum Heggan, Sam Budgett, Timothy Hosepedales, Mehrdad Yeghoobi
-
$σ$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples
Antonio Emanuele Cinà, Francesco Villani, Maura Pintor, Lea Schönherr, Battista Biggio, Marcello Pelillo
-
Xi Li, Jiaqi Wang
-
HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
Han Liu, Zhi Xu, Xiaotong Zhang, Feng Zhang, Fenglong Ma, Hongyang Chen, Hong Yu, Xianchao Zhang
-
Preference Poisoning Attacks on Reward Model Learning
Junlin Wu, Jiongxiao Wang, Chaowei Xiao, Chenguang Wang, Ning Zhang, Yevgeniy Vorobeychik
-
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu, Runkai Zheng, Jindong Wang, Cheuk Hang Leung, Qi Wu, Xing Xie
-
Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection
Xinlin Peng, Ying Zhou, Ben He, Le Sun, Yingfei Sun
-
Safety of Multimodal Large Language Models on Images and Text
Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao
-
Masked Conditional Diffusion Model for Enhancing Deepfake Detection
Tiewen Chen, Shanmin Yang, Shu Hu, Zhenghan Fang, Ying Fu, Xi Wu, Xin Wang
-
Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks
Maan Qraitem, Nazia Tasnim, Kate Saenko, Bryan A. Plummer
-
Approximating Optimal Morphing Attacks using Template Inversion
Laurent Colbois, Hatef Otroshi Shahreza, Sébastien Marcel
-
Tropical Decision Boundaries for Neural Networks Are Robust Against Adversarial Attacks
Kurt Pasque, Christopher Teska, Ruriko Yoshida, Keiji Miura, Jefferson Huang
-
Short: Benchmarking Transferable Adversarial Attacks
Zhibo Jin, Jiayu Zhang, Zhiyu Zhu, Huaming Chen
-
Large Language Models Based Fuzzing Techniques: A Survey
Linghan Huang, Peizhou Zhao, Huaming Chen, Lei Ma
-
Investigating Bias Representations in Llama 2 Chat via Activation Steering
Dawn Lu, Nina Rimsky
-
Invariance-powered Trustworthy Defense via Remove Then Restore
Xiaowei Fu, Yuhang Zhou, Lina Ma, Lei Zhang
-
Safety of Multimodal Large Language Models on Images and Text
Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao
-
Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks
Maan Qraitem, Nazia Tasnim, Kate Saenko, Bryan A. Plummer
-
Survey of Privacy Threats and Countermeasures in Federated Learning
Masahiro Hayashitani, Junki Mori, Isamu Teranishi
-
Yuqing Wang, Malvika Pillai, Yun Zhao, Catherine Curtin, Tina Hernandez-Boussard
-
Hamed Poursiami, Ihsen Alouani, Maryam Parsa
-
Manipulating Predictions over Discrete Inputs in Machine Teaching
Xiaodong Wu, Yufei Han, Hayssam Dahrouj, Jianbing Ni, Zhenwen Liang, Xiangliang Zhang
-
Unified Physical-Digital Face Attack Detection
Hao Fang, Ajian Liu, Haocheng Yuan, Junze Zheng, Dingheng Zeng, Yanhong Liu, Jiankang Deng, Sergio Escalera, Xiaoming Liu, Jun Wan, Zhen Lei
-
Logit Poisoning Attack in Distillation-based Federated Learning and its Countermeasures
Yonghao Yu, Shunan Zhu, Jinglu Hu
-
Adversarial Quantum Machine Learning: An Information-Theoretic Generalization Analysis
Petros Georgiou, Sharu Theresa Jose, Osvaldo Simeone
-
Common Sense Reasoning for Deep Fake Detection
Yue Zhang, Ben Colman, Ali Shahriyari, Gaurav Bharaj
-
Privacy and Security Implications of Cloud-Based AI Services : A Survey
Alka Luqman, Riya Mahesh, Anupam Chattopadhyay
-
An Early Categorization of Prompt Injection Attacks on Large Language Models
Sippo Rossi, Alisia Marianne Michel, Raghava Rao Mukkamala, Jason Bennett Thatcher
-
Chenan Wang, Pu Zhao, Siyue Wang, Xue Lin
-
Steffi Chern, Ethan Chern, Graham Neubig, Pengfei Liu
-
Finetuning Large Language Models for Vulnerability Detection
Alexey Shestov, Anton Cheshkov, Rodion Levichev, Ravil Mussabayev, Pavel Zadorozhny, Evgeny Maslov, Chibirev Vadim, Egor Bulychev
-
Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks
Andy Zhou, Bo Li, Haohan Wang
-
Gradient-Based Language Model Red Teaming
Nevan Wichers, Carson Denison, Ahmad Beirami
-
Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers
Lei Xu, Sarah Alnegheimish, Laure Berti-Equille, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
-
Weak-to-Strong Jailbreaking on Large Language Models
Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei Li, Yu-Xiang Wang, William Yang Wang
-
Optimal-Landmark-Guided Image Blending for Face Morphing Attacks
Qiaoyun He, Zongyong Deng, Zuyuan He, Qijun Zhao
-
Towards Assessing the Synthetic-to-Measured Adversarial Vulnerability of SAR ATR
Bowen Peng, Bo Peng, Jingyuan Xia, Tianpeng Liu, Yongxiang Liu, Li Liu
-
Revisiting Gradient Pruning: A Dual Realization for Defending against Gradient Attacks
Lulu Xue, Shengshan Hu, Ruizhi Zhao, Leo Yu Zhang, Shengqing Hu, Lichao Sun, Dezhong Yao
-
Systematically Assessing the Security Risks of AI/ML-enabled Connected Healthcare Systems
Mohammed Elnawawy, Mohammadreza Hallajiyan, Gargi Mitra, Shahrear Iqbal, Karthik Pattabiraman
-
Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code
Wenjie Qu, Dong Yin, Zixin He, Wei Zou, Tianyang Tao, Jinyuan Jia, Jiaheng Zhang
-
AdvGPS: Adversarial GPS for Multi-Agent Perception Attack
Jinlong Li, Baolu Li, Xinyu Liu, Jianwu Fang, Felix Juefei-Xu, Qing Guo, Hongkai Yu
-
Security and Privacy Challenges of Large Language Models: A Survey
Badhan Chandra Das, M. Hadi Amini, Yanzhao Wu
-
Large Language Models in Cybersecurity: State-of-the-Art
Farzad Nourmohammadzadeh Motlagh, Mehrdad Hajizadeh, Mehryar Majd, Pejman Najafi, Feng Cheng, Christoph Meinel
-
Adversarial Training on Purification (AToP): Advancing Both Robustness and Generalization
Guang Lin, Chao Li, Jianhai Zhang, Toshihisa Tanaka, Qibin Zhao
-
Finding Challenging Metaphors that Confuse Pretrained Language Models
Yucheng Li, Frank Guerin, Chenghua Lin
-
Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception
Forrest McKee, David Noever
-
TransTroj: Transferable Backdoor Attacks to Pre-trained Models via Embedding Indistinguishability
Hao Wang, Tao Xiang, Shangwei Guo, Jialing He, Hangcheng Liu, Tianwei Zhang
-
AdvNF: Reducing Mode Collapse in Conditional Normalising Flows using Adversarial Learning
Vikas Kanaujia, Mathias S. Scheurer, Vipul Arora
-
Red-Teaming for Generative AI: Silver Bullet or Security Theater?
Michael Feffer, Anusha Sinha, Zachary C. Lipton, Hoda Heidari
-
LESSON: Multi-Label Adversarial False Data Injection Attack for Deep Learning Locational Detection
Jiwei Tian, Chao Shen, Buhong Wang, Xiaofang Xia, Meng Zhang, Chenhao Lin, Qian Li
-
Effective Controllable Bias Mitigation for Classification and Retrieval using Gate Adapters
Shahed Masoudian, Cornelia Volaucnik, Markus Schedl, Shahed Masoudian
-
Weifeng Liu, Tianyi She, Jiawei Liu, Run Wang, Dongyu Yao, Ziyou Liang
-
Yongyu Wang
-
Integrating Differential Privacy and Contextual Integrity
Sebastian Benthall, Rachel Cummings
-
L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks
Ping Guo, Fei Liu, Xi Lin, Qingchuan Zhao, Qingfu Zhang
-
Wei-Yao Wang, Yu-Chieh Chang, Wen-Chih Peng
-
Multi-Trigger Backdoor Attacks: More Triggers, More Threats
Yige Li, Xingjun Ma, Jiabo He, Hanxun Huang, Yu-Gang Jiang
-
Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation
Yiling Xie, Xiaoming Huo
-
Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement
Nuoyan Zhou, Dawei Zhou, Decheng Liu, Xinbo Gao, Nannan Wang
-
Conserve-Update-Revise to Cure Generalization and Robustness Trade-off in Adversarial Training
Shruthi Gowda, Bahram Zonooz, Elahe Arani
-
BackdoorBench: A Comprehensive Benchmark and Analysis of Backdoor Learning
Baoyuan Wu, Hongrui Chen, Mingda Zhang, Zihao Zhu, Shaokui Wei, Danni Yuan, Mingli Zhu, Ruotong Wang, Li Liu, Chao Shen
-
Unrecognizable Yet Identifiable: Image Distortion with Preserved Embeddings
Dmytro Zakharov, Oleksandr Kuznetsov, Emanuele Frontoni
-
Eugene Frimpong, Khoa Nguyen, Mindaugas Budzys, Tanveer Khan, Antonis Michalas
-
PrivStream: An Algorithm for Streaming Differentially Private Data
Girish Kumar, Thomas Strohmer, Roman Vershynin
-
Coca: Improving and Explaining Graph Neural Network-Based Vulnerability Detection Systems
Sicong Cao, Xiaobing Sun, Xiaoxue Wu, David Lo, Lili Bo, Bin Li, Wei Liu
-
Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective
Yue Xing, Xiaofeng Lin, Qifan Song, Yi Xu, Belinda Zeng, Guang Cheng
-
MEA-Defender: A Robust Watermark against Model Extraction Attack
Peizhuo Lv, Hualong Ma, Kai Chen, Jiachen Zhou, Shengzhi Zhang, Ruigang Liang, Shenchen Zhu, Pan Li, Yingjun Zhang
-
Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation
Yifan Yang, Xiaoyu Liu, Qiao Jin, Furong Huang, Zhiyong Lu
-
Adaptive Text Watermark for Large Language Models
Yepeng Liu, Yuheng Bu
-
Sparse and Transferable Universal Singular Vectors Attack
Kseniia Kuvshinova, Olga Tsymboi, Ivan Oseledets
-
Mengyao Du, Miao Zhang, Yuwen Pu, Kai Xu, Shouling Ji, Quanjun Yin
-
Information Leakage Detection through Approximate Bayes-optimal Prediction
Pritha Gupta, Marcel Wever, Eyke Hüllermeier
-
Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization
Adnan Khan, Mai A. Shaaban, Muhammad Haris Khan
-
Producing Plankton Classifiers that are Robust to Dataset Shift
Cheng Chen, Sreenath Kyathanahally, Marta Reyes, Stefanie Merkli, Ewa Merz, Emanuele Francazi, Marvin Hoege, Francesco Pomati, Marco Baity-Jesi
-
Decentralized Federated Learning: A Survey on Security and Privacy
Ehsan Hallaji, Roozbeh Razavi-Far, Mehrdad Saif, Boyu Wang, Qiang Yang
-
Boosting the Transferability of Adversarial Examples via Local Mixup and Adaptive Step Size
Junlin Liu, Xinchen Lyu
-
AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation
Lulan Shen, Ali Edalati, Brett Meyer, Warren Gross, James J. Clark
-
Multi-Agent Diagnostics for Robustness via Illuminated Diversity
Mikayel Samvelyan, Davide Paglieri, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel
-
Zhongjie Shi, Fanghui Liu, Yuan Cao, Johan A.K. Suykens
-
A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks
Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth
-
Don't Push the Button! Exploring Data Leakage Risks in Machine Learning and Transfer Learning
Andrea Apicella, Francesco Isgrò, Roberto Prevete
-
LAA-Net: Localized Artifact Attention Network for High-Quality Deepfakes Detection
Dat Nguyen, Nesryne Mejri, Inder Pal Singh, Polina Kuleshova, Marcella Astrid, Anis Kacem, Enjie Ghorbel, Djamila Aouada
-
Inference Attacks Against Face Recognition Model without Classification Layers
Yuanqing Huang, Huilong Chen, Yinggui Wang, Lei Wang
-
A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks
Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth
-
Securing Recommender System via Cooperative Training
Qingyang Wang, Chenwang Wu, Defu Lian, Enhong Chen
-
DAFA: Distance-Aware Fair Adversarial Training
Hyungyu Lee, Saehyung Lee, Hyemi Jang, Junsung Park, Ho Bae, Sungroh Yoon
-
Fast Adversarial Training against Textual Adversarial Attacks
Yichen Yang, Xin Liu, Kun He
-
Ying Song, Balaji Palanisamy
-
ToDA: Target-oriented Diffusion Attacker against Recommendation System
Xiaohao Liu, Zhulin Tao, Ting Jiang, He Chang, Yunshan Ma, Xianglin Huang
-
Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems
Michelle R. Greene, Mariam Josyula, Wentao Si, Jennifer A. Hart
-
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, Jingyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi
-
GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks?
Yu sun, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui
-
Zuojin Tang, Xiaoyu Chen, YongQiang Li, Jianyu Chen
-
Robustness to distribution shifts of compressed networks for edge devices
Lulan Shen, Ali Edalati, Brett Meyer, Warren Gross, James J. Clark
-
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
-
Text Embedding Inversion Attacks on Multilingual Language Models
Yiyi Chen, Heather Lent, Johannes Bjerva
-
A Training-Free Defense Framework for Robust Learned Image Compression
Myungseo Song, Jinyoung Choi, Bohyung Han
-
Privacy-Preserving Data Fusion for Traffic State Estimation: A Vertical Federated Learning Approach
Qiqing Wang, Kaidi Yang
-
Analyzing the Quality Attributes of AI Vision Models in Open Repositories Under Adversarial Attacks
Zerui Wang, Yan Liu
-
GRATH: Gradual Self-Truthifying for Large Language Models
Weixin Chen, Bo Li
-
Aly M. Kassem, Sherif Saad
-
Kiyoon Kim, Shreyank N Gowda, Panagiotis Eustratiadis, Antreas Antoniou, Robert B Fisher
-
TetraLoss: Improving the Robustness of Face Recognition against Morphing Attacks
Mathias Ibsen, Lázaro J. González-Soler, Christian Rathgeb, Christoph Busch
-
How Robust Are Energy-Based Models Trained With Equilibrium Propagation?
Siddharth Mansingh, Michal Kucer, Garrett Kenyon, Juston Moore, Michael Teti
-
Hangsheng Zhang, Jiqiang Liu, Jinsong Dong
-
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li
-
Ping Guo, Zhiyuan Yang, Xi Lin, Qingchuan Zhao, Qingfu Zhang
-
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
Adib Hasan, Ileana Rugina, Alex Wang
-
Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment
Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi
-
Differentially Private and Adversarially Robust Machine Learning: An Empirical Evaluation
Janvi Thakkar, Giulio Zizzo, Sergio Maffeis
-
Adversarially Robust Signed Graph Contrastive Learning from Balance Augmentation
Jialong Zhou, Xing Ai, Yuni Lai, Kai Zhou
-
Marsalis Gibson, David Babazadeh, Claire Tomlin, Shankar Sastry
-
A Lightweight Multi-Attack CAN Intrusion Detection System on Hybrid FPGAs
Shashwat Khandelwal, Shreejith Shanker
-
Real-Time Zero-Day Intrusion Detection System for Automotive Controller Area Network on FPGAs
Shashwat Khandelwal, Shreejith Shanker
-
The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness
Yifan Hao, Tong Zhang
-
Uncertainty-Aware Hardware Trojan Detection Using Multimodal Deep Learning
Rahul Vishwakarma, Amin Rezaei
-
Ariel Marcus
-
Tuc Nguyen, Thai Le
-
Artwork Protection Against Neural Style Transfer Using Locally Adaptive Adversarial Color Attack
Zhongliang Guo, Kaixuan Wang, Weiye Li, Yifei Qian, Ognjen Arandjelović, Lei Fang
-
Cross-Modality Perturbation Synergy Attack for Person Re-identification
Yunpeng Gong, others
-
Prajwal Panzade, Daniel Takabi, Zhipeng Cai
-
MITS-GAN: Safeguarding Medical Imaging from Tampering with Generative Adversarial Networks
Giovanni Pasqualino, Luca Guarnera, Alessandro Ortis, Sebastiano Battiato
-
Universally Robust Graph Neural Networks by Preserving Neighbor Similarity
Yulin Zhu, Yuni Lai, Xing Ai, Kai Zhou
-
HGAttack: Transferable Heterogeneous Graph Adversarial Attack
He Zhao, Zhiwei Zeng, Yongwei Wang, Deheng Ye, Chunyan Miao
-
Hijacking Attacks against Neural Networks by Analyzing Training Data
Yunjie Ge, Qian Wang, Huayang Huang, Qi Li, Cong Wang, Chao Shen, Lingchen Zhao, Peipei Jiang, Zheng Fang, Shenyi Zhang
-
MITS-GAN: Safeguarding Medical Imaging from Tampering with Generative Adversarial Networks
Giovanni Pasqualino, Luca Guarnera, Alessandro Ortis, Sebastiano Battiato
-
Nicolas Garcia Trillos, Matt Jacobs, Jakwang Kim, Matthew Werenski
-
Artwork Protection Against Neural Style Transfer Using Locally Adaptive Adversarial Color Attack
Zhongliang Guo, Kaixuan Wang, Weiye Li, Yifei Qian, Ognjen Arandjelović, Lei Fang
-
Bag of Tricks to Boost Adversarial Transferability
Zeliang Zhang, Rongyi Zhu, Wei Yao, Xiaosen Wang, Chenliang Xu
-
Towards Efficient and Certified Recovery from Poisoning Attacks in Federated Learning
Yu Jiang, Jiyuan Shen, Ziyao Liu, Chee Wei Tan, Kwok-Yan Lam
-
PPR: Enhancing Dodging Attacks while Maintaining Impersonation Attacks on Face Recognition Systems
Fengfan Zhou, Heifei Ling
-
A Generative Adversarial Attack for Multilingual Text Classifiers
Tom Roth, Inigo Jauregi Unanue, Alsharif Abuadbba, Massimo Piccardi
-
Left-right Discrepancy for Adversarial Attack on Stereo Networks
Pengfei Wang, Xiaofei Hui, Beijia Lu, Nimrod Lilith, Jun Liu, Sameer Alam
-
Junxi Chen, Junhao Dong, Xiaohua Xie
-
Crafter: Facial Feature Crafting against Inversion-based Identity Theft on Deep Models
Shiming Wang, Zhe Ji, Liyao Xiang, Hao Zhang, Xinbing Wang, Chenghu Zhou, Bo Li
-
Robustness Against Adversarial Attacks via Learning Confined Adversarial Polytopes
Shayan Mohajer Hamidi; Linfeng Ye
-
An Analytical Framework for Modeling and Synthesizing Malicious Attacks on ACC Vehicles
Shian Wang
-
Adversarial Examples are Misaligned in Diffusion Model Manifolds
Peter Lorenz, Ricard Durall, Janis Keuper
-
Yi Zeng, Hongpeng Lin, Jingwen Zhang, Diyi Yang, Ruoxi Jia, Weiyan Shi
-
Combating Adversarial Attacks with Multi-Agent Debate
Steffi Chern, Zhen Fan, Andy Liu
-
Zhiyu Zhu, Huaming Chen, Xinyi Wang, Jiayu Zhang, Zhibo Jin, Kim-Kwang Raymond Choo
-
Universal Vulnerabilities in Large Language Models: In-context Learning Backdoor Attacks
Shuai Zhao, Meihuizi Jia, Luu Anh Tuan, Jinming Wen
-
Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation
Lijun Sheng, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
-
Revisiting Adversarial Training at Scale
Zeyu Wang, Xianhang Li, Hongru Zhu, Cihang Xie
-
SoK: Facial Deepfake Detectors
Binh M. Le, Jiwon Kim, Shahroz Tariq, Kristen Moore, Alsharif Abuadbba, Simon S. Woo
-
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky, Paul Christiano, Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez
-
TrustLLM: Trustworthiness in Large Language Models
Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang, Huan Zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, Ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, Willian Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
-
Chenxi Yang, Yujia Liu, Dingquan Li, Tingting Jiang
-
Deep Anomaly Detection in Text
Andrei Manolache
-
Zilin Huang, Zihao Sheng, Chengyuan Ma, Sikai Chen
-
An Investigation of Large Language Models for Real-World Hate Speech Detection
Keyan Guo, Alexander Hu, Jaden Mu, Ziheng Shi, Ziming Zhao, Nishant Vishwamitra, Hongxin Hu
-
LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward
Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Gonzalo De La Torre Parra, Elias Bou-Harb, Peyman Najafirad
-
GLOCALFAIR: Jointly Improving Global and Local Group Fairness in Federated Learning
Syed Irfan Ali Meerza, Luyang Liu, Jiaxin Zhang, Jian Liu
-
Pengfei Ding, Yan Wang, Guanfeng Liu, Nan Wang
-
A Large-scale Empirical Study on Improving the Fairness of Deep Learning Models
Junjie Yang, Jiajun Jiang, Zeyu Sun, Junjie Chen
-
Abel Salinas, Fred Morstatter
-
Transferable Learned Image Compression-Resistant Adversarial Perturbations
Yang Sui, Zhuohang Li, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, Zhenzhong Chen
-
Adaptive Boosting with Fairness-aware Reweighting Technique for Fair Classification
Xiaobin Song, Zeyuan Liu, Benben Jiang
-
Accurate and Scalable Estimation of Epistemic Uncertainty for Graph Neural Networks
Puja Trivedi, Mark Heimann, Rushil Anirudh, Danai Koutra, Jayaraman J. Thiagarajan
-
Logits Poisoning Attack in Federated Distillation
Yuhan Tang, Zhiyuan Wu, Bo Gao, Tian Wen, Yuwei Wang, Sheng Sun
-
Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception
Takami Sato, Sri Hrushikesh Varma Bhupathiraju, Michael Clifford, Takeshi Sugawara, Qi Alfred Chen, Sara Rampazzi
-
Data-Driven Subsampling in the Presence of an Adversarial Actor
Abu Shafin Mohammad Mahdee Jameel, Ahmed P. Mohamed, Jinho Yi, Aly El Gamal, Akshay Malhotra
-
ROIC-DM: Robust Text Inference and Classification via Diffusion Model
Shilong Yuan, Wei Yuan, Tieke HE
-
Data-Dependent Stability Analysis of Adversarial Training
Yihan Wang, Shuang Liu, Xiao-Shan Gao
-
End-to-End Anti-Backdoor Learning on Images and Time Series
Yujing Jiang, Xingjun Ma, Sarah Monazam Erfani, Yige Li, James Bailey
-
Transferable Learned Image Compression-Resistant Adversarial Perturbations
Yang Sui, Zhuohang Li, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, Zhenzhong Chen
-
Enhancing targeted transferability via feature space fine-tuning
Hui Zeng, Biwei Chen, Anjie Peng
-
Calibration Attack: A Framework For Adversarial Attacks Targeting Calibration
Stephen Obadinma, Xiaodan Zhu, Hongyu Guo
-
A backdoor attack against link prediction tasks with graph neural networks
Jiazhu Dai, Haoyu Sun
-
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang
-
Jai Prakash Veerla, Poojitha Thota, Partha Sai Guttikonda, Shirin Nilizadeh, Jacob M. Luber
-
A Random Ensemble of Encrypted models for Enhancing Robustness against Adversarial Examples
Ryota Iijima, Sayaka Shiota, Hitoshi Kiya
-
AdvSQLi: Generating Adversarial SQL Injections against Real-world WAF-as-a-service
Zhenqing Qu, Xiang Ling, Ting Wang, Xiang Chen, Shouling Ji, Chunming Wu
-
Evasive Hardware Trojan through Adversarial Power Trace
Behnam Omidi, Khaled N. Khasawneh, Ihsen Alouani
-
Object-oriented backdoor attack against image captioning
Meiling Li, Nan Zhong, Xinpeng Zhang, Zhenxing Qian, Sheng Li
-
DEM: A Method for Certifying Deep Neural Network Classifier Outputs in Aerospace
Guy Katz, Natan Levy, Idan Refaeli, Raz Yerushalmi
-
H M Sabbir Ahmad, Ehsan Sabouni, Akua Dickson, Wei Xiao, Christos G. Cassandras, Wenchao Li
-
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
Zheng Yuan, Jie Zhang, Yude Wang, Shiguang Shan, Xilin Chen
-
Spy-Watermark: Robust Invisible Watermarking for Backdoor Attack
Ruofei Wang, Renjie Wan, Zongyu Guo, Qing Guo, Rui Huang
-
FullLoRA-AT: Efficiently Boosting the Robustness of Pretrained Vision Transformers
Zheng Yuan, Jie Zhang, Shiguang Shan
-
Integrated Cyber-Physical Resiliency for Power Grids under IoT-Enabled Dynamic Botnet Attacks
Yuhan Zhao, Juntao Chen, Quanyan Zhu
-
Enhancing Generalization of Invisible Facial Privacy Cloak via Gradient Accumulation
Xuannan Liu, Yaoyao Zhong, Weihong Deng, Hongzhi Shi, Xingchen Cui, Yunfeng Yin, Dongchao Wen
-
Jie Zhu, Leye Wang, Xiao Han, Anmin Liu, Tao Xie
-
Teach Large Language Models to Forget Privacy
Ran Yan, Yujun Li, Wenqian Li, Peihua Mai, Yan Pang, Yinchuan Li
-
PPBFL: A Privacy Protected Blockchain-based Federated Learning Model
Yang Li, Chunhe Xia, Wanshuang Lin, Tianbo Wang
-
LLbezpeky: Leveraging Large Language Models for Vulnerability Detection
Noble Saji Mathews, Yelizaveta Brus, Yousra Aafer, Mei Nagappan, Shane McIntosh
-
Noise-NeRF: Hide Information in Neural Radiance Fields using Trainable Noise
Qinglong Huang, Yong Liao, Yanbin Hao, Pengyuan Zhou
-
Imperio: Language-Guided Backdoor Attacks for Arbitrary Model Control
Ka-Ho Chow, Wenqi Wei, Lei Yu
-
Efficient Sparse Least Absolute Deviation Regression with Differential Privacy
Weidong Liu, Xiaojun Mao, Xiaofei Zhang, Xin Zhang
-
Opening A Pandora's Box: Things You Should Know in the Era of Custom GPTs
Guanhong Tao, Siyuan Cheng, Zhuo Zhang, Junmin Zhu, Guangyu Shen, Xiangyu Zhang
-
Daniel Wankit Yip, Aysan Esmradi, Chun Fai Chan
-
Detection and Defense Against Prominent Attacks on Preconditioned LLM-Integrated Virtual Assistants
Chun Fai Chan, Daniel Wankit Yip, Aysan Esmradi
-
Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks
Aleksander Buszydlik, Karol Dobiczek, Michał Teodor Okoń, Konrad Skublicki, Philip Lippmann, Jie Yang
-
A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models
Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu
-
SHARE: Single-view Human Adversarial REconstruction
Shreelekha Revankar, Shijia Liao, Yu Shen, Junbang Liang, Huaishu Peng, Ming Lin
-
SSL-OTA: Unveiling Backdoor Threats in Self-Supervised Learning for Object Detection
Qiannan Wang, Changchun Yin, Liming Fang, Lu Zhou, Zhe Liu, Run Wang, Chenhao Lin
-
Adversarially Trained Actor Critic for offline CMDPs
Honghao Wei, Xiyue Peng, Xin Liu, Arnob Ghosh
-
Is It Possible to Backdoor Face Forgery Detection with Natural Triggers
Xiaoxuan Han, Songlin Yang, Wei Wang, Ziwen He, Jing Dong
-
Sebastian-Vasile Echim, Iulian-Marius Tăiatu, Dumitru-Clementin Cercel, Florin Pop
-
CamPro: Camera-based Anti-Facial Recognition
Wenjun Zhu, Yuan Sun, Jiani Liu, Yushi Cheng, Xiaoyu Ji, Wenyuan Xu
-
TPatch: A Triggered Physical Adversarial Patch
Wenjun Zhu, Xiaoyu Ji, Yushi Cheng, Shibo Zhang, Wenyuan Xu
-
A clean-label graph backdoor attack method in node classification task
Xiaogang Xing, Ming Xu, Yujing Bai, Dongdong Yang
-
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Julien Piet, Maha Alrashed, Chawin Sitawarin, Sizhe Chen, Zeming Wei, Elizabeth Sun, Basel Alomair, David Wagner
-
Dongfang Li, Baotian Hu, Qingcai Chen, Shan He
-
Hyunjune Kim, Sangyong Lee, Simon S. Woo
-
Yuntao Shou, Tao Meng, Wei Ai, Keqin Li
-
Julien Ferry, Ulrich Aïvodji, Sébastien Gambs, Marie-José Huguet, Mohamed Siala
-
Adversarial Attacks on Image Classification Models: Analysis and Defense
Jaydip Sen, Abhiraj Sen, Ananda Chatterjee
-
BlackboxBench: A Comprehensive Benchmark of Black-box Adversarial Attacks
Meixi Zheng, Xuanchen Yan, Zihao Zhu, Hongrui Chen, Baoyuan Wu
-
Attack Tree Analysis for Adversarial Evasion Attacks
Yuki Yamaguchi, Toshiaki Aoki
-
DOEPatch: Dynamically Optimized Ensemble Model for Adversarial Patches Generation
Wenyi Tan, Yang Li, Chenxing Zhao, Zhunga Liu, Quan Pan
-
Securing NextG Systems against Poisoning Attacks on Federated Learning: A Game-Theoretic Solution
Yalin E. Sagduyu, Tugba Erpek, Yi Shi
-
Timeliness: A New Design Metric and a New Attack Surface
Priyanka Kaswan, Sennur Ulukus
-
Federico Siciliano, Luca Maiano, Lorenzo Papa, Federica Baccin, Irene Amerini, Fabrizio Silvestri
-
Adversarial Attacks on LoRa Device Identification and Rogue Signal Detection with Deep Learning
Yalin E. Sagduyu, Tugba Erpek
-
Domain Generalization with Vital Phase Augmentation
Ingyun Lee, Wooju Lee, Hyun Myung
-
Gulsum Yigit, Mehmet Fatih Amasyali
-
Natural Adversarial Patch Generation Method Based on Latent Diffusion Model
Xianyi Chen, Fazhan Liu, Dong Jiang, Kai Yan
-
Universal Pyramid Adversarial Training for Improved ViT Performance
Ping-yeh Chiang, Yipin Zhou, Omid Poursaeed, Satya Narayan Shukla, Ashish Shah, Tom Goldstein, Ser-Nam Lim
-
Robust Survival Analysis with Adversarial Regularization
Michael Potter, Stefano Maxenti, Michael Everett
-
GanFinger: GAN-Based Fingerprint Generation for Deep Neural Network Ownership Verification
Huali Ren, Anli Yan, Xiaojun Ren, Pei-Gen Ye, Chong-zhi Gao, Zhili Zhou, Jin Li
-
Adversarial Item Promotion on Visually-Aware Recommender Systems by Guided Diffusion
Lijian Chen, Wei Yuan, Tong Chen, Quoc Viet Hung Nguyen, Lizhen Cui, Hongzhi Yin
-
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng, Zhicheng Li, Zhaoyang Han, Xiangmao Chang, Piji Li
-
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi, Yueqi Xie, Bin Zhu, Keegan Hines, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu
-
Taha Eghtesad, Sirui Li, Yevgeniy Vorobeychik, Aron Laszka
-
AutoAugment Input Transformation for Highly Transferable Targeted Attacks
Haobo Lu, Xin Liu, Kun He
-
Can Machines Learn Robustly, Privately, and Efficiently?
Youssef Allouah, Rachid Guerraoui, John Stephan
-
Federico Siciliano, Luca Maiano, Lorenzo Papa, Federica Baccin, Irene Amerini, Fabrizio Silvestri
-
Pre-trained Trojan Attacks for Visual Recognition
Aishan Liu, Xinwei Zhang, Yisong Xiao, Yuguang Zhou, Siyuan Liang, Jiakai Wang, Xianglong Liu, Xiaochun Cao, Dacheng Tao
-
MEAOD: Model Extraction Attack against Object Detectors
Zeyu Li, Chenghui Shi, Yuwen Pu, Xuhong Zhang, Yu Li, Jinbao Li, Shouling Ji
-
Asymmetric Bias in Text-to-Image Generation with Adversarial Attacks
Haz Sameen Shahgir, Xianghao Kong, Greg Ver Steeg, Yue Dong
-
Understanding the Regularity of Self-Attention with Optimal Transport
Valérie Castin, Pierre Ablin, Gabriel Peyré
-
Attacking Byzantine Robust Aggregation in High Dimensions
Sarthak Choudhary, Aashish Kolluri, Prateek Saxena
-
SODA: Protecting Proprietary Information in On-Device Machine Learning Models
Akanksha Atrey, Ritwik Sinha, Saayan Mitra, Prashant Shenoy
-
Energy-based learning algorithms for analog computing: a comparative study
Benjamin Scellier, Maxence Ernoult, Jack Kendall, Suhas Kumar
-
Adaptive Domain Inference Attack
Yuechun Gu, Keke Chen
-
Adversarial Markov Games: On Adaptive Decision-Based Attacks and Defenses
Ilias Tsingenopoulos, Vera Rimmer, Davy Preuveneers, Fabio Pierazzi, Lorenzo Cavallaro, Wouter Joosen
-
Samuel J. Aronson, Kalotina Machini, Pranav Sriraman, Jiyeon Shin, Emma R. Henricks, Charlotte Mailly, Angie J. Nottage, Michael Oates, Matthew S. Lebo
-
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich, Albert Clapés, Sergio Escalera
-
ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks
Peng Zhao, Jiehua Zhang, Bowen Peng, Longguang Wang, YingMei Wei, Yu Liu, Li Liu
-
Ruichu Cai, Yuxuan Zhu, Jie Qiao, Zefeng Liang, Furui Liu, Zhifeng Hao
-
Manipulating Trajectory Prediction with Backdoors
Kaouther Massoud, Kathrin Grosse, Mickael Chen, Matthieu Cord, Patrick Pérez, Alexandre Alahi
-
Progressive Poisoned Data Isolation for Training-time Backdoor Defense
Yiming Chen, Haiwei Wu, Jiantao Zhou
-
PGN: A perturbation generation network against deep reinforcement learning
Xiangjuan Li, Feifan Li, Yang Li, Quan Pan
-
Learning and Forgetting Unsafe Examples in Large Language Models
Jiachen Zhao, Zhun Deng, David Madras, James Zou, Mengye Ren
-
Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu, Ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang
-
Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining
Bumsoo Kim, Yeonsik Jo, Jinhyung Kim, Seung Hwan Kim
-
Mutual-modality Adversarial Attack with Semantic Perturbation
Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang
-
Trust, But Verify: A Survey of Randomized Smoothing Techniques
Anupriya Kumari, Devansh Bhardwaj, Sukrit Jindal, Sarthak Gupta
-
Stability of Graph Convolutional Neural Networks through the lens of small perturbation analysis
Lucia Testa, Claudio Battiloro, Stefania Sardellitti, Sergio Barbarossa
-
Neural Stochastic Differential Equations with Change Points: A Generative Adversarial Approach
Zhongchang Sun, Yousef El-Laham, Svitlana Vyetrenko
-
SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks
Peishen Yan, Hao Wang, Tao Song, Yang Hua, Ruhui Ma, Ningxin Hu, Mohammad R. Haghighat, Haibing Guan
-
Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet
Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse Coskun, Gianluca Stringhini
-
A Red Teaming Framework for Securing AI in Maritime Autonomous Systems
Mathew J. Walter, Aaron Barrett, Kimberly Tam
-
Maatphor: Automated Variant Analysis for Prompt Injection Attacks
Ahmed Salem, Andrew Paverd, Boris Köpf
-
Robust Communicative Multi-Agent Reinforcement Learning with Active Defense
Lebin Yu, Yunbo Qiu, Quanming Yao, Yuan Shen, Xudong Zhang, Jian Wang
-
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in ultra low-data regimes
Nabeel Seedat, Nicolas Huynh, Boris van Breugel, Mihaela van der Schaar
-
Bypassing the Safety Training of Open-Source LLMs with Priming Attacks
Jason Vega, Isha Chaudhary, Changming Xu, Gagandeep Singh
-
Chasing Fairness in Graphs: A GNN Architecture Perspective
Zhimeng Jiang, Xiaotian Han, Chao Fan, Zirui Liu, Na Zou, Ali Mostafavi, Xia Hu
-
Huafeng Qin, Xin Jin, Yun Jiang, Mounim A. El-Yacoubi, Xinbo Gao
-
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Shweta Mahajan, Tanzila Rahman, Kwang Moo Yi, Leonid Sigal
-
A Study on Transferability of Deep Learning Models for Network Intrusion Detection
Shreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
-
Android Malware Detection with Unbiased Confidence Guarantees
Harris Papadopoulos, Nestoras Georgiou, Charalambos Eliades, Andreas Konstantinidis
-
Terrapin Attack: Breaking SSH Channel Integrity By Sequence Number Manipulation
Fabian Bäumer, Marcus Brinkmann, Jörg Schwenk
-
Gakusei Sato, Taketo Akama
-
SAME: Sample Reconstruction Against Model Extraction Attacks
Yi Xie, Jie Zhang, Shiqian Zhao, Tianwei Zhang, Xiaofeng Chen
-
Synthesizing Black-box Anti-forensics DeepFakes with High Visual Quality
Bing Fan, Shu Hu, Feng Ding
-
Unmasking Deepfake Faces from Videos Using An Explainable Cost-Sensitive Deep Learning Approach
Faysal Mahmud, Yusha Abdullah, Minhajul Islam, Tahsin Aziz
-
DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models
Jiachen Zhou, Peizhuo Lv, Yibing Lan, Guozhu Meng, Kai Chen, Hualong Ma
-
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model
Decheng Liu, Xijun Wang, Chunlei Peng, Nannan Wang, Ruiming Hu, Xinbo Gao
-
Bengali Intent Classification with Generative Adversarial BERT
Mehedi Hasan, Mohammad Jahid Ibna Basher, Md. Tanvir Rouf Shawon
-
Yu-An Liu, Ruqing Zhang, Mingkun Zhang, Wei Chen, Maarten de Rijke, Jiafeng Guo, Xueqi Cheng
-
Closing the Gap: Achieving Better Accuracy-Robustness Tradeoffs Against Query-Based Attacks
Pascal Zimmer, Sébastien Andreina, Giorgia Azzurra Marson, Ghassan Karame
-
Jaehui Hwang, Junghyuk Lee, Jong-Seok Lee
-
The Ultimate Combo: Boosting Adversarial Example Transferability by Composing Data Augmentations
Zebin Yun, Achi-Or Weingarten, Eyal Ronen, Mahmood Sharif
-
On Robustness to Missing Video for Audiovisual Speech Recognition
Oscar Chang, Otavio Braga, Hank Liao, Dmitriy Serdyuk, Olivier Siohan
-
Rethinking Robustness of Model Attributions
Sandesh Kamath, Sankalp Mittal, Amit Deshpande, Vineeth N Balasubramanian
-
The Pros and Cons of Adversarial Robustness
Yacine Izza, Joao Marques-Silva
-
PPIDSG: A Privacy-Preserving Image Distribution Sharing Scheme with GAN in Federated Learning
Yuting Ma, Yuanzhi Yao, Xiaohua Xu
-
TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Mengxin Zheng, Jiaqi Xue, Xun Chen, YanShan Wang, Qian Lou, Lei Jiang
-
TrojFair: Trojan Fairness Attacks
Mengxin Zheng, Jiaqi Xue, Yi Sheng, Lei Yang, Qian Lou, Lei Jiang
-
Adversarially Balanced Representation for Continuous Treatment Effect Estimation
Amirreza Kazemi, Martin Ester
-
Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity
Zhihao Zhu, Chenwang Wu, Rui Fan, Yi Yang, Defu Lian, Enhong Chen
-
MISA: Unveiling the Vulnerabilities in Split Federated Learning
Wei Wan, Yuxuan Ning, Shengshan Hu1, Lulu Xue, Minghui Li, Leo Yu Zhang, Hai Jin
-
Uncertainty-based Fairness Measures
Selim Kuzucu, Jiaee Cheong, Hatice Gunes, Sinan Kalkan
-
Rohan Banerjee, Prishita Ray, Mark Campbell
-
Harnessing Inherent Noises for Privacy Preservation in Quantum Machine Learning
Keyi Ju, Xiaoqi Qin, Hui Zhong, Xinyue Zhang, Miao Pan, Baoling Liu
-
UltraClean: A Simple Framework to Train Robust Neural Networks against Backdoor Attacks
Bingyin Zhao, Yingjie Lao
-
A Mutation-Based Method for Multi-Modal Jailbreaking Attack Detection
Xiaoyu Zhang, Cen Zhang, Tianlin Li, Yihao Huang, Xiaojun Jia, Xiaofei Xie, Yang Liu, Chao Shen
-
Federated learning with differential privacy and an untrusted aggregator
Kunlong Liu, Trinabh Gupta
-
Aysan Esmradi, Daniel Wankit Yip, Chun Fai Chan
-
Investigating Responsible AI for Scientific Research: An Empirical Study
Muneera Bano, Didar Zowghi, Pip Shea, Georgina Ibarra
-
Qian Wang, Yaoyao Liu, Hefei Ling, Yingwei Li, Qihao Liu, Ping Li, Jiazhong Chen, Alan Yuille, Ning Yu
-
Chen Ma, Ningfei Wang, Qi Alfred Chen, Chao Shen
-
Embodied Adversarial Attack: A Dynamic Robust Physical Attack in Autonomous Driving
Yitong Sun, Yao Huang, Xingxing Wei
-
Adversarial Robustness on Image Classification with $k$-means
Rollin Omari, Junae Kim, Paul Montague
-
Fragility, Robustness and Antifragility in Deep Learning
Chandresh Pravin, Ivan Martino, Giuseppe Nicosia, Varun Ojha
-
Reliable Probabilistic Classification with Neural Networks
Harris Papadopoulos
-
A Malware Classification Survey on Adversarial Attacks and Defences
Mahesh Datta Sai Ponnuru, Likhitha Amasala, Tanu Sree Bhimavarapu, Guna Chaitanya Garikipati
-
Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
Jiawei Zhao, Kejiang Chen, Xiaojian Yuan, Yuang Qi, Weiming Zhang, Nenghai Yu
-
Ao Liu, Wenshan Li, Tao Li, Beibei Li, Hanyuan Huang, Pan Zhou
-
PhasePerturbation: Speech Data Augmentation via Phase Perturbation for Automatic Speech Recognition
Chengxi Lei, Satwinder Singh, Feng Hou, Xiaoyun Jia, Ruili Wang
-
Yichen Wan, Youyang Qu, Wei Ni, Yong Xiang, Longxiang Gao, Ekram Hossain
-
Rongwu Xu, Brian S. Lin, Shujian Yang, Tianqi Zhang, Weiyan Shi, Tianwei Zhang, Zhixuan Fang, Wei Xu, Han Qiu
-
Privacy Constrained Fairness Estimation for Decision Trees
Florian van der Steen, Fré Vink, Heysem Kaya
-
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Haibin Wu, Heng-Cheng Kuo, Yu Tsao, Hung-yi Lee
-
AVA: Inconspicuous Attribute Variation-based Adversarial Attack bypassing DeepFake Detection
Xiangtao Meng, Li Wang, Shanqing Guo, Lei Ju, Qingchuan Zhao
-
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks
Changjiang Li, Ren Pang, Bochuan Cao, Zhaohan Xi, Jinghui Chen, Shouling Ji, Ting Wang
-
Detection and Defense of Unlearnable Examples
Yifan Zhu, Lijia Yu, Xiao-Shan Gao
-
Buqing Nie, Jingtian Ji, Yangqing Fu, Yue Gao
-
Yichen Wan, Youyang Qu, Wei Ni, Yong Xiang, Longxiang Gao, Ekram Hossain
-
DRAM-Locker: A General-Purpose DRAM Protection Mechanism against Adversarial DNN Weight Attacks
Ranyang Zhou, Sabbir Ahmed, Arman Roohi, Adnan Siraj Rakin, Shaahin Angizi
-
Forbidden Facts: An Investigation of Competing Objectives in Llama-2
Tony T. Wang, Miles Wang, Kaivu Hariharan, Nir Shavit
-
Coevolutionary Algorithm for Building Robust Decision Trees under Minimax Regret
Adam Żychowski, Andrew Perrault, Jacek Mańdziuk
-
Exploring Transferability for Randomized Smoothing
Kai Qiu, Huishuai Zhang, Zhirong Wu, Stephen Lin
-
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Shanghang Zhang, Kurt Keutzer
-
Defenses in Adversarial Machine Learning: A Survey
Baoyuan Wu, Shaokui Wei, Mingli Zhu, Meixi Zheng, Zihao Zhu, Mingda Zhang, Hongrui Chen, Danni Yuan, Li Liu, Qingshan Liu
-
Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification
Xiaojun Xue, Chunxia Zhang, Tianxiang Xu, Zhendong Niu
-
Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection
Samrat Mukherjee, Dibyanayan Bandyopadhyay, Baban Gain, Asif Ekbal
-
Ao Liu, Wenshan Li, Tao Li, Beibei Li, Hanyuan Huang, Pan Zhou
-
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Haibin Wu, Heng-Cheng Kuo, Yu Tsao, Hung-yi Lee
-
Accelerating the Global Aggregation of Local Explanations
Alon Mor, Yonatan Belinkov, Benny Kimelfeld
-
Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking
Shengsheng Qian, Yifei Wang, Dizhan Xue, Shengjie Zhang, Huaiwen Zhang, Changsheng Xu
-
Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models
Jiang Zhang, Qiong Wu, Yiming Xu, Cheng Cao, Zheng Du, Konstantinos Psounis
-
Patch-MI: Enhancing Model Inversion Attacks via Patch-Based Reconstruction
Jonggyu Jang, Hyeonsu Lyu, Hyun Jong Yang
-
Radio Signal Classification by Adversarially Robust Quantum Machine Learning
Yanqiu Wu, Eromanga Adermann, Chandra Thapa, Seyit Camtepe, Hajime Suzuki, Muhammad Usman
-
SSTA: Salient Spatially Transformed Attack
Renyang Liu, Wei Zhou, Sixin Wu, Jun Zhao, Kwok-Yan Lam
-
DTA: Distribution Transform-based Attack for Query-Limited Scenario
Renyang Liu, Wei Zhou, Xin Jin, Song Gao, Yuanyu Wang, Ruxin Wang
-
May the Noise be with you: Adversarial Training without Adversarial Examples
Ayoub Arous, Andres F Lopez-Lopera, Nael Abu-Ghazaleh, Ihsen Alouani
-
Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training
Qian Li, Yuxiao Hu, Yinpeng Dong, Dongxiao Zhang, Yuntian Chen
-
Attacking the Loop: Adversarial Attacks on Graph-based Loop Closure Detection
Jonathan J. Y. Kim, Martin Urschler, Patricia J. Riddle, Jorg S. Wicker
-
Collapse-Oriented Adversarial Training with Triplet Decoupling for Robust Image Retrieval
Qiwei Tian, Chenhao Lin, Qian Li, Zhengyu Zhao, Chao Shen
-
ReRoGCRL: Representation-based Robustness in Goal-Conditioned Reinforcement Learning
Xiangyu Yin, Sihao Wu, Jiaxu Liu, Meng Fang, Xingyu Zhao, Xiaowei Huang, Wenjie Ruan
-
Robust MRI Reconstruction by Smoothed Unrollin
Shijun Liang, Van Hoang Minh Nguyen, Jinghan Jia, Ismail Alkhouri, Sijia Liu, Saiprasad Ravishankar
-
Cost Aware Untargeted Poisoning Attack against Graph Neural Networks,
Yuwei Han, Yuni Lai, Yulin Zhu, Kai Zhou
-
EdgePruner: Poisoned Edge Pruning in Graph Contrastive Learning
Hiroya Kato, Kento Hasegawa, Seira Hidano, Kazuhide Fukushima
-
Causality Analysis for Evaluating the Security of Large Language Models
Wei Zhao, Zhe Li, Jun Sun
-
SimAC: A Simple Anti-Customization Method against Text-to-Image Synthesis of Diffusion Models
Feifei Wang, Zhentao Tan, Tianyi Wei, Yue Wu, Qidong Huang
-
Michael Lanier, Aayush Dhakal, Zhexiao Xiong, Arthur Li, Nathan Jacobs, Yevgeniy Vorobeychik
-
Bang Wu, Xingliang Yuan, Shuo Wang, Qi Li, Minhui Xue, Shirui Pan
-
Yanni Georghiades, Rajesh Mishra, Karl Kreder, Sriram Vishwanath
-
Yimo Deng, Huangxun Chen
-
Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack
Yu Fu, Yufei Li, Wen Xiao, Cong Liu, Yue Dong
-
Marwa Kechaou, Mokhtar Z. Alaya, Romain Hérault, Gilles Gasso
-
Dynamic Adversarial Attacks on Autonomous Driving Systems
Amirhosein Chahe, Chenan Wang, Abhishek Jeyapratap, Kaidi Xu, Lifeng Zhou
-
AI Control: Improving Safety Despite Intentional Subversion
Ryan Greenblatt, Buck Shlegeris, Kshitij Sachan, Fabien Roger
-
Caridad Arroyo Arevalo, Sayedeh Leila Noorbakhsh, Yun Dong, Yuan Hong, Binghui Wang
-
Towards Transferable Adversarial Attacks with Centralized Perturbation
Shangbo Wu, Yu-an Tan, Yajie Wang, Ruinan Ma, Wencong Ma, Yuanzhang Li
-
Yuyang Zhou, Guang Cheng, Zongyao Chen, Shui Yu
-
Sparse but Strong: Crafting Adversarially Robust Graph Lottery Tickets
Subhajit Dutta Chowdhury, Zhiyu Ni, Qingyuan Peng, Souvik Kundu, Pierluigi Nuzzo
-
Reward Certification for Policy Smoothed Reinforcement Learning
Ronghui Mu, Leandro Soriano Marcolino, Tianle Zhang, Yanghao Zhang, Xiaowei Huang, Wenjie Ruan
-
Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks
Danni Yuan, Shaokui Wei, Mingda Zhang, Li Liu, Baoyuan Wu
-
Sanghak Oh, Kiho Lee, Seonhye Park, Doowon Kim, Hyoungshick Kim
-
Promoting Counterfactual Robustness through Diversity
Francesco Leofante, Nico Potyka
-
Robust Graph Neural Network based on Graph Denoising
Victor M. Tenorio, Samuel Rey, Antonio G. Marques
-
Privacy Preserving Multi-Agent Reinforcement Learning in Supply Chains
Ananta Mukherjee, Peeyush Kumar, Boling Yang, Nishanth Chandran, Divya Gupta
-
Exploring the Limits of ChatGPT in Software Security Applications
Fangzhou Wu, Qingzhao Zhang, Ati Priya Bajaj, Tiffany Bao, Ning Zhang, Ruoyu "Fish" Wang, Chaowei Xiao
-
Jianwei Li, Sheng Liu, Qi Lei
-
Yuyang Zhou, Guang Cheng, Zongyao Chen, Shui Yu
-
GTA: Gated Toxicity Avoidance for LM Performance Preservation
Heegyu Kim, Hyunsouk Cho
-
Initialization Matters for Adversarial Transfer Learning
Andong Hua, Jindong Gu, Zhiyu Xue, Nicholas Carlini, Eric Wong, Yao Qin
-
Adversarial Camera Patch: An Effective and Robust Physical-World Attack on Object Detectors
Kalibinuer Tiliwalidi
-
CAD: Photorealistic 3D Generation via Adversarial Distillation
Ziyu Wan, Despoina Paschalidou, Ian Huang, Hongyu Liu, Bokui Shen, Xiaoyu Xiang, Jing Liao, Leonidas Guibas
-
Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation
Shiji Zhao, Xizhe Wang, Xingxing Wei
-
Model Extraction Attacks Revisited
Jiacheng Liang, Ren Pang, Changjiang Li, Ting Wang
-
Poisoning $\times$ Evasion: Symbiotic Adversarial Robustness for Graph Neural Networks
Ege Erdogan, Simon Geisler, Stephan Günnemann
-
Yuanda Wang, Qiben Yan, Nikolay Ivanov, Xun Chen
-
DiffAIL: Diffusion Adversarial Imitation Learning
Bingzheng Wang, Yan Zhang, Teng Pang, Guoqiang Wu, Yilong Yin
-
Security and Reliability Evaluation of Countermeasures implemented using High-Level Synthesis
Amalia Artemis Koufopoulou, Kalliopi Xevgeni, Athanasios Papadimitriou, Mihalis Psarakis, David Hely
-
Mojtaba Ahmadi, Reza Nourmohammadi
-
Towards Sample-specific Backdoor Attack with Clean Labels via Attribute Trigger
Yiming Li, Mingyan Zhu, Junfeng Guo, Tao Wei, Shu-Tao Xia, Zhan Qin
-
DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial Natural Language Instructions
Fangzhou Wu, Xiaogeng Liu, Chaowei Xiao
-
Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks
Shuli Jiang, Swanand Ravindra Kadhe, Yi Zhou, Ling Cai, Nathalie Baracaldo
-
Huming Qiu, Junjie Sun, Mi Zhang, Xudong Pan, Min Yang
-
Bangyan He, Xiaojun Jia, Siyuan Liang, Tianrui Lou, Yang Liu, Xiaochun Cao
-
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu, Shujian Yu, Jingzheng Wu, Stjepan Picek
-
Georgi Ganev, Emiliano De Cristofaro
-
MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model
Kaiyu Song, Hanjiang Lai
-
Annotation-Free Group Robustness via Loss-Based Resampling
Mahdi Ghaznavi, Hesam Asadollahzadeh, HamidReza Yaghoubi Araghi, Fahimeh Hosseini Noohdani, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah
-
Diffence: Fencing Membership Privacy With Diffusion Models
Yuefeng Peng, Ali Naseh, Amir Houmansadr
-
HC-Ref: Hierarchical Constrained Refinement for Robust Adversarial Training of GNNs
Xiaobing Pei, Haoran Yang, Gang Shen
-
FedBayes: A Zero-Trust Federated Learning Aggregation to Defend Against Adversarial Attacks
Marc Vucovich, Devin Quinn, Kevin Choi, Christopher Redino, Abdul Rahman, Edward Bowen
-
TrustFed: A Reliable Federated Learning Framework with Malicious-Attack Resistance
Hangn Su, Jianhong Zhou, Xianhua Niu, Gang Feng
-
Topology-Based Reconstruction Prevention for Decentralised Learning
Florine W. Dekker, Zekeriya Erkin, Mauro Conti
-
Kevin Liu, Stephen Casper, Dylan Hadfield-Menell, Jacob Andreas
-
On the Learnability of Watermarks for Language Models
Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto
-
Defense against ML-based Power Side-channel Attacks on DNN Accelerators with Adversarial Attacks
Xiaobei Yan, Chip Hong Chang, Tianwei Zhang
-
GaitGuard: Towards Private Gait in Mixed Reality
Diana Romero, Ruchi Jagdish Patel, Athina Markopolou, Salma Elmalaki
-
An Evaluation of State-of-the-Art Large Language Models for Sarcasm Detection
Juliann Zhou
-
Negotiating with LLMS: Prompt Hacks, Skill Gaps, and Reasoning Deficits
Johannes Schneider, Steffi Haag, Leona Chandra Kruse
-
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng Li, Bo Li, Zhangyang Wang
-
Making Translators Privacy-aware on the User's Side
Ryoma Sato
-
Adversarial Denoising Diffusion Model for Unsupervised Anomaly Detection
Jongmin Yu, Hyeontaek Oh, Jinhong Yang
-
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa
-
The Potential of Vision-Language Models for Content Moderation of Children's Videos
Syed Hammad Ahmed, Shengnan Hu, Gita Sukthankar
-
On the Impact of Multi-dimensional Local Differential Privacy on Fairness
karima Makhlouf, Heber H. Arcolezi, Sami Zhioua, Ghassen Ben Brahim, Catuscia Palamidessi
-
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Hossein Fereidooni, Alessandro Pegoraro, Phillip Rieger, Alexandra Dmitrienko, Ahmad-Reza Sadeghi
-
SoK: Unintended Interactions among Machine Learning Defenses and Risks
Vasisht Duddu, Sebastian Szyller, N. Asokan
-
GaitGuard: Towards Private Gait in Mixed Reality
Diana Romero, Ruchi Jagdish Patel, Athina Markopolou, Salma Elmalaki
-
Exploring the Robustness of Model-Graded Evaluations and Automated Interpretability
Simon Lermen, Ondřej Kvapil
-
On The Fairness Impacts of Hardware Selection in Machine Learning
Sree Harsha Nelaturu, Nishaanth Kanna Ravichandran, Cuong Tran, Sara Hooker, Ferdinando Fioretto
-
Detecting and Restoring Non-Standard Hands in Stable Diffusion Generated Images
Yiqun Zhang, Zhenyue Qin, Yang Liu, Dylan Campbell
-
Adversarial Learning for Feature Shift Detection and Correction
Miriam Barrabes, Daniel Mas Montserrat, Margarita Geleta, Xavier Giro-i-Nieto, Alexander G. Ioannidis
-
Dongchen Han, Xiaojun Jia, Yang Bai, Jindong Gu, Yang Liu, Xiaochun Cao
-
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning
Hossein Fereidooni, Alessandro Pegoraro, Phillip Rieger, Alexandra Dmitrienko, Ahmad-Reza Sadeghi
-
Defense against ML-based Power Side-channel Attacks on DNN Accelerators with Adversarial Attacks
Xiaobei Yan, Chip Hong Chang, Tianwei Zhang
-
Defense Against Adversarial Attacks using Convolutional Auto-Encoders
Shreyasi Mandal
-
Node-aware Bi-smoothing: Certified Robustness against Graph Injection Attacks
Yuni Lai, Yulin Zhu, Bailin Pan, Kai Zhou
-
Privacy-preserving quantum federated learning via gradient hiding
Changhao Li, Niraj Kumar, Zhixin Song, Shouvanik Chakrabarti, Marco Pistoia
-
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa
-
Analyzing the Inherent Response Tendency of LLMs: Real-World Instructions-Driven Jailbreak
Yanrui Du, Sendong Zhao, Ming Ma, Yuhan Chen, Bing Qin
-
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang Liu, Jie Zhang, Tianwei Zhang, Xi Yang, Weiming Zhang, Nenghai Yu
-
Identity-Obscured Neural Radiance Fields: Privacy-Preserving 3D Facial Reconstruction
Jiayi Kong, Baixin Xu, Xurui Song, Chen Qian, Jun Luo, Ying He
-
Tuan Hoang, Santu Rana, Sunil Gupta, Svetha Venkatesh
-
Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models
Sze Jue Yang, Chinh D. La, Quang H. Nguyen, Eugene Bagdasaryan, Kok-Seng Wong, Anh Tuan Tran, Chee Seng Chan, Khoa D. Doan
-
MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator
Xiao-Yin Liu, Xiao-Hu Zhou, Guo-Tao Li, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou
-
Generating Visually Realistic Adversarial Patch
Xiaosen Wang, Kunyu Wang
-
ScAR: Scaling Adversarial Robustness for LiDAR Object Detection
Xiaohu Lu, Hayder Radha
-
Xinwei Yuan, Shu Han, Wei Huang, Hongliang Ye, Xianglong Kong, Fan Zhang
-
Realistic Scatterer Based Adversarial Attacks on SAR Image Classifiers
Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart, Lance Kaplan
-
Class Incremental Learning for Adversarial Robustness
Seungju Cho, Hongshin Lee, Changick Kim
-
Jan Schuchardt, Yan Scholten, Stephan Günnemann
-
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanimng Cui, Alejandro Aparcedo, Young Kyun Jang, Ser-Nam Lim
-
Scaling Laws for Adversarial Attacks on Language Model Activations
Stanislav Fort
-
Indirect Gradient Matching for Adversarial Robust Distillation
Hongsin Lee, Seungju Cho, Changick Kim
-
Robust Backdoor Detection for Deep Learning via Topological Evolution Dynamics
Xiaoxing Mo, Yechao Zhang, Leo Yu Zhang, Wei Luo, Nan Sun, Shengshan Hu, Shang Gao, Yang Xiang
-
Prompt Optimization via Adversarial In-Context Learning
Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Qizhe Xie, Junxian He
-
Privacy-Preserving Task-Oriented Semantic Communications Against Model Inversion Attacks
Yanhu Wang, Shuaishuai Guo, Yiqin Deng, Haixia Zhang, Yuguang Fang
-
Zhuo Huang, Chang Liu, Yinpeng Dong, Hang Su, Shibao Zheng, Tongliang Liu
-
Adversarial Medical Image with Hierarchical Feature Hiding
Qingsong Yao, Zecheng He, Yuexiang Li, Yi Lin, Kai Ma, Yefeng Zheng, S. Kevin Zhou
-
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
Xunguang Wang, Zhenlan Ji, Pingchuan Ma, Zongjie Li, Shuai Wang
-
Singular Regularization with Information Bottleneck Improves Model's Adversarial Robustness
Guanlin Li, Naishan Zheng, Man Zhou, Jie Zhang, Tianwei Zhang
-
Chengyin Hu, Weiwen Shi
-
Sai Venkatesh Chilukoti, Md Imran Hossen, Liqun Shan, Vijay Srinivas Tida, Xiai Hei
-
Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren, Zeyu Wang, Hongru Zhu, Junfei Xiao, Alan Yuille, Cihang Xie
-
QuantAttack: Exploiting Dynamic Quantization to Attack Vision Transformers
Amit Baras, Alon Zolfi, Yuval Elovici, Asaf Shabtai
-
OCGEC: One-class Graph Embedding Classification for DNN Backdoor Detection
Haoyu Jiang, Haiyang Yu, Nan Li, Ping Yi
-
Evaluating the Security of Satellite Systems
Roy Peled, Eran Aizikovich, Edan Habler, Yuval Elovici, Asaf Shabtai
-
Exploring Adversarial Robustness of LiDAR-Camera Fusion Model in Autonomous Driving
Bo Yang, Xiaoyu Ji, Xiaoyu Ji, Xiaoyu Ji, Xiaoyu Ji
-
TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation
Xiaojun Jia, Jindong Gu, Yihao Huang, Simeng Qin, Qing Guo, Yang Liu, Xiaochun Cao
-
Rethinking PGD Attack: Is Sign Function Necessary? (98%
Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang
-
Yisheng Zhong, Li-Ping Wang
-
Mendata: A Framework to Purify Manipulated Training Data
Zonghao Huang, Neil Gong, Michael K. Reiter
-
Ruitong Liu, Yanbin Wang, Zhenhao Guo, Haitao Xu, Zhan Qin, Wenrui Ma, Fan Zhang
-
Survey of Security Issues in Memristor-based Machine Learning Accelerators for RF Analysis
William Lillis, Max Cohen Hoffing, Wayne Burleson
-
Deep Generative Attacks and Countermeasures for Data-Driven Offline Signature Verification
An Ngo, MinhPhuong Cao, Rajesh Kumar
-
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou, Tianyu Pang, Keqin Liu, Charles H. Martin, Michael W. Mahoney, Yaoqing Yang
-
Improving Faithfulness for Vision Transformers
Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, Di Wang
-
TrustMark: Universal Watermarking for Arbitrary Resolution Images
Tu Bui, Shruti Agarwal, John Collomosse
-
Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text
Qi Cao, Takeshi Kojima, Yutaka Matsuo, Yusuke Iwasawa
-
ROBBIE: Robust Bias Evaluation of Large Generative Language Models
David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric Michael Smith
-
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations
Raphael Tang, Xinyu Zhang, Jimmy Lin, Ferhan Ture
-
Improving Adversarial Transferability via Model Alignment
Avery Ma, Amir-massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu
-
Poisoning Attacks Against Contrastive Recommender Systems
Zongwei Wang, Junliang Yu, Min Gao, Hongzhi Yin, Bin Cui, Shazia Sadiq
Lujia Shen, Yuwen Pu, Shouling Ji, Changjiang Li, Xuhong Zhang, Chunpeng Ge, Ting Wang
Shpresim Sadiku, Moritz Wagner, Sebastian Pokutta
David Winderl, Nicola Franco, Jeanette Miriam Lorenz
Filippo Guerranti, Zinuo Yi, Anna Starovoit, Rafiq Kamel, Simon Geisler, Stephan Günnemann
Xiaoyue Mi, Fan Tang, Zonghan Yang, Danding Wang, Juan Cao, Peng Li, Yang Liu
Zihao Tan, Qingliang Chen, Yongjian Huang, Chen Liang
Xiaoyue Mi, Fan Tang, Yepeng Weng, Danding Wang, Juan Cao, Sheng Tang, Peng Li, Yang Liu
Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao
Maximilian Augustin, Yannic Neuhaus, Matthias Hein
Tanmay Chavan, Shantanu Patankar, Aditya Kane, Omkar Gokhale, Geetanjali Kale, Raviraj Joshi
Xu Liu, Shu Zhou, Yurong Song, Wenzhe Luo, Xin Zhang
Jiaxin Wen, Pei Ke, Hao Sun, Zhexin Zhang, Chengfei Li, Jinfeng Bai, Minlie Huang
Lucas Beerens, Desmond J. Higham
Xiaoliang Liu, Furao Shen, Feng Han, Jian Zhao, Changhai Nie
AprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya
Xiaoliang Liu, Furao Shen, Jian Zhao, Changhai Nie
Bernd Prach, Fabio Brau, Giorgio Buttazzo, Christoph H. Lampert
Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee
Yingying Huangfu, Tian Bai
Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra, Svetlana Lazebnik, D. A. Forsyth, Anand Bhattad
Runzhi Tian, Yongyi Mao
Maximilian Dreyer, Reduan Achtibat, Wojciech Samek, Sebastian Lapuschkin
Xiaosen Wang, Zeyuan Yin
-
Microarchitectural Security of AWS Firecracker VMM for Serverless Cloud Platforms
Zane Worcester Polytechnic Institute Weissman, Thore University of Lübeck Tiemann, Thomas University of Lübeck Eisenbarth, Berk Worcester Polytechnic Institute Sunar
-
Scale-Dropout: Estimating Uncertainty in Deep Neural Networks Using Stochastic Scale
Soyed Tuhin Ahmed, Kamal Danouchi, Michael Hefenbrock, Guillaume Prenat, Lorena Anghel, Mehdi B. Tahoori
-
Instruct2Attack: Language-Guided Semantic Adversarial Attacks
Jiang Liu, Chen Wei, Yuxiang Guo, Heng Yu, Alan Yuille, Soheil Feizi, Chun Pong Lau, Rama Chellappa
-
A Survey on Vulnerability of Federated Learning: A Learning Algorithm Perspective
Xianghua Xie, Chen Hu, Hanchi Ren, Jingjing Deng
-
Distributed Attacks over Federated Reinforcement Learning-enabled Cell Sleep Control
Han Zhang, Hao Zhou, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci
-
How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
Haoqin Tu, Chenhang Cui, Zijun Wang, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie
-
Trainwreck: A damaging adversarial attack on image classifiers
Jan Zahálka
-
Adversaral Doodles: Interpretable and Human-drawable Attacks Provide Describable Insights
Ryoya Nara, Yusuke Matsui
-
Automated discovery of trade-off between utility, privacy and fairness in machine learning models
Bogdan Ficiu, Neil D. Lawrence, Andrei Paleyes
-
Attend Who is Weak: Enhancing Graph Condensation via Cross-Free Adversarial Training
Xinglin Li, Kun Wang, Hanhui Deng, Yuxuan Liang, Di Wu
-
Rethinking Privacy in Machine Learning Pipelines from an Information Flow Control Perspective
Lukas Wutschitz, Boris Köpf, Andrew Paverd, Saravan Rajmohan, Ahmed Salem, Shruti Tople, Santiago Zanella-Béguelin, Menglin Xia, Victor Rühle
-
Confidence Is All You Need for MI Attacks
Abhishek Sinha, Himanshi Tibrewal, Mansi Gupta, Nikhar Waghela, Shivank Garg
-
Mixing Classifiers to Alleviate the Accuracy-Robustness Trade-Off
Yatong Bai, Brendon G. Anderson, Somayeh Sojoudi
-
Adversarial Purification of Information Masking
Sitong Liu, Zhichao Lian, Shuangquan Zhang, Liang Xiao
-
Having Second Thoughts? Let's hear it
Jung H. Lee, Sujith Vijayan
-
Effective Backdoor Mitigation Depends on the Pre-training Objective
Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Jeff Bilmes
-
Robust Graph Neural Networks via Unbiased Aggregation
Ruiqi Feng, Zhichao Hou, Tyler Derr, Xiaorui Liu
-
Mixing Classifiers to Alleviate the Accuracy-Robustness Trade-Off
Yatong Bai, Brendon G. Anderson, Somayeh Sojoudi
-
Robust Graph Neural Networks via Unbiased Aggregation
Ruiqi Feng, Zhichao Hou, Tyler Derr, Xiaorui Liu
-
Effective Backdoor Mitigation Depends on the Pre-training Objective
Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Jeff Bilmes
-
Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando, Florian Tramèr
-
How to ensure a safe control strategy? Towards a SRL for urban transit autonomous operation
Zicong Zhao
-
Potential Societal Biases of ChatGPT in Higher Education: A Scoping Review
Ming Li, Ariunaa Enkhtur, Beverley Anne Yamamoto, Fei Cheng
-
AI-based Attack Graph Generation
Sangbeom Park, Jaesung Lee, Jeongdo Yoo, Min Geun Song, Hyosun Lee, Jaewoong Choi, Chaeyeon Sagong, Huy Kang Kim
-
Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on Segmentation Models
Francesco Croce, Matthias Hein
-
Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando, Florian Tramèr
-
Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification
Daryna Dementieva, Daniil Moskovskiy, David Dale, Alexander Panchenko
-
Efficient Trigger Word Insertion
Yueqi Zeng, Ziqiang Li, Pengfei Xia, Lei Liu, Bin Li
-
ACT: Adversarial Consistency Models
Fei Kong, Jinhao Duan, Lichao Sun, Hao Cheng, Renjing Xu, Hengtao Shen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu
-
Robust and Interpretable COVID-19 Diagnosis on Chest X-ray Images using Adversarial Training
Karina Yang, Alexis Bennett, Dominique Duncan
-
When Side-Channel Attacks Break the Black-Box Property of Embedded Artificial Intelligence
Benoit Coqueret, Mathieu Carbone, Olivier Sentieys, Gabriel Zaid
-
Adversarial defense based on distribution transfer
Jiahao Chen, Diqun Yan, Li Dong
-
Robust and Interpretable COVID-19 Diagnosis on Chest X-ray Images using Adversarial Training
Karina Yang, Alexis Bennett, Dominique Duncan
-
Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models
Thomas P. Zollo, Todd Morrill, Zhun Deng, Jake C. Snell, Toniann Pitassi, Richard Zemel
-
A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer
Chenyang Li, Zhao Song, Weixin Wang, Chiwun Yang
-
OASIS: Offsetting Active Reconstruction Attacks in Federated Learning
Tre' R. Jeter, Truc Nguyen, Raed Alharbi, My T. Thai
-
Fan Xing, Xiaoyi Zhou, Xuefeng Fan, Zhuo Tian, Yan Zhao
-
A Survey of Adversarial CAPTCHAs on its History, Classification and Generation
Zisheng Xu, Qiao Yan, F. Richard Yu, Victor C. M. Leung
-
Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization
Yuzhe You, Jarvis Tse, Jian Zhao
-
Security and Privacy Challenges in Deep Learning Models
Gopichandh Golla
-
A Somewhat Robust Image Watermark against Diffusion-based Editing Models
Mingtian Tan, Tianhao Wang, Somesh Jha
-
Bishal Shrestha, Griwan Khakurel, Kritika Simkhada, Badri Adhikari
-
Quazi Mishkatul Alam, Bilel Tarchoun, Ihsen Alouani, Nael Abu-Ghazaleh
-
SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion
Yueqian Lin, Jingyang Zhang, Yiran Chen, Hai Li
-
Hard Label Black Box Node Injection Attack on Graph Neural Networks
Yu Zhou, Zihao Dong, Guofeng Zhang, Jingchen Tang
-
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Chi Zhang, Zifan Wang, Ravi Mangal, Matt Fredrikson, Limin Jia, Corina Pasareanu
-
Boost Adversarial Transferability by Uniform Scale and Mix Mask Method
Tao Wang, Zijian Ying, Qianmu Li, zhichao Lian
-
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning
Siyuan Liang, Mingli Zhu, Aishan Liu, Baoyuan Wu, Xiaochun Cao, Ee-Chien Chang
-
Renu Sharma, Redwan Sony, Arun Ross
-
Darshika Jauhari, Renu Sharma, Cunjian Chen, Nelson Sepulveda, Arun Ross
-
ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches
Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, Muhammad Shafique
-
Attacking Motion Planners Using Adversarial Perception Errors
Jonathan Sadeghi, Nicholas A. Lord, John Redford, Romain Mueller
-
Masked Autoencoders Are Robust Neural Architecture Search Learners
Yiming Hu, Xiangxiang Chu, Bo Zhang
-
Adversarial Reweighting Guided by Wasserstein Distance for Bias Mitigation
Xuan Zhao, Simone Fabbrizzi, Paula Reyero Lobo, Siamak Ghodsi, Klaus Broelemann, Steffen Staab, Gjergji Kasneci
-
Attacks of fairness in Federated Learning
Joseph Rance, Filip Svoboda
-
DefensiveDR: Defending against Adversarial Patches using Dimensionality Reduction
Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, Muhammad Shafique
-
Safety-aware Causal Representation for Trustworthy Reinforcement Learning in Autonomous Driving
Haohong Lin, Wenhao Ding, Zuxin Liu, Yaru Niu, Jiacheng Zhu, Yuming Niu, Ding Zhao
-
Assessing Prompt Injection Risks in 200+ Custom GPTs
Jiahao Yu, Yuhang Wu, Dong Shu, Mingyu Jin, Xinyu Xin
-
Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems
Guangjing Wang, Ce Zhou, Yuanda Wang, Bocheng Chen, Hanqing Guo, Qiben Yan
-
Generating Valid and Natural Adversarial Examples with Large Language Models
Zimu Wang, Wei Wang, Qi Chen, Qiufeng Wang, Anh Nguyen
-
AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems
Sai Amrit Patnaik, Shivali Chansoriya, Anil K. Jain, Anoop M. Namboodiri
-
Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks
Evan Rose, Fnu Suya, David Evans
-
Training robust and generalizable quantum models
Julian Berberich, Daniel Fink, Daniel Pranjić, Christian Tutschku, Christian Holm
-
BrainWash: A Poisoning Attack to Forget in Continual Learning
Ali Abbasi, Parsa Nooralinejad, Hamed Pirsiavash, Soheil Kolouri
-
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Feng Wang, M. Cenk Gursoy, Senem Velipasalar
-
Adversarial Prompt Tuning for Vision-Language Models
Jiaming Zhang, Xingjun Ma, Xin Wang, Lingyu Qiu, Jiaqi Wang, Yu-Gang Jiang, Jitao Sang
-
TextGuard: Provable Defense against Backdoor Attacks on Text Classification
Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song
-
Improving Adversarial Transferability by Stable Diffusion
Jiayang Liu, Siyu Zhu, Siyuan Liang, Jie Zhang, Han Fang, Weiming Zhang, Ee-Chien Chang
-
Attention-Based Real-Time Defenses for Physical Adversarial Attacks in Vision Applications
Giulio Rossolini, Alessandro Biondi, Giorgio Buttazzo
-
PACOL: Poisoning Attacks Against Continual Learners
Huayu Li, Gregory Ditzler
-
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Feng Wang, M. Cenk Gursoy, Senem Velipasalar
-
Robustness Enhancement in Neural Networks with Alpha-Stable Training Noise
Xueqiong Yuan, Jipeng Li, Ercan Engin Kuruoğlu
-
Hee-Seon Kim, Minji Son, Minbeom Kim, Myung-Joon Kwon, Changick Kim
-
PACOL: Poisoning Attacks Against Continual Learners
Huayu Li, Gregory Ditzler
-
Two-Factor Authentication Approach Based on Behavior Patterns for Defeating Puppet Attacks
Wenhao Wang, Guyue Li, Zhiming Chu, Haobo Li, Daniele Faccio
-
Breaking Boundaries: Balancing Performance and Robustness in Deep Wireless Traffic Forecasting
Ilbert Romain, V. Hoang Thai, Zhang Zonghua, Palpanas Themis
-
Hijacking Large Language Models via Adversarial In-Context Learning
Yao Qiang, Xiangyu Zhou, Dongxiao Zhu
-
Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking
Nan Xu, Fei Wang, Ben Zhou, Bang Zheng Li, Chaowei Xiao, Muhao Chen
-
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo, Jiashu Xu, Qin Liu, Jiongxiao Wang, Jun Yan, Chaowei Xiao, Muhao Chen
-
On the Exploitability of Reinforcement Learning with Human Feedback for Large Language Models
Jiongxiao Wang, Junlin Wu, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao
-
Towards more Practical Threat Models in Artificial Intelligence Security
Kathrin Grosse, Lukas Bieringer, Tarek Richard Besold, Alexandre Alahi
-
Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts
Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun
-
Haoran Wang, Kai Shu
-
Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing
A K Iowa State University Nirala, A New York University Joshi, C New York University Hegde, S Iowa State University Sarkar
-
Adversarially Robust Spiking Neural Networks Through Conversion
Ozan Özdenizci, Robert Legenstein
-
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Zhexin Zhang, Junxiao Yang, Pei Ke, Minlie Huang
-
Privacy Threats in Stable Diffusion Models
Thomas Cilloni, Charles Fleming, Charles Walter
-
Lingbo Mo, Boshi Wang, Muhao Chen, Huan Sun
-
MirrorNet: A TEE-Friendly Framework for Secure On-device DNN Inference
Ziyu Liu, Yukui Luo, Shijin Duan, Tong Zhou, Xiaolin Xu
-
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models
Yueqing Liang, Lu Cheng, Ali Payani, Kai Shu
-
JAB: Joint Adversarial Prompting and Belief Augmentation
Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Jwala Dhamala, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
-
Shashank Kotyan, Danilo Vasconcellos Vargas
-
Physical Adversarial Examples for Multi-Camera Systems
Ana Răduţoiu, Jan-Philipp Schulze, Philip Sperl, Konstantin Böttinger
-
DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Pre-trained Language Models
Yibo Wang, Xiangjue Dong, James Caverlee, Philip S. Yu
-
On The Relationship Between Universal Adversarial Attacks And Sparse Representations
Dana Weitzner, Raja Giryes
-
Peng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, Shujian Huang
-
Multi-Set Inoculation: Assessing Model Robustness Across Multiple Challenge Sets
Vatsal Gupta, Pranshu Pandya, Tushar Kataria, Vivek Gupta, Dan Roth
-
The Perception-Robustness Tradeoff in Deterministic Image Restoration
Guy Ohayon, Tomer Michaeli, Michael Elad
-
Adversarial Purification for Data-Driven Power System Event Classifiers with Diffusion Models
Yuanbin Cheng, Koji Yamashita, Jim Follum, Nanpeng Yu
-
Rui Duan, Zhe Qu, Leah Ding, Yao Liu, Zhuo Lu
-
An Extensive Study on Adversarial Attack against Pre-trained Models of Code
Xiaohu Du, Ming Wen, Zichao Wei, Shangwen Wang, Hai Jin
-
Untargeted Black-box Attacks for Social Recommendations
Wenqi Fan, Shijie Wang, Xiao-yong Wei, Xiaowei Mei, Qing Li
-
On the Robustness of Neural Collapse and the Neural Collapse of Robustness
Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe
-
Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data
Bart Pleiter, Behrad Tajalli, Stefanos Koffas, Gorka Abad, Jing Xu, Martha Larson, Stjepan Picek
-
Learning Globally Optimized Language Structure via Adversarial Training
Xuwang Yin
-
Contractive Systems Improve Graph Neural Networks Against Adversarial Attacks
Moshe Eliasof, Davide Murari, Ferdia Sherry, Carola-Bibiane Schönlieb
-
Behrouz Azimian, Shiva Moshtagh, Anamitra Pal, Shanshan Ma
-
DialMAT: Dialogue-Enabled Transformer with Moment-Based Adversarial Training
Kanta Kaneda, Ryosuke Korekata, Yuiga Wada, Shunya Nagashima, Motonari Kambara, Yui Iioka, Haruka Matsuo, Yuto Imai, Takayuki Nishimura, Komei Sugiura
-
Ziwei Wang, Nabil Aouf, Jose Pizarro, Christophe Honvault
-
Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild
Nanna Inie, Jonathan Stray, Leon Derczynski
-
Shanghao Shi, Ning Wang, Yang Xiao, Chaoyu Zhang, Yi Shi, Y.Thomas Hou, Wenjing Lou
-
Does Differential Privacy Prevent Backdoor Attacks in Practice?
Fereshteh Razmi, Jian Lou, Li Xiong
-
Flatness-aware Adversarial Attack
Mingyuan Fan, Xiaodan Li, Cen Chen, Yinggui Wang
-
Ziwei Wang, Nabil Aouf, Jose Pizarro, Christophe Honvault
-
Fight Fire with Fire: Combating Adversarial Patch Attacks using Pattern-randomized Defensive Patches
Jianan Feng, Jiachun Li, Changqing Miao, Jianjun Huang, Wei You, Wenchang Shi, Bin Liang
-
Resilient and constrained consensus against adversarial attacks: A distributed MPC framework
Henglai Wei, Kunwu Zhang, Hui Zhang, Yang Shi
-
CALLOC: Curriculum Adversarial Learning for Secure and Robust Indoor Localization
Danish Gufran, Sudeep Pasricha
-
Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang
-
Xiangguo Sun, Hong Cheng, Hang Dong, Bo Qiao, Si Qin, Qingwei Lin
-
Meiling Fang, Marco Huber, Julian Fierrez, Raghavendra Ramachandra, Naser Damer, Alhasan Alkhaddour, Maksim Kasantcev, Vasiliy Pryadchenko, Ziyuan Yang, Huijie Huangfu, Yingyu Chen, Yi Zhang, Yuchen Pan, Junjun Jiang, Xianming Liu, Xianyun Sun, Caiyong Wang, Xingyu Liu, Zhaohua Chang, Guangzhe Zhao, Juan Tapia, Lazaro Gonzalez-Soler, Carlos Aravena, Daniel Schulz
-
Xiangguo Sun, Hong Cheng, Hang Dong, Bo Qiao, Si Qin, Qingwei Lin
-
Rui Xu, Wenkang Qin, Peixiang Huang, Haowang, Lin Luo
-
Training Robust Deep Physiological Measurement Models with Synthetic Video-based Data
Yuxuan Ou, Yuzhe Zhang, Yuntang Wang, Shwetak Patel, Daniel McDuf, Xin Liu
-
FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts
Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, Xiaoyun Wang
-
Familiarity-Based Open-Set Recognition Under Adversarial Attacks
Philip Enevoldsen, Christian Gundersen, Nico Lang, Serge Belongie, Christian Igel
-
Edge-assisted U-Shaped Split Federated Learning with Privacy-preserving for Internet of Things
Hengliang Tang, Zihang Zhao, Detian Liu, Yang Cao, Shiqiang Zhang, Siqing You
-
DEMASQ: Unmasking the ChatGPT Wordsmith
Kavita Kumari, Alessandro Pegoraro, Hossein Fereidooni, Ahmad-Reza Sadeghi
-
On the steerability of large language models toward data-driven personas
Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
-
FD-MIA: Efficient Attacks on Fairness-enhanced Models
Huan Tian, Guangsheng Zhang, Bo Liu, Tianqing Zhu, Ming Ding, Wanlei Zhou
-
Unveiling Safety Vulnerabilities of Large Language Models
George Kour, Marcel Zalmanovici, Naama Zwerdling, Esther Goldbraich, Ora Nova Fandina, Ateret Anaby-Tavor, Orna Raz, Eitan Farchi
-
A Preference Learning Approach to Develop Safe and Personalizable Autonomous Vehicles
Ruya Karagulle, Nikos Arechiga, Andrew Best, Jonathan DeCastro, Necmiye Ozay
-
Making Harmful Behaviors Unlearnable for Large Language Models
Xin Zhou, Yi Lu, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang
-
Uncertainty Quantification of Deep Learning for Spatiotemporal Data: Challenges and Opportunities
Wenchong He, Zhe Jiang
-
On the Intersection of Self-Correction and Trust in Language Models
Satyapriya Krishna
-
Preserving Privacy in GANs Against Membership Inference Attack
Mohammadhadi Shateri, Francisco Messina, Fabrice Labeau, Pablo Piantanida
-
DeepInception: Hypnotize Large Language Model to Be Jailbreaker
Xuan Li, Zhanke Zhou, Jianing Zhu, Jiangchao Yao, Tongliang Liu, Bo Han
-
Pilot-Based Key Distribution and Encryption for Secure Coherent Passive Optical Networks
Haide Wang, Ji Zhou, Qingxin Lu, Jianrui Zeng, Yongqing Liao, Weiping Liu, Changyuan Yu, Zhaohui Li
-
ELEGANT: Certified Defense on the Fairness of Graph Neural Networks
Yushun Dong; Binchi Zhang; Hanghang Tong; Jundong Li
-
From Trojan Horses to Castle Walls: Unveiling Bilateral Backdoor Effects in Diffusion Models
Zhuoshi Pan, Yuguang Yao, Gaowen Liu, Bingquan Shen, H. Vicky Zhao, Ramana Rao Kompella, Sijia Liu
-
Can AI Mitigate Human Perceptual Biases? A Pilot Study
Ross Geuy, Nate Rising, Tiancheng Shi, Meng Ling, Jian Chen
-
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization
Jaafar Mhamed, Shangding Gu
-
Robust Identity Perceptual Watermark Against Deepfake Face Swapping
Tianyi Wang, Mengxiao Huang, Harry Cheng, Bin Ma, Yinglong Wang
-
A Call to Arms: AI Should be Critical for Social Media Analysis of Conflict Zones
Afia Abedin, Abdul Bais, Cody Buntain, Laura Courchesne, Brian McQuinn, Matthew E. Taylor, Muhib Ullah
-
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell
-
MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training
Jiacheng Li, Ninghui Li, Bruno Ribeiro
-
Reputation Systems for Supply Chains: The Challenge of Achieving Privacy Preservation
Lennart Bader, Jan Pennekamp, Emildeon Thevaraj, Maria Spiß, Salil S. Kanhere, Klaus Wehrle
-
Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems
Ziqing Lu, Guanlin Liu, Lifeng Cai, Weiyu Xu
-
SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization
Jaafar Mhamed, Shangding Gu
-
Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code
Mohammed Latif Siddiq, Joanna C. S. Santos
-
Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Lehman Go Indifferent
Lorenz Kummer, Samir Moustafa, Nils N. Kriege, Wilfried N. Gansterer
-
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly
Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen
-
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell
-
In Defense of Softmax Parametrization for Calibrated and Consistent Learning to Defer
Yuzhou Cao, Hussein Mozannar, Lei Feng, Hongxin Wei, Bo An
-
MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training
Jiacheng Li, Ninghui Li, Bruno Ribeiro
-
FAIRLABEL: Correcting Bias in Labels
Srinivasan H Sengamedu, Hien Pham
-
Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation
Xiangjue Dong, Yibo Wang, Philip S. Yu, James Caverlee
-
Robustness Tests for Automatic Machine Translation Metrics with Adversarial Attacks
Yichen Huang, Timothy Baldwin
-
Medi-CAT: Contrastive Adversarial Training for Medical Image Classification
Pervaiz Iqbal Khan, Andreas Dengel, Sheraz Ahmed
-
Uncertainty quantification and out-of-distribution detection using surjective normalizing flows
Simon Dirmeier, Ye Hong, Yanan Xin, Fernando Perez-Cruz
-
NEO-KD: Knowledge-Distillation-Based Adversarial Training for Robust Multi-Exit Neural Networks
Seokil Ham, Jungwuk Park, Dong-Jun Han, Jaekyun Moon
-
Feng Chen, Liqin Wang, Julie Hong, Jiaqi Jiang, Li Zhou
-
Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?
Leiyu Pan, Supryadi, Deyi Xiong
-
LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B
Simon Lermen, Charlie Rogers-Smith, Jeffrey Ladish
-
DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models
Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong
-
Verification of Neural Networks Local Differential Classification Privacy
Roie Reshef, Anan Kabaha, Olga Seleznova, Dana Drachsler-Cohen
-
Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks
Jiayuan Ye, Zhenyu Zhu, Fanghui Liu, Reza Shokri, Volkan Cevher
-
PriPrune: Quantifying and Preserving Privacy in Pruned Federated Learning
Tianyue Chu, Mengwei Yang, Nikolaos Laoutaris, Athina Markopoulou
-
LipSim: A Provably Robust Perceptual Similarity Metric
Sara Ghazanfari, Alexandre Araujo, Prashanth Krishnamurthy, Farshad Khorrami, Siddharth Garg
-
Counterfactual Fairness for Predictions using Generative Adversarial Networks
Yuchen Ma, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel
-
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method
Yukun Zhao, Lingyong Yan, Weiwei Sun, Guoliang Xing, Chong Meng, Shuaiqiang Wang, Zhicong Cheng, Zhaochun Ren, Dawei Yin
-
Fine tuning Pre trained Models for Robustness Under Noisy Labels
Sumyeong Ahn, Sihyeon Kim, Jongwoo Ko, Se-Young Yun
-
Adversarial Anomaly Detection using Gaussian Priors and Nonlinear Anomaly Scores
Fiete Lüer, Tobias Weber, Maxim Dolgich, Christian Böhm
-
$α$-Mutual Information: A Tunable Privacy Measure for Privacy Protection in Data Sharing
MirHamed Jafarzadeh Asl, Mohammadhadi Shateri, Fabrice Labeau
-
BlackJack: Secure machine learning on IoT devices through hardware-based shuffling
Karthik Ganesan, Michal Fishkin, Ourong Lin, Natalie Enright Jerger
-
ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation
Zi Lin, Zihan Wang, Yongqi Tong, Yangkun Wang, Yuxin Guo, Yujia Wang, Jingbo Shang
-
Where you go is who you are -- A study on machine learning based semantic privacy attacks
Nina Wiedemann, Ourania Kounadi, Martin Raubal, Krzysztof Janowicz
-
Lynda Boukela, Gongxuan Zhang, Meziane Yacoub, Samia Bouzefrane
-
CBD: A Certified Backdoor Detector Based on Local Dominant Probability
Zhen Xiang, Zidi Xiong, Bo Li
-
A Survey on Transferability of Adversarial Examples across Deep Neural Networks
Jindong Gu, Xiaojun Jia, Pau de Jorge, Wenqain Yu, Xinwei Liu, Avery Ma, Yuan Xun, Anjun Hu, Ashkan Khakzar, Zhijiang Li, Xiaochun Cao, Philip Torr
-
Uncertainty-weighted Loss Functions for Improved Adversarial Attacks on Semantic Segmentation
Kira Maag, Asja Fischer
-
Detecting stealthy cyberattacks on adaptive cruise control vehicles: A machine learning approach
Tianyi Li, Mingfeng Shang, Shian Wang, Raphael Stern
-
SoK: Pitfalls in Evaluating Black-Box Attacks
Fnu Suya, Anshuman Suri, Tingwei Zhang, Jingtao Hong, Yuan Tian, David Evans
-
Defending Against Transfer Attacks From Public Models
Chawin Sitawarin, Jaewon Chang, David Huang, Wesson Altoyan, David Wagner
-
Proving Test Set Contamination in Black Box Language Models
Yonatan Oren, Nicole Meister, Niladri Chatterji, Faisal Ladhak, Tatsunori B. Hashimoto
-
Detection Defenses: An Empty Promise against Adversarial Patch Attacks on Optical Flow
Erik Scheurer, Jenny Schmalfuss, Alexander Lis, Andrés Bruhn
-
Trust, but Verify: Robust Image Segmentation using Deep Learning
Fahim Ahmed Zaman, Xiaodong Wu, Weiyu Xu, Milan Sonka, Raghuraman Mudumbai
-
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Aradhana Sinha, Ananth Balashankar, Ahmad Beirami, Thi Avrahami, Jilin Chen, Alex Beutel
-
Ananth Balashankar, Xiao Ma, Aradhana Sinha, Ahmad Beirami, Yao Qin, Jilin Chen, Alex Beutel
-
Confounder Balancing in Adversarial Domain Adaptation for Pre-Trained Large Models Fine-Tuning
Shuoran Jiang, Qingcai Chen, Yang Xiang, Youcheng Pan, Xiangping Wu
-
Jiexin Wang, Liuwen Cao, Xitong Luo, Zhiping Zhou, Jiayuan Xie, Adam Jatowt, Yi Cai
-
Zhiling Zhang, Jie Zhang, Kui Zhang, Wenbo Zhou, Weiming Zhang, Nenghai Yu
-
Defense Against Model Extraction Attacks on Recommender Systems
Sixiao Zhang, Hongzhi Yin, Hongxu Chen, Cheng Long
-
Robust and Actively Secure Serverless Collaborative Learning
Olive Franzese, Adam Dziedzic, Christopher A. Choquette-Choo, Mark R. Thomas, Muhammad Ahmad Kaleem, Stephan Rabanser, Congyu Fang, Somesh Jha, Nicolas Papernot, Xiao Wang
-
AI Hazard Management: A framework for the systematic management of root causes for AI risks
Ronald Schnitzer, Andreas Hapfelmeier, Sven Gaube, Sonja Zillner
-
Md Rafi Ur Rashid, Vishnu Asutosh Dasu, Kang Gu, Najrin Sultana, Shagufta Mehnaz
-
Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks
Xinglong Chang, Katharina Dost, Gillian Dobbie, Jörg Wicker
-
Arun Kumar Silivery, Ram Mohan Rao Kovvur
-
3D Masked Autoencoders for Enhanced Privacy in MRI Scans
Lennart Alexander Van der Goten, Kevin Smith
-
Self-Guard: Empower the LLM to Safeguard Itself
Zezhong Wang, Fangkai Yang, Lu Wang, Pu Zhao, Hongru Wang, Liang Chen, Qingwei Lin, Kam-Fai Wong
-
The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks
Xiaoyi Chen, Siyuan Tang, Rui Zhu, Shijun Yan, Lei Jin, Zihao Wang, Liya Su, XiaoFeng Wang, Haixu Tang
-
Deceptive Fairness Attacks on Graphs via Meta Learning
Jian Kang, Yinglong Xia, Ross Maciejewski, Jiebo Luo, Hanghang Tong
-
Momentum Gradient-based Untargeted Attack on Hypergraph Neural Networks
Yang Chen, Stjepan Picek, Zhonglin Ye, Zhaoyang Wang, Haixing Zhao
-
Fundamental Limits of Membership Inference Attacks on Machine Learning Models
Eric Aubinais, Elisabeth Gassiat, Pablo Piantanida
-
Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models
Shawn Shan, Wenxin Ding, Josephine Passananti, Haitao Zheng, Ben Y. Zhao
-
MoPe: Model Perturbation-based Privacy Attacks on Language Models
Marvin Li, Jason Wang, Jeffrey Wang, Seth Neel
-
On existence, uniqueness and scalability of adversarial robustness measures for AI classifiers
Illia Horenko
-
AutoDAN: Automatic and Interpretable Adversarial Attacks on Large Language Models
Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun
-
Toward Stronger Textual Attack Detectors
Pierre Colombo, Marine Picot, Nathan Noiry, Guillaume Staerman, Pablo Piantanida
-
CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
Minxuan Lv, Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu
-
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images
Logan Frank, Jim Davis
-
Bi-discriminator Domain Adversarial Neural Networks with Class-Level Gradient Alignment
Chuang Zhao, Hongke Zhao, Hengshu Zhu, Zhenya Huang, Nan Feng, Enhong Chen, Hui Xiong
-
ADoPT: LiDAR Spoofing Attack Detection Based on Point-Level Temporal Consistency
Minkyoung Cho, Yulong Cao, Zixiang Zhou, Z. Morley Mao
-
F$^2$AT: Feature-Focusing Adversarial Training via Disentanglement of Natural and Perturbed Patterns
Yaguan Qian, Chenyu Zhao, Zhaoquan Gu, Bin Wang, Shouling Ji, Wei Wang, Boyang Zhou, Pan Zhou
-
Semantic-Aware Adversarial Training for Reliable Deep Hashing Retrieval
Xu Yuan, Zheng Zhang, Xunguang Wang, Lin Wu
-
On the Detection of Image-Scaling Attacks in Machine Learning
Erwin Quiring, Andreas Müller, Konrad Rieck
-
Adversarial Attacks on Fairness of Graph Neural Networks
Binchi Zhang, Yushun Dong, Chen Chen, Yada Zhu, Minnan Luo, Jundong Li
-
Competitive Advantage Attacks to Decentralized Federated Learning
Yuqi Jia, Minghong Fang, Neil Zhenqiang Gong
-
Enhancing Accuracy-Privacy Trade-off in Differentially Private Split Learning
Ngoc Duy Pham, Khoa Tran Phan, Naveen Chilamkurti
-
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Kaya Stechly, Matthew Marquez, Subbarao Kambhampati
-
Prompt Injection Attacks and Defenses in LLM-Integrated Applications
Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, Neil Zhenqiang Gong
-
Probing LLMs for hate speech detection: strengths and vulnerabilities
Sarthak Roy, Ashish Harshavardhan, Animesh Mukherjee, Punyajoy Saha
-
PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models
Hongwei Yao, Jian Lou, Zhan Qin
-
Segment Anything Meets Universal Adversarial Perturbation
Dongshen Han, Sheng Zheng, Chaoning Zhang
-
Fast Model Debias with Machine Unlearning
Ruizhe Chen, Jianfei Yang, Huimin Xiong, Jianhong Bai, Tianxiang Hu, Jin Hao, Yang Feng, Joey Tianyi Zhou, Jian Wu, Zuozhu Liu
-
Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao
-
Attack Prompt Generation for Red Teaming and Defending Large Language Models
Boyi Deng, Wenjie Wang, Fuli Feng, Yang Deng, Qifan Wang, Xiangnan He
-
Recoverable Privacy-Preserving Image Classification through Noise-like Adversarial Examples
Jun Liu, Jiantao Zhou, Jinyu Tian, Weiwei Sun
-
REVAMP: Automated Simulations of Adversarial Attacks on Arbitrary Objects in Realistic Scenes
Matthew Hull, Zijie J. Wang, Duen Horng Chau
-
Generating Robust Adversarial Examples against Online Social Networks (OSNs)
Jun Liu, Jiantao Zhou, Haiwei Wu, Weiwei Sun, Jinyu Tian
-
OODRobustBench: benchmarking and analyzing adversarial robustness under distribution shift
Lin Li, Yifei Wang, Chawin Sitawarin, Michael Spratling
-
CAT: Closed-loop Adversarial Training for Safe End-to-End Driving
Linrui Zhang, Zhenghao Peng, Quanyi Li, Bolei Zhou
-
Knowledge from Uncertainty in Evidential Deep Learning
Cai Davies, Marc Roig Vilamala, Alun D. Preece, Federico Cerutti, Lance M. Kaplan, Supriyo Chakraborty
-
Learn from the Past: A Proxy based Adversarial Defense Framework to Boost Robustness
Yaohua Liu, Jiaxin Gao, Zhu Liu, Xianghao Jiao, Xin Fan, Risheng Liu
-
PrivInfer: Privacy-Preserving Inference for Black-box Large Language Model
Meng Tong, Kejiang Chen, Yuang Qi, Jie Zhang, Weiming Zhang, Nenghai Yu
-
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework
Imdad Ullah, Najm Hassan, Sukhpal Singh Gill, Basem Suleiman, Tariq Ahamed Ahanger, Zawar Shah, Junaid Qadir, Salil S. Kanhere
-
SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models
Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang
-
Adversarial Robustness Unhardening via Backdoor Attacks in Federated Learning
Taejin Kim, Jiarui Li, Shubhranshu Singh, Nikhil Madaan, Carlee Joe-Wong
-
WaveAttack: Asymmetric Frequency Obfuscation-based Backdoor Attacks Against Deep Neural Networks
Jun Xia, Zhihao Yue, Yingbo Zhou, Zhiwei Ling, Xian Wei, Mingsong Chen
-
The Efficacy of Transformer-based Adversarial Attacks in Security Domains
Kunyang Li, Kyle Domico, Jean-Charles Noirot Ferrand, Patrick McDaniel
-
Black-Box Training Data Identification in GANs via Detector Networks
Lukman Olagoke, Salil Vadhan, Seth Neel
-
A Cautionary Tale: On the Role of Reference Data in Empirical Privacy Defenses
Caelin G. Kaplan, Chuan Xu, Othmane Marfoq, Giovanni Neglia, Anderson Santana de Oliveira
-
Domain-Generalized Face Anti-Spoofing with Unknown Attacks
Zong-Wei Hong, Yu-Chen Lin, Hsuan-Tung Liu, Yi-Ren Yeh, Chu-Song Chen
-
Yimeng Zhang, Jinghan Jia, Xin Chen, Aochuan Chen, Yihua Zhang, Jiancheng Liu, Ke Ding, Sijia Liu
-
Exploring Decision-based Black-box Attacks on Face Forgery Detection
Zhaoyu Chen, Bo Li, Kaixun Jiang, Shuang Wu, Shouhong Ding, Wenqiang Zhang
-
Zhengyu Zhao, Hanwei Zhang, Renjue Li, Ronan Sicre, Laurent Amsaleg, Michael Backes, Qi Li, Chao Shen
-
In defense of parameter sharing for model-compression
Aditya Desai, Anshumali Shrivastava
-
Adversarial Training for Physics-Informed Neural Networks
Yao Li, Shengzhu Shi, Zhichang Guo, Boying Wu
-
Quantifying Privacy Risks of Prompts in Visual Prompt Learning
Yixin Wu, Rui Wen, Michael Backes, Pascal Berrang, Mathias Humbert, Yun Shen, Yang Zhang
-
Demystifying Poisoning Backdoor Attacks from a Statistical Perspective
Xun Xian, Ganghua Wang, Jayanth Srinivasa, Ashish Kundu, Xuan Bi, Mingyi Hong, Jie Ding
-
Learning from Red Teaming: Gender Bias Provocation and Mitigation in Large Language Models
Hsuan Su, Cheng-Chu Cheng, Hua Farn, Shachi H Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee
-
Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
-
Functional Invariants to Watermark Large Transformers
Fernandez Pierre, Couairon Guillaume, Furon Teddy, Douze Matthijs
-
Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks
Jiaying Wu, Bryan Hooi
-
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani, Md Abdullah Al Mamun, Yu Fu, Pedram Zaree, Yue Dong, Nael Abu-Ghazaleh
-
Christina Chance, Da Yin, Dakuo Wang, Kai-Wei Chang
-
Backdoor Attack through Machine Unlearning
Peixin Zhang, Jun Sun, Mingtian Tan, Xinyu Wang
-
Regularization properties of adversarially-trained linear regression
Antônio H. Ribeiro, Dave Zachariah, Francis Bach, Thomas B. Schön
-
Locally Differentially Private Graph Embedding
Zening Li, Rong-Hua Li, Meihao Liao, Fusheng Jin, Guoren Wang
-
Rui Wen, Tianhao Wang, Michael Backes, Yang Zhang, Ahmed Salem
-
Unbiased Watermark for Large Language Models
Zhengmian Hu, Lichang Chen, Xidong Wu, Yihan Wu, Hongyang Zhang, Heng Huang
-
DANAA: Towards transferable attacks with double adversarial neuron attribution
Zhibo Jin, Zhiyu Zhu, Xinyi Wang, Jiayu Zhang, Jun Shen, Huaming Chen
-
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li, Yulin Chen, Jinglong Luo, Yan Kang, Xiaojin Zhang, Qi Hu, Chunkit Chan, Yangqiu Song
-
Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks
Shuyu Jiang, Xingshu Chen, Rui Tang
-
ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models
Alex Mei, Sharon Levy, William Yang Wang
-
Orthogonal Uncertainty Representation of Data Manifold for Robust Long-Tailed Learning
Yanbiao Ma, Licheng Jiao, Fang Liu, Shuyuan Yang, Xu Liu, Lingling Li
-
Quantifying Assistive Robustness Via the Natural-Adversarial Frontier
Jerry Zhi-Yang He, Zackory Erickson, Daniel S. Brown, Anca D. Dragan
-
SCME: A Self-Contrastive Method for Data-free and Query-Limited Model Extraction Attack
Renyang Liu, Jinhong Zhang, Kwok-Yan Lam, Jun Zhao, Wei Zhou
-
AFLOW: Developing Adversarial Examples under Extremely Noise-limited Settings
Renyang Liu, Jinhong Zhang, Haoran Li, Jin Zhang, Yuanyu Wang, Wei Zhou
-
Black-box Targeted Adversarial Attack on Segment Anything (SAM)
Sheng Zheng, Chaoning Zhang
-
Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks
Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei Zhang, Bin Li
-
Model Inversion Attacks on Homogeneous and Heterogeneous Graph Neural Networks
Renyang Liu, Wei Zhou, Jinhong Zhang, Xiaoyuan Liu, Peiyuan Si, Haoran Li
-
Chahyon Ku, Carl Winge, Ryan Diaz, Wentao Yuan, Karthik Desingh
-
Is Certifying $\ell_p$ Robustness Still Worthwhile?
Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson
-
MAGIC: Detecting Advanced Persistent Threats via Masked Graph Representation Learning
Zian Jia, Yun Xiong, Yuhong Nan, Yao Zhang, Jinjing Zhao, Mi Wen
-
A Comprehensive Study of Privacy Risks in Curriculum Learning
Joann Qiongna Chen, Xinlei He, Zheng Li, Yang Zhang, Zhou Li
-
Prime Match: A Privacy-Preserving Inventory Matching System
Antigoni Polychroniadou, Gilad Asharov, Benjamin Diamond, Tucker Balch, Hans Buehler, Richard Hua, Suwen Gu, Greg Gimler, Manuela Veloso
-
BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries
Wenjie Lv, Zhen Wang, Yitao Zheng, Zhehua Zhong, Qi Xuan, Tianyi Chen
-
Defending Our Privacy With Backdoors
Dominik Hintersdorf, Lukas Struppek, Daniel Neider, Kristian Kersting
-
Samples on Thin Ice: Re-Evaluating Adversarial Pruning of Neural Networks
Giorgio Piras, Maura Pintor, Ambra Demontis, Battista Biggio
-
Towards the Vulnerability of Watermarking Artificial Intelligence Generated Content
Guanlin Li, Yifei Chen, Jie Zhang, Jiwei Li, Shangwei Guo, Tianwei Zhang
-
Sentinel: An Aggregation Function to Secure Decentralized Federated Learning
Chao Feng, Alberto Huertas Celdran, Janosch Baltensperger, Enrique Tomas Matınez Bertran, Gerome Bovet, Burkhard Stiller
-
Bálint Mucsányi, Michael Kirchhof, Elisa Nguyen, Alexander Rubinstein, Seong Joon Oh
-
Bucks for Buckets (B4B): Active Defenses Against Stealing Encoders
Jan Dubiński, Stanisław Pawlak, Franziska Boenisch, Tomasz Trzciński, Adam Dziedzic
-
Effects of Human Adversarial and Affable Samples on BERT Generalizability
Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor
-
Concealed Electronic Countermeasures of Radar Signal with Adversarial Examples
Ruinan Ma, Canjie Zhu, Mingfeng Lu, Yunjie Li, Yu-an Tan, Ruibin Zhang, Ran Tao
-
Jailbreaking Black Box Large Language Models in Twenty Queries
Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong
-
Towards Robust Multi-Modal Reasoning via Model Selection
Xiangyan Liu, Rongxue Li, Wei Ji, Tao Lin
-
Improving Fast Minimum-Norm Attacks with Hyperparameter Optimization
Giuseppe Floris, Raffaele Mura, Luca Scionis, Giorgio Piras, Maura Pintor, Ambra Demontis, Battista Biggio
-
Invisible Threats: Backdoor Attack in OCR Systems
Mauro Conti, Nicola Farronato, Stefanos Koffas, Luca Pajola, Stjepan Picek
-
Promoting Robustness of Randomized Smoothing: Two Cost-Effective Approaches
Linbo Liu, Trong Nghia Hoang, Lam M. Nguyen, Tsui-Wei Weng
-
Towards Causal Deep Learning for Vulnerability Detection
Md Mahbubur Rahman, Ira Ceka, Chengzhi Mao, Saikat Chakraborty, Baishakhi Ray, Wei Le
-
RobustEdge: Low Power Adversarial Detection for Cloud-Edge Systems
Abhishek Moitra, Abhiroop Bhattacharjee, Youngeun Kim, Priyadarshini Panda
-
Mahmoud Nazzal, Nura Aljaafari, Ahmed Sawalmeh, Abdallah Khreishah, Muhammad Anan, Abdulelah Algosaibi, Mohammed Alnaeem, Adel Aldalbahi, Abdulaziz Alhumam, Conrado P. Vizcarra, Shadan Alhamed
-
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen
-
No Privacy Left Outside: On the (In-)Security of TEE-Shielded DNN Partition for On-Device ML
Ziqi Zhang, Chen Gong, Yifeng Cai, Yuanyuan Yuan, Bingyan Liu, Ding Li, Yao Guo, Xiangqun Chen
-
Boosting Black-box Attack to Deep Neural Networks with Conditional Diffusion Models
Renyang Liu, Wei Zhou, Tianwei Zhang, Kangjie Chen, Jun Zhao, Kwok-Yan Lam
-
Composite Backdoor Attacks Against Large Language Models
Hai Huang, Zhengyu Zhao, Michael Backes, Yun Shen, Yang Zhang
-
Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin, Sergey Lavrushkin, Ekaterina Shumitskaya, Maksim Velikanov, Dmitriy Vatolin
-
Robust Safe Reinforcement Learning under Adversarial Disturbances
Zeyang Li, Chuxiong Hu, Shengbo Eben Li, Jia Cheng, Yunan Wang
-
Fingerprint Attack: Client De-Anonymization in Federated Learning
Qiongkai Xu, Trevor Cohn, Olga Ohrimenko
-
Suppressing Overestimation in Q-Learning through Adversarial Behaviors
HyeAnn Lee, Donghwan Lee
-
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Zeming Wei, Yifei Wang, Yisen Wang
-
Multilingual Jailbreak Challenges in Large Language Models
Yue Deng, Wenxuan Zhang, Sinno Jialin Pan, Lidong Bing
-
A Semantic Invariant Robust Watermark for Large Language Models
Aiwei Liu, Leyi Pan, Xuming Hu, Shiao Meng, Lijie Wen
-
Adversarial Masked Image Inpainting for Robust Detection of Mpox and Non-Mpox
Yubiao Yue, Zhenzhang Li
-
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data
Lukas Struppek, Martin B. Hentschel, Clifton Poth, Dominik Hintersdorf, Kristian Kersting
-
Lukas Struppek, Dominik Hintersdorf, Kristian Kersting
-
Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach
Shaopeng Fu, Di Wang
-
PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization
Jiancong Xiao, Ruoyu Sun, Zhi-quan Luo
-
Adversarial Robustness in Graph Neural Networks: A Hamiltonian Approach
Kai Zhao, Qiyu Kang, Yang Song, Rui She, Sijie Wang, Wee Peng Tay
-
Exploring adversarial attacks in federated learning for medical imaging
Erfan Darzi, Florian Dubost, N.M. Sijtsema, P.M.A van Ooijen
-
Vipula Rawte, Swagata Chakraborty, Agnibh Pathak, Anubhav Sarkar, S.M Towhidul Islam Tonmoy, Aman Chadha, Amit P. Sheth, Amitava Das
-
Large Language Models Can Be Good Privacy Protection Learners
Yijia Xiao, Yiqiao Jin, Yushi Bai, Yue Wu, Xianjun Yang, Xiao Luo, Wenchao Yu, Xujiang Zhao, Yanchi Liu, Haifeng Chen, Wei Wang, Wei Cheng
-
Muhammad Ahmed Shah, Roshan Sharma, Hira Dhamyal, Raphael Olivier, Ankit Shah, Dareen Alharthi, Hazim T Bukhari, Massa Baali, Soham Deshmukh, Michael Kuhlmann, Bhiksha Raj, Rita Singh
-
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
Xiaogeng Liu, Nan Xu, Muhao Chen, Chaowei Xiao
-
Understanding and Improving Adversarial Attacks on Latent Diffusion Model
Boyang Zheng, Chumeng Liang, Xiaoyu Wu, Yan Liu
-
Robustness-enhanced Uplift Modeling with Adversarial Feature Desensitization
Zexu Sun, Bowei He, Ming Ma, Jiakai Tang, Yuchen Wang, Chen Ma, Dugang Liu
-
Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks
Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
-
BRAINTEASER: Lateral Thinking Puzzles for Large Language Model
Yifan Jiang, Filip Ilievski, Kaixin Ma
-
Do Large Language Models Know about Facts?
Xuming Hu, Junzhe Chen, Xiaochuan Li, Yufei Guo, Lijie Wen, Philip S. Yu, Zhijiang Guo
-
Liang Xu, Kangkang Zhao, Lei Zhu, Hang Xue
-
IPMix: Label-Preserving Data Augmentation Method for Training Robust Classifiers
Zhenglin Huang, Xianan Bao, Na Zhang, Qingqi Zhang, Xiaomei Tu, Biao Wu, Xi Yang
-
VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
Ziyi Yin, Muchao Ye, Tianrong Zhang, Tianyu Du, Jinguo Zhu, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma
-
GReAT: A Graph Regularized Adversarial Training Method
Samet Bayram, Kenneth Barner
-
Generating Less Certain Adversarial Examples Improves Robust Generalization
Minxing Zhang, Michael Backes, Xiao Zhang
-
Protecting Sensitive Data through Federated Co-Training
Amr Abourayya, Jens Kleesiek, Kanishka Rao, Erman Ayday, Bharat Rao, Geoff Webb, Michael Kamp
-
Tight Certified Robustness via Min-Max Representations of ReLU Neural Networks
Brendon G. Anderson, Samuel Pfrommer, Somayeh Sojoudi
-
Lightweight Boosting Models for User Response Prediction Using Adversarial Validation
Hyeonwoo Kim, Wonsung Lee
-
Assessing Robustness via Score-Based Adversarial Image Generation
Marcel Kollovieh, Lukas Gosch, Yan Scholten, Marten Lienen, Stephan Günnemann
-
Indirect Meltdown: Building Novel Side-Channel Attacks from Transient-Execution Attacks
Daniel Weber, Fabian Thomas, Lukas Gerlach, Ruiyi Zhang, Michael Schwarz
-
Andreea Postovan, Mădălina Eraşcu
-
Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models
Zihao Lin, Yan Sun, Yifan Shi, Xueqian Wang, Lifu Huang, Li Shen, Dacheng Tao
-
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson
-
Ask for Alice: Online Covert Distress Signal in the Presence of a Strong Adversary
Hayyu Imanda, Kasper Rasmussen
-
Misusing Tools in Large Language Models With Visual Adversarial Examples
Xiaohan Fu, Zihan Wang, Shuheng Li, Rajesh K. Gupta, Niloofar Mireshghallah, Taylor Berg-Kirkpatrick, Earlence Fernandes
-
Robust Representation Learning via Asymmetric Negative Contrast and Reverse Attention
Nuoyan Zhou, Decheng Liu, Dawei Zhou, Xinbo Gao, Nannan Wang
-
SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
-
Ke Shen, Mayank Kejriwal
-
Shielding the Unseen: Privacy Protection through Poisoning NeRF with Spatial Deformation
Yihan Wu, Brandon Y. Feng, Heng Huang
-
CSI: Enhancing the Robustness of 3D Point Cloud Recognition against Corruption
Zhuoyuan Wu, Jiachen Sun, Chaowei Xiao
-
OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks
Ofir Bar Tal, Adi Haviv, Amit H. Bermano
-
Khushnaseeb Roshan, Aasim Zafar, Sheikh Burhan Ul Haque
-
Targeted Adversarial Attacks on Generalizable Neural Radiance Fields
Andras Horvath, Csaba M. Jozsa
-
Adversarial Machine Learning for Social Good: Reframing the Adversary as an Ally
Shawqi Al-Maliki, Adnan Qayyum, Hassan Ali, Mohamed Abdallah, Junaid Qadir, Dinh Thai Hoang, Dusit Niyato, Ala Al-Fuqaha
-
Biagio Montaruli, Luca Demetrio, Maura Pintor, Luca Compagna, Davide Balzarotti, Battista Biggio
-
Regret Analysis of Distributed Online Control for LTI Systems with Adversarial Disturbances
Ting-Jui Chang, Shahin Shahrampour
-
Certifiably Robust Graph Contrastive Learning
Minhua Lin, Teng Xiao, Enyan Dai, Xiang Zhang, Suhang Wang
-
An Integrated Algorithm for Robust and Imperceptible Audio Adversarial Examples
Armin Ettenhofer, Jan-Philipp Schulze, Karla Pizzi
-
Low-Resource Languages Jailbreak GPT-4
Zheng-Xin Yong, Cristina Menghini, Stephen H. Bach
-
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design
Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster
-
Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models
Xianjun Yang, Xiao Wang, Qi Zhang, Linda Petzold, William Yang Wang, Xun Zhao, Dahua Lin
-
Yufan Chen, Arjun Arunasalam, Z. Berkay Celik
-
KL Navaneet, Soroush Abbasi Koohpayegani, Essam Sleiman, Hamed Pirsiavash
-
Splitting the Difference on Adversarial Training
Matan Levi, Aryeh Kontorovich
-
A Recipe for Improved Certifiable Robustness: Capacity and Data
Kai Hu, Klas Leino, Zifan Wang, Matt Fredrikson
-
Jailbreaker in Jail: Moving Target Defense for Large Language Models
Bocheng Chen, Advait Paliwal, Qiben Yan
-
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith, Ali Shahin Shamsabadi, Carolyn Ashurst, Adrian Weller
-
Fooling the Textual Fooler via Randomizing Latent Representations
Duy C. Hoang, Quang H. Nguyen, Saurav Manchanda, MinLong Peng, Kok-Seng Wong, Khoa D. Doan
-
Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu
-
Towards Stable Backdoor Purification through Feature Shift Tuning
Rui Min, Zeyu Qin, Li Shen, Minhao Cheng
-
Defending Against Authorship Identification Attacks
Haining Wang
-
Can Language Models be Instructed to Protect Personal Information?
Yang Chen, Ethan Mendes, Sauvik Das, Wei Xu, Alan Ritter
-
Xianjian Xie, Xiaochen Xian, Dan Li, Andi Wang
-
Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations
Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales
-
Beyond Labeling Oracles: What does it mean to steal ML models?
Avital Shafran, Ilia Shumailov, Murat A. Erdogdu, Nicolas Papernot
-
FLEDGE: Ledger-based Federated Learning Resilient to Inference and Backdoor Attacks
Jorge Castillo, Phillip Rieger, Hossein Fereidooni, Qian Chen, Ahmad Sadeghi
-
Waveform Manipulation Against DNN-based Modulation Classification Attacks
Dimitrios Varkatzas, Antonios Argyriou
-
Zhen Liu, Hang Gao, Hao Ma, Shuo Cai, Yunfeng Hu, Ting Qu, Hong Chen, Xun Gong
-
Certified Robustness via Dynamic Margin Maximization and Improved Lipschitz Regularization
Mahyar Fazlyab, Taha Entesari, Aniket Roy, Rama Chellappa
-
Beyond Random Noise: Insights on Anonymization Strategies from a Latent Bandit Study
Alexander Galozy, Sadi Alawadi, Victor Kebande, Sławomir Nowaczyk
-
Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks
Quang H. Nguyen, Yingjie Lao, Tung Pham, Kok-Seng Wong, Khoa D. Doan
-
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals
Yair Gat, Nitay Calderon, Amir Feder, Alexander Chapanin, Amit Sharma, Roi Reichart
-
A Survey of Robustness and Safety of 2D and 3D Deep Learning Models Against Adversarial Attacks
Yanjie Li, Bin Xie, Songtao Guo, Yuanyuan Yang, Bin Xiao
-
All Languages Matter: On the Multilingual Safety of Large Language Models
Wenxuan Wang, Zhaopeng Tu, Chang Chen, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu
-
Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives
Sihao Hu, Tiansheng Huang, Fatih İlhan, Selim Fukan Tekin, Ling Liu
-
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yang
-
Fewer is More: Trojan Attacks on Parameter-Efficient Fine-Tuning
Lauren Hong, Ting Wang
-
Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks
Mehrdad Saberi, Vinu Sankar Sadasivan, Keivan Rezaei, Aounon Kumar, Atoosa Chegini, Wenxiao Wang, Soheil Feizi
-
Human-Producible Adversarial Examples
David Khachaturov, Yue Gao, Ilia Shumailov, Robert Mullins, Ross Anderson, Kassem Fawaz
-
Black-box Attacks on Image Activity Prediction and its Natural Language Explanations
Alina Elena Baia, Valentina Poggioni, Andrea Cavallaro
-
Qiannan Wang, Changchun Yin, Zhe Liu, Liming Fang, Run Wang, Chenhao Lin
-
Counterfactual Image Generation for adversarially robust and interpretable Classifiers
Rafael Bischof, Florian Scheidegger, Michael A. Kraus, A. Cristiano I. Malossi
-
Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study
Myeongseob Ko, Ming Jin, Chenguang Wang, Ruoxi Jia
-
Understanding Adversarial Transferability in Federated Learning
Yijiang Li, Ying Gao, Haohan Wang
-
On the Onset of Robust Overfitting in Adversarial Training
Chaojian Yu, Xiaolong Shi, Jun Yu, Bo Han, Tongliang Liu
-
Sepehr Bakhshi, Fazli Can
-
Mohammed M. Alani, Atefeh Mashatan, Ali Miri
-
Source Inference Attacks: Beyond Membership Inference Attacks in Federated Learning
Hongsheng Hu, Xuyun Zhang, Zoran Salcic, Lichao Sun, Kim-Kwang Raymond Choo, Gillian Dobbie
-
AIR: Threats of Adversarial Attacks on Deep Learning-Based Information Recovery
Jinyin Chen, Jie Ge, Shilian Zheng, Linhui Ye, Haibin Zheng, Weiguo Shen, Keqiang Yue, Xiaoniu Yang
-
Dmitrii Korzh, Mikhail Pautov, Olga Tsymboi, Ivan Oseledets
-
Investigating Human-Identifiable Features Hidden in Adversarial Perturbations
Dennis Y. Menn, Tzu-hsun Feng, Sriram Vishwanath, Hung-yi Lee
-
Medical Foundation Models are Susceptible to Targeted Misinformation Attacks
Tianyu Han, Sven Nebelung, Firas Khader, Tianci Wang, Gustav Mueller-Franzes, Christiane Kuhl, Sebastian Försch, Jens Kleesiek, Christoph Haarburger, Keno K. Bressem, Jakob Nikolas Kather, Daniel Truhn
-
Adversarial Machine Learning in Latent Representations of Neural Networks
Milin Zhang, Mohammad Abdi, Francesco Restuccia
-
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Vaidehi Patil, Peter Hase, Mohit Bansal
-
Mengke Zhang, Tianxing He, Tianle Wang, Fatemehsadat Mireshghallah, Binyi Chen, Hao Wang, Yulia Tsvetkov
-
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness
Xiaoyu Wen, Xudong Yu, Rui Yang, Chenjia Bai, Zhen Wang
-
Efficient Biologically Plausible Adversarial Training
Matilde Tristany Farinha, Thomas Ortner, Giorgia Dellaferrera, Benjamin Grewe, Angeliki Pantazi
-
Adversarial Imitation Learning from Visual Observations using Latent Information
Vittorio Giammarino, James Queeney, Ioannis Ch. Paschalidis
-
Towards Efficient and Trustworthy AI Through Hardware-Algorithm-Communication Co-Design
Bipin Rajendran, Osvaldo Simeone, Bashir M. Al-Hashimi
-
VDC: Versatile Data Cleanser for Detecting Dirty Samples via Visual-Linguistic Inconsistency
Zihao Zhu, Mingda Zhang, Shaokui Wei, Bingzhe Wu, Baoyuan Wu
-
Recent Advances of Differential Privacy in Centralized Deep Learning: A Systematic Survey
Lea Demelius, Roman Kern, Andreas Trügler
-
Robust Offline Reinforcement Learning -- Certify the Confidence Interval
Jiarui Yao, Simon Shaolei Du
-
Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness
Ambar Pal, Jeremias Sulam, René Vidal
-
Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation Robustness via Hypernetworks
Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu
-
On the Trade-offs between Adversarial Robustness and Actionable Explanations
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju
-
Zhen Qin, Feiyi Chen, Chen Zhi, Xueqiang Yan, Shuiguang Deng
-
Towards Poisoning Fair Representations
Tianci Liu, Haoyu Wang, Feijie Wu, Hengtong Zhang, Pan Li, Lu Su, Jing Gao
-
Compilation as a Defense: Enhancing DL Model Attack Robustness via Tensor Optimization
Stefan Trawicki, William Hackett, Lewis Birch, Neeraj Suri, Peter Garraghan
-
Cyber Sentinel: Exploring Conversational Agents in Streamlining Security Tasks with GPT-4
Mehrdad Kaheh, Danial Khosh Kholgh, Panos Kostakos
-
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner
-
The Robust Semantic Segmentation UNCV2023 Challenge Results
Xuanlong Yu, Yi Zuo, Zitao Wang, Xiaowen Zhang, Jiaxuan Zhao, Yuting Yang, Licheng Jiao, Rui Peng, Xinyi Wang, Junpei Zhang, Kexin Zhang, Fang Liu, Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Hanlin Tian, Kenta Matsui, Tianhao Wang, Fahmy Adan, Zhitong Gao, Xuming He, Quentin Bouniot, Hossein Moghaddam, Shyam Nandan Rai, Fabio Cermelli, Carlo Masone, Andrea Pilzer, Elisa Ricci, Andrei Bursuc, Arno Solin, Martin Trapp, Rui Li, Angela Yao, Wenlong Chen, Ivor Simpson, Neill D. F. Campbell, Gianni Franchi
-
A Unified View of Differentially Private Deep Generative Modeling
Dingfan Chen, Raouf Kerkouche, Mario Fritz
-
On Computational Entanglement and Its Interpretation in Adversarial Machine Learning
YenLung Lai, Xingbo Dong, Zhe Jin
-
Automatic Feature Fairness in Recommendation via Adversaries
Hengchang Hu, Yiming Cao, Zhankui He, Samson Tan, Min-Yen Kan
-
Bias Assessment and Mitigation in LLM-based Code Generation
Dong Huang, Qingwen Bu, Jie Zhang, Xiaofei Xie, Junjie Chen, Heming Cui
-
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen
-
Survey of Social Bias in Vision-Language Models
Nayeon Lee, Yejin Bang, Holy Lovenia, Samuel Cahyawijaya, Wenliang Dai, Pascale Fung
-
Vu Le Anh Quan, Chau Thuan Phat, Kiet Van Nguyen, Phan The Duy, Van-Hau Pham
-
DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature Space
Liu Jun, Zhou Jiantao, Zeng Jiandian, Jinyu Tian
-
Structure Invariant Transformation for better Adversarial Transferability
Xiaosen Wang, Zeliang Zhang, Jianping Zhang
-
Frugal Satellite Image Change Detection with Deep-Net Inversion
Hichem Sahbi, Sebastien Deschamps
-
Pratyusha Ria Kalluri, William Agnew, Myra Cheng, Kentrell Owens, Luca Soldaini, Abeba Birhane
-
Unveiling Fairness Biases in Deep Learning-Based Brain MRI Reconstruction
Yuning Du, Yuyang Xue, Rohan Dharmakumar, Sotirios A. Tsaftaris
-
LogGPT: Log Anomaly Detection via GPT
Xiao Han, Shuhan Yuan, Mohamed Trabelsi
-
Privacy-preserving and Privacy-attacking Approaches for Speech and Audio -- A Survey
Yuchen Liu, Apu Kapadia, Donald Williamson
-
Investigating Efficient Deep Learning Architectures For Side-Channel Attacks on AES
Yohaï-Eliel Berreby, Laurent Sauvage
-
Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation
Kai Huang, Hanyun Yin, Heng Huang, Wei Gao
-
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
Zhaohan Xi, Tianyu Du, Changjiang Li, Ren Pang, Shouling Ji, Jinghui Chen, Fenglong Ma, Ting Wang
-
LLMs as Counterfactual Explanation Modules: Can ChatGPT Explain Black-box Text Classifiers?
Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu
-
Seeing Is Not Always Believing: Invisible Collision Attack and Defence on Pre-Trained Models
Minghang Deng, Zhong Zhang, Junming Shao
-
PRIS: Practical robust invertible network for image steganography
Hang Yang, Yitian Xu, Xuhua Liu, Xiaodong Ma
-
Stone Yun, Alexander Wong
-
Can LLM-Generated Misinformation Be Detected?
Canyu Chen, Kai Shu
-
RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias
Hao Cheng, Jinhao Duan, Hui Li, Lyutianyang Zhang, Jiahang Cao, Ping Wang, Jize Zhang, Kaidi Xu, Renjing Xu
-
DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning
Kangyang Luo, Shuai Wang, Yexuan Fu, Xiang Li, Yunshi Lan, Ming Gao
-
Vulnerabilities in Video Quality Assessment Models: The Challenge of Adversarial Attacks
Ao-Xiang Zhang, Yu Ran, Weixuan Tang, Yuan-Gen Wang
-
Yijun Yang, Angelica I. Aviles-Rivero, Huazhu Fu, Ye Liu, Weiming Wang, Lei Zhu
-
Combining Two Adversarial Attacks Against Person Re-Identification Systems
Eduardo de O. Andrade, Igor Garcia Ballhausen Sampaio, Joris Guérin, José Viterbo
-
Adversarial Attacks on Video Object Segmentation with Hard Region Discovery
Ping Li, Yu Zhang, Li Yuan, Jian Zhao, Xianghua Xu, Xiaoqin Zhang
-
SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution
Zhongjie Ba, Jieming Zhong, Jiachen Lei, Peng Cheng, Qinglong Wang, Zhan Qin, Zhibo Wang, Kui Ren
-
Spatial-frequency channels, shape bias, and adversarial robustness
Ajay Subramanian, Elena Sizikova, Najib J. Majaj, Denis G. Pelli
-
Beyond Fairness: Age-Harmless Parkinson's Detection via Voice
Yicheng Wang, Xiaotian Han, Leisheng Yu, Na Zou
-
Improving Robustness of Deep Convolutional Neural Networks via Multiresolution Learning
Hongyan Zhou, Yao Liang
-
Invisible Watermarking for Audio Generation Diffusion Models
Xirong Cao, Xiang Li, Divyesh Jadav, Yanzhao Wu, Zhehui Chen, Chen Zeng, Wenqi Wei
-
Trong-Nghia To, Danh Le Kim, Do Thi Thu Hien, Nghi Hoang Khoa, Hien Do Hoang, Phan The Duy, Van-Hau Pham
-
Junqi Jiang, Jianglin Lan, Francesco Leofante, Antonio Rago, Francesca Toni
-
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems
Leonardo Ranaldi, Fabio Massimo Zanzotto
-
Xiaoxiao Sun, Nidham Gazagnadou, Vivek Sharma, Lingjuan Lyu, Hongdong Li, Liang Zheng
-
Improving Machine Learning Robustness via Adversarial Training
Long Dang, Thushari Hapuarachchi, Kaiqi Xiong, Jing Lin
-
On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures
Qingzhao Zhang, Shuowei Jin, Jiachen Sun, Xumiao Zhang, Ruiyang Zhu, Qi Alfred Chen, Z. Morley Mao
-
Robotic Handling of Compliant Food Objects by Robust Learning from Demonstration
Ekrem Misimi, Alexander Olofsson, Aleksander Eilertsen, Elling Ruud Øye, John Reidar Mathiassen
-
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge
Manuel Brack, Patrick Schramowski, Kristian Kersting
-
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans
-
Bad Actor, Good Advisor: Exploring the Role of Large Language Models in Fake News Detection
Beizhe Hu, Qiang Sheng, Juan Cao, Yuhui Shi, Yang Li, Danding Wang, Peng Qi
-
A Chinese Prompt Attack Dataset for LLMs with Evil Content
Chengyuan Liu, Fubang Zhao, Lizhi Qing, Yangyang Kang, Changlong Sun, Kun Kuang, Fei Wu
-
Vulnerability of 3D Face Recognition Systems to Morphing Attacks
Sanjeet Vardam, Luuk Spreeuwers
-
Towards Differential Privacy in Sequential Recommendation: A Noisy Graph Neural Network Approach
Wentao Hu, Hui Fang
-
Jinmeng Rao, Song Gao, Sijia Zhu
-
How Robust is Google's Bard to Adversarial Image Attacks?
Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu
-
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi, Hidetoshi Shimodaira
-
On the Relationship between Skill Neurons and Robustness in Prompt Tuning
Leon Ackermann, Xenia Ohmer
-
TextCLIP: Text-Guided Face Image Generation And Manipulation Without Adversarial Training
Xiaozhou You, Jian Zhang
-
Dictionary Attack on IMU-based Gait Authentication
Rajesh Kumar, Can Isik, Chilukuri K. Mohan
-
Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation
Xinyu Tang, Richard Shin, Huseyin A. Inan, Andre Manoel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Robert Sim
-
MarkNerf:Watermarking for Neural Radiance Field
Lifeng Chen, Jia Liu, Yan Ke, Wenquan Sun, Weina Dong, Xiaozhong Pan
-
DeepTheft: Stealing DNN Model Architectures through Power Side Channel
Yansong Gao, Huming Qiu, Zhi Zhang, Binghui Wang, Hua Ma, Alsharif Abuadbba, Minhui Xue, Anmin Fu, Surya Nepal
-
When to Trust AI: Advances and Challenges for Certification of Neural Networks
Marta Kwiatkowska, Xiyue Zhang
-
C$\cdot$ASE: Learning Conditional Adversarial Skill Embeddings for Physics-based Characters
Zhiyang Dou, Xuelin Chen, Qingnan Fan, Taku Komura, Wenping Wang
-
What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples
Shakila Mahjabin Tonni, Mark Dras
-
PRAT: PRofiling Adversarial aTtacks
Rahul Ambati, Naveed Akhtar, Ajmal Mian, Yogesh Singh Rawat
-
It's Simplex! Disaggregating Measures to Improve Certified Robustness
Andrew C. Cullen, Paul Montague, Shijie Liu, Sarah M. Erfani, Benjamin I.P. Rubinstein
-
Wei Liao, Joel Voldman
-
Tran Duc Luong, Vuong Minh Tien, Nguyen Huu Quyen, Do Thi Thu Hien, Phan The Duy, Van-Hau Pham
-
GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu, Xingwei Lin, Xinyu Xing
-
Exploring the Dark Side of AI: Advanced Phishing Attack Design and Deployment Using ChatGPT
Nils Begou, Jeremy Vinoy, Andrzej Duda, Maciej Korczynski
-
Transferable Adversarial Attack on Image Tampering Localization
Yuqi Wang, Gang Cao, Zijie Lou, Haochen Zhu
-
RECALL+: Adversarial Web-based Replay for Continual Learning in Semantic Segmentation
Chang Liu, Giulia Rizzoli, Francesco Barbato, Umberto Michieli, Yi Niu, Pietro Zanuttigh
-
Realistic Website Fingerprinting By Augmenting Network Trace
Alireza Bahramali, Ardavan Bozorgi, Amir Houmansadr
-
Tanveer Khan, Khoa Nguyen, Antonis Michalas, Alexandros Bakas
-
Disentangled Information Bottleneck guided Privacy-Protective JSCC for Image Transmission
Lunan Sun, Yang Yang, Mingzhe Chen, Caili Guo
-
SPFL: A Self-purified Federated Learning Method Against Poisoning Attacks
Zizhen Liu, Weiyang He, Chip-Hong Chang, Jing Ye, Huawei Li, Xiaowei Li
-
Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents
Ziyi Yang, Shreyas S. Raman, Ankit Shah, Stefanie Tellex
-
André Storhaug, Jingyue Li, Tianyuan Hu
-
Bias of AI-Generated Content: An Examination of News Produced by Large Language Models
Xiao Fang, Shangkun Che, Minjia Mao, Hongzhe Zhang, Ming Zhao, Xiaohang Zhao
-
Reducing Adversarial Training Cost with Gradient Approximation
Huihui Gong, Shuo Yang, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu
-
Stealthy Physical Masked Face Recognition Attack via Adversarial Style Optimization
Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang Xu
-
Evaluating Adversarial Robustness with Expected Viable Performance
Ryan McCoppin, Colin Dawson, Sean M. Kennedy, Leslie M. Blaha
-
Convex Latent-Optimized Adversarial Regularizers for Imaging Inverse Problems
Huayu Wang, Chen Luo, Taofeng Xie, Qiyu Jin, Guoqing Chen, Zhuo-Xu Cui, Dong Liang
-
Robust Backdoor Attacks on Object Detection in Real World
Yaguan Qian, Boyuan Ji, Shuke He, Shenhui Huang, Xiang Ling, Bin Wang, Wei Wang
-
Mahammed Kamruzzaman, Md. Minul Islam Shovon, Gene Louis Kim
-
Context-aware Adversarial Attack on Named Entity Recognition
Shuguang Chen, Leonardo Neves, Thamar Solorio
-
Adversarial Attacks on Tables with Entity Swap
Aneta Koleva, Martin Ringsquandl, Volker Tresp
-
A More Secure Split: Enhancing the Security of Privacy-Preserving Split Learning
Tanveer Khan, Khoa Nguyen, Antonis Michalas
-
Yasir Ali Farrukh, Syed Wali, Irfan Khan, Nathaniel D. Bastian
-
Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated Text
Mahdi Dhaini, Wessel Poelman, Ege Erdogan
-
Keep your Identity Small: Privacy-preserving Client-side Fingerprinting
Alberto Fernandez-de-Retana, Igor Santos-Grueiro
-
Fake News Detectors are Biased against Texts Generated by Large Language Models
Jinyan Su, Terry Yue Zhuo, Jonibek Mansurov, Di Wang, Preslav Nakov