Exploring the Impact of Single-Character Attacks in Federated Learning: Introducing the novel Single-Character Strike
This repository is a fork of https://github.com/ksreenivasan/OOD_Federated_Learning who wrote the paper: Attack of the Tails: Yes, You Really Can Backdoor Federated Learning. This project (not the Attack of the Tails paper) is part of a the course CSE3000: Research Project 2023 of the University of Technology Delft.
If you want to reproduce the experiment all files are provided in the language-tasks-fl
folder and all instructions are provided in the related README file.
Federated learning (FL) is a privacy preserving machine learning approach which allows a machine learning model to be trained in a distributed ashion without ever sharing user data. Due to the large amount of valuable text and voice data stored on end-user devices, this approach works particularly well for natural language processing (NLP) tasks. Due to many applications making use of the algorithm and increasing interest in academics, ensuring security is essential. Current backdoor attacks in NLP tasks are still unable to evade some defence mechanisms. Therefore, we propose a novel attack, the single-character strike to address this research gap. Consequently, the following research question is posed: What are the properties of the single-character strike in a language classification task? By experimental analysis the following properties are discovered: the single-character strike is undetectable against five state-of-the-art defences, has low impact on the global model accuracy, trains slower than similar attacks, relies on characters on the edge of the distribution to function, is robust within the global model, and performs best when close to convergence and with more adversarial clients. Emphasizing its imperceptibility and persistence, the attack maintains a 70% backdoor accuracy after a thousand iterations without training and remains undetectable against: (Multi-)Krum, RFA, Norm Clipping and Weak Differential Privacy. By providing insight into the effective single-character strike, this paper adds to the growing body of work that questions whether federated learning can be secure against backdoor attacks.