# "Can You Really Backdoor Federated Learning?"
Siu et al, 12/2019
 - This paper focuses on backdoor attacks in the federated learning setting
 - we allow non-malicious clients to have correctly labeled samples from the targeted tasks.
 - comprehensive study of backdoor attacks and defenses for the EMNIST dataset, a real-life, user-partitioned, and non-iid dataset
 - the performance of the attack largely depends on the fraction of adversaries present and the “complexity” of the targeted task.
 - norm clipping and “weak” differential privacy mitigate the attacks without hurting the overall performance.

## Defences proposed
 - **Norm thresholding of updates**. Since boosted attacks are likely to produce updates with large norms, a reasonable defense is for the server to simply ignore updates whose norm is above some threshold M. However, we assume the adversary knows the threshold M , and can hence always return malicious updates within this magnitude

 - **(Weak) DP**. First clipping updates (as above) and then adding Gaussian noise. However, traditionally the amount of noise added to obtain reasonable differential privacy is relatively large. Since our goal is not privacy, but instead preventing attacks, we add a small amount of noise that is empirically sufficient to limit the success of attacks.
 

## Result 
 - ed backdoor attacks and defenses for federated learning under the more realistic EMNIST dataset. In the absence of any defense, we showed that the performance of the adversary largely depends on the fraction of adversaries present. Hence, for reasonable success, there needs to be a large number of adversaries
 - Selecting 3 as the norm bound will successfully mitigate the attack with almost no effect on the performance of the main task. Hence we can see that norm bounding may be a valid defense for current backdoor attacks.
 - We can see that adding Gaussian noise can also help mitigate the attack beyond norm clipping without hurting the overall performance much.

# Data Poisoning Attacks Against Federated Learning Systems. 

Tolpegin et al, 11/2020

 - With no central authority able to validate data, these malicious participants can consequently poison the trained global model.
 - We make minimal assumptions on the capability of a malicious FL participant – each can only manipulate the raw training data on their device.
 - CIFAR-10 and Fashion-MNIST
 - we show that attack effectiveness (decrease in model utility) depends on the percentage of malicious users and the attack is effective even when this percentage is small.
 - we show that attacks can be targeted, i.e., they have large negative impact on the subset of classes that are under attack, but have little to no impact on remaining classes. -> avoid easy detection
 - the global model may still converge accurately after early-round poisoning stops,
 - **largest poisoning impact can be achieved if malicious users participate in later rounds and with high availability.**
 - Goal: targeted attack == manipulate the learned parameters such that the final global model M has high errors for particular classes (a subset of C)
 - 

## Analysis of label flipping attacks

 - m = %malicious = 10% mean
 - Even with small m, we observe a decrease in model accuracy compared to a non-poisoned model
 - With m = 4% source class recall drops by ∼ 10% and with m = 10% it drops by ∼ 20%.

### Timing of attacks (very interesting)
 - it is important to understand the capabilities of adversaries who are available for only part of the training process.
 - Google’s Gboard application of FL requires all participant devices be plugged into power and connected to the internet via WiFi
 - Adversaries can take advantage of this design choice, making themselves available at times when honest participants are unable to.
 - We consider two scenarios: adv only available before 75 round and only after
 - As the rate of global model accuracy improvement decreases with both datasets by training round 75, we choose this point to highlight how pre-established model stability may effect an adversary’s ability to launch an effective label flipping attack.
 - Results show that while there are observable drops in source class recall during the rounds with poisoning (1-75), the global model is **able to recover quickly** after poisoning finishes (after round 75). Furthermore, the final convergence of the models (towards the end of training) are not impacted.
 - After 75: **the final poisoned model in the late-round poisoning scenario may show substantial difference in accuracy or recall compared to a non-poisoned model.**
 - model convergence on both datasets is negatively impacted, as evidenced by the large variances in recall values between consecutive rounds.

## Defence Proposed
 - The parameter updates sent from malicious participants have unique characteristics compared to honest participants’ updates for a subset of the parameter space
 - propose an automated strategy for identifying the relevant parameter subset and for studying participant updates using dimensionality reduction (PCA).
 - Given the aggregator’s goal of defending against the label flipping attack from csrc, only the subset of the parameters in θ∆,i corresponding to ncsrc is extracted. The outcome of the extraction is denoted by θsrc ∆,i and added to a global list U built by the aggregator.