# FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information

In this work, we  propose **FedRecover**, a method that can recover an accurate global model from a poisoned one while introducing small computation and communication cost for the clients.  
Like train-from-scratch, FedRecover removes the detected malicious clients,  re-initializes a global model, and trains it iteratively in multiple rounds. However, unlike train-from-scratch, FedRecover reduces the  cost for the clients by changing the way of obtaining their model updates. Our intuition is that the **historical information**, including the global models and clients' model updates, which the server collected when training the poisoned global model before the malicious clients are detected, still carry valuable information for model recovery. Based on the intuition, our key idea is that, during the recovery process, the server estimates the remaining clients' model updates using such historical information instead of asking the clients to compute and communicate them. FedRecover is independent of the detection methods used to detect the malicious clients and the aggregation rules of FL. In other words, FedRecover can be used together with any detection method and FL aggregation rule in a defense-in-depth strategy. 

The key of FedRecover is that the server estimates the clients' model updates itself during the recovery process. Specifically, the server stores the historical information when training the poisoned global model before the malicious clients are detected. During the recovery process,  the server uses the well-known **Cauchy mean value theorem** to estimate each client's model update in each round. However, the Cauchy mean value theorem requires an integrated Hessian matrix for each client, whose exact value is challenging to compute. To address the challenge, we further leverage an L-BFGS based algorithm to efficiently approximate the integrated Hessian matrix.  FedRecover introduces some storage and computation cost to the server due to storing the historical information and estimating the clients' model updates. However, such cost is  acceptable since the server is powerful.

Since FedRecover estimates the clients'  model updates, the estimation errors may accumulate over multiple rounds during the recovery process, which eventually  may result in a less accurate recovered global model. We propose multiple strategies to  address the challenge. Specifically,  the L-BFGS algorithm requires the recovered global models in the previous several rounds to estimate a client's model update in the current round. The accurately recovered global models in the first several rounds of the recovery process will help reduce the estimation errors in the future rounds. Therefore, we propose the **warm-up** strategy, in which the server asks the clients to compute and communicate their exact model updates in the first $T_w$ rounds of the recovery process. Moreover, we propose the **periodic correction** strategy, in which the server asks the clients to compute and communicate their exact model updates in every $T_c$ rounds.  When an estimated model update for a client is large, it has large influence on the recovered global model. To reduce the impact of potentially incorrectly estimated large model updates, we propose the **abnormality fixing** strategy, in which the server asks a client to compute its exact model update when at least one coordinate of the estimated model update is larger than a threshold $\tau$. Furthermore, we propose **final tuning** strategy to reduce the estimation error before the training terminates, in which the server asks the clients to compute and communicate their exact model updates in the last $T_f$ rounds. The parameters $T_w$, $T_c$, $\tau$, and $T_f$ control the trade-off between accuracy of the recovered global model and computation/communication cost for the clients. In particular, a larger $T_w$, a smaller $T_c$, a smaller $\tau$, or a larger $T_f$ may recover a more accurate global model but also introduces a larger cost to the clients. 

Theoretically, we show that the difference between the global model recovered by FedRecover and the global model recovered by train-from-scratch can be bounded under some assumptions, e.g., the loss function used to learn the global model is smooth and strongly convex. Empirically, we evaluate FedRecover extensively using four datasets, three FL methods (e.g., FedAvg,  Median, and Trimmed-mean), as well as Trim attack (an untargeted poisoning attack) and backdoor attack (a targeted poisoning attack). Our empirical results show that FedRecover can recover global models that are as accurate as those recovered by train-from-scratch while saving lots of computation/communication cost for the clients. For instance, the backdoor attack with 40 malicious clients can achieve 1.00 attack success rate when the dataset is MNIST  and the FL method is Trimmed-mean. Both FedRecover and train-from-scratch can recover global models with 0.07 test error rate and 0.01 attack success rate, but FedRecover saves the clients' computation/communication cost by 88\% on average compared to train-from-scratch. Moreover, 
FedRecover can efficiently recover as accurate global models as train-from-scratch even if the detection method incorrectly detects some malicious clients as benign and/or some benign clients as malicious. 

## Problem Definition

### Threat Model
We follow the threat model considered in previous studies on poisoning attacks to FL. Specifically, we  discuss in detail the attacker's goals, capabilities, and background knowledge. 

- **Attacker's goals.** In an untargeted poisoning attack, the attacker's goal is to increase the test error rate of the global model indiscriminately for a large number of test inputs. In a targeted poisoning attack, the attacker's goal is to poison the global model such that it predicts an attacker-chosen target label for attacker-chosen target test inputs but the predictions for other test inputs are unaffected. For instance, in a category of targeted poisoning attacks also known as backdoor attacks, the target test inputs  include any input embedded with an attacker-chosen trigger, e.g., a feature pattern.

- **Attacker's capabilities.** We assume the attacker controls some malicious clients but does not compromise the server. The malicious clients could be  fake clients injected into the FL system by the attacker or  genuine clients in the FL system compromised by the attacker. The malicious clients can send arbitrary model updates to the server. 


- **Attacker's background knowledge.** There are two common settings for the attacker's background knowledge about the FL system, i.e., **partial-knowledge setting** and **full-knowledge setting**. The partial-knowledge setting  assumes the attacker knows the global model, the loss function, as well as local training data and model updates on the malicious clients. The full-knowledge setting further assumes the attacker knows the local training data and model updates on all clients as well as the server's aggregation rule. The poisoning attacks are often stronger in the full-knowledge setting than in the 
partial-knowledge setting. In this work, we consider strong poisoning attacks in the full-knowledge setting.

### Design Goals
We aim to design an accurate and efficient model recovery method for FL. We use train-from-scratch as a baseline to measure the accuracy and  efficiency of a recovery method. Our method should  recover a global model as accurate as the one recovered by  train-from-scratch, while incurring less client-side computation and communication cost. Specifically, our design goals are as follows:

- **Accurate.** The global model recovered by our recovery method should be accurate. In particular, for untargeted poisoning attacks, the  test error rate of the recovered global model should be close to that of the  global model recovered by train-from-scratch. For  targeted poisoning attacks, we further require that the attack success rate  for  the global model recovered by our method should be as low as that for the global model recovered by  train-from-scratch.  

- **Efficient.** Our recovery method should incur small client-side computation and communication cost.  
We focus on the client-side efficiency because clients are usually resource-constrained devices. 
Model recovery introduces a unit of communication and computation cost to a client when it is asked to compute its exact model update in a round. 
Therefore, we measure the efficiency of a recovery method  by the number of rounds in which the clients are asked to compute their exact model updates. We aim to design an efficient recovery method that requires the clients to compute their exact model updates  only in a small fraction of rounds. Note that our method incurs an acceptable computation and storage cost for the server. 

- **Independent of detection methods.** Different detection methods have been proposed to detect malicious clients. Moreover, new  detection methods may be developed in the future. Therefore, we aim to design a general recovery method that is compatible with any detection method. Specifically, all detection methods predict a list of malicious clients and our recovery method should be able to recover a global model using this list without any other information about the detection process. In practice, a detector may miss some malicious clients (i.e., false negatives) or incorrectly detect some benign clients as malicious (i.e., false positives). Our recovery method should still be as accurate as and more efficient than train-from-scratch  when the detector's false negative rate and false positive rate are non-zero. 


- **Independent of aggregation rules.** Various aggregations rules have been proposed in FL and the poisoned global models might be trained using different aggregation rules. Therefore, we aim to design a general recovery method that is compatible with any aggregation rule. Our recovery method should not rely on the FL's aggregation rule. In particular, during the recovery process, we use the same aggregation rule as the one used for training the poisoned global model.

### Server Requirements
We assume the server has storage capacity to save the global models and  clients' model updates that the server collected when training the poisoned global model before the malicious clients are detected. We also assume the server has computation power to estimate the clients' model updates during recovery. These requirements are reasonable since the server (e.g., a  data center) is often powerful. 

## Design of FedRecover

After the detected malicious clients are removed, FedRecover initializes a new global model and trains it iteratively in multiple rounds. In each round, FedRecover **simulates** the FL's three steps. Instead of asking the remaining clients to compute and communicate the model updates, the server estimates the model updates using the stored historical information, including the original global models and the original model updates. The estimation errors in the clients' model updates  may accumulate in multiple rounds, eventually leading to an inaccurate recovered global model. Therefore, we further propose several strategies, including warm-up, periodic correction, abnormality fixing, and final tuning to optimize FedRecover. In these strategies, the server asks the clients to compute their exact model updates instead of estimating them in the first several rounds of the recovery process, periodically in every certain number of rounds, when the estimated model updates are abnormal, and in the last few rounds, respectively. Theoretically, we can bound the difference between the global model recovered by  FedRecover and the global model recovered by train-from-scratch under some assumptions; and we show that  such difference decreases exponentially as FedRecover increases the computation/communication cost for the clients. 

### Estimating Clients' Model Updates

Based on the integral version of the Cauchy mean value theorem, we can calculate the exact model update $\boldsymbol{g}_t^i$ as follows:
$$\boldsymbol{g}_t^i = \boldsymbol{\bar{g}}_t^i + \boldsymbol{H}_t^i (\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t),$$
where $\mathbf{H}_t^i=\int_0^1 \mathbf{H}(\boldsymbol{\bar{w}}_t+z(\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t))dz$ is an integrated Hessian matrix for the $i$th client in the $t$th round. Intuitively, the gradient $\boldsymbol{g}$ is a function of the model parameters $\boldsymbol{w}$. The difference between the function values $\boldsymbol{g}^i_t - \bar{\boldsymbol{g}}^i_t$ can be characterized by the difference between the variables $\hat{\boldsymbol{w}}_t - \bar{\boldsymbol{w}}_t$ and the integrated gradient of the function $\boldsymbol{g}$ along the line between the variables, i.e.,  $\mathbf{H}_t^i$. Note that the equation above  involves  an integrated Hessian matrix, which is challenging to compute exactly. To address the challenge, we leverage an efficient L-BFGS algorithm to compute an approximate Hessian matrix. Next, we discuss how to approximate an integrated Hessian matrix. 

In optimization, L-BFGS algorithm is a popular tool to approximate a Hessian matrix or its inverse. 
The L-BFGS algorithm needs the differences of the global models and the model updates in the past rounds to make the approximation in the current round. Specifically, we define the **global-model difference** in the $t$th round as $\Delta\boldsymbol{w}_t=\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t$, and the **model-update difference** of the $i$th client in the $t$th round  as $\Delta\boldsymbol{g}_t^i=\boldsymbol{g}_t^i - \boldsymbol{\bar{g}}_t^i$. Note that a global-model difference measures the difference between the recovered global model and the original global model in a round, while a model-update difference measures the difference between a client's exact model update and original model update in a round. The L-BFGS algorithm  maintains a buffer of the global-model differences in the $t$th round $\Delta\boldsymbol{W}_t=[\Delta\boldsymbol{w}_{b_1}, \Delta\boldsymbol{w}_{b_2}, \cdots, \Delta\boldsymbol{w}_{b_s}]$, where $s$ is the buffer size. Moreover, for each client $i$, the L-BFGS algorithm maintains a buffer of the model-update differences $\Delta\boldsymbol{G}_t^i=[\Delta\boldsymbol{g}_{b_1}^i, \Delta\boldsymbol{g}_{b_2}^i, \cdots, \Delta\boldsymbol{g}_{b_s}^i]$. 
The L-BFGS algorithm takes $\Delta\boldsymbol{W}_t$ and $\Delta\boldsymbol{G}_t^i$ as an input and outputs an approximate Hessian matrix $\boldsymbol{\widetilde{H}}_t^i$ for the $i$th client in the $t$th round, i.e., $\boldsymbol{\widetilde{H}}_t^i=\text{L-BFGS}(\Delta\boldsymbol{W}_t,\Delta\boldsymbol{G}_t^i)$. 

Note that the size of the Hessian matrix is the square of the number of global model parameters, and thus the Hessian matrix may be too large to store in memory when the global model is deep neural network. Moreover, in practice, the product of the Hessian matrix and a vector $\boldsymbol{v}$ is usually desired, which is called Hessian-vector product. For instance, in FedRecover, we aim to find $\boldsymbol{H}_t^i\boldsymbol{v}$, where $\boldsymbol{v}=\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t$. Therefore, modern implementation of the L-BFGS algorithm takes the vector $\boldsymbol{v}$ as an additional input and directly approximates the Hessian-vector product in an efficient way, i.e., $\boldsymbol{\widetilde{H}}_t^i\boldsymbol{v}=\text{L-BFGS}(\Delta\boldsymbol{W}_t,\Delta\boldsymbol{G}_t^i, \boldsymbol{v})$. There are other variants and implementations of L-BFGS. However, they approximate the **inverse**-Hessian-vector product instead of the Hessian-vector product, and thus are not applicable to FedRecover. After obtaining the approximate Hessian-vector product $\boldsymbol{\widetilde{H}}_t^i(\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t)$, we can compute the estimated model update as $\boldsymbol{\hat{g}}_t^i = \boldsymbol{\bar{g}}_t^i + \boldsymbol{\widetilde{H}}_t^i (\boldsymbol{\hat{w}}_t - \boldsymbol{\bar{w}}_t)$. 


Note that in the standard L-BFGS algorithm, the buffer of the global-model differences (or model-update differences) in the $t$th round consist of the global-model differences (or model-update differences) in the previous $s$ rounds, i.e., $b_j=t-s+j-1$. 
This standard L-BFGS algorithm faces a key challenge: it requires the exact model update $\boldsymbol{g}_t^i$ in each round in order to calculate the buffer of the model-update differences, but our goal is to avoid asking the clients to compute their exact model updates in most rounds. Next, we propose several optimization strategies to address the challenge. 

### Optimization Strategies

Our first optimization strategy is to warm-up the L-BFGS algorithm in the first several rounds of the recovery process. In particular, in the first $T_w > s$ rounds, the server asks the clients to compute their exact model updates  $\boldsymbol{g}_t^i$, and uses them to update the recovered global model. Based on the last $s$ warm-up rounds, the server computes the buffer $\Delta\boldsymbol{W}_t$ of the global-model differences and the buffer $\Delta\boldsymbol{G}_t^i$ of the model-update differences for each client $i$. Then, in the future rounds, the server can use the L-BFGS algorithm with these buffers to compute the approximate Hessian matrices, then uses the approximate Hessian matrices to compute the estimated model updates, and finally uses the estimated model updates to update the recovered global model. However,  the buffers constructed based on the warm-up rounds may be outdated for  the future rounds, which leads to inaccurate approximate Hessian matrices, inaccurate estimated model updates, and eventually inaccurate recovered global model. To address the challenge, we further propose periodic correction and abnormality fixing strategies, which we discuss next. 

In periodic correction,  the server asks each client to periodically compute its exact model update in every $T_c$ rounds after warm-up. In abnormality fixing, the server asks a client to compute its exact model update in a round if the estimated model update is abnormally large, i.e., if at least one coordinate of the estimated model update is larger than $\tau$, which we call the **abnormality threshold**. A large estimated model update has a large influence on the recovered global model, and thus a large incorrectly estimated model update would negatively influence the recovered global model substantially. Therefore, we consider the abnormality fixing strategy to limit the impact of potentially incorrectly estimated model updates.  

Our abnormality fixing strategy may also treat  correctly estimated large model updates   as abnormal if the abnormality threshold $\tau$ is too small, which increases computation/communication cost for the clients. Therefore, we select $\tau$ based on the historical information. Specifically, for each round $t$, we collect the original model updates $\boldsymbol{\bar{g}}^i_t$ of all clients $i$ who participant in the recovery. We select $\tau_t$ such that at most $\alpha$ fraction of parameters in the clients' original model updates $\boldsymbol{\bar{g}}^i_t$ are greater than $\tau_t$. Then we choose $\tau$ as the largest value among $\tau_t$, i.e., $\tau=\max_t\{\tau_t\}$. Here,  the probability of a parameter in benign model updates being treated as abnormal is no greater than $\alpha$ in any round, and we call $\alpha$ the **tolerance rate** since we allow at most $\alpha$ fraction of such mistreatment. 

We find that if we terminate the training with a round of estimated model updates, the performance of the recovered global model could be unstable due to the potential estimation error. Therefore, we further propose the final tuning strategy, where the server asks the clients to compute their exact model updates in the last $T_f$ rounds before the training ends. As we will show in experiments, only a small number of rounds (e.g., $T_f=5$) are needed to ensure a good performance of the recovered global model.

We note that, when some malicious clients are not detected by the malicious-client detection method, they can still perform poisoning attacks in the warm-up, periodic correction, abnormality fixing, and final tuning rounds. However, our experiments will show that FedRecover can still recover an accurate global model in such scenarios. This is because the number of warm-up, periodic correction, abnormality fixing, and final tuning rounds is small. 

Recall that the buffers of the L-BFGS algorithm require the clients' exact model updates. Therefore, we only update the buffer $\Delta\boldsymbol{W}_t$ after the the server asks {all} clients to compute their model updates, and update the buffer $\Delta\boldsymbol{G}_t^i$ after the server asks the $i$th client to compute its exact model update. Note that the clients only compute their exact model updates for warm-up, periodic correction, abnormality fixing, or final tuning. In the $t$th round, $\Delta\boldsymbol{W}_t$ contains the global-model differences in the previous $s$ rounds, in which all clients compute their exact model updates; and $\Delta\boldsymbol{G}_t^i$ contains the model-update differences of the $i$th client in the previous $s$ rounds, in which the $i$th client computes its exact model updates.

### Theoretical Analysis

We first analyze the computation and communication cost for the clients introduced by both train-from-scratch and FedRecover. Then, we show that the difference between the global model recovered by FedRecover and the global model recovered by  train-from-scratch can be bounded in each round under some assumptions. Finally, we show the connection between such difference and the computation/communication cost for the clients, i.e., the trade-off between the accuracy of the recovered global model and the computation/communication cost for the clients in FedRecover.  We note that our theoretical bound analysis is based on some assumptions, which may not hold for complex models such as neural networks. Therefore, we empirically evaluate FedRecover for neural networks in the next section. 

When a client is asked to compute model update, we introduce some computation and communication cost to the client. Moreover, such computation/communication cost roughly does not depend on which round the client is asked to compute model update. Therefore, we can view such cost as an unit of cost.  Train-from-scratch asks each client to compute model update in each round. Therefore, the average computation/communication cost per client for train-from-scratch is $O(T)$, where $T$ is the total number of rounds. In FedRecover, the cost depends on the number of warm-up rounds $T_w$, the periodic correction parameter $T_c$, the number of rounds in which the abnormality fixing is triggered, and the number of final tuning rounds $T_f$. The number of rounds for abnormality fixing depends on dataset, FL method, and the threshold $\tau$, which makes it hard to theoretically analyze the cost for FedRecover. However, when the abnormality fixing is not used, i.e., $\tau=\infty$, we can show that the average computation/communication cost per client for FedRecover is $O(T_w + T_f + \lfloor(T-T_w-T_f)/T_c\rfloor)$. 

- **Assumption 1.** The loss function  is $\mu$-strongly convex and $L$-smooth. Formally, for each client $i$, we have the following two inequalities for any $\boldsymbol{w}$ and $\boldsymbol{w}'$:
$$\langle\boldsymbol{w}-\boldsymbol{w}', \nabla\mathcal{L}_i(\boldsymbol{w})-\nabla\mathcal{L}_i(\boldsymbol{w}')\rangle \ge \mu\Vert\boldsymbol{w}-\boldsymbol{w}'\Vert^2,$$
$$\langle\boldsymbol{w}-\boldsymbol{w}', \nabla\mathcal{L}_i(\boldsymbol{w})-\nabla\mathcal{L}_i(\boldsymbol{w}')\rangle \ge \frac{1}{L}\Vert\nabla\mathcal{L}_i(\boldsymbol{w})-\nabla\mathcal{L}_i(\boldsymbol{w}')\Vert^2$$
where $\mathcal{L}_i$ is the loss function for client $i$, $ \langle\cdot, \cdot\rangle$ represents inner product of two vectors, and $\Vert\cdot \Vert$ represents $\ell_2$ norm of a vector. 

- **Assumption 2.** The error of approximating a Hessian-vector product in the L-BFGS algorithm is bounded. Formally, each approximated Hessian-vector product satisfies the following: 
$$\forall i, \forall t, \Vert\boldsymbol{\widetilde{H}}^i_t(\boldsymbol{\hat{w}}_t-\boldsymbol{\bar{w}}_t) + \boldsymbol{\bar{g}}^i_t - \boldsymbol{g}^i_t\Vert \le M,$$

- **Theorem 1.** (Proof omitted, see paper) Suppose Assumption 1-2 hold, FedAvg is used as the aggregation rule, the threshold $\tau=\infty$ (i.e., abnormality fixing is not used), the learning rate $\eta$ satisfies $\eta\le\text{min}(\frac{1}{\mu}, \frac{1}{L})$,  and all malicious clients are detected. Then, the  difference between the  global model recovered by FedRecover and that recovered by  train-from-scratch  in each round $t>0$ can be bounded as follows: 
$$\Vert\boldsymbol{\hat{w}}_{t}-\boldsymbol{w}_{t}\Vert \le  (\sqrt{1-\eta\mu})^{t}\Vert\boldsymbol{\hat{w}}_{0}-\boldsymbol{w}_{0}\Vert + \frac{1-(\sqrt{1-\eta\mu})^{t}}{1-\sqrt{1-\eta\mu}}\eta M$$
where $\boldsymbol{\hat{w}}_{t}$ and $\boldsymbol{w}_{t}$ respectively are the global models recovered by FedRecover and train-from-scratch in round $t$.

- **Corollary 1.** When the L-BFGS algorithm can exactly compute the integrated Hessian-vector product (i.e., $M=0$),  the difference between the  global model recovered by FedRecover and that  recovered by  train-from-scratch is bounded as $\Vert\boldsymbol{\hat{w}}_{t}-\boldsymbol{w}_{t}\Vert \le  (\sqrt{1-\eta\mu})^{t}\Vert\boldsymbol{\hat{w}}_{0}-\boldsymbol{w}_{0}\Vert$. Therefore, the  global model recovered by FedRecover converges to the global model recovered by  train-from-scratch, i.e., we have $\lim_{t\rightarrow \infty} \boldsymbol{\hat{w}}_{t}= \lim_{t\rightarrow \infty} \boldsymbol{w}_{t}$.

Given Corollary 1, we have the difference bound as $\Vert\boldsymbol{\hat{w}}_{T}-\boldsymbol{w}_{T}\Vert \le  (\sqrt{1-\eta\mu})^{T}\Vert\boldsymbol{\hat{w}}_{0}-\boldsymbol{w}_{0}\Vert$ when FedRecover runs for $T$ rounds. The difference bound decreases exponentially as $T$ increases.  Moreover, the computation/communication cost of FedRecover is linear to $T$  when $\tau=\infty$. Therefore, the difference bound decreases exponentially as the cost increases. In other words, we observe an accuracy-cost trade-off for FedRecover, i.e., the global model recovered by FedRecover is more accurate (i.e., closer to the train-from-scratch global model) when  more cost is introduced for the clients. 

## Evaluation

![1](https://drive.google.com/uc?export=view&id=1A-1-WkabZErUDu-AtW9wbb2G-ZffRWdP)

![2](https://drive.google.com/uc?export=view&id=1vSxNlMDhAMtdiPT5bl6iv_MGKIGEKOmI)

![4](https://drive.google.com/uc?export=view&id=14Js7rsY90OmCRF1JZb_bRg2dYjPURLVB)

![6](https://drive.google.com/uc?export=view&id=16-jHAHJ0G7xHljmRa_KJyDRJlyE34x-k)

![8](https://drive.google.com/uc?export=view&id=1vmL6OavF64h-NnTVApyzRX-DLRbRE2bA)

![9](https://drive.google.com/uc?export=view&id=1VevM8H1n-biM5bq6O-CmBf8zKdwQWWDN)


# References
- X. Cao, J. Jia, Z. Zhang, and N. Z. Gong, FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information. arXiv, 2022. doi: 10.48550/ARXIV.2210.10936. [[Paper](https://arxiv.org/abs/2210.10936)]