# Federated Unlearning via Class-Discriminative Pruning

We primarily consider a scenario where a service provider is required by most users to delete the target data category from their model to protect the user's privacy and avoid legal risks. We frame the problem of target category unlearning in FL as follows. Suppose train a shared model on data points $D$ distributed across $m$ devices, where $m \gg n$ (the number of participant devices). Let $U$ $=$ $\{u_1,\cdots,u_i,\cdots,u_{|U|}\}$ be the classification space that data space can mapping to, $w$ be the model in hypothesis space after pre-training. To unlearn the $i$-th category from the trained model, we need to update the model to make it operate as if the training data $D_i$ in $i$-th category had never been observed.

We define the algorithm $\mathcal{L}$: $U$ $\to$ $w$ be a learning process which maps classification space $U$ into the hypothesis space of model $w$. $u'$ be the category that most users require to delete. We define an unlearning process $\mathcal{L}^-$: $\mathcal{L}(U)$ $\otimes$ $U$ $\otimes$ $u'$ $\to$ $w'$, which takes an input classification space $U$, a learned model $\mathcal{L}(U)$ and the category $u'$ that required to be forgotten. Thus the objective of federated unlearning can be described as

$$\Phi[\mathcal{L}(U\backslash u')] = \Phi[\mathcal{L}^-(\mathcal{L}(U), U, u')]$$

where $U \backslash u'$ defines the classification space without target category $u'$.

## Unlearning framework

### Local processing in FL clients

At each FL client, `local_proc` in the unlearning program is conducted with local private images, in order to generate a local representation between channels and classes.

Considering the local trained model as an $L$-layer CNN model, whose kernel weights can be represented as $w$ = $\left\{w_1, w_2, \dots, w_L\right\}$. The kernel in the $l$-th layer is denoted as $w_l\in\mathcal{R}^{C_{out}^l\times C_{in}^l\times K_l \times K_l}$, where $C_{out}^l$, $C_{in}^l$, $K_l$ denote the numbers of output channels and input channels, and the kernel size, respectively. Let $X_l\in\mathcal{R}^{N\times C_{in}^l\times H_l \times W_l}$ be the input of the $l$-th layer where $N$ is the batch size of input images, and $H_l$, $W_l$ are the height and width of the input respectively. Therefore, the local set of private images can be denoted as $X_1$. The output feature map of the $l$-th layer is calculated as

$$O_l=X_l\circledast w_l$$

where $\circledast$ is the convolutional operation, $O_l\in\mathcal{R}^{N\times C_{in}^{l+1}\times H_{l+1} \times W_{l+1}}$, and $C_{in}^{l+1}$=$C_{out}^l$. As a first step, the client will record the feature map generated by local model in each layer. Then, we apply ReLU \cite{nair2010rectified} followed by an average pooling operation over the feature map.  For the channels at $l$-th layer, the activation of output feature map is denoted by $A_l$, can be calculated by

$$A_l=\text{AvgPooling}(\text{ReLU}(O_l))$$

where $O_l$ is the output feature map generated by $l$-th layer, and the feature map of each channel is resized from $H_{l+1}$$\times$$W_{l+1}$ to 1$\times$1. Therefore, $A_l\in\mathcal{R}^{N\times C_{out}^{l}}$ is a local representation between images and channels. Finally, the representation $A_l$ will be averaged across classes and re-stacking each channel, as $A'_l\in\mathcal{R}^{|U|\times C_{out}^{l}}$, where $|U|$ is the number of classes.Upon the unlearning program is conducted completely in the client, the representation $A'_l$ is uploaded to the federated server for further processing.

### Processing in the federated server

At the federated server, `server_proc` is conducted with the local uploaded representations between classes and channels. Upon having received local representations from all participant clients, the server first aggregates these local representations in average to generate a global representation between classes and channels. Given a global representation $A^*\in\mathcal{R}^{|U|\times C_{out}^{l}}$ in $l$-th layer, the term frequency (TF) can be calculated by

$$\text{TF}_l^{u'}=\frac{{A^*}_l^{u'}}{{\sum}_{j=0}^{C_{out}^l}{A^*}_l^{u',j}}$$

where TF$_l^{u'}\in\mathcal{R}^{C_{out}^l}$ represents the contribution of each word (channel) to a specific document (class $u'$). Note that, some channels that have high scores in TF may also have a contribution to other categories outside the target category. In order to obtain the most discriminative channels of the target category, the inverse document frequency (IDF) in $l$-th layer can be calculated by

$$\text{IDF}_l^j=\log\frac{1+|U|}{1+\left|\big\{u_i\in U: {A^*}_l^{u_i,j}\geq\text{Avg}\big({A^*}_l^{u_i}\big)\big\}\right|}$$

where IDF$_l^j\in\mathcal{R}^1$ represents how common or rare contribution of a specific word (channel $j$) is in the entire document set (classes). The closer it is to 0, the more common contribution of a channel is. Hence, if contribution of a channel is very common and appears in many classes, its IDF will approach 0. Otherwise, it will approach 1. Multiplying TF$_l^{u'}$ and IDF$_l$ results in the TF-IDF$_l^{u'}$ score of channels for the target class in $l$-th layer, as

$$\text{TF-IDF}_l^{u'}=\text{TF}_l^{u'}*\{\text{IDF}_l^1,\cdots,\text{IDF}_l^{C_{out}^l}\}$$

where TF-IDF$_l^{u'}\in\mathcal{R}^{C_{out}^l}$ represents the relevant score between the channels and categories. The higher the score, the more relevant that channel is in that particular class.

Based on the calculated TF-IDF score between channels and categories, the server then builds a pruner to execute pruning on the most discriminative channels of the target category. Pruning is a common technique to compress neural network models. It prunes specific weights from models, thereby their values are zeroed and we make sure they do not take part in the back-propagation process. We adopt a one-shot pruning to prune the channels whose TF-IDF score is beyond a pre-defined percentage $R$. Since some channels in $l$+1-th layer are removed, the convolution filters that generate these channels in $l$-th layer will be removed accordingly. The pruned kernel $\hat{w}_l\in\mathcal{R}^{\hat{C}_{out}^l\times \hat{C}_{in}^l\times K_l \times K_l}$ is obtained under the constraints of $\hat{C}_{in}^l$$\leqslant$$C_{in}^l$ and $\hat{C}_{out}^l$$\leqslant$$C_{out}^l$. We have the Equation~\ref{equ:feature} in the pruned model $\hat{w}_l$ as

$$\hat{O}_l=\hat{X}_l\circledast \hat{w}_l$$

In the case of multi-class removal, the pruning process is executed multiple times, removing one class each time. Finally, upon the pruning complete, the federated server will notify each participant FL client to download the pruned model from it. A fine-tuning process is then conducted to achieve the after-pruned model with a target accuracy.

### Fine-tuning processing

After pruning the most discriminative channels of the target category, the accuracy degradation should be compensated by retraining the pruned model. Our fine-tuning process is the same as the normal training procedure of federated learning, without introducing additional regularization. To reduce the unlearning time, we apply the **prune once and retrain** strategy: prune channels of multiple layers at once and retrain them until the target accuracy is restored. We find for our unlearning method, the **prune once and retrain** strategy can be used to prune away significant portions of the model corresponding to the target class, and any loss in accuracy can be regained by retraining for a short period of time (significantly less than retraining from scratch). 

![prune](https://drive.google.com/uc?export=view&id=1xW_DlUfUQX6GGDr6wBc9CGF_4iOOuJ-2)

# References

- J. Wang, S. Guo, X. Xie, and H. Qi, “Federated Unlearning via Class-Discriminative Pruning,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 622–632. [[Paper](https://dl.acm.org/doi/abs/10.1145/3485447.3512222)]