# Unlearning Frameworks

There are many forms of unlearning frameworks, but in general, a model is trained on some data and is then used for inference. Upon a removal request, the data-to-be-forgotten is unlearned from the model. The unlearned model is then verified against privacy criteria, and, if these criteria are not met, the model is retrained, i.e., if the model still leaks some information about the forgotten data. 

As in the figure below, there are two main components to this process: the **learning component** (left) and the **unlearning component** (right).

![fw](https://drive.google.com/uc?export=view&id=1esZmnq3DKCoPlqNHQ9N3dxBbhUX3kczx)

## The learning component

The learning component involves the current data, a learning algorithm, and the current model. In the beginning, the initial model is trained from the whole dataset using the learning algorithm.

## The unlearning component

The unlearning component involves an unlearning algorithm, the unlearned model, optimization requirements, evaluation metrics, and a verification mechanism. 

Upon a data removal request, the current model will be processed by an **unlearning algorithm** to forget the corresponding information of that data inside the model. The unlearning algorithm might take several requirements into account such as completeness, timeliness, and privacy guarantees. 

The outcome is an **unlearned model**, which will be evaluated against different performance metrics (e.g., accuracy, ZRF score, anamnesis index). 

However, to provide a privacy certificate for the unlearned model, a **verification** (or **audit**) is needed to prove that the model actually forgot the requested data and that there are no information leaks. This audit might include a feature injection test, a membership inference attack, forgetting measurements, etc.

If the unlearned model passes the verification, it becomes the new model for downstream tasks (e.g., inference, prediction, classification, recommendation).

If the model does not pass verification, the remaining data, i.e., the original data excluding the data to be forgotten, needs to be used to retrain the model. Either way, the unlearning component will be called repeatedly upon a new removal request.

## Unlearning requests

- **Item removal**. Requests to remove certain items/samples from the training data are the most common requests in machine unlearning.

- **Feature removal**. In many scenarios, privacy leaks might not only originate from a single data item but also in a group of data with the similar features or labels. For example, a poisoned spam filter might misclassify malicious addresses that are present in thousands of emails. Thus, unlearning suspicious emails might not enough. Similarly, in an application screening system, inappropriate features, such as the gender or race of applicants, might need to be unlearned for thousands of affected applications.

- **Class removal**. There are many scenarios where the data to be forgotten belongs to single or multiple classes from a trained model. For example, in face recognition applications, each class is a person’s face so there could potentially be thousands or millions of classes. However, when a user opts out of the system, their face information must be removed without using a sample of their face.

- **Task removal**. Today, machine learning models are not only trained for a single task but also for multiple tasks. This paradigm, aka continual learning or lifelong learning, is motivated by the human brain, in which learning multiple tasks can benefit each other due to their correlations. This technique is also used to overcome data sparsity or cold-start problems where there is not enough data to train a single task effectively.

- **Stream removal**. Handling data streams where a huge amount of data arrives online requires some mechanisms to retain or ignore certain data while maintaining limited storage. In the context of machine unlearning, however, handling data streams is more about dealing with a stream of removal requests.

## Design requirements

- **Completeness (Consistency)**. A good unlearning algorithm should be complete. That is, the unlearned model and the retrained model should make the same predictions about any possible data sample (whether right or wrong). One way to measure this consistency is to compute the percentage of the same prediction results on a test data. This requirement can be designed as an optimization objective in an unlearning definition by formulating the difference between the output space of the two models. Many works on adversarial attacks can help with this formulation.

- **Timeliness.** In general, retraining can fully solve any unlearning problem. However, retraining is time-consuming, especially when the distribution of the data to be forgotten is unknown. As a result, there needs to be a trade-off between completeness and timeliness. Unlearning techniques that do not use retraining might be inherently not complete, i.e., they may lead to some privacy leaks, even though some provable guarantees are provided for special cases. To measure timeliness, we can measure the speed up of unlearning over retraining after an unlearning request is invoked.

- **Accuracy.** An unlearned model should be able to predict test samples correctly. Or at least its accuracy should be comparable to the retrained model. However, as retraining is computationally costly, retrained models are not always available for comparison. To address this issue, the accuracy of the unlearned model is often measured on a new test set, or it is compared with that of the original model before unlearning. 

- **Light-weight.** To prepare for unlearning process, many techniques need to store model checkpoints, historical model updates, training data, and other temporary data. A good unlearning algorithm should be light-weight and scale with big data. Any other computational overhead beside unlearning time and storage cost should be reduced as well.

- **Provable guarantees.** With the exception of retraining, any unlearning process might be inherently approximate. It is practical for an unlearning method to provide a provable guarantee on the unlearned model. To this end, many works have designed unlearning techniques with bounded approximations on retraining. Nonetheless, these approaches are founded on the premise that models with comparable parameters will have comparable accuracy.

- **Model-agnostic.** An unlearning process should be generic for different learning algorithms and machine learning models, especially with provable guarantees as well. However, as machine learning models are different and have different learning algorithms as well, designing a model-agnostic unlearning framework could be challenging.

- **Verifiability.** Beyond unlearning requests, another demand by users is to verify that the unlearned model now protects their privacy. To this end, a good unlearning framework should provide end-users with a verification mechanism. For example, backdoor attacks can be used to verify unlearning by injecting backdoor samples into the training data. If the backdoor can be detected in the original model while not detected in the unlearned model, then verification is considered to be a success. However, such verification might be too intrusive for a trustworthy machine learning system and the verification might still introduce false positive due to the inherent uncertainty in backdoor detection.

## Unlearning verification

- **Feature injection test.** The goal of this test is to verify whether the unlearned model has adjusted the weights corresponding to the removed data samples based on data features/attributes. The idea is that if the set of data to be forgotten has a very distinct feature distinguishing it from the remaining set, it gives a strong signal for the model weights. However, this feature needs to be correlated with the labels of the set to be forgotten, otherwise the model might not learn anything from this feature.

- **Forgetting measuring.** Even after the data to be forgotten has been unlearned from the model, it is still possible for the model to carry detectable traces of those samples. 

- **Information leakage.** Many machine learning models inherently leak information during the model updating process. Recent works have exploited this phenomenon by comparing the model before and after unlearning to measure the information leakage. 

- **Membership inference attacks.** This kind of attack is designed
to detect whether a target model leaks data. Specifically, an inference model is trained to recognise new data samples from the training data used to optimize the target model. 

- **Backdoor attacks.** Backdoor attacks were proposed to inject backdoors to the data for deceiving a machine learning model. The deceived model makes correct predictions with clean data, but with poison data in a target class as a backdoor trigger, it makes incorrect predictions.

- **Slow-down attacks.** Some studies focus on the theoretical guarantee of indistinguishability between an unlearned and a retrained models. However, the practical bounds on computation costs are largely neglected in these papers. As a result, a new threat has been introduced to machine unlearning where poisoning attacks are used to slow down the unlearning process.

- **Interclass Confusion Test.** The idea of this test is to investigate whether information from the data to be forgotten can still be inferred from an unlearned model. Different from traditional approximate unlearning definitions that focus on the indistinguishability between unlearned and retrained models in the parameter space, this test focuses on the output space.

- **Federated verification.** Unlearning verification in federated learning is uniquely challenging. First, the participation of one or a few clients in the federation may subtly change the global model’s performance, making verification in the output space challenging. Second, verification using adversarial attacks is not applicable in the federated setting because it might introduce new security threats to the infrastructure.

- **Cryptographic protocol.** Since most of existing verification frameworks do not provide any theoretical guarantee, [Eisenhofer et al.](https://arxiv.org/abs/2210.09126) proposed a cryptography-informed approach with verifiable proofs, i.e. proof of update (the model was trained on a particular dataset $D$) and proof of unlearning (the forget item $d$ is not a member of $D$).

# References

- T. T. Nguyen, T. T. Huynh, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V. H. Nguyen, A Survey of Machine Unlearning. arXiv, 2022. [[Paper](https://arxiv.org/abs/2209.02299)]