<a href="https://www.kaggle.com/code/tamlhp/machine-unlearning-the-right-to-be-forgotten?scriptVersionId=135861266" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Machine Unlearning: The Right to be Forgotten

## Abstract

Today, computer systems hold large amounts of personal data. Yet while
such an abundance of data allows breakthroughs in artificial
intelligence, and especially machine learning, its existence can be a
threat to user privacy, and it can weaken the bonds of trust between
humans and AI. Recent regulations now require that, on request, private
information about a user must be removed from both computer systems and
from machine learning models -- this legislation is more colloquially
called "the right to be forgotten"). While removing data from back-end
databases should be straightforward, it is not sufficient in the AI
context as machine learning models often 'remember' the old data.
Contemporary adversarial attacks on trained models have proven that we
can learn whether an instance or an attribute belonged to the training
data. This phenomenon calls for a new paradigm, namely *machine
unlearning*, to make machine learning models forget about particular
data. It turns out that recent works on machine unlearning have not been
able to completely solve the problem due to the lack of common
frameworks and resources. Therefore, this paper aspires to present a
comprehensive examination of machine unlearning's concepts, scenarios,
methods, and applications. Specifically, as a category collection of
cutting-edge studies, the intention behind this article is to serve as a
comprehensive resource for researchers and practitioners seeking an
introduction to machine unlearning and its formulations, design
criteria, removal requests, algorithms, and applications. In addition,
we aim to highlight the key findings, current trends, and new research
areas that have not yet featured the use of machine unlearning but could
benefit greatly from it. We hope this survey serves as a valuable
resource for machine learning researchers and those seeking to innovate
privacy technologies. Our resources are publicly available at
<https://github.com/tamlhp/awesome-machine-unlearning>.

<div class="figure*">
<figure>
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/framework.png" alt="image" style="max-width: 100%;"/>
<figcaption aria-hidden="true">A Typical Machine Unlearning Process</figcaption>
</figure>
</div>

<h1 id="sec:intro">1. Introduction</h1>
<p>Computer systems today hold large amounts of personal data. Due to
the great advancement in data storage and data transfer technologies,
the amount of data being produced, recorded, and processed has exploded.
For example, four billion YouTube videos are watched every day&#xA0;<a href="#ref-sari2020learning">(Sari et al.
2020)</a>. These online personal data, including digital footprints
made by (or about) netizens, reflects their behaviors, interactions, and
communication patterns in real-world&#xA0;<a href="#ref-nguyen2019debunking">(Thanh Tam Nguyen 2019)</a>. Other
sources of personal data include the digital content that online users
create to express their ideas and opinions, such as product reviews,
blog posts (e.g.&#xA0;Medium), status seeking (e.g.&#xA0;Instagram), and knowledge
sharing (e.g. Wikipedia)&#xA0;<a href="#ref-nguyen2021judo">(Thanh Toan Nguyen et al. 2021)</a>. More
recently, personal data has also expanded to include data from wearable
devices&#xA0;<a href="#ref-ren2022prototype">(Z. Ren,
Nguyen, and Nejdl 2022)</a>. On the one hand, such an abundance of
data has helped to advance artificial intelligence (AI). However, on the
other hand, it threatens the privacy of users and has led to many data
breaches&#xA0;<a href="#ref-cao2015towards">(Y. Cao and
Yang 2015)</a>. For this reason, some users may choose to have their
data completely removed from a system, especially sensitive systems such
as those do with finance or healthcare&#xA0;<a href="#ref-ren2022prototype">(Z. Ren, Nguyen, and Nejdl 2022)</a>.
Recent regulations now compel organisations to give users &#x201C;the right to
be forgotten&#x201D;, i.e., the right to have all or part of their data deleted
from a system on request&#xA0;<a href="#ref-dang2021right">(Dang 2021)</a>.</p>
<p>While removing data from back-end databases satisfies the
regulations, doing so is not sufficient in the AI context as machine
learning models often &#x2018;remember&#x2019; the old data. Indeed, in machine
learning systems, often millions, if not billions, of users&#x2019; data have
been processed during the model&#x2019;s training phase. However, unlike humans
who learn general patterns, machine learning models behave more like a
lossy data compression mechanism&#xA0;<a href="#ref-schelter2020amnesia">(Schelter 2020)</a>, and some are
overfit against their training data. The success of deep learning models
in particular has been recently been attributed to the compression of
training data&#xA0;<a href="#ref-tishby2000information tishby2015deep">(Tishby et al. 2000;
Tishby and Zaslavsky 2015)</a>. This memorization behaviour can be
further proven by existing works on adversarial attacks&#xA0;<a href="#ref-ren2020generating chang2022example ren2020enhancing">(Z.
Ren, Baird, et al. 2020; Chang et al. 2022; Z. Ren, Han, et al.
2020)</a>, which have shown that it is possible to extract the
private information within some target data from a trained model.
However, we also know that the parameters of a trained model do not tend
to show any clear connection to the data that was used for
training&#xA0;<a href="#ref-shwartz2017opening">(Shwartz-Ziv and Tishby 2017)</a>. As
a result, it can be challenging to remove information corresponding to a
particular data item from a machine learning model. In other words, it
can be difficult to make a machine learning model forget a user&#x2019;s
data.</p>
<div class="table*">

</div>
<p>This challenge of allowing users the possibility and flexibility to
completely delete their data from a machine learning model calls for a
new paradigm, namely <em>machine unlearning</em>&#xA0;<a href="#ref-nguyen2022markov baumhauer2020machine tahiliani2021machine">(Q.
P. Nguyen et al. 2022; Baumhauer, Sch&#xF6;ttle, and Zeppelzauer 2020;
Tahiliani et al. 2021)</a>. Ideally, a machine unlearning mechanism
would remove data from the model without needing to retrain it from
scratch&#xA0;<a href="#ref-nguyen2022markov">(Q. P.
Nguyen et al. 2022)</a>. To this end, a users&#x2019; right to be forgotten
would be observed and the model owner would be shielded from constant
and expensive retraining exercises.</p>
<p>Researchers have already begun to study aspects of machine
unlearning, such as removing part of the training data and analysing the
subsequent model predictions&#xA0;<a href="#ref-nguyen2022markov thudi2021necessity">(Q. P. Nguyen et al.
2022; Thudi, Jia, et al. 2022)</a>. However, it turns out that this
problem cannot be completely solved due to a lack of common frameworks
and resources&#xA0;<a href="#ref-villaronga2018humans veale2018algorithms shintre2019making schelter2020amnesia">(Villaronga,
Kieseberg, and Li 2018; Veale, Binns, and Edwards 2018; Shintre et al.
2019; Schelter 2020)</a>. Hence, to begin building a foundation of
works in this nascent area, we undertook a comprehensive survey of
machine unlearning: its definitions, scenarios, mechanisms, and
applications. Our resources are publicly available at&#xA0;<a href="#fn1"
class="footnote-ref" id="fnref1"
role="doc-noteref"><sup>1</sup></a>.</p>

<h2 id="reasons-for-machine-unlearning">1.1. Reasons for Machine
Unlearning</h2>
<p>There are many reasons for why a users may want to delete their data
from a system. We have categorized these into four major groups:
security, privacy, usability, and fidelity. Each reason is discussed in
more detail next.</p>
<p><strong>Security.</strong> Recently, deep learning models have been
shown to be vulnerable to external attacks, especially adversarial
attacks&#xA0;<a href="#ref-ren2020adversarial">(K. Ren
et al. 2020)</a>. In an adversarial attack, the attacker generates
adversarial data that are very similar to the original data to the
extent that a human cannot distinguish between the real and fake data.
This adversarial data is designed to force the deep learning models into
outputting wrong predictions, which frequently results in serious
problems. For example, in healthcare, a wrong prediction could lead to a
wrong diagnosis, a non-suitable treatment, even a death. Hence,
detecting and removing adversarial data is essential for ensuring the
model&#x2019;s security and, once an attack is detected, the model needs to be
able delete the adversarial data through a machine unlearning
mechanism&#xA0;<a href="#ref-cao2015towards marchant2022hard">(Y. Cao and Yang 2015;
Marchant, Rubinstein, and Alfeld 2022)</a>.</p>
<p><strong>Privacy.</strong> Many privacy-preserving regulations have
been enacted recently that involve the right to be forgotten&#x201D;&#xA0;<a href="#ref-bourtoule2021machine dang2021right">(Bourtoule et al. 2021;
Dang 2021)</a>, such as the European Union&#x2019;s General Data Protection
Regulation (GDPR)&#xA0;<a href="#ref-magdziarczyk2019right">(Mantelero 2013)</a> and the
California Consumer Privacy Act&#xA0;<a href="#ref-pardau2018california">(Pardau 2018)</a>. In this
particular regulation, users must be given the right to have their data
and related information deleted to protect their privacy. In part, this
legislation has sprung up as a result of privacy leaks. For example,
cloud systems can leak user data due to multiple copies of data hold by
different parties, backup policies, and replication strategies&#xA0;<a href="#ref-singh2017data">(A. Singh and Anand
2017)</a>. In another case, machine learning approaches for genetic
data processing were found to leak patients&#x2019; genetic markers&#xA0;<a href="#ref-fredrikson2014privacy wang2009learning">(Fredrikson et al.
2014; R. Wang et al. 2009)</a>. It is therefore not surprising that
users would want to remove their data to avoid the risks of a data
leak&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang
2015)</a>.</p>
<p><strong>Usability.</strong> People have difference preferences in
online applications and/or services, especially recommender systems. An
application will produce inconvenient recommendations if it cannot
completely delete the incorrect data (e.&#x2006;g.,&#xA0;noise, malicious data,
out-of-distribution data) related to a user. For example, one can
accidentally search for an illegal product on his laptop, and find that
he keeps getting this product recommendation on this phone, even after
he cleared his web browser history&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang 2015)</a>. Such
undesired usability by not forgetting data will not only produce wrong
predictions, but also result in less users.</p>
<p><strong>Fidelity.</strong> Unlearning requests might come from biased
machine learning models. Despite recent advances, machine learning
models are still sensitive to bias that means their output can unfairly
discriminate against a group of people&#xA0;<a href="#ref-mehrabi2021survey">(Mehrabi et al. 2021)</a>. For
example, COMPAS, the software used by courts to decide parole cases, is
more likely to consider African-American offenders to have higher risk
scores than Caucasians, even though ethnicity information is not part of
the input&#xA0;<a href="#ref-zou2018ai">(Zou and
Schiebinger 2018)</a>. Similar situations have been observed in
beauty contest judged by AI, which was biased against contestants with
darker skin tones, or facial recognition AI that wrongly recognized
Asian facial features&#xA0;<a href="#ref-feuerriegel2020fair">(Feuerriegel, Dolata, and Schwabe
2020)</a>.</p>
<p>The source of these biases often originate from data. For example, AI
systems that have been trained on public datasets that contain mostly
white persons, such as ImageNet, are likely to make errors when
processing images of black persons. Similarly, in an application
screening system, inappropriate features, such as the gender or race of
applicants, might be unintentionally learned by the machine learning
model&#xA0;<a href="#ref-dinsdale2021deep dinsdale2020unlearning">(Dinsdale,
Jenkinson, and Namburete 2021; Dinsdale, Jenkinson, et al. 2020)</a>.
As a result, there is a need to unlearn these data, including the
features and affected data items.</p>

<h2 id="challenges-in-machine-unlearning">1.2. Challenges in Machine
Unlearning</h2>
<p>Before we can truly achieve machine unlearning, several challenges to
removing specific parts of the training data need to be overcome. The
challenges are summarized as follows.</p>
<p><strong>Stochasticity of training.</strong> We do not know the impact
of each data point seen during training on the machine learning model
due to the stochastic nature of the training procedure&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a>. Neural networks, for example, are usually trained on
random mini-batches containing a certain number of data samples.
Further, the order of the training batches is also random&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a>. This stochasticity raises difficulties for machine
unlearning as the specific data sample to be removed would need to be
removed from all batches.</p>
<p><strong>Incrementality of training.</strong> A model&#x2019;s training
procedure is an incremental process&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>. In
other words, the model update on a given data sample will affect the
model performance on data samples fed into the model after this data. A
model&#x2019;s performance on this given data sample is also affected by prior
data samples. Determining a way to erase the effect of the to-be-removed
training sample on further model performance is a challenge for machine
unlearning.</p>
<p><strong>Catastrophic unlearning.</strong> In general, an unlearned
model usually performs worse than the model retrained on the remaining
data&#xA0;<a href="#ref-nguyen2020variational nguyen2022markov">(Q. P. Nguyen, Low,
and Jaillet 2020; Q. P. Nguyen et al. 2022)</a>. However, the
degradation can be exponential when more data is unlearned. Such sudden
degradation is often referred as catastrophic unlearning&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low,
and Jaillet 2020)</a>. While several studies&#xA0;<a href="#ref-du2019lifelong golatkar2020eternal">(Du, Chen, et al. 2019;
Golatkar, Achille, and Soatto 2020a)</a> have explored ways to
mitigate catastrophic unlearning by designing special loss functions,
how to naturally prevent catastrophic unlearning is still an open
question.</p>




<h1 id="sec:framework">2. Unlearning Framework</h1>
<h2 id="unlearning-workflow">2.1. Unlearning Workflow</h2>
<p>The unlearning framework in <a href="#fig:unlearning_workflow"
data-reference-type="autoref"
data-reference="fig:unlearning_workflow">[fig:unlearning_workflow]</a>
presents the typical workflow of a machine learning model in the
presence of a data removal request. In general, a model is trained on
some data and is then used for inference. Upon a removal request, the
data-to-be-forgotten is unlearned from the model. The unlearned model is
then verified against privacy criteria, and, if these criteria are not
met, the model is retrained, i.e., if the model still leaks some
information about the forgotten data. There are two main components to
this process: the <em>learning component</em> (left) and the
<em>unlearning component</em> (right). The learning component involves
the current data, a learning algorithm, and the current model. In the
beginning, the initial model is trained from the whole dataset using the
learning algorithm. The unlearning component involves an unlearning
algorithm, the unlearned model, optimization requirements, evaluation
metrics, and a verification mechanism. Upon a data removal request, the
current model will be processed by an unlearning algorithm to forget the
corresponding information of that data inside the model. The unlearning
algorithm might take several requirements into account such as
completeness, timeliness, and privacy guarantees. The outcome is an
unlearned model, which will be evaluated against different performance
metrics (e.g., accuracy, ZRF score, anamnesis index). However, to
provide a privacy certificate for the unlearned model, a verification
(or audit) is needed to prove that the model actually forgot the
requested data and that there are no information leaks. This audit might
include a feature injection test, a membership inference attack,
forgetting measurements, etc.</p>
<div class="figure*">
<figure>
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/framework.png" alt="image" style="max-width: 80%;"/>
<figcaption aria-hidden="true">A Typical Machine Unlearning Process</figcaption>
</figure>
</div>
<p>If the unlearned model passes the verification, it becomes the new
model for downstream tasks (e.g., inference, prediction, classification,
recommendation). If the model does not pass verification, the remaining
data, i.e., the original data excluding the data to be forgotten, needs
to be used to retrain the model. Either way, the unlearning component
will be called repeatedly upon a new removal request.</p>

<h2 id="unlearning-requests">2.2. Unlearning Requests</h2>
<p><strong>Item Removal.</strong> Requests to remove certain
items/samples from the training data are the most common requests in
machine unlearning&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>. The
techniques used to unlearn these data are described in detail in <a
href="#sec:algorithms" data-reference-type="autoref"
data-reference="sec:algorithms">[sec:algorithms]</a>.</p>
<p><strong>Feature Removal.</strong> In many scenarios, privacy leaks
might not only originate from a single data item but also in a group of
data with the similar features or labels&#xA0;<a href="#ref-warnecke2021machine">(Warnecke et al. 2021)</a>. For
example, a poisoned spam filter might misclassify malicious addresses
that are present in thousands of emails. Thus, unlearning suspicious
emails might not enough. Similarly, in an application screening system,
inappropriate features, such as the gender or race of applicants, might
need to be unlearned for thousands of affected applications.</p>
<p>In such cases, naively unlearning the affected data items
sequentially is imprudent as repeated retraining is computationally
expensive. Moreover, unlearning too many data items can inherently
reduce the performance of the model, regardless of the unlearning
mechanism used. Thus, there is a need for unlearning data at the feature
or label level with an arbitrary number of data items.</p>
<p>Warnecke et al.&#xA0;<a href="#ref-warnecke2021machine">(Warnecke et al. 2021)</a> proposed
a technique for unlearning a group of training data based on influence
functions. More precisely, the effect of training data on model
parameter updates is estimated and formularized in closed-form. As a
result of this formulation, influences of the learning sets act as a
compact update instead of solving an optimisation problem iteratively
(e.g., loss minimization). First-order and second-order derivatives are
the keys to computing this update effectively&#xA0;<a href="#ref-warnecke2021machine">(Warnecke et al. 2021)</a>.</p>
<p>Guo et al.&#xA0;<a href="#ref-guo2022efficient">(T.
Guo et al. 2022)</a> proposed another technique to unlearn a feature
in the data based on disentangled representation. The core idea is to
learn the correlation between features from the latent space as well as
the effects of each feature on the output space. Using this information,
certain features can be progressively detached from the learnt model
upon request, while the remaining features are still preserved to
maintain good accuracy. However, this method is mostly applicable to
deep neural networks in the image domain, in which the deeper
convolutional layers become smaller and can therefore identify abstract
features that match real-world data attributes.</p>
<p><strong>Class Removal.</strong> There are many scenarios where the
forgetting data belongs to single or multiple classes from a trained
model. For example, in face recognition applications, each class is a
person&#x2019;s face so there could potentially be thousands or millions of
classes. However, when a user opts out of the system, their face
information must be removed without using a sample of their face.</p>
<p>Similar to feature removal, class removal is more challenging than
item removal because retraining solutions can incur many unlearning
passes. Even though each pass might only come at a small computational
cost due to data partitioning, the expense mounts up. However,
partitioning data by class itself does not help the model&#x2019;s training in
the first place, as learning the differences between classes is the core
of many learning algorithms&#xA0;<a href="#ref-tanha2020boosting">(Tanha et al. 2020)</a>. Although some
of the above techniques for feature removal can be applied to class
removal&#xA0;<a href="#ref-warnecke2021machine">(Warnecke et al. 2021)</a>, it is
not always the case as class information might be implicit in many
scenarios.</p>
<p>Tarun et al.&#xA0;<a href="#ref-tarun2021fast">(Tarun
et al. 2021)</a> proposed an unlearning method for class removal
based on data augmentation. The basic concept is to introduce noise into
the model such that the classification error is maximized for the target
class(es). The model is updated by training on this noise without the
need to access any samples of the target class(es). Since such impair
step may disturb the model weights and degrade the classification
performance for the remaining classes, a repair step is needed to train
the model for one or a few more epochs on the remaining data. Their
experiments show that the method can be efficient for large-scale
multi-class problems (100 classes). Further, the method worked
especially well with face recognition tasks because the deep neural
networks were originally trained on triplet loss and negative samples so
the difference between the classes was quite significant&#xA0;<a href="#ref-masi2018deep">(Masi et al.
2018)</a>.</p>
<p>Baumhauer et al.&#xA0;<a href="#ref-baumhauer2020machine">(Baumhauer, Sch&#xF6;ttle, and Zeppelzauer
2020)</a> proposed an unlearning method for class removal based on a
linear filtration operator that proportionally shifts the classification
of the samples of the class to be forgotten to other classes. However,
the approach is only applicable to class removal due to the
characteristics of this operator.</p>
<p><strong>Task Removal.</strong> Today, machine learning models are not
only trained for a single task but also for multiple tasks. This
paradigm, aka continual learning or lifelong learning&#xA0;<a href="#ref-parisi2019continual">(Parisi et al.
2019)</a>, is motivated by the human brain, in which learning
multiple tasks can benefit each other due to their correlations. This
technique is also used overcome data sparsity or cold-start problems
where there is not enough data to train a single task effectively.</p>
<p>However, in these settings too, there can be a need to remove private
data related to a specific task. For example, consider a robot that is
trained to assist a patient at home during their medical treatment. This
robot may be asked to forget this assistance behaviour after the patient
has recovered&#xA0;<a href="#ref-liu2022continual">(B.
Liu, Liu, et al. 2022)</a>. To this end, temporarily learning a task
and forgetting it in the future has become a need for lifelong learning
models.</p>
<p>In general, unlearning a task is uniquely challenging as continual
learning might depend on the order of the learned tasks. Therefore,
removing a task might create a catastrophic unlearning effect, where the
overall performance of multiple tasks is degraded in a
domino-effect&#xA0;<a href="#ref-liu2022continual">(B.
Liu, Liu, et al. 2022)</a>. Mitigating this problem requires the
model to be aware of that the task may potentially be removed in future.
Liu et al.&#xA0;<a href="#ref-liu2022continual">(B. Liu,
Liu, et al. 2022)</a> explains that this requires users to explicitly
define which tasks will be learned permanently and which tasks will be
learned only temporarily.</p>
<p><strong>Stream Removal.</strong> Handling data streams where a huge
amount of data arrives online requires some mechanisms to retain or
ignore certain data while maintaining limited storage&#xA0;<a href="#ref-nguyen2017retaining">(Tam et al.
2017)</a>. In the context of machine unlearning, however, handling
data streams is more about dealing with a stream of removal
requests.</p>
<p>Gupta et el.&#xA0;<a href="#ref-gupta2021adaptive">(Gupta et al. 2021)</a> proposed a
streaming unlearning setting involving a sequence of data removal
requests. This is motivated by the fact that many users can be involved
in a machine learning system and decide to delete their data
sequentially. Such is also the case when the training data has been
poisoned in an adversarial attack and the data needs to be deleted
gradually to recover the model&#x2019;s performance. These streaming requests
can be either non-adaptive or adaptive. A non-adaptive request means
that the removal sequence does not depend on the intermediate results of
each unlearning request, whereas and adaptive request means that the
data to be removed depends on the current unlearned model. In other
words, after the poisonous data is detected, the model is unlearned
gradually so as to decide which data item is most beneficial to unlearn
next.</p>

<h2 id="design-requirements">2.3. Design Requirements</h2>
<p><strong>Completeness (Consistency).</strong> A good unlearning
algorithm should be complete&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang 2015)</a>, i.e.&#xA0;the
unlearned model and the retrained model make the same predictions about
any possible data sample (whether right or wrong). One way to measure
this consistency is to compute the percentage of the same prediction
results on a test data. This requirement can be designed as an
optimization objective in an unlearning definition (<a
href="#sec:exact_unlearning" data-reference-type="autoref"
data-reference="sec:exact_unlearning">[sec:exact_unlearning]</a>) by
formulating the difference between the output space of the two models.
Many works on adversarial attacks can help with this formulation&#xA0;<a href="#ref-sommer2022athena chen2021machine">(Sommer
et al. 2022; M. Chen et al. 2021b)</a>.</p>
<p><strong>Timeliness.</strong> In general, retraining can fully solve
any unlearning problem. However, retraining is time-consuming,
especially when the distribution of the data to be forgotten is
unknown&#xA0;<a href="#ref-cao2015towards bourtoule2021machine">(Y. Cao and Yang 2015;
Bourtoule et al. 2021)</a>. As a result, there needs to be a
trade-off between completeness and timeliness. Unlearning techniques
that do not use retraining might be inherently not complete, i.e., they
may lead to some privacy leaks, even though some provable guarantees are
provided for special cases&#xA0;<a href="#ref-GuoGHM20 marchant2022hard neel2021descent">(C. Guo et al.
2020; Marchant, Rubinstein, and Alfeld 2022; Neel, Roth, and
Sharifi-Malvajerdi 2021)</a>. To measure timeliness, we can measure
the speed up of unlearning over retraining after an unlearning request
is invoked.</p>
<p>It is also worth recognizing the cause of this trade-off between
retraining and unlearning. When there is not much data to be forgotten,
unlearning is generally more beneficial as the effects on model accuracy
are small. However, when there is much forgetting data, retraining might
be better as unlearning many times, even bounded, may catastrophically
degrade the model&#x2019;s accuracy&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang 2015)</a>.</p>
<p><strong>Accuracy.</strong> An unlearned model should be able to
predict test samples correctly. Or at least its accuracy should be
comparable to the retrained model. However, as retraining is
computationally costly, retrained models are not always available for
comparison. To address this issue, the accuracy of the unlearned model
is often measured on a new test set, or it is compared with that of the
original model before unlearning&#xA0;<a href="#ref-he2021deepobliviate">(He et al. 2021)</a>.</p>
<p><strong>Light-weight.</strong> To prepare for unlearning process,
many techniques need to store model checkpoints, historical model
updates, training data, and other temporary data&#xA0;<a href="#ref-he2021deepobliviate bourtoule2021machine liu2020federated">(He
et al. 2021; Bourtoule et al. 2021; G. Liu et al. 2020)</a>. A good
unlearning algorithm should be light-weight and scale with big data. Any
other computational overhead beside unlearning time and storage cost
should be reduced as well&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>.</p>
<p><strong>Provable guarantees.</strong> With the exception of
retraining, any unlearning process might be inherently approximate. It
is practical for an unlearning method to provide a provable guarantee on
the unlearned model. To this end, many works have designed unlearning
techniques with bounded approximations on retraining&#xA0;<a href="#ref-GuoGHM20 marchant2022hard neel2021descent">(C. Guo et al.
2020; Marchant, Rubinstein, and Alfeld 2022; Neel, Roth, and
Sharifi-Malvajerdi 2021)</a>. Nonetheless, these approaches are
founded on the premise that models with comparable parameters will have
comparable accuracy.</p>
<p><strong>Model-agnostic.</strong> An unlearning process should be
generic for different learning algorithms and machine learning
models&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>,
especially with provable guarantees as well. However, as machine
learning models are different and have different learning algorithms as
well, designing a model-agnostic unlearning framework could be
challenging.</p>
<p><strong>Verifiability.</strong> Beyond unlearning requests, another
demand by users is to verify that the unlearned model now protects their
privacy. To this end, a good unlearning framework should provide
end-users with a verification mechanism. For example, backdoor attacks
can be used to verify unlearning by injecting backdoor samples into the
training data&#xA0;<a href="#ref-sommer2020towards">(Sommer et al. 2020)</a>. If the
backdoor can be detected in the original model while not detected in the
unlearned model, then verification is considered to be a success.
However, such verification might be too intrusive for a trustworthy
machine learning system and the verification might still introduce false
positive due to the inherent uncertainty in backdoor detection.</p>

<h2 id="unlearning-verification">2.4. Unlearning Verification</h2>
<p>The goal of unlearning verification methods is to certify that one
cannot easily distinguish between the unlearned models and their
retrained counterparts&#xA0;<a href="#ref-thudi2021necessity">(Thudi, Jia, et al. 2022)</a>. While
the evaluation metrics (<a href="#sec:metrics"
data-reference-type="autoref"
data-reference="sec:metrics">[sec:metrics]</a>) are theoretical criteria
for machine unlearning, unlearning verification can act as a certificate
for an unlearned model. They also include best practices for validating
the unlearned models efficiently.</p>
<p>It is noteworthy that while unlearning metrics (in <a
href="#sec:formulation" data-reference-type="autoref"
data-reference="sec:formulation">[sec:formulation]</a>) and verification
metrics share some overlaps, the big difference is that the former can
be used for optimization or to provide a bounded guarantee, while the
latter is used for evaluation only.</p>
<p><strong>Feature Injection Test.</strong> The goal of this test is to
verify whether the unlearned model has adjusted the weights
corresponding to the removed data samples based on data
features/attributes&#xA0;<a href="#ref-izzo2021approximate">(Izzo et al. 2021)</a>. The idea is
that if the set of data to be forgotten has a very distinct feature
distinguishing it from the remaining set, it gives a strong signal for
the model weights. However, this feature needs to be correlated with the
labels of the set to be forgotten, otherwise the model might not learn
anything from this feature.</p>
<p>More precisely, an extra feature is added for each data item such
that it is equal to zero for the remaining set and is perfectly
correlated with the labels of the set to forget. Izzo et al.&#xA0;<a href="#ref-izzo2021approximate">(Izzo et al.
2021)</a> applied this idea with linear classifiers, where the weight
associated with this extra feature is expected to be significantly
different from zero after training. After the model is unlearned, this
weight is expected to become zero. As a result, the difference of this
weight can be plotted before and after unlearning as a measure of
effectiveness of the unlearning process.</p>
<p>One limitation of this verification method is that the current
solution is only applicable for linear and logistic models&#xA0;<a href="#ref-izzo2021approximate">(Izzo et al.
2021)</a>. This is because these models have explicit weights
associated with the injected feature, whereas, for other models such as
deep learning, injecting such a feature as a strong signal is
non-trivial, even though the set to be forgotten is small. Another
limitation to these types of methods is that an injected version of the
data needs to be created so that the model can be learned (either from
scratch or incrementally depending on the type of the model).</p>
<p><strong>Forgetting Measuring.</strong> Even after the data to be
forgotten has been unlearned from the model, it is still possible for
the model to carry detectable traces of those samples&#xA0;<a href="#ref-jagielski2022measuring">(Jagielski et al.
2022)</a>. Jagielski et al.&#xA0;<a href="#ref-jagielski2022measuring">(Jagielski et al. 2022)</a>
proposed a formal way to measure the forgetfulness of a model via
privacy attacks. More precisely, a model is said to <span
class="math inline"><em>&#x3B1;</em></span>-forget a training sample if a
privacy attack (e.g., a membership inference) on that sample achieves no
greater than success rate <span class="math inline"><em>&#x3B1;</em></span>.
This definition is more flexible than differential privacy because a
training algorithm is differentially private only if it immediately
forgets every sample it learns. As a result, this definition allows a
sample to be temporarily learned, and measures how long until it is
forgotten by the model.</p>
<p><strong>Information Leakage.</strong> Many machine learning models
inherently leak information during the model updating process&#xA0;<a href="#ref-chen2021machine">(M. Chen et al.
2021b)</a>. Recent works have exploited this phenomenon by comparing
the model before and after unlearning to measure the information
leakage. More precisely, Salem et al.&#xA0;<a href="#ref-salem2020updates">(Salem et al. 2020)</a> proposed an
adversary attack in the image domain that could reconstruct a removed
sample when a classifier is unlearned on a data sample. Brockschmidt et
al.&#xA0;<a href="#ref-zanella2020analyzing">(Zanella-B&#xE9;guelin et al. 2020)</a>
suggested a similar approach for the text domain. Chen et al.&#xA0;<a href="#ref-chen2021machine">(M. Chen et al.
2021b)</a> introduced a membership inference attack to detect whether
a removed sample belongs to the learning set. Compared to previous
works&#xA0;<a href="#ref-Salem0HBF019 shokri2017membership">(Salem et al. 2019;
Shokri et al. 2017)</a>, their approach additionally makes use of the
posterior output distribution of the original model, besides that of the
unlearned model. Chen et al.&#xA0;<a href="#ref-chen2021machine">(M. Chen et al. 2021b)</a> also proposed
two leakage metrics, namely the degradation count and the degradation
rate.</p>
<div class="compactitem">
<p>The <em>degradation count:</em> is defined as the ratio between the
number of target samples whose membership can be inferred by the
proposed attack with higher confidence compared to traditional attacks
and the total number of samples.</p>
<p>The <em>degradation rate:</em> is defined the average improvement
rate of the confidence of the proposed attack compared to traditional
attacks.</p>
</div>
<p><strong>Membership Inference Attacks.</strong> This kind of attack is
designed to detect whether a target model leaks data&#xA0;<a href="#ref-shokri2017membership thudi2022bounding chen2021machine">(Shokri
et al. 2017; Thudi, Shumailov, et al. 2022; M. Chen et al.
2021b)</a>. Specifically, an inference model is trained to recognise
new data samples from the training data used to optimize the target
model. In&#xA0;<a href="#ref-shokri2017membership">(Shokri et al. 2017)</a>, a set of
shallow models were trained on a new set of data items different from
the one that the target model was trained on. The attack model was then
trained to predict whether a data item belonged to the training data
based on the predictions made by shallow models for training as well as
testing data. The training set for the shallow and attack models share
similar data distribution to the target model. Membership inference
attacks are helpful for detecting data leaks. Hence, they are useful for
verifying the effectiveness of the machine unlearning&#xA0;<a href="#ref-chen2021machine">(M. Chen et al.
2021b)</a>.</p>
<p><strong>Backdoor attacks.</strong> Backdoor attacks were proposed to
inject backdoors to the data for deceiving a machine learning
model&#xA0;<a href="#ref-wang2019neural">(B. Wang et al.
2019)</a>. The deceived model makes correct predictions with clean
data, but with poison data in a target class as a backdoor trigger, it
makes incorrect predictions. Backdoor attacks were used to verify the
effectiveness of machine unlearning in&#xA0;<a href="#ref-sommer2020towards sommer2022athena">(Sommer et al. 2020,
2022)</a>. Specifically, the setting begins with training a model
that has a mixture of clean and poison data items across all users. Some
of the users want their data deleted. If the users&#x2019; data are not
successfully deleted, the poison samples will be predicted as the target
class. Otherwise, the model will not predict the poison samples as the
target class. However, there is no absolute guarantee that this rule is
always correct, although one can increase the number of poison samples
to make this rule less likely to fail.</p>
<p><strong>Slow-down attacks.</strong> Some studies focus on the
theoretical guarantee of indistinguishability between an unlearned and a
retrained models. However, the practical bounds on computation costs are
largely neglected in these papers&#xA0;<a href="#ref-marchant2022hard">(Marchant, Rubinstein, and Alfeld
2022)</a>. As a result, a new threat has been introduced to machine
unlearning where poisoning attacks are used to slow down the unlearning
process. Formally, let <span
class="math inline"><em>h</em><sub>0</sub>&#x2004;=&#x2004;<em>A</em>(<em>D</em>)</span>
be an initial model trained by a learning algorithm <span
class="math inline"><em>A</em></span> on a dataset <span
class="math inline"><em>D</em></span>. The goal of the attacker is to
poison a subset <span
class="math inline"><em>D</em><sub><em>p</em><em>o</em><em>i</em><em>s</em><em>o</em><em>n</em></sub>&#x2004;&#x2282;&#x2004;<em>D</em></span>
such as to maximize the computation cost of removing <span
class="math inline"><em>D</em><sub><em>p</em><em>o</em><em>i</em><em>s</em><em>o</em><em>n</em></sub></span>
from <span class="math inline"><em>h&#x302;</em></span> using an unlearning
algorithm <span class="math inline"><em>U</em></span>. Marchant et al.
&#xA0;<a href="#ref-marchant2022hard">(Marchant,
Rubinstein, and Alfeld 2022)</a> defined and estimated an efficient
computation cost for certifying removal methods. However, generalizing
this computation cost for different unlearning methods is still an open
research direction.</p>
<p><strong>Interclass Confusion Test.</strong> The idea of this test is
to investigate whether information from the data to be forgotten can
still be inferred from an unlearned model&#xA0;<a href="#ref-goel2022evaluating">(Goel, Prabhu, and Kumaraguru
2022)</a>. Different from traditional approximate unlearning
definitions that focus on the indistinguishability between unlearned and
retrained models in the parameter space, this test focuses on the output
space. More precisely, the test involves randomly selecting a set of
samples <span class="math inline"><em>S</em>&#x2004;&#x2282;&#x2004;<em>D</em></span> from
two chosen classes in the training data <span
class="math inline"><em>D</em></span> and then randomly swapping the
label assignment between the samples of different classes to result in a
confused set <span class="math inline"><em>S</em>&#x2032;</span>. Together
<span class="math inline"><em>S</em>&#x2032;</span> and <span
class="math inline"><em>D</em>&#x2005;\&#x2005;<em>S</em></span> form a new training
dataset <span class="math inline"><em>D</em>&#x2032;</span>, resulting in a new
trained model. <span class="math inline"><em>S</em>&#x2032;</span> is
considered to be the forgotten data. From this, Goet et al.&#xA0;<a href="#ref-goel2022evaluating">(Goel, Prabhu, and
Kumaraguru 2022)</a> computes a forgetting score from a confusion
matrix generated by the unlearned model. A lower forgetting score means
a better unlearned model.</p>
<p><strong>Federated verification.</strong> Unlearning verification in
federated learning is uniquely challenging. First, the participation of
one or a few clients in the federation may subtly change the global
model&#x2019;s performance, making verification in the output space
challenging. Second, verification using adversarial attacks is not
applicable in the federated setting because it might introduce new
security threats to the infrastructure&#xA0;<a href="#ref-gao2022verifi">(X. Gao et al. 2022)</a>. As a result, Gao
et al.&#xA0;<a href="#ref-gao2022verifi">(X. Gao et al.
2022)</a> proposes a verification mechanism that uses a few
communication rounds for clients to verify their data in the global
model. This approach is compatible with federated settings because the
model is trained in the same way where the clients communicate with the
server over several rounds.</p>
<p><strong>Cryptographic proofs.</strong> Since most of existing
verification frameworks do not provide any theoretical guarantee,
Eisenhofer et al.&#xA0;<a href="#ref-eisenhofer2022verifiable">(Eisenhofer et al. 2022)</a>
proposed a cryptography-informed protocol to compute two proofs,
i.e.&#xA0;proof of update (the model was trained on a particular dataset
<span class="math inline"><em>D</em></span>) and proof of unlearning
(the forget item <span class="math inline"><em>d</em></span> is not a
member of <span class="math inline"><em>D</em></span>). The core idea of
the proof of update is using SNARK&#xA0;<a href="#ref-bitansky2012extractable">(Bitansky et al. 2012)</a> data
structure to commit a hash whenever the model is updated (learned or
unlearned) while ensuring that: (i) the model was obtained from the
remaining data, (ii) the remaining data does not contain any forget
items, (iii) the previous forget set is a subset of the current forget
set, and (iv) the forget items are never re-added into the training
data. The core idea of the proof of unlearning is using the Merkle tree
to maintain the order of data items in the training data so that an
unlearned item cannot be added to the training data again. While the
approach is demonstrated on SISA (efficient retraining)&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a>, it is applicable for any unlearning method.</p>



<h1 id="sec:problem">3. Unlearning Definition</h1>
<p>While the application of machine unlearning can originate from
security, usability, fidelity, and privacy reasons, it is often
formulated as a privacy preserving problem where users can ask for the
removal of their data from computer systems and machine learning
models&#xA0;<a href="#ref-sekhari2021remember ginart2019making bourtoule2021machine garg2020formalizing">(Sekhari
et al. 2021; Ginart et al. 2019; Bourtoule et al. 2021; Garg,
Goldwasser, and Vasudevan 2020)</a>. The forgetting request can be
motivated by security and usability reasons as well. For example, the
models can be attacked by adversarial data and produce wrong outputs.
Once these types of attacks are detected, the corresponding adversarial
data has to be removed as well without harming the model&#x2019;s predictive
performance.</p>
<p>When fulfilling a removal request, the computer system needs to
remove all user&#x2019;s data and &#x2018;forget&#x2019; any influence on the models that
were trained on those data. As removing data from a database is
considered trivial, the literature mostly concerns how to unlearn data
from a model&#xA0;<a href="#ref-GuoGHM20 izzo2021approximate neel2021descent ullah2021machine">(C.
Guo et al. 2020; Izzo et al. 2021; Neel, Roth, and Sharifi-Malvajerdi
2021; Ullah et al. 2021)</a>.</p>
<p>To properly formulate an unlearning problem, we need to introduce a
few concepts. First, let us denote <span class="math inline">&#x1D4B5;</span> as
an example space, i.e., a space of data items or examples (called
samples). Then, the set of all possible training datasets is denoted as
<span class="math inline">&#x1D4B5;<sup>*</sup></span>. One can argue that <span
class="math inline">&#x1D4B5;<sup>*</sup>&#x2004;=&#x2004;2<sup>&#x1D4B5;</sup></span> but that is not
important, as a particular training dataset <span
class="math inline"><em>D</em>&#x2004;&#x2208;&#x2004;<em>Z</em><sup>*</sup></span> is often
given as input. Given <span class="math inline"><em>D</em></span>, we
want to get a machine learning model from a hypothesis space <span
class="math inline">&#x210B;</span>. In general, the hypothesis space <span
class="math inline">&#x210B;</span> covers the parameters and the meta-data of
the models. Sometimes, it is modeled as <span
class="math inline">&#x1D4B2;&#x2005;&#xD7;&#x2005;<em>&#x398;</em></span>, where <span
class="math inline">&#x1D4B2;</span> is the parameter space and <span
class="math inline"><em>&#x398;</em></span> is the metadata/state space. The
process of training a model on <span
class="math inline"><em>D</em></span> in the given computer system is
enabled by a learning algorithm, denoted by a function <span
class="math inline"><em>A</em>&#x2004;:&#x2004;&#x1D4B5;<sup>*</sup>&#x2004;&#x2192;&#x2004;&#x210B;</span>, with the
trained model denoted as <span
class="math inline"><em>A</em>(<em>D</em>)</span>.</p>
<p>To support forgetting requests, the computer system needs to have an
unlearning mechanism, denoted by a function <span
class="math inline"><em>U</em></span>, that takes as input a training
dataset <span
class="math inline"><em>D</em>&#x2004;&#x2208;&#x2004;<em>Z</em><sup>*</sup></span>, a forget
set <span
class="math inline"><em>D</em><sub><em>f</em></sub>&#x2004;&#x2282;&#x2004;<em>D</em></span>
(data to forget) and a model <span
class="math inline"><em>A</em>(<em>D</em>)</span>. It returns a
sanitized (or unlearned) model <span
class="math inline"><em>U</em>(<em>D</em>,<em>D</em><sub><em>f</em></sub>,<em>A</em>(<em>D</em>))&#x2004;&#x2208;&#x2004;&#x210B;</span>.
The unlearned model is expected to be the same or similar to a retrained
model <span
class="math inline"><em>A</em>(<em>D</em>\<em>D</em><sub><em>f</em></sub>)</span>
(i.e., a model as if it had been trained on the remaining data). Note
that <span class="math inline"><em>A</em></span> and <span
class="math inline"><em>U</em></span> are assumed to be randomized
algorithms, i.e., the output is non-deterministic and can be modelled as
a conditional probability distribution over the hypothesis space given
the input data&#xA0;<a href="#ref-marchant2022hard">(Marchant, Rubinstein, and Alfeld
2022)</a>. This assumption is reasonable as many learning algorithms
are inherently stochastic (e.g., SGD) and some floating-point operations
involve randomness in computer implementations&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>.
Another note is that we do not define the function <span
class="math inline"><em>U</em></span> precisely before-hand as its
definition varies with different settings.</p>



<h1 id="sec:algorithms">4. Unlearning Algorithms</h1>
<p>As mentioned in the Section&#xA0;<a href="#sec:intro"
data-reference-type="ref" data-reference="sec:intro">1</a>, machine
unlearning can remove data and data linkages without retraining the
machine learning model from scratch, saving time and computational
resources&#xA0;<a href="#ref-wang2022federated chen2021machinegan">(J. Wang, Guo, et al.
2022; K. Chen, Huang, et al. 2021)</a>. The specific approaches of
machine unlearning can be categorized into model-agnostic,
model-intrinsic, and data-driven approaches.</p>
<div class="table*">
<div class="adjustbox">
<p>max width=0.9</p>
<div class="threeparttable">
<table>
<tbody>
<tr class="odd">
<td style="text-align: left;">
<strong>Unlearning Methods</strong>
</td>
<td colspan="5" style="text-align: center;">
<strong>Unlearning Scenarios</strong>
</td>
<td colspan="6" style="text-align: center;">
<strong>Design Requirements</strong>
</td>
<td colspan="5" style="text-align: center;">
<strong>Unlearning Requests</strong>
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
<td style="text-align: center;">
<div class="sideways">

</div>
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
<span><strong>Model-agnostic</strong></span>
</td>
<td colspan="9" style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Differential privacy&#xA0;<span class="citation"
data-cites="gupta2021adaptive"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Certified removal&#xA0;<span class="citation"
data-cites="GuoGHM20 golatkar2020eternal neel2021descent ullah2021machine"></span>
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Statistical query learning&#xA0;<span class="citation"
data-cites="cao2015towards"></span>
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Decremental learning&#xA0;<span class="citation"
data-cites="ginart2019making chen2019novel"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Knowledge adaptation&#xA0;<span class="citation"
data-cites="chundawat2022can"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Parameter sampling&#xA0;<span class="citation"
data-cites="nguyen2022markov"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
<span><strong>Model-intrinsic</strong></span>
</td>
<td colspan="9" style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Softmax classifiers&#xA0;<span class="citation"
data-cites="baumhauer2020machine"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Linear models&#xA0;<span class="citation"
data-cites="izzo2021approximate li2020online"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Tree-based models&#xA0;<span class="citation"
data-cites="schelter2021hedgecut"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Bayesian models&#xA0;<span class="citation"
data-cites="nguyen2020variational"></span>
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
DNN-based models&#xA0;<span class="citation"
data-cites="he2021deepobliviate goyal2021revisiting mehta2022deep golatkar2021mixed golatkar2020forgetting basu2021influence zhang2022machine"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
-
</td>
<td style="text-align: center;">
-
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
-
</td>
<td style="text-align: center;">
-
</td>
<td style="text-align: center;">
-
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
<span><strong>Data-driven</strong></span>
</td>
<td colspan="9" style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Data partition&#xA0;<span class="citation"
data-cites="bourtoule2021machine aldaghri2021coded"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
</tr>
<tr class="even">
<td style="text-align: left;">
Data augmentation&#xA0;<span class="citation"
data-cites="huang2021unlearnable shan2020protecting tarun2021fast yu2021does"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
<tr class="odd">
<td style="text-align: left;">
Data influence&#xA0;<span class="citation"
data-cites="peste2021ssse zeng2021learning cao2022machine"></span>
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
&#x2013;
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
<td style="text-align: center;">
</td>
</tr>
</tbody>
</table>
<div class="tablenotes">
<dl>
<dt>: fully support</dt>
<dd>
<p>no support</p>
</dd>
<dt>&#x2013;: partially or indirectly support</dt>
<dd>
<p>representative citations</p>
</dd>
</dl>
</div>
</div>
</div>
</div>

<h2 id="sec:model-agnostic">4.1. Model-Agnostic Approaches</h2>



<p>Model-agnostic machine unlearning methodologies include unlearning
processes or frameworks that are applicable to different models.
However, in some cases, theoretical guarantees are only provided for a
class of models (e.g., linear models). Nonetheless, they are still
considered to be model-agnostic as their core ideas are applicable to
complex models (e.g.&#xA0;deep neural networks) with practical results.</p>

<div class="figure*">
<figure>
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/figs/model-agnostic.png" alt="https://arxiv.org/abs/2209.02299" style="max-width: 70%;"/>
</figure>
</div>

<p><strong>Differential Privacy.</strong> Differential privacy was first
proposed to bound a data sample&#x2019;s influence on a machine learning
model&#xA0;<a href="#ref-dwork2008differential">(Dwork
2008)</a>. <span class="math inline"><em>&#x3F5;</em></span>-differential
privacy unlearns a data sample by setting <span
class="math inline"><em>&#x3F5;</em>&#x2004;=&#x2004;0</span>, where <span
class="math inline"><em>&#x3F5;</em></span> bounds the level of change in any
model parameters affected by that data sample&#xA0;<a href="#ref-bourtoule2021machine thudi2022unrolling">(Bourtoule et al.
2021; Thudi, Deza, et al. 2022)</a>. However, Bourtoule et al.&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a> notes that the algorithm cannot learn from the training
data in such a case. Gupta et el.&#xA0;<a href="#ref-gupta2021adaptive">(Gupta et al. 2021)</a> proposed a
differentially private unlearning mechanism for streaming data removal
requests. These requests are adaptive as well, meaning the data to be
removed depends on the current unlearned model. The idea, which is based
on differential privacy, can be roughly formulated as: <span
class="math display">Pr&#x2006;(<em>U</em>(<em>D</em>,<em>s</em>,<em>A</em>(<em>D</em>))&#x2208;&#x1D4AF;)&#x2004;&#x2264;&#x2004;<em>e</em><sup><em>&#x3F5;</em></sup><em>P</em><em>r</em>(<em>A</em>(<em>D</em>\<em>s</em>)&#x2208;&#x1D4AF;)&#x2005;+&#x2005;<em>&#x3B2;</em></span>
for all adaptive removal sequences <span
class="math inline"><em>s</em>&#x2004;=&#x2004;(<em>z</em><sub>1</sub>,&#x2026;,<em>z</em><sub><em>k</em></sub>)</span>.
One weakness of this condition is that it only guarantees the upper
bound of the unlearning scheme compared to full retraining. However, its
strength is that it supports a user&#x2019;s belief that the system has engaged
in full retraining. Finally, an unlearning process is developed by a
notion of differentially private publishing functions and a theoretical
reduction from adaptive to non-adaptive sequences. Differentially
private publishing functions guarantee that the model before and after
an unlearning request do not differ too much.</p>
<p><strong>Certified Removal Mechanisms.</strong> Unlearning algorithms
falling into this category are the ones following the original
approximate definition of machine unlearning&#xA0;<a href="#ref-GuoGHM20 golatkar2020eternal">(C. Guo et al. 2020; Golatkar,
Achille, and Soatto 2020a)</a>. While Guo et al.&#xA0;<a href="#ref-GuoGHM20">(C. Guo et al. 2020)</a> focus
on theoretical guarantees for linear models and convex losses, Golatkar
et al.&#xA0;<a href="#ref-golatkar2020eternal">(Golatkar, Achille, and Soatto
2020a)</a> introduce a computable upper bound for SGD-based learning
algorithms, especially deep neural networks. The core idea is based on
the notion of perturbation (noise) to mask the small residue incurred by
the gradient-based update (e.g., a one-step Newton update&#xA0;<a href="#ref-koh2017understanding">(Koh et al.
2017)</a>). The idea is applicable to other cases, although no
theoretical guarantees are provided&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al. 2021)</a>.</p>
<p>More precisely, certified removal mechanisms mainly accommodate those
linear models that minimize a standardized empirical risk, which is the
total value of a convex loss function that measures the distance of the
actual value from the expected one&#xA0;<a href="#ref-marchant2022hard">(Marchant, Rubinstein, and Alfeld
2022)</a>. However, one has to rely on a customized learning
algorithm that optimizes a perturbed version of the regularized
empirical risk, where the added noise is drawn from a standard normal
distribution. This normalized noise allows conventional convex
optimization techniques to solve the learning problem with perturbation.
As a result, the unlearning request can be done by computing the model
perturbation towards the regularized empirical risk on the remaining
data. The final trick is that this perturbation can be approximated by
the influence function&#xA0;<a href="#ref-koh2017understanding">(Koh et al. 2017)</a>, which is
computed by inverting the Hessian on training data and the gradient of
the data to be forgotten&#xA0;<a href="#ref-marchant2022hard">(Marchant, Rubinstein, and Alfeld
2022)</a>. However, the error of model parameters in such a
computation can be so large that the added noise cannot mask it.
Therefore, if the provided theoretical upper bound exceeds a certain
threshold, the unlearning algorithm resorts to retraining from
scratch&#xA0;<a href="#ref-marchant2022hard">(Marchant,
Rubinstein, and Alfeld 2022)</a>.</p>
<p>Following this idea, Neel et al.&#xA0;<a href="#ref-neel2021descent">(Neel, Roth, and Sharifi-Malvajerdi
2021)</a> provided further extensions, namely regularized perturbed
gradient descent and distributed perturbed gradient descent, to support
weak convex losses and provide theoretical guarantees on
indistinguishability, accuracy, and unlearning times.</p>
<p>Ullah et al.&#xA0;<a href="#ref-ullah2021machine">(Ullah et al. 2021)</a> continued
studying machine unlearning in the context of SGD and streaming removal
requests. They define the notation of total variation stability for a
learning algorithm: <span
class="math display">sup<sub><em>D</em>,&#x2006;<em>D</em>&#x2032;&#x2004;:&#x2004;|<em>D</em>\<em>D</em>&#x2032;|&#x2005;+&#x2005;|<em>D</em>&#x2032;\<em>D</em>|</sub><em>&#x394;</em>(<em>A</em>(<em>D</em>),<em>A</em>(<em>D</em>&#x2032;))&#x2004;&#x2264;&#x2004;<em>&#x3C1;</em></span>
where <span class="math inline"><em>&#x394;</em>(.)</span> is the largest
possible difference between the two probabilities such that they can
assign to the same event, aka total variance distance&#xA0;<a href="#ref-verdu2014total">(Verd&#xFA; 2014)</a>. This
is also a special case of the optimal transportation cost between two
probability distributions&#xA0; <a href="#ref-lei2019geometric">(Lei et al. 2019)</a>. In other words,
a learning algorithm <span class="math inline"><em>A</em>(.)</span> is
said to be <span class="math inline"><em>&#x3C1;</em></span>-TV-stable if
given any two training datasets <span
class="math inline"><em>D</em></span> and <span
class="math inline"><em>D</em>&#x2032;</span>, as long as they have 1 common
data item, the cost of transporting from the model distribution <span
class="math inline"><em>A</em>(<em>D</em>)</span> to <span
class="math inline"><em>A</em>(<em>D</em>&#x2032;)</span> is bounded by <span
class="math inline"><em>&#x3C1;</em></span>. For any <span
class="math inline">1/<em>n</em>&#x2004;&#x2264;&#x2004;<em>&#x3C1;</em>&#x2004;&lt;&#x2004;&#x221E;</span>, Ullah et
al.&#xA0;<a href="#ref-ullah2021machine">(Ullah et al.
2021)</a> proved that there exists an unlearning process that
satisfies exact unlearning at any time in the streaming removal request
while the model accuracy and the unlearning time are bounded w.r.t.
<span class="math inline"><em>&#x3C1;</em></span>.</p>
<p><strong>Statistical Query Learning.</strong> Statistical query
learning is a form of machine learning that trains models by querying
statistics on the training data rather than itself&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang
2015)</a>. In this form, a data sample can be forgotten efficiently
by recomputing the statistics over the remaining data&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a>. More precisely, statistical query learning assumes that
most of the learning algorithms can be represented as a sum of some
efficiently computable transformations, called statistical queries&#xA0;<a href="#ref-kearns1998efficient">(Kearns 1998)</a>.
These statistical queries are basically requests to an oracle (e.g., a
ground truth) to estimate a statistical function over all training data.
Cao et al.&#xA0;<a href="#ref-cao2015towards">(Y. Cao
and Yang 2015)</a> showed that this formulation can generalize many
algorithms for machine learning, such as the Chi-square test, naive
Bayes, and linear regression. For example, in naive Bayes, these
statistical queries are indicator functions that return 1 when the
output is a target label and zero otherwise&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang 2015)</a>. In the
unlearning process, these queries are simply recomputed over the
remaining data. The approach is efficient as these statistical functions
are computationally efficient in the first place. Moreover, statistical
query learning also supports adaptive statistical queries, which are
computed based on the prior state of the learning models, including
k-means, SVM, and gradient descent. Although this time, the unlearning
update makes the model not convergent any more, only a few learning
iterations (adaptive statistical queries) are needed since the model
starts from an almost-converged state. Moreover, if the old results of
the summations are cached, say, via dynamic programming, then the
speedup might be even higher.</p>
<p>The limitation of this approach is that it does not scale with
complex models such as deep neural networks. Indeed, in complex models,
the number of statistical queries could become exponentially large&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a>, making both the unlearning and relearning steps less
efficient.</p>
<p>In general, statistical query learning supports item removal and can
be partially applied to stream removal&#xA0;<a href="#ref-gupta2021adaptive">(Gupta et al. 2021)</a> as well,
although the streaming updates to the summations could be unbounded. It
supports exact unlearning, but only partially when the statistical
queries are non-adaptive. It also partially supports zero-shot
unlearning, because only the statistics over the data need to be
accessed, not the individual training data items.</p>
<p><strong>Decremental Learning.</strong> Decremental learning
algorithms were originally designed to remove redundant samples and
reduce the training load on the processor for support vector machines
(SVM)&#xA0;<a href="#ref-chen2019novel cauwenberghs2000incremental tveit2003multicategory tveit2003incremental romero2007incremental duan2007decremental">(Y.
Chen et al. 2019; Cauwenberghs et al. 2000; Tveit et al. 2003; Tveit,
Hetland, and Engum 2003; Romero, Barrio, and Belanche 2007; Duan et al.
2007)</a> and linear classification&#xA0;<a href="#ref-karasuyama2009multiple karasuyama2010multiple tsai2014incremental">(Karasuyama
and Takeuchi 2009, 2010; Tsai, Lin, and Lin 2014)</a>. As such, they
focus on accuracy rather than the completeness of the machine
unlearning.</p>
<p>Ginart et al.&#xA0;<a href="#ref-ginart2019making">(Ginart et al. 2019)</a> developed
decremental learning solutions for <span
class="math inline"><em>k</em></span>-means clustering based on
quantization and data partition. The idea of quantization is to ensure
that small changes in the data do not change the model. Quantization
helps to avoid unnecessary unlearning so that accuracy is not
catastrophically degraded. However, it is only applicable when there are
few model parameters compared to the size of the dataset. The idea
behind the data partitioning is to restrict the data&#x2019;s influence on the
model parameters to only a few specific data partitions. This process
helps to pinpoint the effects of unlearning to a few data features. But,
again, the approach is only effective with a small number of features
compared to the size of the dataset. Notably, data privacy and data
deletion are not completely correlative&#xA0;<a href="#ref-ginart2019making">(Ginart et al. 2019)</a>. Data privacy
does not have to ensure data deletion (e.g., differential privacy), and
data deletion does not have to ensure data privacy.</p>
<p><strong>Knowledge Adaptation.</strong> Knowledge adaptation
selectively removes to-be-forgotten data samples&#xA0;<a href="#ref-chundawat2022can">(Chundawat et al. 2022a)</a>. In this
approach&#xA0;<a href="#ref-chundawat2022can">(Chundawat
et al. 2022a)</a>, one trains two neural networks as teachers
(competent and incompetent) and one neural network as a student. The
competent teacher is trained on the complete dataset, while the
incompetent teacher is randomly initialised. The student is initialised
with the competent teacher&#x2019;s model parameters. The student is trained to
mimic both competent teacher and incompetent teacher by a loss function
with KL-divergence evaluation values between the student and each of the
two teachers. Notably, the competent teacher processes the retained data
and the incompetent teacher deals with the forgotten data.</p>
<p>Beyond Chundwat et al.&#xA0;<a href="#ref-chundawat2022can">(Chundawat et al. 2022a)</a>, machine
learning models have been quickly and accurately adapted by
reconstructing the past gradients of knowledge-adaptation priors
in&#xA0;<a href="#ref-khan2021knowledge">(Khan et al.
2021)</a>. Ideas similar to knowledge-adaptation priors were also
investigated in&#xA0;<a href="#ref-ginart2019making wu2020priu">(Ginart et al. 2019; Y. Wu,
Tannen, and Davidson 2020)</a>. In general, knowledge adaptation is
applicable to a wide range of unlearning requests and scenarios.
However, it is difficult to provide a theoretical guarantee for this
approach.</p>
<p><strong>MCMC Unlearning (Parameter Sampling).</strong> Sampling-based
machine unlearning has also been suggested as a way to train a standard
machine learning model to forget data samples from the training
data&#xA0;<a href="#ref-nguyen2022markov">(Q. P. Nguyen
et al. 2022)</a>. The idea is to sample the distribution of model
parameters using Markov chain Monte Carlo (MCMC). It is assumed that the
forgetting set is often significantly smaller than the training data
(otherwise retraining might be a better solution). Thus, the parameter
distribution <span
class="math inline"><em>P</em><em>r</em>(<em>w</em><sub><em>r</em></sub>)</span>
of the retrained models should not differ much from that of the original
model <span class="math inline"><em>P</em><em>r</em>(<em>w</em>)</span>.
In other words, the posterior density <span
class="math inline"><em>P</em><em>r</em>(<em>w</em><sub><em>r</em></sub>|<em>D</em>)</span>
should be sufficient large for sampling&#xA0;<a href="#ref-nguyen2022markov">(Q. P. Nguyen et al. 2022)</a>. More
precisely, the posterior distribution from the retrained parameters can
be defined as: <span
class="math display"><em>P</em><em>r</em>(<em>w</em><sub><em>r</em></sub>|<em>D</em>)&#x2004;&#x2248;&#x2004;<em>P</em><em>r</em>(<em>w</em>|<em>D</em>)&#x2004;&#x221D;&#x2004;<em>P</em><em>r</em>(<em>D</em>|<em>w</em>)<em>P</em><em>r</em>(<em>w</em>)</span>
Here, the prior distribution <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>)</span> is often
available from the learning algorithm, which means the stochasticity of
learning via sampling can be estimated. The likelihood <span
class="math inline"><em>P</em><em>r</em>(<em>D</em>|<em>w</em>)</span>
is the prediction output of the model itself, which is also available
after training. From <a href="#eq:mcmc_unlearning"
data-reference-type="autoref"
data-reference="eq:mcmc_unlearning">[eq:mcmc_unlearning]</a>, we only
know that the density function of <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em>)</span>
is proportional to a function <span
class="math inline"><em>f</em>(<em>w</em>)&#x2004;=&#x2004;<em>P</em><em>r</em>(<em>D</em>|<em>w</em>)<em>P</em><em>r</em>(<em>w</em>)</span>,
which means <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em>)</span>
cannot be directly sampled. This is where MCMC comes into play, as it
can still generate the next samples using a proposal density <span
class="math inline"><em>g</em>(<em>w</em>&#x2032;|<em>w</em>)</span>&#xA0;<a href="#ref-nguyen2022markov">(Q. P. Nguyen et al.
2022)</a>. However, <span
class="math inline"><em>g</em>(<em>w</em>&#x2032;|<em>w</em>)</span> is assumed
to be a Gaussian distribution centered on the current sample (the
sampling process can be initialized with the original model).</p>
<p>As a result, a candidate set of model parameters <span
class="math inline"><em>P</em><em>r</em>(<em>w</em><sub><em>r</em></sub>|<em>D</em>)</span>
is constructed from the sampling, and the unlearning output is
calculated by simply maximizing the posterior probability <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>,
i.e.: <span
class="math display"><em>w</em><sub><em>r</em></sub>&#x2004;=&#x2004;arg&#x2006;max<sub><em>w</em></sub><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>
The benefit of such sampling-based unlearning is that no access to the
forgetting set is required.</p>

| **Paper Title** | **Year** | **Author** | **Venue** | **Model** | **Code** |
| --------------- | :----: | ---- | :----: | :----: | :----: |
| [Towards Adversarial Evaluations for Inexact Machine Unlearning](https://arxiv.org/abs/2201.06640) | 2023 | Goel et al. | _arXiv_ | EU-k, CF-k | [[Code]](https://github.com/shash42/Evaluating-Inexact-Unlearning) |
| [On the Trade-Off between Actionable Explanations and the Right to be Forgotten](https://openreview.net/pdf?id=HWt4BBZjVW) | 2023 | Pawelczyk et al. | _arXiv_ | - | - |  |
| [Towards Unbounded Machine Unlearning](https://arxiv.org/pdf/2302.09880) | 2023 | Kurmanji et al. | _arXiv_ | SCRUB | [[Code]](https://github.com/Meghdad92/SCRUB) | approximate unlearning |
| [Netflix and Forget: Efficient and Exact Machine Unlearning from Bi-linear Recommendations](https://arxiv.org/abs/2302.06676) | 2023 | Xu et al. | _arXiv_ | Unlearn-ALS | - | Exact Unlearning |
| [To Be Forgotten or To Be Fair: Unveiling Fairness Implications of Machine Unlearning Methods](https://arxiv.org/abs/2302.03350) | 2023 | Zhang et al. | _arXiv_ | - | [[Code]](https://github.com/cleverhans-lab/machine-unlearning) | |
| [Sequential Informed Federated Unlearning: Efficient and Provable Client Unlearning in Federated Optimization](https://arxiv.org/abs/2211.11656) | 2022 | Fraboni et al. | _arXiv_ | SIFU | - | |
| [Certified Data Removal in Sum-Product Networks](https://arxiv.org/abs/2210.01451) | 2022 | Becker and Liebig | _ICKG_ | UNLEARNSPN | [[Code]](https://github.com/ROYALBEFF/UnlearnSPN) | Certified Removal Mechanisms |
| [Learning with Recoverable Forgetting](https://arxiv.org/abs/2207.08224) | 2022 | Ye et al.  | _ECCV_ | LIRF | - |  |
| [Continual Learning and Private Unlearning](https://arxiv.org/abs/2203.12817) | 2022 | Liu et al. | _CoLLAs_ | CLPU | [[Code]](https://github.com/Cranial-XIX/Continual-Learning-Private-Unlearning) | |
| [Verifiable and Provably Secure Machine Unlearning](https://arxiv.org/abs/2210.09126) | 2022 | Eisenhofer et al. | _arXiv_ | - | [[Code]](https://github.com/cleverhans-lab/verifiable-unlearning) |  Certified Removal Mechanisms |
| [VeriFi: Towards Verifiable Federated Unlearning](https://arxiv.org/abs/2205.12709) | 2022 | Gao et al. | _arXiv_ | VERIFI | - | Certified Removal Mechanisms |
| [FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information](https://arxiv.org/abs/2210.10936) | 2022 | Cao et al. | _S&P_ | FedRecover | - | recovery method |
| [Fast Yet Effective Machine Unlearning](https://arxiv.org/abs/2111.08947) | 2022 | Tarun et al. | _arXiv_ | UNSIR | - |  |
| [Membership Inference via Backdooring](https://arxiv.org/abs/2206.04823) | 2022 | Hu et al.  | _IJCAI_ | MIB | [[Code]](https://github.com/HongshengHu/membership-inference-via-backdooring) | Membership Inferencing |
| [Forget Unlearning: Towards True Data-Deletion in Machine Learning](https://arxiv.org/abs/2210.08911) | 2022 | Chourasia et al. | _ICLR_ | - | - | noisy gradient descent |
| [Zero-Shot Machine Unlearning](https://arxiv.org/abs/2201.05629) | 2022 | Chundawat et al. | _arXiv_ | - | - |  |
| [Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations](https://arxiv.org/abs/2202.13295) | 2022 | Guo et al. | _arXiv_ | attribute unlearning | - |  |
| [Few-Shot Unlearning](https://download.huan-zhang.com/events/srml2022/accepted/yoon22fewshot.pdf) | 2022 | Yoon et al.   | _ICLR_ | - | - |  |
| [Federated Unlearning: How to Efficiently Erase a Client in FL?](https://arxiv.org/abs/2207.05521) | 2022 | Halimi et al. | _UpML Workshop_ | - | - | federated learning |
| [Machine Unlearning Method Based On Projection Residual](https://arxiv.org/abs/2209.15276) | 2022 | Cao et al. | _DSAA_ | - | - |  Projection Residual Method |
| [Hard to Forget: Poisoning Attacks on Certified Machine Unlearning](https://ojs.aaai.org/index.php/AAAI/article/view/20736) | 2022 | Marchant et al. | _AAAI_ | - | [[Code]](https://github.com/ngmarchant/attack-unlearning) | Certified Removal Mechanisms |
| [Athena: Probabilistic Verification of Machine Unlearning](https://web.archive.org/web/20220721061150id_/https://petsymposium.org/popets/2022/popets-2022-0072.pdf) | 2022 | Sommer et al. | _PoPETs_ | ATHENA | - | |
| [FP2-MIA: A Membership Inference Attack Free of Posterior Probability in Machine Unlearning](https://link.springer.com/chapter/10.1007/978-3-031-20917-8_12) | 2022 | Lu et al. | _ProvSec_ | FP2-MIA | - | inference attack |
| [Deletion Inference, Reconstruction, and Compliance in Machine (Un)Learning](https://arxiv.org/abs/2202.03460) | 2022 | Gao et al. | _PETS_ | - | - |  |
| [Prompt Certified Machine Unlearning with Randomized Gradient Smoothing and Quantization](https://openreview.net/pdf?id=ue4gP8ZKiWb) | 2022 | Zhang et al.   | _NeurIPS_ | PCMU | - | Certified Removal Mechanisms |
| [The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining](https://arxiv.org/abs/2203.07320) | 2022 | Liu et al. | _INFOCOM_ | - | [[Code]](https://github.com/yiliucs/federated-unlearning) |  |
| [Backdoor Defense with Machine Unlearning](https://arxiv.org/abs/2201.09538) | 2022 | Liu et al. | _INFOCOM_ | BAERASER | - | Backdoor defense |
| [Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten](https://dl.acm.org/doi/abs/10.1145/3488932.3517406) | 2022 | Nguyen et al. | _ASIA CCS_ | MCU | - | MCMC Unlearning  |
| [Federated Unlearning for On-Device Recommendation](https://arxiv.org/abs/2210.10958) | 2022 | Yuan et al. | _arXiv_ | - | - |  |
| [Can Bad Teaching Induce Forgetting? Unlearning in Deep Networks using an Incompetent Teacher](https://arxiv.org/abs/2205.08096) | 2022 | Chundawat et al. | _arXiv_ | - | - | Knowledge Adaptation |
| [ Efficient Two-Stage Model Retraining for Machine Unlearning](https://openaccess.thecvf.com/content/CVPR2022W/HCIS/html/Kim_Efficient_Two-Stage_Model_Retraining_for_Machine_Unlearning_CVPRW_2022_paper.html) | 2022 | Kim and Woo | _CVPR Workshop_ | - | - |  |
| [Learn to Forget: Machine Unlearning Via Neuron Masking](https://ieeexplore.ieee.org/abstract/document/9844865?casa_token=_eowH3BTt1sAAAAA:X0uCpLxOwcFRNJHoo3AtA0ay4t075_cSptgTMznsjusnvgySq-rJe8GC285YhWG4Q0fUmP9Sodw0) | 2021 | Ma et al. | _IEEE_ | Forsaken | - | Mask Gradients |
| [Adaptive Machine Unlearning](https://proceedings.neurips.cc/paper/2021/hash/87f7ee4fdb57bdfd52179947211b7ebb-Abstract.html) | 2021 | Gupta et al. | _NeurIPS_ | - | [[Code]](https://github.com/ChrisWaites/adaptive-machine-unlearning) | Differential Privacy |
| [Descent-to-Delete: Gradient-Based Methods for Machine Unlearning](https://proceedings.mlr.press/v132/neel21a.html) | 2021 | Neel et al. | _ALT_ | - | - | Certified Removal Mechanisms |
| [Remember What You Want to Forget: Algorithms for Machine Unlearning](https://arxiv.org/abs/2103.03279) | 2021 | Sekhari et al. | _NeurIPS_ | - | - |  |
| [FedEraser: Enabling Efficient Client-Level Data Removal from Federated Learning Models](https://ieeexplore.ieee.org/abstract/document/9521274) | 2021 | Liu et al. | _IWQoS_ | FedEraser | - |  |
| [Federated Unlearning](https://arxiv.org/abs/2012.13891) | 2021 | Liu et al. | _IWQoS_ | FedEraser | [[Code]](https://www.dropbox.com/s/1lhx962axovbbom/FedEraser-Code.zip?dl=0) |  |
| [Machine Unlearning via Algorithmic Stability](https://proceedings.mlr.press/v134/ullah21a.html) | 2021 | Ullah et al. | _COLT_ | TV | - | Certified Removal Mechanisms |
| [EMA: Auditing Data Removal from Trained Models](https://link.springer.com/chapter/10.1007/978-3-030-87240-3_76) | 2021 | Huang et al. | _MICCAI_ | EMA | [[Code]](https://github.com/Hazelsuko07/EMA) | Certified Removal Mechanisms |
| [Knowledge-Adaptation Priors](https://proceedings.neurips.cc/paper/2021/hash/a4380923dd651c195b1631af7c829187-Abstract.html) | 2021 | Khan and Swaroop | _NeurIPS_ | K-prior | [[Code]](https://github.com/team-approx-bayes/kpriors) | Knowledge Adaptation |
| [PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models](https://dl.acm.org/doi/abs/10.1145/3318464.3380571) | 2020 | Wu et al. | _NeurIPS_ | PrIU | - | Knowledge Adaptation |
| [Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks](https://arxiv.org/abs/1911.04933) | 2020 | Golatkar et al. | _CVPR_ | - | - | Certified Removal Mechanisms |
| [Learn to Forget: User-Level Memorization Elimination in Federated Learning](https://www.researchgate.net/profile/Ximeng-Liu-5/publication/340134612_Learn_to_Forget_User-Level_Memorization_Elimination_in_Federated_Learning/links/5e849e64a6fdcca789e5f955/Learn-to-Forget-User-Level-Memorization-Elimination-in-Federated-Learning.pdf) | 2020 | Liu et al. | _arXiv_ | Forsaken | - |  |
| [Certified Data Removal from Machine Learning Models](https://proceedings.mlr.press/v119/guo20c.html) | 2020 | Guo et al. | _ICML_ | - | - | Certified Removal Mechanisms |
| [Class Clown: Data Redaction in Machine Unlearning at Enterprise Scale](https://arxiv.org/abs/2012.04699) | 2020 | Felps et al. | _arXiv_ | - | - | Decremental Learning |
| [A Novel Online Incremental and Decremental Learning Algorithm Based on Variable Support Vector Machine](https://link.springer.com/article/10.1007/s10586-018-1772-4) | 2019 | Chen et al. | _Cluster Computing_ | - | - | Decremental Learning  |
| [Making AI Forget You: Data Deletion in Machine Learning](https://papers.nips.cc/paper/2019/hash/cb79f8fa58b91d3af6c9c991f63962d3-Abstract.html) | 2019 | Ginart et al. | _NeurIPS_ | - | - | Decremental Learning  |
| [Lifelong Anomaly Detection Through Unlearning](https://dl.acm.org/doi/abs/10.1145/3319535.3363226) | 2019 | Du et al. | _CCS_ | - | - |  |
| [Learning Not to Learn: Training Deep Neural Networks With Biased Data](https://openaccess.thecvf.com/content_CVPR_2019/html/Kim_Learning_Not_to_Learn_Training_Deep_Neural_Networks_With_Biased_CVPR_2019_paper.html) | 2019 | Kim et al. | _CVPR_ | - | - |  |
| [Efficient Repair of Polluted Machine Learning Systems via Causal Unlearning](https://dl.acm.org/citation.cfm?id=3196517) | 2018 | Cao et al. | _ASIACCS_ | KARMA | [[Code]](https://github.com/CausalUnlearning/KARMA) |  |
| [Understanding Black-box Predictions via Influence Functions](https://proceedings.mlr.press/v70/koh17a.html) | 2017 | Koh et al. | _ICML_ | - | [[Code]](https://github.com/kohpangwei/influence-release) | Certified Removal Mechanisms |
| [Towards Making Systems Forget with Machine Unlearning](https://ieeexplore.ieee.org/abstract/document/7163042) | 2015 | Cao and Yang | _S&P_ | - |  |
| [Towards Making Systems Forget with Machine Unlearning](https://dl.acm.org/doi/10.1109/SP.2015.35) | 2015 | Cao et al. | _S&P_ | - | - | Statistical Query Learning  |
| [Incremental and decremental training for linear classification](https://dl.acm.org/doi/10.1145/2623330.2623661) | 2014 | Tsai et al. | _KDD_ | - | [[Code]](https://www.csie.ntu.edu.tw/~cjlin/papers/ws/) | Decremental Learning  |
| [Multiple Incremental Decremental Learning of Support Vector Machines](https://dl.acm.org/doi/10.5555/2984093.2984196) | 2009 | Karasuyama et al. | _NIPS_ | - | - | Decremental Learning  |
| [Incremental and Decremental Learning for Linear Support Vector Machines](https://dl.acm.org/doi/10.5555/1776814.1776838) | 2007 | Romero et al. | _ICANN_ | - | - | Decremental Learning  |
| [Decremental Learning Algorithms for Nonlinear Langrangian and Least Squares Support Vector Machines](https://www.semanticscholar.org/paper/Decremental-Learning-Algorithms-for-Nonlinear-and-Duan-Li/312c677f0882d0dfd60bfd77346588f52aefd10f) | 2007 | Duan et al. | _OSB_ | - | - | Decremental Learning  |
| [Multicategory Incremental Proximal Support Vector Classifiers](https://link.springer.com/chapter/10.1007/978-3-540-45224-9_54) | 2003 | Tveit et al. | _KES_ | - | - | Decremental Learning  |
| [Incremental and Decremental Proximal Support Vector Classification using Decay Coefficients](https://link.springer.com/chapter/10.1007/978-3-540-45228-7_42) | 2003 | Tveit et al. | _DaWak_ | - | - | Decremental Learning  |
| [Incremental and Decremental Support Vector Machine Learning](https://dl.acm.org/doi/10.5555/3008751.3008808) | 2000 | Cauwenberg et al. | _NeurIPS_ | - | - | Decremental Learning  |
----------

<h2 id="model-intrinsic-approaches">4.2. Model-Intrinsic Approaches</h2>

<p>The model-intrinsic approaches include unlearning methods designed
for a specific type of models. Although they are model-intrinsic, their
applications are not necessarily narrow, as many machine learning models
can share the same type.</p>


<div class="figure*">
<figure>
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/figs/model-intrinsic.png" alt="https://arxiv.org/abs/2209.02299" style="max-width: 70%;"/>
</figure>
</div>


<p><strong>Unlearning for softmax classifiers (logit-based
classifiers).</strong> Softmax (or logit-based) classifiers are
classification models <span
class="math inline"><em>M</em>&#x2004;:&#x2004;&#x1D4B5;&#x2004;&#x2192;&#x2004;&#x211D;<sup><em>K</em></sup></span> that
output a vector of logits <span
class="math inline"><em>l</em>&#x2004;&#x2208;&#x2004;&#x211D;<sup><em>k</em></sup></span>, where
<span class="math inline"><em>K</em></span> is the number of classes,
for each data sample <span class="math inline"><em>x</em>&#x2004;&#x2208;&#x2004;&#x1D4B5;</span>.
The core task of <span class="math inline"><em>M</em>(<em>x</em>)</span>
is to estimate the probability distribution <span
class="math inline"><em>P</em><em>r</em>(<em>X</em>,<em>Y</em>)</span>,
where <span class="math inline"><em>X</em></span> is the random variable
in <span class="math inline">&#x1D4B3;</span>, and <span
class="math inline"><em>Y</em></span> is the random variable in <span
class="math inline">1,&#x2006;&#x2026;,&#x2006;<em>K</em></span>, such that: <span
class="math display"><em>P</em><em>r</em>(<em>Y</em>=<em>i</em>|<em>X</em>=<em>x</em>)&#x2004;&#x2248;&#x2004;<em>&#x3C3;</em>(<em>l</em><sub><em>i</em></sub>)</span>
Here, <span class="math inline">$\sigma(l_i) =
\frac{\exp(l_i)}{\sum_{j=1..K} \exp l_j}$</span> is the softmax
function. This formulation applies to logistic regression and deep
neural networks with a densely connected output layer using softmax
activations&#xA0;<a href="#ref-baumhauer2020machine">(Baumhauer, Sch&#xF6;ttle, and Zeppelzauer
2020)</a>. Baumhauer et al.&#xA0;<a href="#ref-baumhauer2020machine">(Baumhauer, Sch&#xF6;ttle, and Zeppelzauer
2020)</a> proposed an unlearning method for softmax classifiers based
on a linear filtration operator to proportionally shift the
classification of the to-be-forgetten class samples to other classes.
However, this approach is only works for class removal.</p>
<p><strong>Unlearning for linear models.</strong> Izzo et al.&#xA0;<a href="#ref-izzo2021approximate">(Izzo et al.
2021)</a> proposed an approximate unlearning method for linear and
logistic models based on influence functions. They approximated a
Hessian matrix computation with a project residual update&#xA0;<a href="#ref-izzo2021approximate cao2022machine">(Izzo
et al. 2021; Z. Cao et al. 2022)</a> that combines gradient methods
with synthetic data. It is suitable for forgetting small groups of
points out of a learned model. Some other studies consider an online
setting for machine unlearning (aka online data deletion)&#xA0;<a href="#ref-ginart2019making li2020online">(Ginart et
al. 2019; Li, Wang, and Cheng 2021)</a>, in which the removal request
is a sequence of entries that indicates which data item is to be
unlearned. In general, this setting is more challenging than normal
setting because indistinguishability must hold for any entry and for the
end of the deletion sequence. The goal is to achieve a lower bound on
amortized computation time&#xA0;<a href="#ref-ginart2019making li2020online">(Ginart et al. 2019; Li,
Wang, and Cheng 2021)</a>.</p>
<p>Li et al.&#xA0;<a href="#ref-li2020online">(Li, Wang,
and Cheng 2021)</a> formulated a special case of the online setting
where data is only accessible for a limited time so there is no full
training process in the first place. More precisely, the system is
allowed a constant memory to store historical data or a data sketch, and
it has to make predictions within a bounded period of time. Although the
data to be forgotten can be unlearned from a model on-the-fly using a
regret scheme on the memory, this particular unlearning process is only
applicable to ordinary linear regression&#xA0;<a href="#ref-li2020online">(Li, Wang, and Cheng 2021)</a>.</p>
<p><strong>Unlearning for Tree-based Models.</strong> Tree-based models
are classification techniques that partition the feature space
recursively, where the features and cut-off thresholds to split the data
are determined by some criterion, such as information gain&#xA0;<a href="#ref-schelter2021hedgecut">(Schelter,
Grafberger, and Dunning 2021)</a>. There is a class of tree-based
models, called extremely randomized trees&#xA0;<a href="#ref-geurts2006extremely">(Geurts, Ernst, et al. 2006)</a>,
that are built by an ensemble of decision trees. These are very
efficient because the candidate set of split features and cut-off
thresholds are randomly generated. The best candidate is selected by a
reduction in Gini impurity, which avoids the heavy computation of
logarithms.</p>
<p>Schelter et al.&#xA0;<a href="#ref-schelter2021hedgecut">(Schelter, Grafberger, and Dunning
2021)</a> proposed an unlearning solution for extremely randomized
trees by measuring the robustness of the split decisions. A split
decision is robust if removing <span
class="math inline"><em>k</em></span> data items does not reverse that
split. Note that <span class="math inline"><em>k</em></span> can be
bounded, and it is often small as only one in ten-thousand users who
wants to remove their data at a time&#xA0;<a href="#ref-schelter2021hedgecut">(Schelter, Grafberger, and Dunning
2021)</a>). The learning algorithm is redesigned such that most of
splits, especially the high-level ones, are robust. For the non-robust
splits, all subtree variants are grown from all split candidates and
maintained until a removal request would revise that split. When that
happens, the split is switched to its variant with higher Gini gain. As
a result, the unlearning process involves recalculating the Gini gains
and updating the splits if necessary.</p>
<p>One limitation of this approach is that if the set to be forgotten is
too large, there might be many non-robust splits. This would lead to
high storage costs for the subtree variants. However, it does give a
parameterized choice between unlearning and retraining. If there are
many removal requests, retraining might be the best asymptotically.
Alternatively, one might limit the maximum number of removal requests to
be processed at a time. Moreover, tree-based models have a highly
competitive performance for many predictive applications&#xA0;<a href="#ref-schelter2021hedgecut">(Schelter,
Grafberger, and Dunning 2021)</a>.</p>
<p><strong>Unlearning for Bayesian Models.</strong> Bayesian models are
probabilistic models that approximate a posterior likelihood&#xA0;<a href="#ref-fu2022knowledge fu2021bayesian jose2021unified nguyen2020variational">(Fu,
He, et al. 2022; Fu et al. 2021; Jose and Simeone 2021; Q. P. Nguyen,
Low, and Jaillet 2020)</a>. Also known as Bayesian inference, this
process is particularly useful when a loss function is not well-defined
or does not even exist. Bayesian models cover a wide range of machine
learning algorithms, such as Bayesian neural networks, probabilistic
graphical models, generative models, topic modeling, and probabilistic
matrix factorization&#xA0;<a href="#ref-zhang2020deep roth2018bayesian pearce2020uncertainty">(H.
Zhang et al. 2020; Roth and Pernkopf 2018; Pearce, Leibfried, and
Brintrup 2020)</a>.</p>
<p>Unlearning for Bayesian models requires a special treatment, as the
training already involves optimizing the posterior distribution of the
model&#x2019;s parameters. It also often involves optimizing the
Kullback-Leibler divergence between a prior belief and the posterior
distribution&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low, and Jaillet
2020)</a>. Nguyen et al.&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low, and Jaillet
2020)</a> proposed the notion of <em>exact Bayesian learning</em>:
<span
class="math display"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)&#x2004;=&#x2004;<em>P</em><em>r</em>(<em>w</em>|<em>D</em>)<em>P</em><em>r</em>(<em>D</em><sub><em>f</em></sub>|<em>D</em><sub><em>r</em></sub>)/<em>P</em><em>r</em>(<em>D</em><sub><em>f</em></sub>|<em>w</em>)&#x2004;&#x221D;&#x2004;<em>P</em><em>r</em>(<em>w</em>|<em>D</em>)/<em>P</em><em>r</em>(<em>D</em><sub><em>f</em></sub>|<em>w</em>)</span>
where <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>
is the distribution of a retrained model (as if it were trained only on
<span class="math inline"><em>D</em><sub><em>r</em></sub></span>).
However, the posterior distribution <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>
can only be sampled directly when the model parameters are
discrete-valued (quantized) or the prior is conjugate&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low,
and Jaillet 2020)</a>. For non-conjugate priors, Nguyen et al.&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low,
and Jaillet 2020)</a> proved that we can approximate <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>
by minimizing the KL divergence between <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em>)</span>
and <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em><sub><em>r</em></sub>)</span>.
Since <span
class="math inline"><em>P</em><em>r</em>(<em>w</em>|<em>D</em>)</span>
is the original model&#x2019;s parameter distribution, this approximation
prevents catastrophic unlearning. As such, the retained model performs
significantly better than the unlearned model in terms of accuracy.</p>
<p>A notion of certified Bayesian unlearning has also been studied,
where the KL divergence between the unlearned model and the retrained
model is bounded&#xA0;<a href="#ref-fu2022knowledge fu2021bayesian jose2021unified">(Fu, He, et
al. 2022; Fu et al. 2021; Jose and Simeone 2021)</a>: <span
class="math display"><em>K</em><em>L</em>(<em>P</em><em>r</em>(<em>A</em>(<em>D</em><sub><em>r</em></sub>)),&#x1D53C;<sub><em>A</em>(<em>D</em>)</sub><em>P</em><em>r</em>(<em>U</em>(<em>D</em>,<em>D</em><sub><em>f</em></sub>,<em>A</em>(<em>D</em>))))&#x2004;&#x2264;&#x2004;<em>&#x3F5;</em></span>
Here, the result of the unlearning process is an expectation over the
parameter distribution of the original model <span
class="math inline"><em>A</em>(<em>D</em>)&#x2004;&#x223C;&#x2004;<em>P</em><em>r</em>(<em>w</em>|<em>D</em>)</span>.
This certification can be achieved for some energy functions when
formulating the evidence lower bound (ELBO) in Bayesian models&#xA0;<a href="#ref-fu2022knowledge fu2021bayesian jose2021unified">(Fu, He, et
al. 2022; Fu et al. 2021; Jose and Simeone 2021)</a>.</p>
<p><strong>Unlearning for DNN-based Models.</strong> Deep neural
networks are advanced models that automatically learn features from
data. As a result, it is very difficult to pinpoint the exact model
update for each data item&#xA0;<a href="#ref-golatkar2020forgetting golatkar2020eternal mehta2022deep he2021deepobliviate goyal2021revisiting">(Golatkar,
Achille, and Soatto 2020b, 2020a; Mehta et al. 2022; He et al. 2021;
Goyal, Hassija, and Albuquerque 2021)</a>. Fortunately, deep neural
networks consist of multiple layers. For layers with convex activation
functions, existing unlearning methods such as certified removal
mechanisms can be applied&#xA0;<a href="#ref-GuoGHM20 neel2021descent sekhari2021remember cao2022machine">(C.
Guo et al. 2020; Neel, Roth, and Sharifi-Malvajerdi 2021; Sekhari et al.
2021; Z. Cao et al. 2022)</a>. For non-convex layers, Golatkar et
al.&#xA0;<a href="#ref-golatkar2021mixed golatkar2020forgetting">(Golatkar et al.
2021; Golatkar, Achille, and Soatto 2020b)</a> proposed a caching
approach that trains the model on data that are known a priori to be
permanent. Then the model is fine-tuned on user data using some convex
optimization.</p>
<p>Sophisticated unlearning methods for DNNs rely primarily on influence
functions&#xA0;<a href="#ref-koh2017understanding zhang2022machine">(Koh et al. 2017;
P.-F. Zhang et al. 2022)</a>. Here, Taylor expansions are used to
approximate the impact of a data item on the parameters of black-box
models&#xA0;<a href="#ref-zeng2021learning">(Zeng et al.
2021)</a>. Some variants include DeltaGrad&#xA0;<a href="#ref-wu2020deltagrad">(Y. Wu et al. 2020)</a>, which stores
the historical updates for each data item, and Fisher-based
unlearning&#xA0;<a href="#ref-golatkar2020eternal">(Golatkar, Achille, and Soatto
2020a)</a>, which we discussed under <a href="#sec:model-agnostic"
data-reference-type="autoref"
data-reference="sec:model-agnostic">[sec:model-agnostic]</a>). However,
influence functions in deep neural networks are not stable with a large
forget set&#xA0;<a href="#ref-basu2021influence mahadevan2021certifiable mahadevan2022certifiable">(Basu,
Pope, and Feizi 2021; Mahadevan and Mathioudakis 2021, 2022)</a>.</p>
<p>More precisely, after the data to be forgotten has been deleted from
database, Fisher-based unlearning&#xA0;<a href="#ref-golatkar2020eternal">(Golatkar, Achille, and Soatto
2020a)</a> works on the remaining training data with the Newton&#x2019;s
method, which uses a second-order gradient. To mitigate potential
information leaks, noise is injected into the model&#x2019;s parameters&#xA0;<a href="#ref-conggrapheditor">(Cong and Mahdavi
2022a)</a>. As the Fisher-based method aims to approximate the model
without the deleted data, there can be no guarantee that all the
influence of the deleted data has been removed. Although injecting noise
can help mitigate information leaks, the model&#x2019;s performance may be
affected by the noise&#xA0;<a href="#ref-conggrapheditor">(Cong and Mahdavi 2022a)</a>.</p>
<p>Golatkar et al.&#xA0;<a href="#ref-golatkar2020eternal">(Golatkar, Achille, and Soatto
2020a)</a> point out that the Hessian computation in certified
removal mechanisms is too expensive for complex models like deep neural
networks. Hence, they resorted to an approximation of Hessian via
Levenberg-Marquardt semi-positive-definite approximation, which turns
out to correspond with the Fisher Information Matrix&#xA0;<a href="#ref-martens2020new">(Martens 2020)</a>.
Although it does not provide a concrete theoretical guarantee,
Fisher-based unlearning could lead to further information-theoretic
approaches to machine unlearning&#xA0;<a href="#ref-guo2022efficient golatkar2020forgetting">(T. Guo et al.
2022; Golatkar, Achille, and Soatto 2020b)</a>.</p>

| **Paper Title** | **Year** | **Author** | **Venue** | **Model** | **Code** |
| --------------- | :----: | ---- | :----: | :----: | :----: |
| [Towards Adversarial Evaluations for Inexact Machine Unlearning](https://arxiv.org/abs/2201.06640) | 2023 | Goel et al. | _arXiv_ | EU-k, CF-k | [[Code]](https://github.com/shash42/Evaluating-Inexact-Unlearning) |
| [On the Trade-Off between Actionable Explanations and the Right to be Forgotten](https://openreview.net/pdf?id=HWt4BBZjVW) | 2023 | Pawelczyk et al. | _arXiv_ | - | - |  |
| [Towards Unbounded Machine Unlearning](https://arxiv.org/pdf/2302.09880) | 2023 | Kurmanji et al. | _arXiv_ | SCRUB | [[Code]](https://github.com/Meghdad92/SCRUB) | approximate unlearning |
| [Netflix and Forget: Efficient and Exact Machine Unlearning from Bi-linear Recommendations](https://arxiv.org/abs/2302.06676) | 2023 | Xu et al. | _arXiv_ | Unlearn-ALS | - | Exact Unlearning |
| [To Be Forgotten or To Be Fair: Unveiling Fairness Implications of Machine Unlearning Methods](https://arxiv.org/abs/2302.03350) | 2023 | Zhang et al. | _arXiv_ | - | [[Code]](https://github.com/cleverhans-lab/machine-unlearning) | |
| [Sequential Informed Federated Unlearning: Efficient and Provable Client Unlearning in Federated Optimization](https://arxiv.org/abs/2211.11656) | 2022 | Fraboni et al. | _arXiv_ | SIFU | - | |
| [Certified Data Removal in Sum-Product Networks](https://arxiv.org/abs/2210.01451) | 2022 | Becker and Liebig | _ICKG_ | UNLEARNSPN | [[Code]](https://github.com/ROYALBEFF/UnlearnSPN) | Certified Removal Mechanisms |
| [Learning with Recoverable Forgetting](https://arxiv.org/abs/2207.08224) | 2022 | Ye et al.  | _ECCV_ | LIRF | - |  |
| [Continual Learning and Private Unlearning](https://arxiv.org/abs/2203.12817) | 2022 | Liu et al. | _CoLLAs_ | CLPU | [[Code]](https://github.com/Cranial-XIX/Continual-Learning-Private-Unlearning) | |
| [Verifiable and Provably Secure Machine Unlearning](https://arxiv.org/abs/2210.09126) | 2022 | Eisenhofer et al. | _arXiv_ | - | [[Code]](https://github.com/cleverhans-lab/verifiable-unlearning) |  Certified Removal Mechanisms |
| [VeriFi: Towards Verifiable Federated Unlearning](https://arxiv.org/abs/2205.12709) | 2022 | Gao et al. | _arXiv_ | VERIFI | - | Certified Removal Mechanisms |
| [FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information](https://arxiv.org/abs/2210.10936) | 2022 | Cao et al. | _S&P_ | FedRecover | - | recovery method |
| [Fast Yet Effective Machine Unlearning](https://arxiv.org/abs/2111.08947) | 2022 | Tarun et al. | _arXiv_ | UNSIR | - |  |
| [Membership Inference via Backdooring](https://arxiv.org/abs/2206.04823) | 2022 | Hu et al.  | _IJCAI_ | MIB | [[Code]](https://github.com/HongshengHu/membership-inference-via-backdooring) | Membership Inferencing |
| [Forget Unlearning: Towards True Data-Deletion in Machine Learning](https://arxiv.org/abs/2210.08911) | 2022 | Chourasia et al. | _ICLR_ | - | - | noisy gradient descent |
| [Zero-Shot Machine Unlearning](https://arxiv.org/abs/2201.05629) | 2022 | Chundawat et al. | _arXiv_ | - | - |  |
| [Efficient Attribute Unlearning: Towards Selective Removal of Input Attributes from Feature Representations](https://arxiv.org/abs/2202.13295) | 2022 | Guo et al. | _arXiv_ | attribute unlearning | - |  |
| [Few-Shot Unlearning](https://download.huan-zhang.com/events/srml2022/accepted/yoon22fewshot.pdf) | 2022 | Yoon et al.   | _ICLR_ | - | - |  |
| [Federated Unlearning: How to Efficiently Erase a Client in FL?](https://arxiv.org/abs/2207.05521) | 2022 | Halimi et al. | _UpML Workshop_ | - | - | federated learning |
| [Machine Unlearning Method Based On Projection Residual](https://arxiv.org/abs/2209.15276) | 2022 | Cao et al. | _DSAA_ | - | - |  Projection Residual Method |
| [Hard to Forget: Poisoning Attacks on Certified Machine Unlearning](https://ojs.aaai.org/index.php/AAAI/article/view/20736) | 2022 | Marchant et al. | _AAAI_ | - | [[Code]](https://github.com/ngmarchant/attack-unlearning) | Certified Removal Mechanisms |
| [Athena: Probabilistic Verification of Machine Unlearning](https://web.archive.org/web/20220721061150id_/https://petsymposium.org/popets/2022/popets-2022-0072.pdf) | 2022 | Sommer et al. | _PoPETs_ | ATHENA | - | |
| [FP2-MIA: A Membership Inference Attack Free of Posterior Probability in Machine Unlearning](https://link.springer.com/chapter/10.1007/978-3-031-20917-8_12) | 2022 | Lu et al. | _ProvSec_ | FP2-MIA | - | inference attack |
| [Deletion Inference, Reconstruction, and Compliance in Machine (Un)Learning](https://arxiv.org/abs/2202.03460) | 2022 | Gao et al. | _PETS_ | - | - |  |
| [Prompt Certified Machine Unlearning with Randomized Gradient Smoothing and Quantization](https://openreview.net/pdf?id=ue4gP8ZKiWb) | 2022 | Zhang et al.   | _NeurIPS_ | PCMU | - | Certified Removal Mechanisms |
| [The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining](https://arxiv.org/abs/2203.07320) | 2022 | Liu et al. | _INFOCOM_ | - | [[Code]](https://github.com/yiliucs/federated-unlearning) |  |
| [Backdoor Defense with Machine Unlearning](https://arxiv.org/abs/2201.09538) | 2022 | Liu et al. | _INFOCOM_ | BAERASER | - | Backdoor defense |
| [Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten](https://dl.acm.org/doi/abs/10.1145/3488932.3517406) | 2022 | Nguyen et al. | _ASIA CCS_ | MCU | - | MCMC Unlearning  |
| [Federated Unlearning for On-Device Recommendation](https://arxiv.org/abs/2210.10958) | 2022 | Yuan et al. | _arXiv_ | - | - |  |
| [Can Bad Teaching Induce Forgetting? Unlearning in Deep Networks using an Incompetent Teacher](https://arxiv.org/abs/2205.08096) | 2022 | Chundawat et al. | _arXiv_ | - | - | Knowledge Adaptation |
| [ Efficient Two-Stage Model Retraining for Machine Unlearning](https://openaccess.thecvf.com/content/CVPR2022W/HCIS/html/Kim_Efficient_Two-Stage_Model_Retraining_for_Machine_Unlearning_CVPRW_2022_paper.html) | 2022 | Kim and Woo | _CVPR Workshop_ | - | - |  |
| [Learn to Forget: Machine Unlearning Via Neuron Masking](https://ieeexplore.ieee.org/abstract/document/9844865?casa_token=_eowH3BTt1sAAAAA:X0uCpLxOwcFRNJHoo3AtA0ay4t075_cSptgTMznsjusnvgySq-rJe8GC285YhWG4Q0fUmP9Sodw0) | 2021 | Ma et al. | _IEEE_ | Forsaken | - | Mask Gradients |
| [Adaptive Machine Unlearning](https://proceedings.neurips.cc/paper/2021/hash/87f7ee4fdb57bdfd52179947211b7ebb-Abstract.html) | 2021 | Gupta et al. | _NeurIPS_ | - | [[Code]](https://github.com/ChrisWaites/adaptive-machine-unlearning) | Differential Privacy |
| [Descent-to-Delete: Gradient-Based Methods for Machine Unlearning](https://proceedings.mlr.press/v132/neel21a.html) | 2021 | Neel et al. | _ALT_ | - | - | Certified Removal Mechanisms |
| [Remember What You Want to Forget: Algorithms for Machine Unlearning](https://arxiv.org/abs/2103.03279) | 2021 | Sekhari et al. | _NeurIPS_ | - | - |  |
| [FedEraser: Enabling Efficient Client-Level Data Removal from Federated Learning Models](https://ieeexplore.ieee.org/abstract/document/9521274) | 2021 | Liu et al. | _IWQoS_ | FedEraser | - |  |
| [Federated Unlearning](https://arxiv.org/abs/2012.13891) | 2021 | Liu et al. | _IWQoS_ | FedEraser | [[Code]](https://www.dropbox.com/s/1lhx962axovbbom/FedEraser-Code.zip?dl=0) |  |
| [Machine Unlearning via Algorithmic Stability](https://proceedings.mlr.press/v134/ullah21a.html) | 2021 | Ullah et al. | _COLT_ | TV | - | Certified Removal Mechanisms |
| [EMA: Auditing Data Removal from Trained Models](https://link.springer.com/chapter/10.1007/978-3-030-87240-3_76) | 2021 | Huang et al. | _MICCAI_ | EMA | [[Code]](https://github.com/Hazelsuko07/EMA) | Certified Removal Mechanisms |
| [Knowledge-Adaptation Priors](https://proceedings.neurips.cc/paper/2021/hash/a4380923dd651c195b1631af7c829187-Abstract.html) | 2021 | Khan and Swaroop | _NeurIPS_ | K-prior | [[Code]](https://github.com/team-approx-bayes/kpriors) | Knowledge Adaptation |
| [PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models](https://dl.acm.org/doi/abs/10.1145/3318464.3380571) | 2020 | Wu et al. | _NeurIPS_ | PrIU | - | Knowledge Adaptation |
| [Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks](https://arxiv.org/abs/1911.04933) | 2020 | Golatkar et al. | _CVPR_ | - | - | Certified Removal Mechanisms |
| [Learn to Forget: User-Level Memorization Elimination in Federated Learning](https://www.researchgate.net/profile/Ximeng-Liu-5/publication/340134612_Learn_to_Forget_User-Level_Memorization_Elimination_in_Federated_Learning/links/5e849e64a6fdcca789e5f955/Learn-to-Forget-User-Level-Memorization-Elimination-in-Federated-Learning.pdf) | 2020 | Liu et al. | _arXiv_ | Forsaken | - |  |
| [Certified Data Removal from Machine Learning Models](https://proceedings.mlr.press/v119/guo20c.html) | 2020 | Guo et al. | _ICML_ | - | - | Certified Removal Mechanisms |
| [Class Clown: Data Redaction in Machine Unlearning at Enterprise Scale](https://arxiv.org/abs/2012.04699) | 2020 | Felps et al. | _arXiv_ | - | - | Decremental Learning |
| [A Novel Online Incremental and Decremental Learning Algorithm Based on Variable Support Vector Machine](https://link.springer.com/article/10.1007/s10586-018-1772-4) | 2019 | Chen et al. | _Cluster Computing_ | - | - | Decremental Learning  |
| [Making AI Forget You: Data Deletion in Machine Learning](https://papers.nips.cc/paper/2019/hash/cb79f8fa58b91d3af6c9c991f63962d3-Abstract.html) | 2019 | Ginart et al. | _NeurIPS_ | - | - | Decremental Learning  |
| [Lifelong Anomaly Detection Through Unlearning](https://dl.acm.org/doi/abs/10.1145/3319535.3363226) | 2019 | Du et al. | _CCS_ | - | - |  |
| [Learning Not to Learn: Training Deep Neural Networks With Biased Data](https://openaccess.thecvf.com/content_CVPR_2019/html/Kim_Learning_Not_to_Learn_Training_Deep_Neural_Networks_With_Biased_CVPR_2019_paper.html) | 2019 | Kim et al. | _CVPR_ | - | - |  |
| [Efficient Repair of Polluted Machine Learning Systems via Causal Unlearning](https://dl.acm.org/citation.cfm?id=3196517) | 2018 | Cao et al. | _ASIACCS_ | KARMA | [[Code]](https://github.com/CausalUnlearning/KARMA) |  |
| [Understanding Black-box Predictions via Influence Functions](https://proceedings.mlr.press/v70/koh17a.html) | 2017 | Koh et al. | _ICML_ | - | [[Code]](https://github.com/kohpangwei/influence-release) | Certified Removal Mechanisms |
| [Towards Making Systems Forget with Machine Unlearning](https://ieeexplore.ieee.org/abstract/document/7163042) | 2015 | Cao and Yang | _S&P_ | - |  |
| [Towards Making Systems Forget with Machine Unlearning](https://dl.acm.org/doi/10.1109/SP.2015.35) | 2015 | Cao et al. | _S&P_ | - | - | Statistical Query Learning  |
| [Incremental and decremental training for linear classification](https://dl.acm.org/doi/10.1145/2623330.2623661) | 2014 | Tsai et al. | _KDD_ | - | [[Code]](https://www.csie.ntu.edu.tw/~cjlin/papers/ws/) | Decremental Learning  |
| [Multiple Incremental Decremental Learning of Support Vector Machines](https://dl.acm.org/doi/10.5555/2984093.2984196) | 2009 | Karasuyama et al. | _NIPS_ | - | - | Decremental Learning  |
| [Incremental and Decremental Learning for Linear Support Vector Machines](https://dl.acm.org/doi/10.5555/1776814.1776838) | 2007 | Romero et al. | _ICANN_ | - | - | Decremental Learning  |
| [Decremental Learning Algorithms for Nonlinear Langrangian and Least Squares Support Vector Machines](https://www.semanticscholar.org/paper/Decremental-Learning-Algorithms-for-Nonlinear-and-Duan-Li/312c677f0882d0dfd60bfd77346588f52aefd10f) | 2007 | Duan et al. | _OSB_ | - | - | Decremental Learning  |
| [Multicategory Incremental Proximal Support Vector Classifiers](https://link.springer.com/chapter/10.1007/978-3-540-45224-9_54) | 2003 | Tveit et al. | _KES_ | - | - | Decremental Learning  |
| [Incremental and Decremental Proximal Support Vector Classification using Decay Coefficients](https://link.springer.com/chapter/10.1007/978-3-540-45228-7_42) | 2003 | Tveit et al. | _DaWak_ | - | - | Decremental Learning  |
| [Incremental and Decremental Support Vector Machine Learning](https://dl.acm.org/doi/10.5555/3008751.3008808) | 2000 | Cauwenberg et al. | _NeurIPS_ | - | - | Decremental Learning  |
----------

<h2 id="data-driven-approaches">4.3. Data-Driven Approaches</h2>

The approaches fallen into this category use data partition, data augmentation and data influence to speed up the retraining process. Methods of attack by data manipulation (e.g. data poisoning) are also included for reference.

<div class="figure*">
<figure>
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/figs/data-driven.png" alt="https://arxiv.org/abs/2209.02299" style="max-width: 60%;"/>
</figure>
</div>

<p><strong>Data Partitioning (Efficient Retraining).</strong> The
approaches falling into this category uses data partitioning mechanisms
to speed up the retraining process. Alternatively, they partially
retrain the model with some bounds on accuracy. Bourtoule et al.&#xA0;<a href="#ref-bourtoule2021machine">(Bourtoule et al.
2021)</a> proposed the well-known SISA framework (<a
href="#fig:partition" data-reference-type="autoref"
data-reference="fig:partition">[fig:partition]</a>) that partitions the
data into shards and slices. Each shard has a single model, and the
final output is an aggregation of multiple models over these shards. For
each slice of a shard, a model checkpoint is stored during training so
that a new model can be retrained from an intermediate state&#xA0;<a href="#ref-bourtoule2021machine aldaghri2021coded">(Bourtoule et al.
2021; Aldaghri, Mahdavifar, et al. 2021)</a>.</p>

<figure id="fig:partition">
<img src="https://raw.githubusercontent.com/tamlhp/awesome-machine-unlearning/main/kaggle/partition.png"
alt="Efficient retraining for machine unlearning using data partition" style="max-width: 80%;" />
<figcaption aria-hidden="true">Efficient retraining for machine
unlearning using data partition</figcaption>
</figure>

<p><strong>Data Augmentation (Error-manipulation noise).</strong> Data
augmentation is the process of enriching or adding more data to support
a model&#x2019;s training&#xA0;<a href="#ref-yu2021does">(D. Yu
et al. 2021)</a>. Such mechanisms can be used to support machine
unlearning as well. Huang et al.&#xA0;<a href="#ref-huang2021unlearnable">(H. Huang et al. 2021)</a> proposed
the idea of error-minimizing noise, which tricks a model into thinking
that there is nothing to be learned from a given set of data (i.e., the
loss does not change). However, it can only be used to protect a
particular data item before the model is trained. A similar setting was
also studied by Fawkes&#xA0;<a href="#ref-shan2020protecting">(Shan et al. 2020)</a>, in which a
targeted adversarial attack is used to ensure the model does not learn
anything from a targeted data item.</p>
<p>Conversely, Tarun et al.&#xA0;<a href="#ref-tarun2021fast">(Tarun et al. 2021)</a> proposed
error-maximizing noise to impair the model on a target class of data (to
be forgotten). However, this tactic does not work on specific data items
as it is easier to interfere with a model&#x2019;s prediction on a whole class
as opposed to a specific data item of that class&#xA0;<a href="#ref-tarun2021fast">(Tarun et al. 2021)</a>.</p>
<p><strong>Data influence.</strong> This group of unlearning approaches
studies how a change in training data impacts a model&#x2019;s parameters&#xA0;<a href="#ref-wu2022puma conggrapheditor cao2022machine">(G. Wu, Hashemi,
and Srinivasa 2022; Cong and Mahdavi 2022a; Z. Cao et al. 2022)</a>,
where impact is computed using influence functions&#xA0;<a href="#ref-mahadevan2022certifiable chundawat2022zero">(Mahadevan and
Mathioudakis 2022; Chundawat et al. 2022b)</a>. However, influence
functions depend on the current state of a learning algorithm&#xA0;<a href="#ref-wu2022puma">(G. Wu, Hashemi, and Srinivasa
2022)</a>. To mitigate this issue, several works store a training
history of intermediate quantities (e.g., model parameters or gradients)
generated by each step of model training&#xA0;<a href="#ref-graves2021amnesiac neel2021descent wu2020deltagrad wu2020priu">(Graves,
Nagisetty, and Ganesh 2021; Neel, Roth, and Sharifi-Malvajerdi 2021; Y.
Wu et al. 2020; Y. Wu, Tannen, and Davidson 2020)</a>. Then, the
unlearning process becomes one of subtracting these historical updates.
However, the model&#x2019;s accuracy might degrade significantly due to
catastrophic unlearning&#xA0;<a href="#ref-nguyen2020variational">(Q. P. Nguyen, Low, and Jaillet
2020)</a> since the order in which the training data is fed matters
to the learning model. Moreover, the influence itself does not verify
whether the data to be forgotten is still included in the unlearned
model&#xA0;<a href="#ref-thudi2021necessity thudi2022unrolling">(Thudi, Jia, et al.
2022; Thudi, Deza, et al. 2022)</a>.</p>
<p>Zeng et al.&#xA0;<a href="#ref-zeng2021learning">(Zeng et al. 2021)</a> suggested a new
method of modeling data influence by adding regularization terms into
the learning algorithm. Although this method is model-agnostic, it
requires intervening in the original training process of the original
model. Moreover, it is only applicable to convex learning problems and
deep neural networks.</p>
<p>Peste et al.&#xA0;<a href="#ref-peste2021ssse">(Peste, Alistarh, and Lampert 2021)</a>
closed this gap by introducing a new Fisher-based unlearning method,
which can approximate the Hessian matrix. This method works for both
shallow and deep models, and also convex and non-convex problems. The
idea is to efficiently compute the matrix inversion of a Fisher
Information Matrix using rank-one updates. However, as the whole process
is approximate, there is no concrete guarantee on the unlearned
model.</p>

| **Paper Title** | **Year** | **Author** | **Venue** | **Model** | **Code** | 
| --------------- | :----: | ---- | :----: | :----: | :----: |
| [Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks](https://arxiv.org/abs/2212.10717) | 2022 | Di et al. | _NeurIPS_ | - | [[Code]](https://github.com/Jimmy-di/camouflage-poisoning) | Data Poisoning |
| [Forget Unlearning: Towards True Data Deletion in Machine Learning](https://arxiv.org/pdf/2210.08911.pdf) | 2022 | Chourasia et al. | _ICLR_ | - | - | Data Influence |
| [ARCANE: An Efficient Architecture for Exact Machine Unlearning](https://www.ijcai.org/proceedings/2022/0556.pdf) | 2022 | Yan et al.  | _IJCAI_ | ARCANE | - | Data Partition |
| [PUMA: Performance Unchanged Model Augmentation for Training Data Removal](https://ojs.aaai.org/index.php/AAAI/article/view/20846) | 2022 | Wu et al. | _AAAI_ | PUMA | - | Data Influence |
| [Certifiable Unlearning Pipelines for Logistic Regression: An Experimental Study](https://www.mdpi.com/2504-4990/4/3/28) | 2022 | Mahadevan and Mathioudakis | _MAKE_ | - | [[Code]](https://version.helsinki.fi/mahadeva/unlearning-experiments) | Data Influence |
| [Zero-Shot Machine Unlearning](https://arxiv.org/abs/2201.05629) | 2022 | Chundawat et al. | _arXiv_ | - | - | Data Influence |
| [GRAPHEDITOR: An Efficient Graph Representation Learning and Unlearning Approach](https://congweilin.github.io/CongWeilin.io/files/GraphEditor.pdf) | 2022 | Cong and Mahdavi | - | GRAPHEDITOR | [[Code]](https://anonymous.4open.science/r/GraphEditor-NeurIPS22-856E/README.md) | Data Influence |
| [Fast Model Update for IoT Traffic Anomaly Detection with Machine Unlearning](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9927728) | 2022 | Fan et al. | _IEEE IoT-J_ | ViFLa | - | Data Partition |
| [Learning to Refit for Convex Learning Problems](https://arxiv.org/abs/2111.12545) | 2021 | Zeng et al. | _arXiv_ | OPTLEARN | - | Data Influence |
| [Fast Yet Effective Machine Unlearning](https://arxiv.org/abs/2111.08947) | 2021 | Ayush et al. | _arXiv_ | - | - | Data Augmentation |
| [Learning with Selective Forgetting](https://www.ijcai.org/proceedings/2021/0137.pdf) | 2021 | Shibata et al. | _IJCAI_ | - | - | Data Augmentation |
| [SSSE: Efficiently Erasing Samples from Trained Machine Learning Models](https://openreview.net/forum?id=GRMKEx3kEo) | 2021 | Peste et al. | _NeurIPS_ | SSSE | - | Data Influence |
| [How Does Data Augmentation Affect Privacy in Machine Learning?](https://arxiv.org/abs/2007.10567) | 2021 | Yu et al. | _AAAI_ | - | [[Code]](https://github.com/dayu11/MI_with_DA) | Data Augmentation |
| [Coded Machine Unlearning](https://ieeexplore.ieee.org/document/9458237) | 2021 | Aldaghri et al. | _IEEE_ | - | - | Data Partitioning |
| [Machine Unlearning](https://ieeexplore.ieee.org/document/9519428) | 2021 | Bourtoule et al. | _IEEE_ | SISA | [[Code]](https://github.com/cleverhans-lab/machine-unlearning) | Data Partitioning |
| [How Does Data Augmentation Affect Privacy in Machine Learning?](https://ojs.aaai.org/index.php/AAAI/article/view/17284/) | 2021 | Yu et al. | _AAAI_ | - | [[Code]](https://github.com/dayu11/MI_with_DA) | Data Augmentation |
| [Amnesiac Machine Learning](https://ojs.aaai.org/index.php/AAAI/article/view/17371) | 2021 | Graves et al. | _AAAI_ | AmnesiacML | [[Code]](https://github.com/lmgraves/AmnesiacML) | Data Influence |
| [Unlearnable Examples: Making Personal Data Unexploitable](https://arxiv.org/abs/2101.04898) | 2021 | Huang et al. | _ICLR_ | - | [[Code]](https://github.com/HanxunH/Unlearnable-Examples) | Data Augmentation |
| [Descent-to-Delete: Gradient-Based Methods for Machine Unlearning](https://proceedings.mlr.press/v132/neel21a.html) | 2021 | Neel et al. | _ALT_ | - | - | Data Influence |
| [Fawkes: Protecting Privacy against Unauthorized Deep Learning Models](https://dl.acm.org/doi/abs/10.5555/3489212.3489302) | 2020 | Shan et al. | _USENIX Sec. Sym._ | Fawkes | [[Code]](https://github.com/Shawn-Shan/fawkes) | Data Augmentation |
| [PrIU: A Provenance-Based Approach for Incrementally Updating Regression Models](https://dl.acm.org/doi/abs/10.1145/3318464.3380571) | 2020 | Wu et al. | _SIGMOD_ | PrIU/PrIU-opt | - | Data Influence |
| [DeltaGrad: Rapid retraining of machine learning models](https://proceedings.mlr.press/v119/wu20b.html) | 2020 | Wu et al. | _ICML_ | DeltaGrad | [[Code]](https://github.com/thuwuyinjun/DeltaGrad) | Data Influence |


----------

<h2 id="sec:metrics">5. Evaluation Metrics</h2>

| Metrics | Formula/Description | Usage |
| ---- | ---- | ---- |
| Accuracy | Accuracy on unlearned model on forget set and retrain set | Evaluating the predictive performance of unlearned model |
| Completeness | The overlapping (e.g. Jaccard distance) of output space between the retrained and the unlearned model | Evaluating the indistinguishability between model outputs |
| Unlearn time | The amount of time of unlearning request | Evaluating the unlearning efficiency |
| Relearn Time | The epochs number required for the unlearned model to reach the accuracy of source model | Evaluating the unlearning efficiency (relearn with some data sample) |
| Layer-wise Distance | The weight difference between original model and retrain model | Evaluate the indistinguishability between model parameters |
| Activation Distance | An average of the L2-distance between the unlearned model and retrained model’s predicted probabilities on the forget set | Evaluating the indistinguishability between model outputs | 
| JS-Divergence | Jensen-Shannon divergence between the predictions of the unlearned and retrained model | Evaluating the indistinguishability between model outputs |
| Membership Inference Attack | Recall (#detected items / #forget items) | Verify the influence of forget data on the unlearned model |
| ZRF score | $\mathcal{ZRF} = 1 - \frac{1}{nf}\sum\limits_{i=0}^{n_f} \mathcal{JS}(M(x_i), T_d(x_i))$ | The unlearned model should not intentionally give wrong output $\(\mathcal{ZRF} = 0\)$ or random output $\(\mathcal{ZRF} = 1\)$ on the forget item |
| Anamnesis Index (AIN) | $AIN = \frac{r_t (M_u, M_{orig}, \alpha)}{r_t (M_s, M_{orig}, \alpha)}$ | Zero-shot machine unlearning | 
| Epistemic Uncertainty | if $\mbox{i(w;D) > 0}$, then $\mbox{efficacy}(w;D) = \frac{1}{i(w; D)}$;<br />otherwise $\mbox{efficacy}(w;D) = \infty$ | How much information the model exposes |
| Model Inversion Attack | Visualization | Qualitative verifications and evaluations |

<p>The most often used metrics for measuring anomaly detection
performance include accuracy, completeness, unlearn time, distance, and
forgetting scores. Their formulas and common usage are summarized in <a
href="#tab:metrics" data-reference-type="autoref"
data-reference="tab:metrics">[tab:metrics]</a>. More detailed
descriptions are given below.</p>
<p><strong>Accuracy.</strong> In machine unlearning, a model&#x2019;s accuracy
needs to be compared on three different datasets: (1) The set to be
forgotten. Since the expected behaviour of an unlearned model after
unlearning should mirror that of a retrained model, the accuracy on the
remaining data should be similar to the retrained model. (2) The
retained set. The retained set&#x2019;s accuracy should be close to that of the
original model. (3) The test set. The unlearned model should still
perform well on a separate test dataset compared to the retrained
model.</p>
<p><strong>Completeness.</strong> The influence of the to-be-removed
samples on the unlearned model must be completely eliminated.
Completeness, hence, measures the degree to which an unlearned model is
compatible with a retrained model&#xA0;<a href="#ref-cao2015towards">(Y. Cao and Yang 2015)</a>. If the
unlearned model gives similar predictions to a retrained model for all
samples, the operation of feeding samples or observing the model&#x2019;s
information is impractical to achieve the forgotten data and its
lineage. The final metric is often calculated as the overlap of output
space (e.g., the Jaccard distance) between the unlearned model and the
retraining. However, computing this metric is often exhaustive.</p>
<p><strong>Unlearning time and Retraining time.</strong> Timeliness
quantifies the time saved when using unlearning instead of retraining
for model update. The quicker the system restores privacy, security, and
usefulness, the more timely the unlearning process. In particular,
retraining uses the whole training set to execute the learning
algorithm, whereas unlearning executes the learning algorithm on a
limited amount of summations; hence, the speed of unlearning is quicker
due to the reduced size of the training data.</p>
<p><strong>Relearn time.</strong> Relearning time is an excellent proxy
for measuring the amount of unlearned data information left in the
model. If a model recovers its performance on unlearned data with just a
few steps of retraining, it is extremely probable that the model has
retained some knowledge of the unlearned data.</p>
<p><strong>The layer-wise distance.</strong> The layer-wise distance
between the original and unlearned models helps when trying to
understand the impact of the unlearning on each layer. The weight
difference should be comparable to a retrained model given that a
shorter distance indicates ineffective unlearning. Likewise, a much
longer distance may point to a Streisand effect and possible information
leaks.</p>
<p><strong>Activation Distance.</strong> The activation distance is the
separation between the final activation of the scrubbed weights and the
retrained model. A shorter activation distance indicates superior
unlearning.</p>
<p><strong>JS-Divergence.</strong> When paired with the activation
distance, the JS-Divergence between the predictions of the unlearned and
retrained model provides a more full picture of unlearning. Less
divergence results in better unlearning. The formula of JS-Divergence is
<span
class="math inline">&#x1D4A5;&#x1D4AE;(<em>M</em>(<em>x</em>),<em>T</em><sub><em>d</em></sub>(<em>x</em>))&#x2004;=&#x2004;0.5&#x2005;*&#x2005;&#x1D4A6;&#x2112;(<em>M</em>(<em>x</em>)||<em>m</em>)&#x2005;+&#x2005;0.5&#x2005;*&#x2005;&#x1D4A6;&#x2112;(<em>T</em><sub><em>d</em></sub>(<em>x</em>)||<em>m</em>)</span>,
where <span class="math inline"><em>M</em></span> is unlearned model,
<span class="math inline"><em>T</em><sub><em>d</em></sub></span> is a
competent teacher, and <span class="math inline">&#x1D4A6;&#x2112;</span> is The
Kullback-Leibler divergence <a href="#ref-KLFormula">(Kullback et al. 1951)</a>, <span
class="math inline">$m= \frac{M(x)+T_d(x)}{2}$</span>.</p>
<p><strong>Membership Inference.</strong> The membership inference
metric leverages a membership inference attack to determine whether or
not any information about the forgotten samples remains in the
model&#xA0;<a href="#ref-chen2021machine">(M. Chen et
al. 2021b)</a>. The set to be forgotten should have reduced the
attack probability in the unlearned model. The chance of an inference
attack should be reduced in the unlearned model compared to the original
model for the forgotten class data.</p>
<p><strong>ZRF score.</strong> Zero Retrain Forgetting (ZRF) makes it
possible to evaluate unlearning approaches independent of
retraining&#xA0;<a href="#ref-chundawat2022can">(Chundawat et al. 2022a)</a>. The
unpredictability of the model&#x2019;s predictions is measured by comparing
them to an unskilled instructor. ZRF compares the set to be forgotten&#x2019;s
output distribution to the output of a randomly initialised model, which
in most situations is our lousy instructor. The ZRF score ranges between
0 and 1; it will be near to 1 if the model&#x2019;s behaviour with the
forgotten samples is entirely random, and close to 0 if it exhibits a
certain pattern. The formula of ZRF score is <span
class="math inline">$\mathcal{ZRF} = 1 -
\frac{1}{nf}\sum\limits_{i=0}^{n_f} \mathcal{JS}(M(x_i),
T_d(x_i))$</span>, where <span
class="math inline"><em>x</em><sub><em>i</em></sub></span> is the <span
class="math inline"><em>i</em><sub><em>t</em><em>h</em></sub></span>
sample from the set to be forgotten with a total number of samples <span
class="math inline"><em>n</em><sub><em>f</em></sub></span></p>
<p><strong>Anamnesis Index (AIN).</strong> AIN values range between 0
and 1. The better the unlearning, the closer to 1. Instances where
information from the classes to be forgotten are still preserved in the
model correlate to AIN levels well below 1. A score closer to 0 also
suggests that the unlearned model will rapidly relearn to generate
correct predictions. This may be due to the fact that the last layers
contain limited reversible modifications, which degrades the performance
of the model on the forgotten classes. If an AIN score is much greater
than 1, it may suggest that the approach causes parameter changes that
are so severe that the unlearning itself may be detected (Streisand
effect). This might be due to the fact that the model was pushed away
from the original point and, as a result, is unable to retrieve
previously learned knowledge about the forgotten class(es). The formula
for calculating an AIN value is <span class="math inline">$AIN =
\frac{r_t (M_u, M_{orig}, \alpha)}{r_t (M_s, M_{orig}, \alpha)}$</span>,
where <span class="math inline"><em>&#x3B1;</em>%</span> is a margin around
the initial precision used to determine relearn time. <span
class="math inline"><em>r</em><sub><em>t</em></sub>(<em>M</em>,<em>M</em><sub><em>o</em><em>r</em><em>i</em><em>g</em></sub>,<em>&#x3B1;</em>)</span>
are mini-batches (or steps) to be achieved by the model <span
class="math inline"><em>M</em></span> on the classes to be forgotten
within <span class="math inline"><em>&#x3B1;</em>%</span> of the precision
compared to the original model <span
class="math inline"><em>M</em><sub><em>o</em><em>r</em><em>i</em><em>g</em></sub></span>.
<span class="math inline"><em>M</em><sub><em>u</em></sub></span> and
<span class="math inline"><em>M</em><sub><em>s</em></sub></span>
respectively represent the unlearned model and a model trained from
scratch.</p>

 


----------

<h1 id="sec:reference">References</h1>

<div id="refs" class="references csl-bib-body hanging-indent"
role="list">
<div id="ref-abadi2016deep" class="csl-entry" role="listitem">
Abadi, Martin, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya
Mironov, Kunal Talwar, and Li Zhang. 2016. <span>&#x201C;Deep Learning with
Differential Privacy.&#x201D;</span> In <em>SIGSAC</em>, 308&#x2013;18.
</div>
<div id="ref-aldaghri2021coded" class="csl-entry" role="listitem">
Aldaghri, Nasser, Hessam Mahdavifar, et al. 2021. <span>&#x201C;Coded Machine
Unlearning.&#x201D;</span> <em>IEEE Access</em> 9: 88137&#x2013;50.
</div>
<div id="ref-basu2021influence" class="csl-entry" role="listitem">
Basu, Samyadeep, Phil Pope, and Soheil Feizi. 2021. <span>&#x201C;Influence
Functions in Deep Learning Are Fragile.&#x201D;</span> In <em>ICLR</em>.
</div>
<div id="ref-baumhauer2020machine" class="csl-entry" role="listitem">
Baumhauer, Thomas, Pascal Sch&#xF6;ttle, and Matthias Zeppelzauer. 2020.
<span>&#x201C;Machine Unlearning: Linear Filtration for Logit-Based
Classifiers.&#x201D;</span> <em>arXiv Preprint arXiv:2002.02730</em>.
</div>
<div id="ref-becker2022epistemic" class="csl-entry" role="listitem">
Becker, Alexander, and Thomas Liebig. 2022. <span>&#x201C;Evaluating Machine
Unlearning via Epistemic Uncertainty.&#x201D;</span>
</div>
<div id="ref-berahas2016multi" class="csl-entry" role="listitem">
Berahas, Albert S, Jorge Nocedal, et al. 2016. <span>&#x201C;A Multi-Batch
l-BFGS Method for Machine Learning.&#x201D;</span> <em>NIPS</em> 29.
</div>
<div id="ref-bitansky2012extractable" class="csl-entry" role="listitem">
Bitansky, Nir, Ran Canetti, Alessandro Chiesa, and Eran Tromer. 2012.
<span>&#x201C;From Extractable Collision Resistance to Succinct Non-Interactive
Arguments of Knowledge, and Back Again.&#x201D;</span> In <em>ITCS</em>,
326&#x2013;49.
</div>
<div id="ref-bollapragada2018progressive" class="csl-entry"
role="listitem">
Bollapragada, Raghu, Jorge Nocedal, Dheevatsa Mudigere, Hao-Jun Shi, and
Ping Tak Peter Tang. 2018. <span>&#x201C;A Progressive Batching l-BFGS Method
for Machine Learning.&#x201D;</span> In <em>ICML</em>, 620&#x2013;29.
</div>
<div id="ref-bourtoule2021machine" class="csl-entry" role="listitem">
Bourtoule, Lucas, Varun Chandrasekaran, Christopher A Choquette-Choo,
Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas
Papernot. 2021. <span>&#x201C;Machine Unlearning.&#x201D;</span> In <em>SP</em>,
141&#x2013;59.
</div>
<div id="ref-brophy2021machine" class="csl-entry" role="listitem">
Brophy, Jonathan, and Daniel Lowd. 2021. <span>&#x201C;Machine Unlearning for
Random Forests.&#x201D;</span> In <em>ICML</em>, 1092&#x2013;1104.
</div>
<div id="ref-cao2015towards" class="csl-entry" role="listitem">
Cao, Yinzhi, and Junfeng Yang. 2015. <span>&#x201C;Towards Making Systems
Forget with Machine Unlearning.&#x201D;</span> In <em>2015 IEEE Symposium on
Security and Privacy</em>, 463&#x2013;80.
</div>
<div id="ref-cao2018efficient" class="csl-entry" role="listitem">
Cao, Yinzhi, Alexander Fangxiao Yu, Andrew Aday, Eric Stahl, Jon
Merwine, and Junfeng Yang. 2018. <span>&#x201C;Efficient Repair of Polluted
Machine Learning Systems via Causal Unlearning.&#x201D;</span> In
<em>ASIACCS</em>, 735&#x2013;47.
</div>
<div id="ref-cao2022machine" class="csl-entry" role="listitem">
Cao, Zihao, Jianzong Wang, Shijing Si, Zhangcheng Huang, and Jing Xiao.
2022. <span>&#x201C;Machine Unlearning Method Based on Projection
Residual.&#x201D;</span> In <em>DSAA</em>, 1&#x2013;8.
</div>
<div id="ref-cauwenberghs2000incremental" class="csl-entry"
role="listitem">
Cauwenberghs, Gert et al. 2000. <span>&#x201C;Incremental and Decremental
Support Vector Machine Learning.&#x201D;</span> <em>NIPS</em> 13.
</div>
<div id="ref-chang2022example" class="csl-entry" role="listitem">
Chang, Yi, Zhao Ren, Thanh Tam Nguyen, Wolfgang Nejdl, and Bj&#xF6;rn W
Schuller. 2022. <span>&#x201C;Example-Based Explanations with Adversarial
Attacks for Respiratory Sound Analysis.&#x201D;</span> In <em>INTERSPEECH</em>.
</div>
<div id="ref-chaudhuri2011differentially" class="csl-entry"
role="listitem">
Chaudhuri, Kamalika, Claire Monteleoni, and Anand D Sarwate. 2011.
<span>&#x201C;Differentially Private Empirical Risk Minimization.&#x201D;</span>
<em>JMLR</em> 12 (3).
</div>
<div id="ref-chen2022recommendation" class="csl-entry" role="listitem">
Chen, Chong, Fei Sun, Min Zhang, and Bolin Ding. 2022.
<span>&#x201C;Recommendation Unlearning.&#x201D;</span> In <em>WWW</em>, 2768&#x2013;77.
</div>
<div id="ref-chen2020graph" class="csl-entry" role="listitem">
Chen, Fenxiao, Yun-Cheng Wang, Bin Wang, et al. 2020. <span>&#x201C;Graph
Representation Learning: A Survey.&#x201D;</span> <em>ATSIP</em> 9.
</div>
<div id="ref-chen2021machinegan" class="csl-entry" role="listitem">
Chen, Kongyang, Yao Huang, et al. 2021. <span>&#x201C;Machine Unlearning via
GAN.&#x201D;</span> <em>arXiv Preprint arXiv:2111.11869</em>.
</div>
<div id="ref-chen2021graph" class="csl-entry" role="listitem">
Chen, Min, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert,
and Yang Zhang. 2021a. <span>&#x201C;Graph Unlearning.&#x201D;</span> <em>arXiv
Preprint arXiv:2103.14991</em>.
</div>
<div id="ref-chen2021machine" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2021b. <span>&#x201C;When Machine Unlearning Jeopardizes Privacy.&#x201D;</span>
In <em>SIGSAC</em>, 896&#x2013;911.
</div>
<div id="ref-chen2019novel" class="csl-entry" role="listitem">
Chen, Yuantao, Jie Xiong, Weihong Xu, and Jingwen Zuo. 2019. <span>&#x201C;A
Novel Online Incremental and Decremental Learning Algorithm Based on
Variable Support Vector Machine.&#x201D;</span> <em>Cluster Computing</em> 22
(3): 7435&#x2013;45.
</div>
<div id="ref-chien2022certified" class="csl-entry" role="listitem">
Chien, Eli, Chao Pan, et al. 2022. <span>&#x201C;Certified Graph
Unlearning.&#x201D;</span> <em>arXiv Preprint arXiv:2206.09140</em>.
</div>
<div id="ref-chundawat2022can" class="csl-entry" role="listitem">
Chundawat, Vikram S, Ayush K Tarun, Murari Mandal, and Mohan
Kankanhalli. 2022a. <span>&#x201C;Can Bad Teaching Induce Forgetting?
Unlearning in Deep Networks Using an Incompetent Teacher.&#x201D;</span>
<em>arXiv Preprint arXiv:2205.08096</em>.
</div>
<div id="ref-chundawat2022zero" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2022b. <span>&#x201C;Zero-Shot Machine Unlearning.&#x201D;</span> <em>arXiv
Preprint arXiv:2201.05629</em>.
</div>
<div id="ref-conggrapheditor" class="csl-entry" role="listitem">
Cong, Weilin, and Mehrdad Mahdavi. 2022a. <span>&#x201C;GRAPHEDITOR: An
Efficient Graph Representation Learning and Unlearning Approach.&#x201D;</span>
<em><a href="https://congweilin.github.io/CongWeilin.io/"
class="uri">Https://Congweilin.github.io/CongWeilin.io/</a></em>.
</div>
<div id="ref-congprivacy" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2022b. <span>&#x201C;Privacy Matters! Efficient Graph Representation
Unlearning with Data Removal Guarantee.&#x201D;</span> <em><a
href="https://congweilin.github.io/CongWeilin.io/"
class="uri">Https://Congweilin.github.io/CongWeilin.io/</a></em>.
</div>
<div id="ref-DaiDHSCW22" class="csl-entry" role="listitem">
Dai, Damai, Li Dong, et al. 2022. <span>&#x201C;Knowledge Neurons in Pretrained
Transformers.&#x201D;</span> In <em>ACL</em>, 8493&#x2013;8502.
</div>
<div id="ref-dang2021right" class="csl-entry" role="listitem">
Dang, Quang-Vinh. 2021. <span>&#x201C;Right to Be Forgotten in the Age of
Machine Learning.&#x201D;</span> In <em>ICADS</em>, 403&#x2013;11.
</div>
<div id="ref-deng2009imagenet" class="csl-entry" role="listitem">
Deng, Jia, Wei Dong, Richard Socher, et al. 2009. <span>&#x201C;Imagenet: A
Large-Scale Hierarchical Image Database.&#x201D;</span> In <em>CVPR</em>,
248&#x2013;55.
</div>
<div id="ref-dinsdale2020unlearning" class="csl-entry" role="listitem">
Dinsdale, Nicola K, Mark Jenkinson, et al. 2020. <span>&#x201C;Unlearning
Scanner Bias for Mri Harmonisation.&#x201D;</span> In <em>MICCAI</em>, 369&#x2013;78.
</div>
<div id="ref-dinsdale2021deep" class="csl-entry" role="listitem">
Dinsdale, Nicola K, Mark Jenkinson, and Ana IL Namburete. 2021.
<span>&#x201C;Deep Learning-Based Unlearning of Dataset Bias for MRI
Harmonisation and Confound Removal.&#x201D;</span> <em>NeuroImage</em> 228:
117689.
</div>
<div id="ref-du2019lifelong" class="csl-entry" role="listitem">
Du, Min, Zhi Chen, et al. 2019. <span>&#x201C;Lifelong Anomaly Detection
Through Unlearning.&#x201D;</span> In <em>SIGSAC</em>, 1283&#x2013;97.
</div>
<div id="ref-duan2007decremental" class="csl-entry" role="listitem">
Duan, Hua, Hua Li, Guoping He, and Qingtian Zeng. 2007.
<span>&#x201C;Decremental Learning Algorithms for Nonlinear Langrangian and
Least Squares Support Vector Machines.&#x201D;</span> In <em>OSB</em>, 358&#x2013;66.
</div>
<div id="ref-duda2020training" class="csl-entry" role="listitem">
Duda, Piotr, Maciej Jaworski, Andrzej Cader, and Lipo Wang. 2020.
<span>&#x201C;On Training Deep Neural Networks Using a Streaming
Approach.&#x201D;</span> <em>JAISCR</em> 10.
</div>
<div id="ref-dwork2008differential" class="csl-entry" role="listitem">
Dwork, Cynthia. 2008. <span>&#x201C;Differential Privacy: A Survey of
Results.&#x201D;</span> In <em>TAMC</em>, 1&#x2013;19.
</div>
<div id="ref-dwork2014algorithmic" class="csl-entry" role="listitem">
Dwork, Cynthia, Aaron Roth, et al. 2014. <span>&#x201C;The Algorithmic
Foundations of Differential Privacy.&#x201D;</span> <em>Foundations and
Trends<span></span> in Theoretical Computer Science</em> 9 (3&#x2013;4):
211&#x2013;407.
</div>
<div id="ref-eisenhofer2022verifiable" class="csl-entry"
role="listitem">
Eisenhofer, Thorsten, Doreen Riepel, Varun Chandrasekaran, Esha Ghosh,
Olga Ohrimenko, and Nicolas Papernot. 2022. <span>&#x201C;Verifiable and
Provably Secure Machine Unlearning.&#x201D;</span> <em>arXiv Preprint
arXiv:2210.09126</em>.
</div>
<div id="ref-felps2020class" class="csl-entry" role="listitem">
Felps, Daniel L, Amelia D Schwickerath, Joyce D Williams, Trung N Vuong,
Alan Briggs, Matthew Hunt, Evan Sakmar, David D Saranchak, and Tyler
Shumaker. 2020. <span>&#x201C;Class Clown: Data Redaction in Machine Unlearning
at Enterprise Scale.&#x201D;</span> <em>arXiv Preprint arXiv:2012.04699</em>.
</div>
<div id="ref-feuerriegel2020fair" class="csl-entry" role="listitem">
Feuerriegel, Stefan, Mateusz Dolata, and Gerhard Schwabe. 2020.
<span>&#x201C;Fair AI.&#x201D;</span> <em>Business &amp; Information Systems
Engineering</em> 62 (4): 379&#x2013;84.
</div>
<div id="ref-fredrikson2014privacy" class="csl-entry" role="listitem">
Fredrikson, Matthew, Eric Lantz, Somesh Jha, Simon Lin, David Page, and
Thomas Ristenpart. 2014. <span>&#x201C;Privacy in Pharmacogenetics: An <span
class="math inline">{</span>End-to-End<span class="math inline">}</span>
Case Study of Personalized Warfarin Dosing.&#x201D;</span> In <em>USENIX
Security</em>, 17&#x2013;32.
</div>
<div id="ref-fu2022knowledge" class="csl-entry" role="listitem">
Fu, Shaopeng, Fengxiang He, et al. 2022. <span>&#x201C;Knowledge Removal in
Sampling-Based Bayesian Inference.&#x201D;</span> In <em>ICLR</em>.
</div>
<div id="ref-fu2021bayesian" class="csl-entry" role="listitem">
Fu, Shaopeng, Fengxiang He, Yue Xu, and Dacheng Tao. 2021.
<span>&#x201C;Bayesian Inference Forgetting.&#x201D;</span> <em>arXiv Preprint
arXiv:2101.06417</em>.
</div>
<div id="ref-gao2022deletion" class="csl-entry" role="listitem">
Gao, Ji, Sanjam Garg, Mohammad Mahmoody, and Prashant Nalini Vasudevan.
2022. <span>&#x201C;Deletion Inference, Reconstruction, and Compliance in
Machine (Un) Learning.&#x201D;</span> <em>Proc. Priv. Enhancing Technol.</em>
2022 (3): 415&#x2013;36.
</div>
<div id="ref-gao2022verifi" class="csl-entry" role="listitem">
Gao, Xiangshan, Xingjun Ma, Jingyi Wang, Youcheng Sun, Bo Li, Shouling
Ji, Peng Cheng, and Jiming Chen. 2022. <span>&#x201C;VeriFi: Towards Verifiable
Federated Unlearning.&#x201D;</span> <em>arXiv Preprint arXiv:2205.12709</em>.
</div>
<div id="ref-garg2020formalizing" class="csl-entry" role="listitem">
Garg, Sanjam, Shafi Goldwasser, and Prashant Nalini Vasudevan. 2020.
<span>&#x201C;Formalizing Data Deletion in the Context of the Right to Be
Forgotten.&#x201D;</span> In <em>EUROCRYPT</em>, 373&#x2013;402.
</div>
<div id="ref-gasteiger2018combining" class="csl-entry" role="listitem">
Gasteiger, Johannes, Aleksandar Bojchevski, and Stephan G&#xFC;nnemann. 2019.
<span>&#x201C;Combining Neural Networks with Personalized PageRank for
Classification on Graphs.&#x201D;</span> In <em>ICLR</em>.
</div>
<div id="ref-geurts2006extremely" class="csl-entry" role="listitem">
Geurts, Pierre, Damien Ernst, et al. 2006. <span>&#x201C;Extremely Randomized
Trees.&#x201D;</span> <em>Machine Learning</em> 63 (1): 3&#x2013;42.
</div>
<div id="ref-ginart2019making" class="csl-entry" role="listitem">
Ginart, Antonio, Melody Guan, Gregory Valiant, and James Y Zou. 2019.
<span>&#x201C;Making Ai Forget You: Data Deletion in Machine Learning.&#x201D;</span>
<em>NIPS</em> 32.
</div>
<div id="ref-goel2022evaluating" class="csl-entry" role="listitem">
Goel, Shashwat, Ameya Prabhu, and Ponnurangam Kumaraguru. 2022.
<span>&#x201C;Evaluating Inexact Unlearning Requires Revisiting
Forgetting.&#x201D;</span> <em>arXiv Preprint arXiv:2201.06640</em>.
</div>
<div id="ref-golatkar2021mixed" class="csl-entry" role="listitem">
Golatkar, Aditya, Alessandro Achille, Avinash Ravichandran, Marzia
Polito, and Stefano Soatto. 2021. <span>&#x201C;Mixed-Privacy Forgetting in
Deep Networks.&#x201D;</span> In <em>CVPR</em>, 792&#x2013;801.
</div>
<div id="ref-golatkar2020eternal" class="csl-entry" role="listitem">
Golatkar, Aditya, Alessandro Achille, and Stefano Soatto. 2020a.
<span>&#x201C;Eternal Sunshine of the Spotless Net: Selective Forgetting in
Deep Networks.&#x201D;</span> In <em>CVPR</em>, 9304&#x2013;12.
</div>
<div id="ref-golatkar2020forgetting" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2020b. <span>&#x201C;Forgetting Outside the Box: Scrubbing Deep Networks
of Information Accessible from Input-Output Observations.&#x201D;</span> In
<em>ECCV</em>, 383&#x2013;98.
</div>
<div id="ref-goyal2021revisiting" class="csl-entry" role="listitem">
Goyal, Adit, Vikas Hassija, and Victor Hugo C de Albuquerque. 2021.
<span>&#x201C;Revisiting Machine Learning Training Process for Enhanced Data
Privacy.&#x201D;</span> In <em>IC3</em>, 247&#x2013;51.
</div>
<div id="ref-graves2021amnesiac" class="csl-entry" role="listitem">
Graves, Laura, Vineel Nagisetty, and Vijay Ganesh. 2021. <span>&#x201C;Amnesiac
Machine Learning.&#x201D;</span> In <em>AAAI</em>, 35:11516&#x2013;24. 13.
</div>
<div id="ref-GuoGHM20" class="csl-entry" role="listitem">
Guo, Chuan, Tom Goldstein, Awni Y. Hannun, and Laurens van der Maaten.
2020. <span>&#x201C;Certified Data Removal from Machine Learning
Models.&#x201D;</span> In <em>ICML</em>, 119:3832&#x2013;42.
</div>
<div id="ref-guo2020survey" class="csl-entry" role="listitem">
Guo, Ruocheng, Lu Cheng, Jundong Li, P Richard Hahn, and Huan Liu. 2020.
<span>&#x201C;A Survey of Learning Causality with Data: Problems and
Methods.&#x201D;</span> <em>CSUR</em> 53 (4): 1&#x2013;37.
</div>
<div id="ref-guo2022efficient" class="csl-entry" role="listitem">
Guo, Tao, Song Guo, Jiewei Zhang, Wenchao Xu, and Junxiao Wang. 2022.
<span>&#x201C;Efficient Attribute Unlearning: Towards Selective Removal of
Input Attributes from Feature Representations.&#x201D;</span> <em>arXiv
Preprint arXiv:2202.13295</em>.
</div>
<div id="ref-gupta2021adaptive" class="csl-entry" role="listitem">
Gupta, Varun, Christopher Jung, Seth Neel, Aaron Roth, Saeed
Sharifi-Malvajerdi, and Chris Waites. 2021. <span>&#x201C;Adaptive Machine
Unlearning.&#x201D;</span> <em>NIPS</em> 34: 16319&#x2013;30.
</div>
<div id="ref-halimi2022federated" class="csl-entry" role="listitem">
Halimi, Anisa, Swanand Kadhe, Ambrish Rawat, and Nathalie Baracaldo.
2022. <span>&#x201C;Federated Unlearning: How to Efficiently Erase a Client in
FL?&#x201D;</span> <em>arXiv Preprint arXiv:2207.05521</em>.
</div>
<div id="ref-hamilton2020graph" class="csl-entry" role="listitem">
Hamilton, William L. 2020. <span>&#x201C;Graph Representation Learning.&#x201D;</span>
<em>Synthesis Lectures on Artifical Intelligence and Machine
Learning</em> 14 (3): 1&#x2013;159.
</div>
<div id="ref-haug2021learning" class="csl-entry" role="listitem">
Haug, Johannes, and Gjergji Kasneci. 2021. <span>&#x201C;Learning Parameter
Distributions to Detect Concept Drift in Data Streams.&#x201D;</span> In
<em>ICPR</em>, 9452&#x2013;59.
</div>
<div id="ref-he2021deepobliviate" class="csl-entry" role="listitem">
He, Yingzhe, Guozhu Meng, Kai Chen, Jinwen He, and Xingbo Hu. 2021.
<span>&#x201C;Deepobliviate: A Powerful Charm for Erasing Data Residual Memory
in Deep Neural Networks.&#x201D;</span> <em>arXiv Preprint
arXiv:2105.06209</em>.
</div>
<div id="ref-hu2021distilling" class="csl-entry" role="listitem">
Hu, Xinting, Kaihua Tang, Chunyan Miao, Xian-Sheng Hua, and Hanwang
Zhang. 2021. <span>&#x201C;Distilling Causal Effect of Data in
Class-Incremental Learning.&#x201D;</span> In <em>CVPR</em>, 3957&#x2013;66.
</div>
<div id="ref-huang2021unlearnable" class="csl-entry" role="listitem">
Huang, Hanxun, Xingjun Ma, Sarah Monazam Erfani, James Bailey, and Yisen
Wang. 2021. <span>&#x201C;Unlearnable Examples: Making Personal Data
Unexploitable.&#x201D;</span> In <em>ICLR</em>.
</div>
<div id="ref-huang2021mathsf" class="csl-entry" role="listitem">
Huang, Yangsibo, Xiaoxiao Li, et al. 2021. <span>&#x201C;EMA: Auditing Data
Removal from Trained Models.&#x201D;</span> In <em>MICCAI</em>, 793&#x2013;803.
</div>
<div id="ref-hullermeier2021aleatoric" class="csl-entry"
role="listitem">
H&#xFC;llermeier, Eyke, and Willem Waegeman. 2021. <span>&#x201C;Aleatoric and
Epistemic Uncertainty in Machine Learning: An Introduction to Concepts
and Methods.&#x201D;</span> <em>Machine Learning</em> 110 (3): 457&#x2013;506.
</div>
<div id="ref-izzo2021approximate" class="csl-entry" role="listitem">
Izzo, Zachary, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. 2021.
<span>&#x201C;Approximate Data Deletion from Machine Learning Models.&#x201D;</span>
In <em>AISTAT</em>, 2008&#x2013;16.
</div>
<div id="ref-jagielski2022measuring" class="csl-entry" role="listitem">
Jagielski, Matthew, Om Thakkar, Florian Tram&#xE8;r, Daphne Ippolito,
Katherine Lee, Nicholas Carlini, Eric Wallace, et al. 2022.
<span>&#x201C;Measuring Forgetting of Memorized Training Examples.&#x201D;</span>
<em>arXiv Preprint arXiv:2207.00099</em>.
</div>
<div id="ref-jia2021proof" class="csl-entry" role="listitem">
Jia, Hengrui, Mohammad Yaghini, Christopher A Choquette-Choo, Natalie
Dullerud, Anvith Thudi, Varun Chandrasekaran, and Nicolas Papernot.
2021. <span>&#x201C;Proof-of-Learning: Definitions and Practice.&#x201D;</span> In
<em>SP</em>, 1039&#x2013;56.
</div>
<div id="ref-jose2021unified" class="csl-entry" role="listitem">
Jose, Sharu Theresa, and Osvaldo Simeone. 2021. <span>&#x201C;A Unified
PAC-Bayesian Framework for Machine Unlearning via Information Risk
Minimization.&#x201D;</span> In <em>MLSP</em>, 1&#x2013;6.
</div>
<div id="ref-karasuyama2009multiple" class="csl-entry" role="listitem">
Karasuyama, Masayuki, and Ichiro Takeuchi. 2009. <span>&#x201C;Multiple
Incremental Decremental Learning of Support Vector Machines.&#x201D;</span>
<em>NIPS</em> 22.
</div>
<div id="ref-karasuyama2010multiple" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2010. <span>&#x201C;Multiple Incremental Decremental Learning of Support
Vector Machines.&#x201D;</span> <em>IEEE Transactions on Neural Networks</em>
21 (7): 1048&#x2013;59.
</div>
<div id="ref-kearns1998efficient" class="csl-entry" role="listitem">
Kearns, Michael. 1998. <span>&#x201C;Efficient Noise-Tolerant Learning from
Statistical Queries.&#x201D;</span> <em>JACM</em> 45 (6): 983&#x2013;1006.
</div>
<div id="ref-khan2021knowledge" class="csl-entry" role="listitem">
Khan, Mohammad Emtiyaz E et al. 2021. <span>&#x201C;Knowledge-Adaptation
Priors.&#x201D;</span> <em>NIPS</em> 34: 19757&#x2013;70.
</div>
<div id="ref-kirkpatrick2017overcoming" class="csl-entry"
role="listitem">
Kirkpatrick, James, Razvan Pascanu, Neil Rabinowitz, Joel Veness,
Guillaume Desjardins, Andrei A Rusu, Kieran Milan, et al. 2017.
<span>&#x201C;Overcoming Catastrophic Forgetting in Neural Networks.&#x201D;</span>
<em>PNAS</em> 114 (13): 3521&#x2013;26.
</div>
<div id="ref-koh2017understanding" class="csl-entry" role="listitem">
Koh, Pang Wei et al. 2017. <span>&#x201C;Understanding Black-Box Predictions
via Influence Functions.&#x201D;</span> In <em>ICML</em>, 1885&#x2013;94.
</div>
<div id="ref-KLFormula" class="csl-entry" role="listitem">
Kullback, S. et al. 1951. <span>&#x201C;<span class="nocase">On Information and
Sufficiency</span>.&#x201D;</span> <em>The Annals of Mathematical
Statistics</em> 22 (1): 79&#x2013;86.
</div>
<div id="ref-lei2019geometric" class="csl-entry" role="listitem">
Lei, Na, Kehua Su, Li Cui, Shing-Tung Yau, and Xianfeng David Gu. 2019.
<span>&#x201C;A Geometric View of Optimal Transportation and Generative
Model.&#x201D;</span> <em>Computer Aided Geometric Design</em> 68: 1&#x2013;21.
</div>
<div id="ref-li2020online" class="csl-entry" role="listitem">
Li, Yuantong, Chi-Hua Wang, and Guang Cheng. 2021. <span>&#x201C;Online
Forgetting Process for Linear Regression Models.&#x201D;</span> In
<em>AISTAT</em>, 130:217&#x2013;25.
</div>
<div id="ref-liu2022continual" class="csl-entry" role="listitem">
Liu, Bo, Qiang Liu, et al. 2022. <span>&#x201C;Continual Learning and Private
Unlearning.&#x201D;</span> <em>arXiv Preprint arXiv:2203.12817</em>.
</div>
<div id="ref-liu2020federated" class="csl-entry" role="listitem">
Liu, Gaoyang, Xiaoqiang Ma, Yang Yang, Chen Wang, and Jiangchuan Liu.
2020. <span>&#x201C;Federated Unlearning.&#x201D;</span> <em>arXiv Preprint
arXiv:2012.13891</em>.
</div>
<div id="ref-liu2021federaser" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2021. <span>&#x201C;Federaser: Enabling Efficient Client-Level Data
Removal from Federated Learning Models.&#x201D;</span> In <em>IWQOS</em>, 1&#x2013;10.
</div>
<div id="ref-liu2020have" class="csl-entry" role="listitem">
Liu, Xiao, and Sotirios A Tsaftaris. 2020. <span>&#x201C;Have You Forgotten? A
Method to Assess If Machine Learning Models Have Forgotten Data.&#x201D;</span>
In <em>MICCAI</em>, 95&#x2013;105.
</div>
<div id="ref-liu2022backdoor" class="csl-entry" role="listitem">
Liu, Yang, Mingyuan Fan, Cen Chen, Ximeng Liu, Zhuo Ma, Li Wang, and
Jianfeng Ma. 2022. <span>&#x201C;Backdoor Defense with Machine
Unlearning.&#x201D;</span> <em>arXiv Preprint arXiv:2201.09538</em>.
</div>
<div id="ref-liu2020learn" class="csl-entry" role="listitem">
Liu, Yang, Zhuo Ma, Ximeng Liu, Jian Liu, Zhongyuan Jiang, Jianfeng Ma,
Philip Yu, and Kui Ren. 2020. <span>&#x201C;Learn to Forget: Machine Unlearning
via Neuron Masking.&#x201D;</span> <em>arXiv Preprint arXiv:2003.10933</em>.
</div>
<div id="ref-liu2021revfrf" class="csl-entry" role="listitem">
Liu, Yang, Zhuo Ma, Yilong Yang, Ximeng Liu, Jianfeng Ma, and Kui Ren.
2021. <span>&#x201C;Revfrf: Enabling Cross-Domain Random Forest Training with
Revocable Federated Learning.&#x201D;</span> <em>TDSC</em>.
</div>
<div id="ref-liu2022right" class="csl-entry" role="listitem">
Liu, Yi, Lei Xu, Xingliang Yuan, Cong Wang, and Bo Li. 2022. <span>&#x201C;The
Right to Be Forgotten in Federated Learning: An Efficient Realization
with Rapid Retraining.&#x201D;</span> In <em>INFOCOM</em>, 1749&#x2013;58.
</div>
<div id="ref-mahadevan2021certifiable" class="csl-entry"
role="listitem">
Mahadevan, Ananth, and Michael Mathioudakis. 2021. <span>&#x201C;Certifiable
Machine Unlearning for Linear Models.&#x201D;</span> <em>arXiv Preprint
arXiv:2106.15093</em>.
</div>
<div id="ref-mahadevan2022certifiable" class="csl-entry"
role="listitem">
&#x2014;&#x2014;&#x2014;. 2022. <span>&#x201C;Certifiable Unlearning Pipelines for Logistic
Regression: An Experimental Study.&#x201D;</span> <em>Machine Learning and
Knowledge Extraction</em> 4 (3): 591&#x2013;620.
</div>
<div id="ref-magdziarczyk2019right" class="csl-entry" role="listitem">
Mantelero, Alessandro. 2013. <span>&#x201C;The EU Proposal for a General Data
Protection Regulation and the Roots of the <span>&#x2018;Right to Be
Forgotten&#x2019;</span>.&#x201D;</span> <em>Computer Law &amp; Security Review</em>
29 (3): 229&#x2013;35.
</div>
<div id="ref-marchant2022hard" class="csl-entry" role="listitem">
Marchant, Neil G, Benjamin IP Rubinstein, and Scott Alfeld. 2022.
<span>&#x201C;Hard to Forget: Poisoning Attacks on Certified Machine
Unlearning.&#x201D;</span> In <em>AAAI</em>, 36:7691&#x2013;7700. 7.
</div>
<div id="ref-martens2020new" class="csl-entry" role="listitem">
Martens, James. 2020. <span>&#x201C;New Insights and Perspectives on the
Natural Gradient Method.&#x201D;</span> <em>JMLR</em> 21 (1): 5776&#x2013;5851.
</div>
<div id="ref-masi2018deep" class="csl-entry" role="listitem">
Masi, Iacopo, Yue Wu, Tal Hassner, and Prem Natarajan. 2018. <span>&#x201C;Deep
Face Recognition: A Survey.&#x201D;</span> In <em>SIBGRAPI</em>, 471&#x2013;78.
</div>
<div id="ref-mcmahan2017communication" class="csl-entry"
role="listitem">
McMahan, Brendan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise
Aguera y Arcas. 2017. <span>&#x201C;Communication-Efficient Learning of Deep
Networks from Decentralized Data.&#x201D;</span> In <em>AISTAT</em>, 1273&#x2013;82.
</div>
<div id="ref-mehrabi2021survey" class="csl-entry" role="listitem">
Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and
Aram Galstyan. 2021. <span>&#x201C;A Survey on Bias and Fairness in Machine
Learning.&#x201D;</span> <em>CSUR</em> 54 (6): 1&#x2013;35.
</div>
<div id="ref-mehta2022deep" class="csl-entry" role="listitem">
Mehta, Ronak, Sourav Pal, Vikas Singh, and Sathya N Ravi. 2022.
<span>&#x201C;Deep Unlearning via Randomized Conditionally Independent
Hessians.&#x201D;</span> In <em>CVPR</em>, 10422&#x2013;31.
</div>
<div id="ref-micaelli2019zero" class="csl-entry" role="listitem">
Micaelli, Paul et al. 2019. <span>&#x201C;Zero-Shot Knowledge Transfer via
Adversarial Belief Matching.&#x201D;</span> <em>NIPS</em> 32.
</div>
<div id="ref-nam2020learning" class="csl-entry" role="listitem">
Nam, Junhyun, Hyuntak Cha, Sungsoo Ahn, Jaeho Lee, and Jinwoo Shin.
2020. <span>&#x201C;Learning from Failure: De-Biasing Classifier from Biased
Classifier.&#x201D;</span> <em>NIPS</em> 33: 20673&#x2013;84.
</div>
<div id="ref-neel2021descent" class="csl-entry" role="listitem">
Neel, Seth, Aaron Roth, and Saeed Sharifi-Malvajerdi. 2021.
<span>&#x201C;Descent-to-Delete: Gradient-Based Methods for Machine
Unlearning.&#x201D;</span> In <em>Algorithmic Learning Theory</em>, 931&#x2013;62.
</div>
<div id="ref-nguyen2020variational" class="csl-entry" role="listitem">
Nguyen, Quoc Phong, Bryan Kian Hsiang Low, and Patrick Jaillet. 2020.
<span>&#x201C;Variational Bayesian Unlearning.&#x201D;</span> <em>NIPS</em> 33:
16025&#x2013;36.
</div>
<div id="ref-nguyen2022markov" class="csl-entry" role="listitem">
Nguyen, Quoc Phong, Ryutaro Oikawa, Dinil Mon Divakaran, Mun Choon Chan,
et al. 2022. <span>&#x201C;Markov Chain Monte Carlo-Based Machine Unlearning:
Unlearning What Needs to Be Forgotten.&#x201D;</span> In <em>ASIACCS</em>,
351&#x2013;63.
</div>
<div id="ref-nguyen2019debunking" class="csl-entry" role="listitem">
Nguyen, Thanh Tam. 2019. <span>&#x201C;Debunking Misinformation on the Web:
Detection, Validation, and Visualisation.&#x201D;</span> PhD thesis, EPFL,
Switzerland.
</div>
<div id="ref-nguyen2022survey" class="csl-entry" role="listitem">
Nguyen, Thanh Tam, Thanh Trung Huynh, Phi Le Nguyen, Alan Wee-Chung
Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2022. <span>&#x201C;A Survey of
Machine Unlearning.&#x201D;</span> <em>arXiv Preprint arXiv:2209.02299</em>.
</div>
<div id="ref-nguyen2021judo" class="csl-entry" role="listitem">
Nguyen, Thanh Toan, Thanh Tam Nguyen, Thanh Thi Nguyen, Bay Vo, Jun Jo,
and Quoc Viet Hung Nguyen. 2021. <span>&#x201C;Judo: Just-in-Time Rumour
Detection in Streaming Social Platforms.&#x201D;</span> <em>Information
Sciences</em> 570: 70&#x2013;93.
</div>
<div id="ref-pardau2018california" class="csl-entry" role="listitem">
Pardau, Stuart L. 2018. <span>&#x201C;The California Consumer Privacy Act:
Towards a European-Style Privacy Regime in the United States.&#x201D;</span>
<em>J. Tech. L. &amp; Pol&#x2019;y</em> 23: 68.
</div>
<div id="ref-parisi2019continual" class="csl-entry" role="listitem">
Parisi, German I, Ronald Kemker, Jose L Part, Christopher Kanan, and
Stefan Wermter. 2019. <span>&#x201C;Continual Lifelong Learning with Neural
Networks: A Review.&#x201D;</span> <em>Neural Networks</em> 113: 54&#x2013;71.
</div>
<div id="ref-parne2021machine" class="csl-entry" role="listitem">
Parne, Nishchal, Kyathi Puppaala, Nithish Bhupathi, and Ripon Patgiri.
2021. <span>&#x201C;An Investigation on Learning, Polluting, and Unlearning the
Spam Emails for Lifelong Learning.&#x201D;</span> <em>arXiv Preprint
arXiv:2111.14609</em>.
</div>
<div id="ref-pearce2020uncertainty" class="csl-entry" role="listitem">
Pearce, Tim, Felix Leibfried, and Alexandra Brintrup. 2020.
<span>&#x201C;Uncertainty in Neural Networks: Approximately Bayesian
Ensembling.&#x201D;</span> In <em>AISTATS</em>, 234&#x2013;44.
</div>
<div id="ref-peste2021ssse" class="csl-entry" role="listitem">
Peste, Alexandra, Dan Alistarh, and Christoph H Lampert. 2021.
<span>&#x201C;<span>SSSE</span>: Efficiently Erasing Samples from Trained
Machine Learning Models.&#x201D;</span> In <em>NeurIPS 2021 Workshop Privacy in
Machine Learning</em>.
</div>
<div id="ref-ramaswamy2021fair" class="csl-entry" role="listitem">
Ramaswamy, Vikram V, Sunnie SY Kim, and Olga Russakovsky. 2021.
<span>&#x201C;Fair Attribute Classification Through Latent Space
de-Biasing.&#x201D;</span> In <em>CVPR</em>, 9301&#x2013;10.
</div>
<div id="ref-ren2020adversarial" class="csl-entry" role="listitem">
Ren, Kui, Tianhang Zheng, Zhan Qin, and Xue Liu. 2020.
<span>&#x201C;Adversarial Attacks and Defenses in Deep Learning.&#x201D;</span>
<em>Engineering</em> 6 (3): 346&#x2013;60.
</div>
<div id="ref-ren2020generating" class="csl-entry" role="listitem">
Ren, Zhao, Alice Baird, Jing Han, Zixing Zhang, and Bj&#xF6;rn Schuller.
2020. <span>&#x201C;Generating and Protecting Against Adversarial Attacks for
Deep Speech-Based Emotion Recognition Models.&#x201D;</span> In
<em>ICASSP</em>, 7184&#x2013;88.
</div>
<div id="ref-ren2020enhancing" class="csl-entry" role="listitem">
Ren, Zhao, Jing Han, Nicholas Cummins, and Bj&#xF6;rn W Schuller. 2020.
<span>&#x201C;Enhancing Transferability of Black-Box Adversarial Attacks via
Lifelong Learning for Speech Emotion Recognition Models.&#x201D;</span> In
<em>INTERSPEECH</em>, 496&#x2013;500.
</div>
<div id="ref-ren2022prototype" class="csl-entry" role="listitem">
Ren, Zhao, Thanh Tam Nguyen, and Wolfgang Nejdl. 2022. <span>&#x201C;Prototype
Learning for Interpretable Respiratory Sound Analysis.&#x201D;</span> In
<em>ICASSP</em>, 9087&#x2013;91.
</div>
<div id="ref-romero2007incremental" class="csl-entry" role="listitem">
Romero, Enrique, Ignacio Barrio, and Llu&#x131;&#x301;s Belanche. 2007.
<span>&#x201C;Incremental and Decremental Learning for Linear Support Vector
Machines.&#x201D;</span> In <em>ICANN</em>, 209&#x2013;18.
</div>
<div id="ref-roth2018bayesian" class="csl-entry" role="listitem">
Roth, Wolfgang, and Franz Pernkopf. 2018. <span>&#x201C;Bayesian Neural
Networks with Weight Sharing Using Dirichlet Processes.&#x201D;</span>
<em>TPAMI</em> 42 (1): 246&#x2013;52.
</div>
<div id="ref-salem2020updates" class="csl-entry" role="listitem">
Salem, Ahmed, Apratim Bhattacharya, Michael Backes, Mario Fritz, and
Yang Zhang. 2020. <span>&#x201C;<span
class="math inline">{</span>Updates-Leak<span
class="math inline">}</span>: Data Set Inference and Reconstruction
Attacks in Online Learning.&#x201D;</span> In <em>USENIX Security</em>,
1291&#x2013;1308.
</div>
<div id="ref-Salem0HBF019" class="csl-entry" role="listitem">
Salem, Ahmed, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz,
and Michael Backes. 2019. <span>&#x201C;ML-Leaks: Model and Data Independent
Membership Inference Attacks and Defenses on Machine Learning
Models.&#x201D;</span> In <em>NDSS</em>.
</div>
<div id="ref-sari2020learning" class="csl-entry" role="listitem">
Sari, WN, BS Samosir, N Sahara, L Agustina, and Y Anita. 2020.
<span>&#x201C;Learning Mathematics <span>&#x2018;Asyik&#x2019;</span> with Youtube Educative
Media.&#x201D;</span> In <em>Journal of Physics: Conference Series</em>,
1477:022012. 2.
</div>
<div id="ref-sattler2021fedaux" class="csl-entry" role="listitem">
Sattler, Felix, Tim Korjakow, Roman Rischke, and Wojciech Samek. 2021.
<span>&#x201C;Fedaux: Leveraging Unlabeled Auxiliary Data in Federated
Learning.&#x201D;</span> <em>TNNLS</em>.
</div>
<div id="ref-schelter2020amnesia" class="csl-entry" role="listitem">
Schelter, Sebastian. 2020. <span>&#x201C;<span>&#x2018;Amnesia&#x2019;</span> - a Selection
of Machine Learning Models That Can Forget User Data Very Fast.&#x201D;</span>
In <em>CIDR</em>.
</div>
<div id="ref-schelter2021hedgecut" class="csl-entry" role="listitem">
Schelter, Sebastian, Stefan Grafberger, and Ted Dunning. 2021.
<span>&#x201C;Hedgecut: Maintaining Randomised Trees for Low-Latency Machine
Unlearning.&#x201D;</span> In <em>SIGMOD</em>, 1545&#x2013;57.
</div>
<div id="ref-sekhari2021remember" class="csl-entry" role="listitem">
Sekhari, Ayush, Jayadev Acharya, Gautam Kamath, and Ananda Theertha
Suresh. 2021. <span>&#x201C;Remember What You Want to Forget: Algorithms for
Machine Unlearning.&#x201D;</span> <em>NIPS</em> 34: 18075&#x2013;86.
</div>
<div id="ref-shan2020protecting" class="csl-entry" role="listitem">
Shan, S, E Wenger, J Zhang, H Li, H Zheng, and BY Zhao. 2020.
<span>&#x201C;Protecting Personal Privacy Against Una Uthorized Deep Learning
Models.&#x201D;</span> In <em>USENIX Security</em>, 1&#x2013;16.
</div>
<div id="ref-shibata2021learning" class="csl-entry" role="listitem">
Shibata, Takashi, Go Irie, Daiki Ikami, and Yu Mitsuzumi. 2021.
<span>&#x201C;Learning with Selective Forgetting.&#x201D;</span> In <em>IJCAI</em>,
2:6. 4.
</div>
<div id="ref-shintre2019making" class="csl-entry" role="listitem">
Shintre, Saurabh et al. 2019. <span>&#x201C;Making Machine Learning
Forget.&#x201D;</span> In <em>Annual Privacy Forum</em>, 72&#x2013;83.
</div>
<div id="ref-shokri2017membership" class="csl-entry" role="listitem">
Shokri, Reza, Marco Stronati, Congzheng Song, and Vitaly Shmatikov.
2017. <span>&#x201C;Membership Inference Attacks Against Machine Learning
Models.&#x201D;</span> In <em>SP</em>, 3&#x2013;18.
</div>
<div id="ref-shwartz2017opening" class="csl-entry" role="listitem">
Shwartz-Ziv, Ravid, and Naftali Tishby. 2017. <span>&#x201C;Opening the Black
Box of Deep Neural Networks via Information.&#x201D;</span> <em>arXiv Preprint
arXiv:1703.00810</em>.
</div>
<div id="ref-singh2017data" class="csl-entry" role="listitem">
Singh, Abhijeet, and Abhineet Anand. 2017. <span>&#x201C;Data Leakage Detection
Using Cloud Computing.&#x201D;</span> <em>IJECS</em> 6 (4).
</div>
<div id="ref-singh2022anatomizing" class="csl-entry" role="listitem">
Singh, Richa, Puspita Majumdar, Surbhi Mittal, and Mayank Vatsa. 2022.
<span>&#x201C;Anatomizing Bias in Facial Analysis.&#x201D;</span> In <em>AAAI</em>,
36:12351&#x2013;58. 11.
</div>
<div id="ref-sommer2020towards" class="csl-entry" role="listitem">
Sommer, David Marco, Liwei Song, Sameer Wagh, and Prateek Mittal. 2020.
<span>&#x201C;Towards Probabilistic Verification of Machine Unlearning.&#x201D;</span>
<em>arXiv Preprint arXiv:2003.04247</em>.
</div>
<div id="ref-sommer2022athena" class="csl-entry" role="listitem">
&#x2014;&#x2014;&#x2014;. 2022. <span>&#x201C;Athena: Probabilistic Verification of Machine
Unlearning.&#x201D;</span> <em>Proc. Priv. Enhancing Technol.</em> 2022 (3):
268&#x2013;90.
</div>
<div id="ref-tahiliani2021machine" class="csl-entry" role="listitem">
Tahiliani, Aman, Vikas Hassija, Vinay Chamola, and Mohsen Guizani. 2021.
<span>&#x201C;Machine Unlearning: Its Need and Implementation
Strategies.&#x201D;</span> In <em>IC3</em>, 241&#x2013;46.
</div>
<div id="ref-nguyen2017retaining" class="csl-entry" role="listitem">
Tam, Nguyen Thanh, Matthias Weidlich, Duong Chi Thang, Hongzhi Yin, and
Nguyen Quoc Viet Hung. 2017. <span>&#x201C;Retaining Data from Streams of
Social Platforms with Minimal Regret.&#x201D;</span> In <em>IJCAI</em>,
2850&#x2013;56.
</div>
<div id="ref-tanha2020boosting" class="csl-entry" role="listitem">
Tanha, Jafar, Yousef Abdi, Negin Samadi, Nazila Razzaghi, and Mohammad
Asadpour. 2020. <span>&#x201C;Boosting Methods for Multi-Class Imbalanced Data
Classification: An Experimental Review.&#x201D;</span> <em>Journal of Big
Data</em> 7 (1): 1&#x2013;47.
</div>
<div id="ref-tarun2021fast" class="csl-entry" role="listitem">
Tarun, Ayush K, Vikram S Chundawat, Murari Mandal, and Mohan
Kankanhalli. 2021. <span>&#x201C;Fast yet Effective Machine Unlearning.&#x201D;</span>
<em>arXiv Preprint arXiv:2111.08947</em>.
</div>
<div id="ref-thudi2022unrolling" class="csl-entry" role="listitem">
Thudi, Anvith, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot.
2022. <span>&#x201C;Unrolling Sgd: Understanding Factors Influencing Machine
Unlearning.&#x201D;</span> In <em>EuroS&amp;p</em>, 303&#x2013;19.
</div>
<div id="ref-thudi2021necessity" class="csl-entry" role="listitem">
Thudi, Anvith, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot. 2022.
<span>&#x201C;On the Necessity of Auditable Algorithmic Definitions for Machine
Unlearning.&#x201D;</span> In <em>USENIX Security</em>, 4007&#x2013;22.
</div>
<div id="ref-thudi2022bounding" class="csl-entry" role="listitem">
Thudi, Anvith, Ilia Shumailov, Franziska Boenisch, and Nicolas Papernot.
2022. <span>&#x201C;Bounding Membership Inference.&#x201D;</span> <em>arXiv Preprint
arXiv:2202.12232</em>.
</div>
<div id="ref-tishby2000information" class="csl-entry" role="listitem">
Tishby, Naftali et al. 2000. <span>&#x201C;The Information Bottleneck
Method.&#x201D;</span> <em>arXiv Preprint Physics/0004057</em>.
</div>
<div id="ref-tishby2015deep" class="csl-entry" role="listitem">
Tishby, Naftali, and Noga Zaslavsky. 2015. <span>&#x201C;Deep Learning and the
Information Bottleneck Principle.&#x201D;</span> In <em>ITW</em>, 1&#x2013;5.
</div>
<div id="ref-tsai2014incremental" class="csl-entry" role="listitem">
Tsai, Cheng-Hao, Chieh-Yen Lin, and Chih-Jen Lin. 2014.
<span>&#x201C;Incremental and Decremental Training for Linear
Classification.&#x201D;</span> In <em>KDD</em>, 343&#x2013;52.
</div>
<div id="ref-tveit2003multicategory" class="csl-entry" role="listitem">
Tveit, Amund et al. 2003. <span>&#x201C;Multicategory Incremental Proximal
Support Vector Classifiers.&#x201D;</span> In <em>KES</em>, 386&#x2013;92.
</div>
<div id="ref-tveit2003incremental" class="csl-entry" role="listitem">
Tveit, Amund, Magnus Lie Hetland, and H&#xE5;avard Engum. 2003.
<span>&#x201C;Incremental and Decremental Proximal Support Vector
Classification Using Decay Coefficients.&#x201D;</span> In <em>DaWaK</em>,
422&#x2013;29.
</div>
<div id="ref-ullah2021machine" class="csl-entry" role="listitem">
Ullah, Enayat, Tung Mai, Anup Rao, Ryan A Rossi, and Raman Arora. 2021.
<span>&#x201C;Machine Unlearning via Algorithmic Stability.&#x201D;</span> In
<em>Conference on Learning Theory</em>, 4126&#x2013;42.
</div>
<div id="ref-veale2018algorithms" class="csl-entry" role="listitem">
Veale, Michael, Reuben Binns, and Lilian Edwards. 2018.
<span>&#x201C;Algorithms That Remember: Model Inversion Attacks and Data
Protection Law.&#x201D;</span> <em>Philos. Trans. R. Soc. A</em> 376 (2133):
20180083.
</div>
<div id="ref-verdu2014total" class="csl-entry" role="listitem">
Verd&#xFA;, Sergio. 2014. <span>&#x201C;Total Variation Distance and the
Distribution of Relative Information.&#x201D;</span> In <em>ITA</em>, 1&#x2013;3.
</div>
<div id="ref-villaronga2018humans" class="csl-entry" role="listitem">
Villaronga, Eduard Fosch, Peter Kieseberg, and Tiffany Li. 2018.
<span>&#x201C;Humans Forget, Machines Remember: Artificial Intelligence and the
Right to Be Forgotten.&#x201D;</span> <em>Computer Law &amp; Security
Review</em> 34 (2): 304&#x2013;13.
</div>
<div id="ref-voigt2017eu" class="csl-entry" role="listitem">
Voigt, Paul, and Axel Von dem Bussche. 2017. <span>&#x201C;The Eu General Data
Protection Regulation (Gdpr).&#x201D;</span> <em>A Practical Guide, 1st Ed.,
Cham: Springer International Publishing</em> 10 (3152676): 10&#x2013;5555.
</div>
<div id="ref-wang2022efficiently" class="csl-entry" role="listitem">
Wang, Benjamin Longxiang, and Sebastian Schelter. 2022.
<span>&#x201C;Efficiently Maintaining Next Basket Recommendations Under
Additions and Deletions of Baskets and Items.&#x201D;</span> <em>arXiv Preprint
arXiv:2201.13313</em>.
</div>
<div id="ref-wang2019neural" class="csl-entry" role="listitem">
Wang, Bolun, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath,
Haitao Zheng, and Ben Y Zhao. 2019. <span>&#x201C;Neural Cleanse: Identifying
and Mitigating Backdoor Attacks in Neural Networks.&#x201D;</span> In
<em>SP</em>, 707&#x2013;23.
</div>
<div id="ref-wang2022federated" class="csl-entry" role="listitem">
Wang, Junxiao, Song Guo, et al. 2022. <span>&#x201C;Federated Unlearning via
Class-Discriminative Pruning.&#x201D;</span> In <em>WWW</em>, 622&#x2013;32.
</div>
<div id="ref-wang2009learning" class="csl-entry" role="listitem">
Wang, Rui, Yong Fuga Li, XiaoFeng Wang, Haixu Tang, and Xiaoyong Zhou.
2009. <span>&#x201C;Learning Your Identity and Disease from Research Papers:
Information Leaks in Genome Wide Association Study.&#x201D;</span> In
<em>CCS</em>, 534&#x2013;44.
</div>
<div id="ref-wang2020towards" class="csl-entry" role="listitem">
Wang, Zeyu, Klint Qinami, Ioannis Christos Karakozis, Kyle Genova, Prem
Nair, Kenji Hata, and Olga Russakovsky. 2020. <span>&#x201C;Towards Fairness in
Visual Recognition: Effective Strategies for Bias Mitigation.&#x201D;</span> In
<em>CVPR</em>, 8919&#x2013;28.
</div>
<div id="ref-warnecke2021machine" class="csl-entry" role="listitem">
Warnecke, Alexander, Lukas Pirch, Christian Wressnegger, and Konrad
Rieck. 2021. <span>&#x201C;Machine Unlearning of Features and Labels.&#x201D;</span>
<em>arXiv Preprint arXiv:2108.11577</em>.
</div>
<div id="ref-wu2022federated" class="csl-entry" role="listitem">
Wu, Chen et al. 2022. <span>&#x201C;Federated Unlearning with Knowledge
Distillation.&#x201D;</span> <em>arXiv Preprint arXiv:2201.09441</em>.
</div>
<div id="ref-wu2019simplifying" class="csl-entry" role="listitem">
Wu, Felix, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and
Kilian Weinberger. 2019. <span>&#x201C;Simplifying Graph Convolutional
Networks.&#x201D;</span> In <em>ICML</em>, 6861&#x2013;71.
</div>
<div id="ref-wu2022puma" class="csl-entry" role="listitem">
Wu, Ga, Masoud Hashemi, and Christopher Srinivasa. 2022. <span>&#x201C;PUMA:
Performance Unchanged Model Augmentation for Training Data
Removal.&#x201D;</span> In <em>AAAI</em>.
</div>
<div id="ref-wu2020deltagrad" class="csl-entry" role="listitem">
Wu, Yinjun et al. 2020. <span>&#x201C;Deltagrad: Rapid Retraining of Machine
Learning Models.&#x201D;</span> In <em>ICML</em>, 10355&#x2013;66.
</div>
<div id="ref-wu2020priu" class="csl-entry" role="listitem">
Wu, Yinjun, Val Tannen, and Susan B Davidson. 2020. <span>&#x201C;Priu: A
Provenance-Based Approach for Incrementally Updating Regression
Models.&#x201D;</span> In <em>SIGMOD</em>, 447&#x2013;62.
</div>
<div id="ref-yoon2022few" class="csl-entry" role="listitem">
Yoon, Youngsik, Jinhwan Nam, Hyojeong Yun, Dongwoo Kim, and Jungseul Ok.
2022. <span>&#x201C;Few-Shot Unlearning by Model Inversion.&#x201D;</span> <em>arXiv
Preprint arXiv:2205.15567</em>.
</div>
<div id="ref-yu2021does" class="csl-entry" role="listitem">
Yu, Da, Huishuai Zhang, Wei Chen, Jian Yin, and Tie-Yan Liu. 2021.
<span>&#x201C;How Does Data Augmentation Affect Privacy in Machine
Learning?&#x201D;</span> In <em>AAAI</em>, 35:10746&#x2013;53. 12.
</div>
<div id="ref-yu2015lsun" class="csl-entry" role="listitem">
Yu, Fisher, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and
Jianxiong Xiao. 2015. <span>&#x201C;Lsun: Construction of a Large-Scale Image
Dataset Using Deep Learning with Humans in the Loop.&#x201D;</span> <em>arXiv
Preprint arXiv:1506.03365</em>.
</div>
<div id="ref-zanella2020analyzing" class="csl-entry" role="listitem">
Zanella-B&#xE9;guelin, Santiago, Lukas Wutschitz, Shruti Tople, Victor R&#xFC;hle,
Andrew Paverd, Olga Ohrimenko, et al. 2020. <span>&#x201C;Analyzing Information
Leakage of Updates to Natural Language Models.&#x201D;</span> In
<em>SIGSAC</em>, 363&#x2013;75.
</div>
<div id="ref-zeng2021learning" class="csl-entry" role="listitem">
Zeng, Yingyan, Tianhao Wang, Si Chen, Hoang Anh Just, Ran Jin, and Ruoxi
Jia. 2021. <span>&#x201C;Learning to Refit for Convex Learning
Problems.&#x201D;</span> <em>arXiv Preprint arXiv:2111.12545</em>.
</div>
<div id="ref-zhang2020deep" class="csl-entry" role="listitem">
Zhang, Hao, Bo Chen, Yulai Cong, Dandan Guo, Hongwei Liu, and Mingyuan
Zhou. 2020. <span>&#x201C;Deep Autoencoding Topic Model with Scalable Hybrid
Bayesian Inference.&#x201D;</span> <em>TPAMI</em> 43 (12): 4306&#x2013;22.
</div>
<div id="ref-zhang2022machine" class="csl-entry" role="listitem">
Zhang, Peng-Fei, Guangdong Bai, Zi Huang, and Xin-Shun Xu. 2022.
<span>&#x201C;Machine Unlearning for Image Retrieval: A Generative Scrubbing
Approach.&#x201D;</span> In <em>MM</em>, 237&#x2013;45.
</div>
<div id="ref-zou2018ai" class="csl-entry" role="listitem">
Zou, James, and Londa Schiebinger. 2018. <span>&#x201C;AI Can Be Sexist and
Racist &#x2013; It&#x2019;s Time to Make It Fair.&#x201D;</span> Nature Publishing Group.
</div>
</div>
<aside id="footnotes" class="footnotes footnotes-end-of-document"
role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p><a
href="https://github.com/tamlhp/awesome-machine-unlearning"
class="uri">https://github.com/tamlhp/awesome-machine-unlearning</a><a
href="#fnref1" class="footnote-back" role="doc-backlink">&#x21A9;&#xFE0E;</a></p></li>
<li id="fn2"><p><a
href="https://insights.daffodilsw.com/blog/machine-unlearning-what-it-is-all-about"
class="uri">https://insights.daffodilsw.com/blog/machine-unlearning-what-it-is-all-about</a><a
href="#fnref2" class="footnote-back" role="doc-backlink">&#x21A9;&#xFE0E;</a></p></li>
<li id="fn3"><p><a
href="https://github.com/ZIYU-DEEP/Awesome-Information-Bottleneck"
class="uri">https://github.com/ZIYU-DEEP/Awesome-Information-Bottleneck</a><a
href="#fnref3" class="footnote-back" role="doc-backlink">&#x21A9;&#xFE0E;</a></p></li>
</ol>
</aside>