GitHub - mahf93/tracking-memorization: Code for "Leveraging Unlabeled Data to Track Memorization"

Code for "Leveraging Unlabeled Data to Track Memorization".

Requirements

We conducted experiments under:

python 3.7.10
torch 1.7.1
torchvision 0.8.2
cuda 10.1
jupyter-notebook 6.1.5
ipython 7.19.0
1 Nvidia Titan X Maxwell GPU

Description of files

datasets.py: the code to get data loader for a given batch size, training set size and level of label noise.
models folder: the folder that contains the neural network configurations that are used.
experiments.py: the code to train the models and compute susceptibility to noisy labels in each epoch.
results.ipynb: the code to plot figures after the execution of experiments.py is finished.

Example to train

To train a resnet on the CIFAR-10 dataset with 50% label noise level, batch size=128, for 200 epochs run the following command:

python3 experiments.py --model resnet --filename <filename> --modelfilename <modelfilename>

The model is saved in ./checkpoint/ directoty and the results are saved in ./results/ directory. Running this script automatically computes susceptibility during training as well.

A walk-through on a dataset with real-world label noise

To illustrate how to use our method, below we provide our results for the Clothing1M dataset, which is a real-world dataset with 1M images of clothes. The images have been labeled from the texts that accompany them, hence there are both clean and noisy labels in the set. The ground-truth of the labels is not available (i.e., we do not know which samples have clean labels and which samples have noisy labels). Therefore we cannot explicitly track memorization as measured by the accuracy on a noisy subset of the training set, but we can use susceptibility as a metric, since it does not require access to the ground-truth labels. Although the labels of the training set are not clean, a held-out test set with clean labels is available, in addition to the training set. We do not use this held-out clean set during training, but use it only to evaluate the performance of our approach based on susceptibility.

We train 19 different settings on this dataset with various architectures (ResNet, AlexNet and VGG) and varying learning rates and learning rate schedulers. We compute the training accuracy and susceptibility during the training process for each setting and visualize the results in the figure below.

We divide the models of the above figure into 4 regions, where the boundaries are set to the mean value of the training accuracy (horizontal line) and mean value of susceptibility (vertical line): Region 1: Models that are trainable and resistant to memorization, Region 2: Trainable and but not resistant, Region 3: Not trainable but resistant and Region 4: Neither trainable nor resistant. This is shown in the figure below.

Our approach suggests selecting models in Region 1 (low susceptibility, high training accuracy). In order to assess how our approach does in model-selection, we can reveal the test accuracy computed on a held-out clean test set in the figure below. We observe that the average (± standard deviation) of the test accuracy of models in each region is as follows:

Region 1: 61.799% ± 1.643
Region 2: 57.893% ± 3.562
Region 3: 51.250% ± 17.209
Region 4: 51.415% ± 9.709

We observe that using our approach we are able to select models with a very high test accuracy. In addition, the test accuracies of models in Region 1 have the least amount of standard deviation. Note that our susceptibility metric $\zeta$ does not use any information about the label noise level or the label noise type that is present in these datasets. Random labeling is used for computing $\zeta$. Interestingly, even though within the training sets of this dataset the label noise type is different than random labeling (label noise type is instance-dependent), $\zeta$ is still successfully tracking memorization.

Access to the paper

You can find the full version of the paper (including appendices) at https://arxiv.org/abs/2212.04461.

Citation

To cite our work, for the arxiv version, use:

@article{forouzesh2022leveraging,
  title={Leveraging Unlabeled Data to Track Memorization},
  author={Forouzesh, Mahsa and Sedghi, Hanie and Thiran, Patrick},
  journal={arXiv preprint arXiv:2212.04461},
  year={2022}
}

and for the published version, use:

@inproceedings{
forouzesh2023leveraging,
title={Leveraging Unlabeled Data to Track Memorization},
author={Mahsa Forouzesh and Hanie Sedghi and Patrick Thiran},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=ORp91sAbzI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
models		models
README.md		README.md
dataset.py		dataset.py
experiments.py		experiments.py
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Description of files

Example to train

A walk-through on a dataset with real-world label noise

Access to the paper

Citation

About

Releases

Packages

Languages

mahf93/tracking-memorization

Folders and files

Latest commit

History

Repository files navigation

Requirements

Description of files

Example to train

A walk-through on a dataset with real-world label noise

Access to the paper

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages