InstanceGM: Instance-Dependent Noisy Label Learning via Graphical Modelling (IEEE/CVF WACV 2023 Round 1)

Paper Link: https://openaccess.thecvf.com/content/WACV2023/html/Garg_Instance-Dependent_Noisy_Label_Learning_via_Graphical_Modelling_WACV_2023_paper.html

Please Cite

 @InProceedings{Garg_2023_WACV,
    author    = {Garg, Arpit and Nguyen, Cuong and Felix, Rafael and Do, Thanh-Toan and Carneiro, Gustavo},
    title     = {Instance-Dependent Noisy Label Learning via Graphical Modelling},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {2288-2298}
}

Abstract

Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them. There are many types of label noise, such as symmetric, asymmetric and instance-dependent noise (IDN), with IDN being the only type that depends on image information. Such dependence on image information makes IDN a critical type of label noise to study, given that labelling mistakes are caused in large part by insufficient or ambiguous information about the visual classes present in images. Aiming to provide an effective technique to address IDN, we present a new graphical modelling approach called InstanceGM, that combines discriminative and generative models. The main contributions of InstanceGM are: i) the use of the continuous Bernoulli distribution to train the generative model, offering significant training advantages, and ii) the exploration of a state-of-the-art noisy-label discriminative classifier to generate clean labels from instance-dependent noisy-label samples. InstanceGM is competitive with current noisy-label learning approaches, particularly in IDN benchmarks using synthetic and real-world datasets, where our method shows better accuracy than the competitors in most experiments.

Badges at the time of acceptance 2022

Methodology

Figure 2 from InstanceGM . The proposed InstanceGM trains the Classifiers to output clean labels for instance-dependent noisy-label samples. We first warmup our two classifiers (Classifier-{11,12}) using the classification loss, and then with classification loss we train the GMM to separate clean and noisy samples with the semi-supervised model MixMatch from the DivideMix stage. Additionally, another set of encoders (Encoder-{1,2}) are used to generate the latent image features as depicted in the graphical model from Fig. 1. Furthermore, for image reconstruction, the decoders (Decoder-{1,2}) are used by utilizing the continuous Bernoulli loss, and another set of classifiers (Classifier-{21,22}) helps to identify the original noisy labels using the standard cross-entropy loss

InstanceGM-Paper on arxiv
The above graphical model (left-section) is adopted from CausalNL

Dependency Repos

Our code is heavily based on the mentioned two repos

Tech Stack

All the libraries used can be found in the requirements file

Datasets

CIFAR10/100

For adding artifical Instance-Dependent noise in CIFAR10/100, we use the code from Part-dependent Label Noise. Please check the tools.py file in our repository

Run without container

Environment Variables

To run this project, you will need to add the following libraries from requirements file

pip install -r requirements.txt

Getting the CIFAR10 dataset

bash cifar10.sh

Run the model

$ python instanceGM.py --r 0.5

r is the noise rate

Run using conatiner Docker (Preferred)

For installing docker on your system please follow official Docker Documentation

Running CIFAR10

To run it on CIFAR-10 (this dataset is already inside docker image), run the following command from your terminal

docker run --gpus 1 -ti arpit2412/instancegm:cifar /bin/bash -c "cd /src && source activate instanceGM && python instanceGM.py --r 0.5"

The above command conatins gpu support and automatically pull the docker image from docker hub if not found locally, and run it after activating the environment
To change the noise rate change the argument --r, be default it's 0.5

Running CIFAR100

To run it on CIFAR-100 (this dataset is already inside docker image), run the following command from your terminal

docker run --gpus 1 -ti arpit2412/instancegm:cifar /bin/bash -c "cd /src && source activate instanceGM && python instanceGM.py --num_class 100 --data_path ./cifar-100 --dataset cifar100 --r 0.5"

To change the noise rate change the argument --r, be default it's 0.5, and changing the settings from CIFAR10 to CIFAR100

Running Animal10N (WandB enabled)

In order to run Animal10N you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM

wandb docker run --gpus 1 -v absolute_path_of_animal10N/:/src/animal10N/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_animal10N.py --saved False"

Please replace absolute_path_of_animal10N with your absolute path of Animal10N dataset
To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials
Initially when running Animal10N for first time it would save the dataset labels that's why saved is False (by default) but if you are running again you can use the previously saved label and data information by changing --saved True parameter in the above command
CIFAR10/CIFAR100 configurations are followed to run this

Running Red Mini-ImageNet (WandB enabled)

In order to run Red Mini-ImageNet you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM
Dataset link: https://google.github.io/controlled-noisy-web-labels/download.html
Directory structure of Red Mini-ImageNet is mentioned in redMini.txt

wandb docker run --gpus 1 -v absolute_path_of_redMini/:/src/red_blue/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_redMini.py"

Please replace absolute_path_of_redMini with your absolute path of Red Mini-ImageNet dataset
To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials
Following the literature the noise rates considered were 0.2, 0.4, 0.6, 0.8 (default is 0.2). Can be easily changed adding --r like python instanceGM_redMini.py --r 0.4 in above command
CIFAR10/CIFAR100 configurations are followed to run this

Running Clothing1M (WandB enabled)

In order to run Clothing1M you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM

wandb docker run --gpus 1 -v absolute_path_of_clothing1M/clothing1M:/src/clothing1M/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_clothing1M.py"

Please replace absolute_path_of_clothing1M/clothing1M with your absolute path of Clothing1M dataset
To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials
Following the literature, pretrained model is used for ResNet, so it might download some pretrained weights automatically

Extra commands (Just to play, not needed for running on CIFAR10/CIFAR100)

Pull image from docker hub

docker pull arpit2412/instancegm:cifar

If the pull is successfull then following command should list the image

docker image ls

All the files are present in src folder in docker image. To check all the files:

docker run -ti arpit2412/instancegm:cifar /bin/bash cd src ls

If you wanna build the image from the files procided in the github repository

docker build -f Dockerfile_train -t docker_instancegm .

Results

Cifar100

Red Mini-ImageNet

Animal-10N

Authors

Please Cite

 @InProceedings{Garg_2023_WACV,
    author    = {Garg, Arpit and Nguyen, Cuong and Felix, Rafael and Do, Thanh-Toan and Carneiro, Gustavo},
    title     = {Instance-Dependent Noisy Label Learning via Graphical Modelling},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {2288-2298}
}

License

This work is licensed under a Custom License. Non-commercial use is permitted without restrictions, while commercial users must contact the copyright holder for licensing permission.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Result Images		Result Images
__pycache__		__pycache__
checkpoint		checkpoint
mylib		mylib
Dockerfile_train		Dockerfile_train
InceptionResNetV2.py		InceptionResNetV2.py
PreResNet.py		PreResNet.py
README.md		README.md
cifar10.sh		cifar10.sh
cifar100.sh		cifar100.sh
dataloader_animal10N.py		dataloader_animal10N.py
dataloader_cifar.py		dataloader_cifar.py
dataloader_clothing1M.py		dataloader_clothing1M.py
dataloader_red.py		dataloader_red.py
instanceGM.py		instanceGM.py
instanceGM_animal10N.py		instanceGM_animal10N.py
instanceGM_clothing1M.py		instanceGM_clothing1M.py
instanceGM_redMini.py		instanceGM_redMini.py
readme.txt		readme.txt
redMini.txt		redMini.txt
requirements.txt		requirements.txt
temp.py		temp.py
tools.py		tools.py

arpit2412/InstanceGM

Folders and files

Latest commit

History

Repository files navigation

InstanceGM: Instance-Dependent Noisy Label Learning via Graphical Modelling (IEEE/CVF WACV 2023 Round 1)

Please Cite

Badges at the time of acceptance 2022

Methodology

Dependency Repos

Tech Stack

Datasets

Run without container

Environment Variables

Getting the CIFAR10 dataset

Run the model

Run using conatiner Docker (Preferred)

Running CIFAR10

Running CIFAR100

Running Animal10N (WandB enabled)

Running Red Mini-ImageNet (WandB enabled)

Running Clothing1M (WandB enabled)

Extra commands (Just to play, not needed for running on CIFAR10/CIFAR100)

Results

Authors

Please Cite

License

About

Resources

Stars

Watchers

Forks

Languages