Skip to content

arpit2412/InstanceGM

Repository files navigation

InstanceGM: Instance-Dependent Noisy Label Learning via Graphical Modelling (IEEE/CVF WACV 2023 Round 1)

Paper Link: https://openaccess.thecvf.com/content/WACV2023/html/Garg_Instance-Dependent_Noisy_Label_Learning_via_Graphical_Modelling_WACV_2023_paper.html

Please Cite

 @InProceedings{Garg_2023_WACV,
    author    = {Garg, Arpit and Nguyen, Cuong and Felix, Rafael and Do, Thanh-Toan and Carneiro, Gustavo},
    title     = {Instance-Dependent Noisy Label Learning via Graphical Modelling},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {2288-2298}
}
  • Abstract

Noisy labels are unavoidable yet troublesome in the ecosystem of deep learning because models can easily overfit them. There are many types of label noise, such as symmetric, asymmetric and instance-dependent noise (IDN), with IDN being the only type that depends on image information. Such dependence on image information makes IDN a critical type of label noise to study, given that labelling mistakes are caused in large part by insufficient or ambiguous information about the visual classes present in images. Aiming to provide an effective technique to address IDN, we present a new graphical modelling approach called InstanceGM, that combines discriminative and generative models. The main contributions of InstanceGM are: i) the use of the continuous Bernoulli distribution to train the generative model, offering significant training advantages, and ii) the exploration of a state-of-the-art noisy-label discriminative classifier to generate clean labels from instance-dependent noisy-label samples. InstanceGM is competitive with current noisy-label learning approaches, particularly in IDN benchmarks using synthetic and real-world datasets, where our method shows better accuracy than the competitors in most experiments.

Instance-Dependent Noise

Badges at the time of acceptance 2022

PWC

PWC

PWC

PWC

PWC

PWC

PWC

PWC

PWC

PWC

PWC

PWC

Methodology

Methodology

Figure 2 from InstanceGM . The proposed InstanceGM trains the Classifiers to output clean labels for instance-dependent noisy-label samples. We first warmup our two classifiers (Classifier-{11,12}) using the classification loss, and then with classification loss we train the GMM to separate clean and noisy samples with the semi-supervised model MixMatch from the DivideMix stage. Additionally, another set of encoders (Encoder-{1,2}) are used to generate the latent image features as depicted in the graphical model from Fig. 1. Furthermore, for image reconstruction, the decoders (Decoder-{1,2}) are used by utilizing the continuous Bernoulli loss, and another set of classifiers (Classifier-{21,22}) helps to identify the original noisy labels using the standard cross-entropy loss

Dependency Repos

Our code is heavily based on the mentioned two repos

Tech Stack

Pytorch

Python

WandB

Docker

  • All the libraries used can be found in the requirements file

Datasets

For adding artifical Instance-Dependent noise in CIFAR10/100, we use the code from Part-dependent Label Noise. Please check the tools.py file in our repository

Run without container

Environment Variables

To run this project, you will need to add the following libraries from requirements file

pip install -r requirements.txt

Getting the CIFAR10 dataset

bash cifar10.sh

Run the model

$ python instanceGM.py --r 0.5

  • r is the noise rate

Run using conatiner Docker (Preferred)

For installing docker on your system please follow official Docker Documentation

Running CIFAR10

  • To run it on CIFAR-10 (this dataset is already inside docker image), run the following command from your terminal

docker run --gpus 1 -ti arpit2412/instancegm:cifar /bin/bash -c "cd /src && source activate instanceGM && python instanceGM.py --r 0.5"

  • The above command conatins gpu support and automatically pull the docker image from docker hub if not found locally, and run it after activating the environment

  • To change the noise rate change the argument --r, be default it's 0.5

Running CIFAR100

  • To run it on CIFAR-100 (this dataset is already inside docker image), run the following command from your terminal

docker run --gpus 1 -ti arpit2412/instancegm:cifar /bin/bash -c "cd /src && source activate instanceGM && python instanceGM.py --num_class 100 --data_path ./cifar-100 --dataset cifar100 --r 0.5"

  • To change the noise rate change the argument --r, be default it's 0.5, and changing the settings from CIFAR10 to CIFAR100

Running Animal10N (WandB enabled)

  • In order to run Animal10N you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM

wandb docker run --gpus 1 -v absolute_path_of_animal10N/:/src/animal10N/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_animal10N.py --saved False"

  • Please replace absolute_path_of_animal10N with your absolute path of Animal10N dataset

  • To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials

  • Initially when running Animal10N for first time it would save the dataset labels that's why saved is False (by default) but if you are running again you can use the previously saved label and data information by changing --saved True parameter in the above command

  • CIFAR10/CIFAR100 configurations are followed to run this

Running Red Mini-ImageNet (WandB enabled)

  • In order to run Red Mini-ImageNet you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM

  • Dataset link: https://google.github.io/controlled-noisy-web-labels/download.html

  • Directory structure of Red Mini-ImageNet is mentioned in redMini.txt

wandb docker run --gpus 1 -v absolute_path_of_redMini/:/src/red_blue/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_redMini.py"

  • Please replace absolute_path_of_redMini with your absolute path of Red Mini-ImageNet dataset

  • To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials

  • Following the literature the noise rates considered were 0.2, 0.4, 0.6, 0.8 (default is 0.2). Can be easily changed adding --r like python instanceGM_redMini.py --r 0.4 in above command

  • CIFAR10/CIFAR100 configurations are followed to run this

Running Clothing1M (WandB enabled)

  • In order to run Clothing1M you must have dataset stored in your local machine and then we can mount that folder to docker image using -v parameter while running InstanceGM

wandb docker run --gpus 1 -v absolute_path_of_clothing1M/clothing1M:/src/clothing1M/ -ti instancegm /bin/bash -c "cd ./src && source activate instanceGM && python instanceGM_clothing1M.py"

  • Please replace absolute_path_of_clothing1M/clothing1M with your absolute path of Clothing1M dataset

  • To record the progress with all the loss curves, accuracy curves and sample images, we used wandb. If you are using it for first time it might ask you for wandb credentials

  • Following the literature, pretrained model is used for ResNet, so it might download some pretrained weights automatically

Extra commands (Just to play, not needed for running on CIFAR10/CIFAR100)

  • Pull image from docker hub

docker pull arpit2412/instancegm:cifar

  • If the pull is successfull then following command should list the image

docker image ls

  • All the files are present in src folder in docker image. To check all the files:

docker run -ti arpit2412/instancegm:cifar /bin/bash cd src ls

  • If you wanna build the image from the files procided in the github repository

docker build -f Dockerfile_train -t docker_instancegm .

Results

  • Cifar100

CIFAR100

  • Red Mini-ImageNet

CIFAR100

  • Animal-10N

CIFAR100

Authors

Please Cite

 @InProceedings{Garg_2023_WACV,
    author    = {Garg, Arpit and Nguyen, Cuong and Felix, Rafael and Do, Thanh-Toan and Carneiro, Gustavo},
    title     = {Instance-Dependent Noisy Label Learning via Graphical Modelling},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {2288-2298}
}

License

This work is licensed under a Custom License. Non-commercial use is permitted without restrictions, while commercial users must contact the copyright holder for licensing permission.

Logo

About

Instance-Dependent Noisy Label Learning via Graphical Modelling (WACV 2023 Round 1)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published