<a href="https://colab.research.google.com/github/mobarakol/tutorial_notebooks/blob/main/ImageNet_CIFAR_LT_LS_FL_Trained_Weights.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ImageNet (ILSVRC2012)

It contains 1000 classes, 1.28 million training images, and 50 thousand validation images. There are 1,281,167 images and 732-1300 per class in the ILSVRC2012 training set. This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. It requires more than 150GB of storage, and training a resnet50 on it will take around 215 hours using a T4 GPU on Google Colab. Folder name to actual class mapping: https://www.image-net.org/challenges/LSVRC/2012/browse-synsets.php <br>
Sample size is not equal in ImageNet. For example top 10 classes:<br>
n02094433:    3047 (Yorkshire terrier)<br>
n02086240:    2563 (Shih-Tzu)<br>
n01882714:    2469 (koala bear, kangaroo bear, native bear, )<br>
n02087394:    2449 (Rhodesian ridgeback)<br>
n02100735:    2426 (English setter)<br>
n00483313:    2410 (singles)<br>
n02279972:    2386 (monarch butterfly, Danaus plexippus)<br>
n09428293:    2382 (seashore)<br>
n02138441:    2341 (meerkat)<br>
n02100583:    2334 (vizsla, Hungarian pointer)<br>


Task-1. Image classification (2010-2014): Algorithms produce a list of object categories present in the image.<br>
Task-2. Single-object localization (2011-2014): Algorithms
produce a list of object categories present in the image, along with an axis-aligned bounding box indicating the position and scale of one instance of each object category.<br>
Task-3. Object detection (2013-2014): Algorithms produce
a list of object categories present in the image along
with an axis-aligned bounding box indicating the
position and scale of every instance of each object
category.<br>

#Download Links:

Training Images (taskl&2): https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar <br>
Training Annotations (taskl&2): https://image-net.org/data/ILSVRC/2012/ILSVRC2012_bbox_train_v2.tar.gz <br>

Validation Images (all tasks): https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar

Validation Annotations (all tasks): https://image-net.org/data/ILSVRC/2012/ILSVRC2012_bbox_val_v3.tgz


# Preparing Train Images into Folders (Not using in this tutorial)
src: https://github.com/pytorch/examples/blob/main/imagenet/extract_ILSVRC.sh

In [None]:
# Create train directory; move .tar file; change directory
!mkdir imagenet/train && mv ILSVRC2012_img_train.tar imagenet/train/ && cd imagenet/train
# Extract training set; remove compressed file
!tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
#
# At this stage imagenet/train will contain 1000 compressed .tar files, one for each category
#
# For each .tar file: 
#   1. create directory with same name as .tar file
#   2. extract and copy contents of .tar file into directory
#   3. remove .tar file
!find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done

# Download only Validation Set

In [29]:
!wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar

--2022-10-30 22:45:45--  https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
Resolving image-net.org (image-net.org)... 171.64.68.16
Connecting to image-net.org (image-net.org)|171.64.68.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6744924160 (6.3G) [application/x-tar]
Saving to: ‘ILSVRC2012_img_val.tar’


2022-10-30 22:54:21 (12.5 MB/s) - ‘ILSVRC2012_img_val.tar’ saved [6744924160/6744924160]



# Preparing Valid Images into Folders

In [None]:
!mkdir imagenet
!mkdir imagenet/val
!tar -xvf ILSVRC2012_img_val.tar --directory imagenet/val
%cd imagenet/val
!wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
%cd ../..

In [31]:
import argparse
import os
import shutil
import time

import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models

def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total

valdir = os.path.join('imagenet', 'val')
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])


val_dataset = datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        normalize,
    ]))
val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=512, shuffle=False,
    num_workers=2, pin_memory=True)

print('Sample size:', len(val_dataset))
for i, (input, target) in enumerate(val_loader):
    print('First batch:',input.shape, target)
    break


Sample size: 50000
First batch: torch.Size([512, 3, 224, 224]) tensor([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,  2,
         2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,
         2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,
         2,  2,  2,  2,  2,  2,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,
         3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,
         3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,
         3,  3,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  

#Vision Transformer and Variants
Basic: https://github.com/mobarakol/tutorial_notebooks/blob/main/ViT_Module_Visualization.ipynb<br>
Installation:<br>
github: https://github.com/rwightman/pytorch-image-models/tree/master/timm/models

In [4]:
!pip -q install timm

[K     |████████████████████████████████| 548 kB 24.1 MB/s 
[K     |████████████████████████████████| 163 kB 61.8 MB/s 
[?25h

ViT: AN IMAGE IS WORTH 16X16 WORDS:
TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE - https://arxiv.org/pdf/2010.11929.pdf

In [None]:
from timm import create_model

device = 'cuda' if torch.cuda.is_available() else 'cpu'
vit = create_model("vit_large_patch16_224", pretrained=True).to(device)#vit_base_patch16_224
accuracy = test(vit, val_loader)
print('accuracy:',accuracy)

accuracy: 0.84374


Swin-Transformer: Hierarchical Vision Transformer using Shifted Windows -https://arxiv.org/pdf/2103.14030.pdf

In [None]:
swintran = create_model("swin_base_patch4_window7_224", pretrained=True).to(device)
accuracy = test(swintran, val_loader)
print('accuracy:',accuracy)

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Downloading: "https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window7_224_22kto1k.pth" to /root/.cache/torch/hub/checkpoints/swin_base_patch4_window7_224_22kto1k.pth


accuracy: 0.84714


DeiT: Data-efficient Image Transformers - https://arxiv.org/abs/2012.12877

In [None]:
deit = create_model("deit_base_patch16_224", pretrained=True).to(device)
accuracy = test(deit, val_loader)
print('accuracy:',accuracy)

Downloading: "https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth" to /root/.cache/torch/hub/checkpoints/deit_base_patch16_224-b5f2ef4d.pth


accuracy: 0.81742


CaiT: Class-Attention in Image Transformers (https://arxiv.org/abs/2103.17239)

In [None]:
cait = create_model("cait_s24_224", pretrained=True).to(device)
accuracy = test(cait, val_loader)
print('accuracy:',accuracy)

Downloading: "https://dl.fbaipublicfiles.com/deit/S24_224.pth" to /root/.cache/torch/hub/checkpoints/S24_224.pth


accuracy: 0.83302


BeiT: BERT Pre-Training of Image Transformers (https://arxiv.org/abs/2106.08254)

In [None]:
from timm import create_model
device = 'cuda' if torch.cuda.is_available() else 'cpu'

beit = create_model("beitv2_base_patch16_224", pretrained=True).to(device)
accuracy = test(beit, val_loader)
print('accuracy:',accuracy)

Downloading: "https://conversationhub.blob.core.windows.net/beit-share-public/beitv2/beitv2_base_patch16_224_pt1k_ft21kto1k.pth" to /root/.cache/torch/hub/checkpoints/beitv2_base_patch16_224_pt1k_ft21kto1k.pth


accuracy: 0.86092


CoaT: Co-Scale Conv-Attentional Image Transformers - https://arxiv.org/abs/2104.06399

In [None]:
coat = create_model("coat_mini", pretrained=True).to(device)
accuracy = test(coat, val_loader)
print('accuracy:',accuracy)

Downloading: "https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-coat-weights/coat_mini-2c6baf49.pth" to /root/.cache/torch/hub/checkpoints/coat_mini-2c6baf49.pth


accuracy: 0.80912


CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification (et al. ICCV 2021)

In [None]:
crossvit = create_model("crossvit_base_240", pretrained=True).to(device)
accuracy = test(crossvit, val_loader)
print('accuracy:',accuracy)

Downloading: "https://github.com/IBM/CrossViT/releases/download/weights-0.1/crossvit_base_224.pth" to /root/.cache/torch/hub/checkpoints/crossvit_base_224.pth


accuracy: 0.82092


ConvMixer: Patches Are All You Need? (https://arxiv.org/pdf/2201.09792.pdf)

In [None]:
convmixer = create_model("convmixer_768_32", pretrained=True).to(device)
accuracy = test(convmixer, val_loader)
print('accuracy:',accuracy)

Downloading: "https://github.com/tmp-iclr/convmixer/releases/download/timm-v1.0/convmixer_768_32_ks7_p7_relu.pth.tar" to /root/.cache/torch/hub/checkpoints/convmixer_768_32_ks7_p7_relu.pth.tar


accuracy: 0.8008


ConvNeXt: A ConvNet for the 2020s - https://arxiv.org/pdf/2201.03545.pdf

In [None]:
convnext = create_model("convnext_base", pretrained=True).to(device)
accuracy = test(convnext, val_loader)
print('accuracy:',accuracy)

Downloading: "https://dl.fbaipublicfiles.com/convnext/convnext_base_1k_224_ema.pth" to /root/.cache/torch/hub/checkpoints/convnext_base_1k_224_ema.pth


accuracy: 0.83746


ViT_relpos: Rethinking and Improving Relative Position Encoding for Vision Transformer -https://arxiv.org/pdf/2107.14222.pdf

In [None]:
vit_relpos = create_model("vit_relpos_base_patch16_cls_224", pretrained=True).to(device) #vit_relpos_base_patch16_224
accuracy = test(vit_relpos, val_loader)
print('accuracy:',accuracy)



# ViTs from https://github.com/jeonsworld/ViT-pytorch

In [None]:
!pip -q install ml_collections
! git clone https://github.com/jeonsworld/ViT-pytorch
%cd ViT-pytorch
! wget https://storage.googleapis.com/vit_models/imagenet21k%2Bimagenet2012/R50%2BViT-B_16.npz
! touch models/__init__.py

[?25l[K     |████▏                           | 10 kB 30.8 MB/s eta 0:00:01[K     |████████▍                       | 20 kB 37.4 MB/s eta 0:00:01[K     |████████████▋                   | 30 kB 45.3 MB/s eta 0:00:01[K     |████████████████▉               | 40 kB 26.1 MB/s eta 0:00:01[K     |█████████████████████           | 51 kB 29.7 MB/s eta 0:00:01[K     |█████████████████████████▎      | 61 kB 33.7 MB/s eta 0:00:01[K     |█████████████████████████████▍  | 71 kB 30.5 MB/s eta 0:00:01[K     |████████████████████████████████| 77 kB 5.9 MB/s 
[?25h  Building wheel for ml-collections (setup.py) ... [?25l[?25hdone
Cloning into 'ViT-pytorch'...
remote: Enumerating objects: 170, done.[K
remote: Total 170 (delta 0), reused 0 (delta 0), pack-reused 170[K
Receiving objects: 100% (170/170), 21.20 MiB | 36.25 MiB/s, done.
Resolving deltas: 100% (83/83), done.
/content/ViT-pytorch
--2022-10-21 23:03:41--  https://storage.googleapis.com/vit_models/imagenet21k%2Bimagenet2012/R50

In [None]:
! wget https://storage.googleapis.com/vit_models/imagenet21k%2Bimagenet2012/ViT-B_16-224.npz

--2022-10-21 23:11:20--  https://storage.googleapis.com/vit_models/imagenet21k%2Bimagenet2012/ViT-B_16-224.npz
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.16.128, 142.251.33.208, 142.250.188.48, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.16.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 346335542 (330M) [application/octet-stream]
Saving to: ‘ViT-B_16-224.npz’


2022-10-21 23:11:24 (83.1 MB/s) - ‘ViT-B_16-224.npz’ saved [346335542/346335542]



In [None]:
%cd ViT-pytorch

/content/ViT-pytorch


In [None]:
import argparse
import os
import shutil
import time
import numpy as np
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models

from models.modeling import VisionTransformer, CONFIGS
#config = CONFIGS['R50-ViT-B_16']
config = CONFIGS['ViT-B_16']
device = 'cuda' if torch.cuda.is_available() else 'cpu'

def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)[0]
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total

valdir = os.path.join('../imagenet', 'val')
normalize = transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])


val_dataset = datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize((224,224)),
        transforms.ToTensor(),
        normalize,
    ]))
val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=300, shuffle=False,
    num_workers=2, pin_memory=True)

hvit = VisionTransformer(config, num_classes=1000, zero_head=False, img_size=224, vis=True)
hvit.load_from(np.load("ViT-B_16-224.npz"))
hvit.to(device)
accuracy = test(hvit, val_loader)
print('accuracy:',accuracy)

accuracy: 0.80314


#CIFAR-LT
src: https://github.com/XuZhengzhuo/Prior-LT

In [11]:
!git clone https://github.com/XuZhengzhuo/Prior-LT.git
%cd Prior-LT

Cloning into 'Prior-LT'...
remote: Enumerating objects: 54, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 54 (delta 1), reused 0 (delta 0), pack-reused 48[K
Unpacking objects: 100% (54/54), done.
/content/Prior-LT


CIFAR-100-LT-50: https://drive.google.com/file/d/1PKpxeeCO5ZRAq4srleTlcQqTTjQd6JfT/view

In [25]:
#src: https://github.com/XuZhengzhuo/Prior-LT
import gdown
url = 'https://drive.google.com/uc?id=1PKpxeeCO5ZRAq4srleTlcQqTTjQd6JfT'
gdown.download(url,'model_best_cifar100_lt50.pth.tar',quiet=False) 

url = 'https://drive.google.com/uc?id=16JUoxnbxuO7nivjw4M0LkUQiJ9AyAJDm'
gdown.download(url,'model_best_cifar100_lt200.pth.tar',quiet=False) 

url = 'https://drive.google.com/uc?id=1tclscVkcXj0lJum7Azy8qHecB7Pomc0c'
gdown.download(url,'model_best_cifar10_lt50.pth.tar',quiet=False) 

url = 'https://drive.google.com/uc?id=1GTf42bpfDmMz5MHTVsX9YkjLSeo9WJ-v'
gdown.download(url,'model_best_cifar10_lt200.pth.tar',quiet=False) 

Downloading...
From: https://drive.google.com/uc?id=1PKpxeeCO5ZRAq4srleTlcQqTTjQd6JfT
To: /content/Prior-LT/model_best_cifar100_lt50.pth.tar
100%|██████████| 3.82M/3.82M [00:00<00:00, 238MB/s]
Downloading...
From: https://drive.google.com/uc?id=16JUoxnbxuO7nivjw4M0LkUQiJ9AyAJDm
To: /content/Prior-LT/model_best_cifar100_lt200.pth.tar
100%|██████████| 3.82M/3.82M [00:00<00:00, 193MB/s]
Downloading...
From: https://drive.google.com/uc?id=1tclscVkcXj0lJum7Azy8qHecB7Pomc0c
To: /content/Prior-LT/model_best_cifar10_lt50.pth.tar
100%|██████████| 3.77M/3.77M [00:00<00:00, 110MB/s]
Downloading...
From: https://drive.google.com/uc?id=1GTf42bpfDmMz5MHTVsX9YkjLSeo9WJ-v
To: /content/Prior-LT/model_best_cifar10_lt200.pth.tar
100%|██████████| 3.77M/3.77M [00:00<00:00, 192MB/s]


'model_best_cifar10_lt200.pth.tar'

In [28]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
from torchvision import models
import torchvision.transforms as transforms
from torch.optim.lr_scheduler import _LRScheduler
import matplotlib.pyplot as plt
from PIL import Image
import copy 
import os
import argparse
import sys
import random
import numpy as np
from torchvision import models
from models import resnet32
device = 'cuda' if torch.cuda.is_available() else 'cpu'

def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total
def main(num_classes=100, ckpt=None, test_loader=None):
    model = resnet32(num_classes=num_classes)
    model.load_state_dict(torch.load(ckpt)['state_dict_model'])
    model.to(device)
    acc = test(model, test_loader)
    return acc

mean_cifar, std_cifar = (0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)
transform_test = transforms.Compose([transforms.ToTensor(),
    transforms.Normalize(mean_cifar, std_cifar),])
test_dataset100 = torchvision.datasets.CIFAR100(root='data', train=False, download=True, transform=transform_test)
test_loader100 = torch.utils.data.DataLoader(test_dataset100, batch_size=2048, shuffle=False, num_workers=2)

test_dataset10 = torchvision.datasets.CIFAR10(root='data', train=False, download=True, transform=transform_test)
test_loader10 = torch.utils.data.DataLoader(test_dataset10, batch_size=2048, shuffle=False, num_workers=2)

ckpt_all =['model_best_cifar100_lt50.pth.tar', 'model_best_cifar100_lt200.pth.tar', 
           'model_best_cifar10_lt50.pth.tar', 'model_best_cifar10_lt200.pth.tar'] 
test_loader_all = [test_loader100, test_loader100, test_loader10, test_loader10]
num_classes_all = [100, 100, 10, 10]
for idx, ckpt in enumerate(ckpt_all):
    print(ckpt,':', main(num_classes=num_classes_all[idx], ckpt=ckpt, test_loader=test_loader_all[idx]))

Files already downloaded and verified
Files already downloaded and verified
model_best_cifar100_lt50.pth.tar : 0.5144
model_best_cifar100_lt200.pth.tar : 0.4347
model_best_cifar10_lt50.pth.tar : 0.8493
model_best_cifar10_lt200.pth.tar : 0.804


# ImageNet-LT
All long-tailed: https://github.com/Vanint/Awesome-LongTailed-Learning<br>
https://github.com/zzw-zwzhang/Awesome-of-Long-Tailed-Recognition

In [29]:
%cd

/root


In [36]:
# https://github.com/naver-ai/cmo
import gdown
url = 'https://drive.google.com/uc?id=1RIHcrFwzZccqvOs8GgSX5CUFUkXlVvWp'
gdown.download(url,'ckpt.pth.tar',quiet=False) 


Downloading...
From: https://drive.google.com/uc?id=1RIHcrFwzZccqvOs8GgSX5CUFUkXlVvWp
To: /root/ckpt.pth.tar
100%|██████████| 205M/205M [00:01<00:00, 160MB/s]


'ckpt.pth.tar'

In [43]:
from torchvision import models
from torch import nn
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'

def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total

valdir = os.path.join('imagenet', 'val')
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])


val_dataset = datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        normalize,
    ]))
val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=512, shuffle=False,
    num_workers=2, pin_memory=True)

model = models.resnet50(pretrained=False)
model = nn.DataParallel(model)
model.load_state_dict(torch.load('ckpt.pth.tar')['state_dict'])
model = model.module
model.to(device)
accuracy = test(model, val_loader)
print('accuracy:',accuracy)


accuracy: 0.49266


# Tiny ImageNet (Label Smoothing (LS), MBLS,  Focal Loss(FL) )
src: https://github.com/by-liu/MbLS/blob/main/docs/TEST.md<br>
MBLS paper: https://arxiv.org/pdf/2111.15430.pdf

In [1]:
! wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
! unzip -q tiny-imagenet-200.zip

--2022-10-30 20:48:00--  http://cs231n.stanford.edu/tiny-imagenet-200.zip
Resolving cs231n.stanford.edu (cs231n.stanford.edu)... 171.64.68.10
Connecting to cs231n.stanford.edu (cs231n.stanford.edu)|171.64.68.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 248100043 (237M) [application/zip]
Saving to: ‘tiny-imagenet-200.zip’


2022-10-30 20:48:21 (11.3 MB/s) - ‘tiny-imagenet-200.zip’ saved [248100043/248100043]



cloning git repo and weights

In [None]:
! wget --no-check-certificate https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-ls-best.pth
! wget --no-check-certificate https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-fl-best.pth
! wget --no-check-certificate https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-ce-best.pth
! wget --no-check-certificate https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-mbls-best.pth
!git clone https://github.com/by-liu/MbLS.git

In [14]:
! wget --no-check-certificate https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-fl-best.pth

--2022-10-30 22:15:31--  https://github.com/by-liu/MbLS/releases/download/v0.2/resnet50_tiny-fl-best.pth
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/431784914/a507f25f-10b6-43b5-8c97-9d55144ea334?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221030%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221030T221531Z&X-Amz-Expires=300&X-Amz-Signature=b64cdc8c412895065eb90881353a531c46c1f705fb5db41184c0cb5318ec7cf6&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=431784914&response-content-disposition=attachment%3B%20filename%3Dresnet50_tiny-fl-best.pth&response-content-type=application%2Foctet-stream [following]
--2022-10-30 22:15:31--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/431784914/a507f25f-10b6-43b5-8c97-9d55144e

preparing dataloader

In [10]:
import argparse
import pandas as pd
import os
import shutil
import time
import numpy as np
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
device = 'cuda' if torch.cuda.is_available() else 'cpu'
def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total

VALID_DIR = 'tiny-imagenet-200/val'
val_data = pd.read_csv(f'{VALID_DIR}/val_annotations.txt', sep='\t', 
                            header=None, names=['File', 'Class', 'X', 'Y', 'H', 'W'])


val_img_dir = os.path.join(VALID_DIR, 'images')
fp = open(os.path.join(VALID_DIR, 'val_annotations.txt'), 'r')
data = fp.readlines()

# Mapping image file name with label name
val_img_dict = {}
for line in data:
    words = line.split('\t')
    val_img_dict[words[0]] = words[1]
fp.close()

# moving images into corresponding class folders
for img, folder in val_img_dict.items():
    newpath = (os.path.join(val_img_dir, folder))
    if not os.path.exists(newpath):
        os.makedirs(newpath)
    if os.path.exists(os.path.join(val_img_dir, img)):
        os.rename(os.path.join(val_img_dir, img), os.path.join(newpath, img))


val_img_dir = os.path.join(VALID_DIR, 'images')

transform_test = transforms.Compose([transforms.Resize((64,64)), transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406,), (0.229, 0.224, 0.225,))]
        )

test_dataset = datasets.ImageFolder(os.path.join(val_img_dir),
    transform=transform_test,)

testloader = torch.utils.data.DataLoader(test_dataset, batch_size=256, shuffle=False, 
                                         num_workers=2, pin_memory=True)
print('sample size-  Validation:%d'%(len(test_dataset)))


sample size-  Validation:10000


In [16]:
import sys
sys.path.append('MbLS')
from MbLS.calibrate.net.resnet_tiny_imagenet import resnet50
def main(method=None):
    model = resnet50()
    model.load_state_dict(torch.load('resnet50_tiny-{}-best.pth'.format(method))['state_dict'])
    model.to(device)
    accuracy = test(model, testloader)
    return accuracy

methods = ['ce', 'ls', 'fl', 'mbls']
for method in methods:
    accuracy = main(method=method)
    print('method:',method, ', accuracy:',accuracy)

method: ce , accuracy: 0.6495
method: ls , accuracy: 0.6565
method: fl , accuracy: 0.6319
method: mbls , accuracy: 0.6478


#ImageNet (Label Smoothing) 
code:https://github.com/sutd-visual-computing-group/LS-KD-compatibility<br>
paper:https://arxiv.org/pdf/2206.14532.pdf

In [32]:
# https://drive.google.com/drive/folders/1GwqXRVYBpKGolNh2OLEzWUdOHx2XQ6G2
import gdown
url = 'https://drive.google.com/uc?id=13KVbELef0hWLczrp5suPvzyzH1NtXA2t'
gdown.download(url,'teacher_resnet50_ce_best.pth.tar',quiet=False) 

url = 'https://drive.google.com/uc?id=1MJ-wniJ9dv_-QoSqOHcwfzU-FP3efJp-'
gdown.download(url,'teacher_resnet50_ls_best.pth.tar',quiet=False) 

Downloading...
From: https://drive.google.com/uc?id=13KVbELef0hWLczrp5suPvzyzH1NtXA2t
To: /content/teacher_resnet50_ce_best.pth.tar
100%|██████████| 103M/103M [00:01<00:00, 73.2MB/s] 
Downloading...
From: https://drive.google.com/uc?id=1MJ-wniJ9dv_-QoSqOHcwfzU-FP3efJp-
To: /content/teacher_resnet50_ls_best.pth.tar
100%|██████████| 103M/103M [00:01<00:00, 78.7MB/s] 


'teacher_resnet50_ls_best.pth.tar'

In [33]:
from torchvision import models
from torch import nn
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'

def test(model, testloader):
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            outputs = model(inputs)
            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()
    return correct / total

valdir = os.path.join('imagenet', 'val')
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])


val_dataset = datasets.ImageFolder(valdir, transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        normalize,
    ]))
val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=512, shuffle=False,
    num_workers=2, pin_memory=True)

def main(method=None):
    model = models.resnet50(pretrained=False)
    model = nn.DataParallel(model)
    model.load_state_dict(torch.load('teacher_resnet50_{}_best.pth.tar'.format(method))['state_dict'])
    model = model.module
    model.to(device)
    accuracy = test(model, val_loader)
    return accuracy

methods = ['ce','ls']
for method in methods:
    accuracy = main(method=method)
    print('method:', method, ', accuracy:',accuracy)


  f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "


accuracy: 0.74548


More models can be found here resnet50, 18:
https://drive.google.com/drive/folders/1BwxpQELsS09-C-2EJlaGyjvZPS6iBv84<br>
NMT: https://drive.google.com/drive/folders/1GwqXRVYBpKGolNh2OLEzWUdOHx2XQ6G2



#ImageNet (Focal Loss)
src: https://github.com/richardaecn/class-balanced-loss

In [34]:
# https://github.com/richardaecn/class-balanced-loss
import gdown
url = 'https://drive.google.com/uc?id=1SmLv1-D1143Cma4Y5bDxHUfXjOI_0Yvr'
gdown.download(url,'imagenet.zip',quiet=False) 

Downloading...
From: https://drive.google.com/uc?id=1SmLv1-D1143Cma4Y5bDxHUfXjOI_0Yvr
To: /content/imagenet.zip
100%|██████████| 1.00G/1.00G [00:08<00:00, 115MB/s]


'imagenet.zip'

In [35]:
!unzip -q imagenet.zip

In [38]:
model = models.resnet50(pretrained=False)
model = nn.DataParallel(model)
model.load_state_dict(torch.load('imagenet/model.ckpt-111339.data-00000-of-00001'))

UnpicklingError: ignored