## TIMM Example
Working through the [medium article](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055).
PyTorch Image Models (timm) contains several model implementations that can be downloaded and used off the shelf or trained. We're going to switch from using Torchvision to TIMM for ease of use. TIMM is also supported by PyTorch where it seems the Torchvision models repo is no longer being added to or maintained.

In [3]:
import torch
import timm
import os

In [4]:
timm.list_models('vgg*', pretrained=True)

['vgg11',
 'vgg11_bn',
 'vgg13',
 'vgg13_bn',
 'vgg16',
 'vgg16_bn',
 'vgg19',
 'vgg19_bn']

In [87]:
timm.list_models('resnet*', pretrained=True)

['resnet10t',
 'resnet14t',
 'resnet18',
 'resnet18d',
 'resnet26',
 'resnet26d',
 'resnet26t',
 'resnet32ts',
 'resnet33ts',
 'resnet34',
 'resnet34d',
 'resnet50',
 'resnet50_gn',
 'resnet50d',
 'resnet51q',
 'resnet61q',
 'resnet101',
 'resnet101d',
 'resnet152',
 'resnet152d',
 'resnet200d',
 'resnetaa50',
 'resnetblur50',
 'resnetrs50',
 'resnetrs101',
 'resnetrs152',
 'resnetrs200',
 'resnetrs270',
 'resnetrs350',
 'resnetrs420',
 'resnetv2_50',
 'resnetv2_50d_evos',
 'resnetv2_50d_gn',
 'resnetv2_50x1_bit_distilled',
 'resnetv2_50x1_bitm',
 'resnetv2_50x1_bitm_in21k',
 'resnetv2_50x3_bitm',
 'resnetv2_50x3_bitm_in21k',
 'resnetv2_101',
 'resnetv2_101x1_bitm',
 'resnetv2_101x1_bitm_in21k',
 'resnetv2_101x3_bitm',
 'resnetv2_101x3_bitm_in21k',
 'resnetv2_152x2_bit_teacher',
 'resnetv2_152x2_bit_teacher_384',
 'resnetv2_152x2_bitm',
 'resnetv2_152x2_bitm_in21k',
 'resnetv2_152x4_bitm',
 'resnetv2_152x4_bitm_in21k']

In [130]:
timm.list_models('*dark*', pretrained=True)

['cs3darknet_focus_l',
 'cs3darknet_focus_m',
 'cs3darknet_l',
 'cs3darknet_m',
 'cs3darknet_x',
 'cs3sedarknet_l',
 'cs3sedarknet_x',
 'cspdarknet53',
 'darknet53',
 'darknetaa53']

In [23]:
# Set the download location for the models
cache_dir = './data/models/'
os.environ['TORCH_HOME'] = cache_dir

In [113]:
res_model = timm.create_model('resnet50d', pretrained=True)

In [114]:
inc_model = timm.create_model('inception_resnet_v2', pretrained=True)

In [115]:
# Inspect the layers of the resnet model
res_layers = list(res_model.children())
print('Num layers: ', len(res_layers)) # How many layers? 10
print('Final 2 layers: ', res_layers[8:]) # What's in the final 2 layers? The classifier head


Num layers:  10
Final 2 layers:  [SelectAdaptivePool2d (pool_type=avg, flatten=Flatten(start_dim=1, end_dim=-1)), Linear(in_features=2048, out_features=1000, bias=True)]


In [116]:
# Inspect the layers of the inception model
inc_layers = list(inc_model.children())
print('Num layers: ', len(inc_layers)) # What's in the final 2 layers? 17
print('Final 2 layers: ', inc_layers[15:]) # What's in the final 2 layers? The classifier head



Num layers:  17
Final 2 layers:  [SelectAdaptivePool2d (pool_type=avg, flatten=Flatten(start_dim=1, end_dim=-1)), Linear(in_features=1536, out_features=1000, bias=True)]


In [96]:
res_model.default_cfg

{'url': 'https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/resnet34-43635321.pth',
 'num_classes': 1000,
 'input_size': (3, 224, 224),
 'pool_size': (7, 7),
 'crop_pct': 0.875,
 'interpolation': 'bilinear',
 'mean': (0.485, 0.456, 0.406),
 'std': (0.229, 0.224, 0.225),
 'first_conv': 'conv1',
 'classifier': 'fc',
 'architecture': 'resnet34'}

In [117]:
# Can see from above that the classifier is "fc". Both of these calls return the same thing
print(res_model.get_classifier())
print(res_model.fc)
print('Input features: ', res_model.get_classifier().in_features)

Linear(in_features=2048, out_features=1000, bias=True)
Linear(in_features=2048, out_features=1000, bias=True)
2048


In [118]:
print('Named classifier head: ',inc_model.default_cfg['classifier'])  # To get the name of the classifier head
print(inc_model.default_cfg)

Named classifier head:  classif
{'url': 'https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/inception_resnet_v2-940b1cd6.pth', 'num_classes': 1000, 'input_size': (3, 299, 299), 'pool_size': (8, 8), 'crop_pct': 0.8975, 'interpolation': 'bicubic', 'mean': (0.5, 0.5, 0.5), 'std': (0.5, 0.5, 0.5), 'first_conv': 'conv2d_1a.conv', 'classifier': 'classif', 'label_offset': 1, 'architecture': 'inception_resnet_v2'}


In [119]:
# Can see from above that the classifier is "classif". Both of these calls return the same thing
print('Input features: ', inc_model.get_classifier().in_features)
print(inc_model.classif)

Input features:  1536
Linear(in_features=1536, out_features=1000, bias=True)


### Inspecting the models in a similar fashion to the article here:
[Notebook](https://jovian.ml/aakanksha-ns/road-signs-bounding-box-prediction)

In [124]:
# res_layers[:8]  # Take all but the classifier head
# What are in the final two layers before the classifier head? And why do they split them in that article?

# Looks like it's used to upsample the features. Likely the final feature extraction layer
# res_layers[6:8]
# Looks like all previous upsampling/conv2d layers
res_layers[:6]
# Still not sure why these are split up in the above article

[Sequential(
   (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
   (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (2): ReLU(inplace=True)
   (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
   (4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (5): ReLU(inplace=True)
   (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
 ),
 BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
 ReLU(inplace=True),
 MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False),
 Sequential(
   (0): Bottleneck(
     (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
     (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
     (act1): ReLU(inplace=True)
     (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bia

In [128]:
# inc_layers[:15]  # Take all but the classifier head
# What are in the final 2 layers before the classifier head? Why are they split up in the article?

# Similar to the above, looks like it's used to upsample features
# inc_layers[13:15]

# Similarly, all previous conv layers. Still no idea why the split though.
# inc_layers[:13]

[Block8(
   (branch0): BasicConv2d(
     (conv): Conv2d(2080, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
     (bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
     (relu): ReLU()
   )
   (branch1): Sequential(
     (0): BasicConv2d(
       (conv): Conv2d(2080, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
       (bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
       (relu): ReLU()
     )
     (1): BasicConv2d(
       (conv): Conv2d(192, 224, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
       (bn): BatchNorm2d(224, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
       (relu): ReLU()
     )
     (2): BasicConv2d(
       (conv): Conv2d(224, 256, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
       (bn): BatchNorm2d(256, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
       (relu): ReLU()
     )
   )
   (conv2d): Conv2d(448, 2080,

In [16]:
timm.create_model('resnet50d', pretrained=True, num_classes=10).get_classifier()

Linear(in_features=2048, out_features=10, bias=True)

In [17]:
model = timm.create_model('resnet50d', pretrained=True, num_classes=10, global_pool='catavgmax')

In [18]:
in_features = model.get_classifier().in_features
in_features

4096

In [19]:
# Replace the final layer with a modified classification head
from torch import nn
model.fc = nn.Sequential(
    nn.BatchNorm1d(in_features),
    nn.Linear(in_features=in_features, out_features=512, bias=False),
    nn.ReLU(),
    nn.BatchNorm1d(512),
    nn.Dropout(0.4),
    nn.Linear(in_features=512, out_features=10, bias=False)
)

In [21]:
model.eval()
model(torch.randn(1, 3, 224, 224)).shape

torch.Size([1, 10])

There's a lot more information in the medium article, mostly involving using the built in dataset stuff. I'm not writing all of that out here because we'll likely have to use something similar to the dataset class we've already created.