## Question 3. 

Pick pick 10 images of animals (such as dogs, cats, birds, farm animals, etc.). If the subject does not occupy a reasonable part of the image, then crop the image. 

Now use a pretrained image classification CNN as in Lab 10.9.4 to predict the class of each of your images, and report the probabilities for the top five predicted classes for each image.


We now read in the images and preprocess them.

In [136]:
import torch
from torchvision.io import read_image
from torchvision.transforms import Resize, CenterCrop, Normalize, Compose
from glob import glob

resize = Resize((232,232))
crop = CenterCrop (224)
normalize = Normalize([0.485,0.456,0.406],
                      [0.229 ,0.224 ,0.225])
imgfiles = sorted([f for f in glob('images/*')])
imgs = torch.stack([torch.div(crop(resize(read_image(f))), 255) 
                    for f in imgfiles])
imgs = normalize(imgs) 
imgs.size()

torch.Size([10, 3, 224, 224])

We now set up the trained network with the weights

In [139]:
from torchvision.models import resnet50, ResNet50_Weights
from torchsummary import summary

resnet_model = resnet50(weights=ResNet50_Weights.DEFAULT) 
# summary(resnet_model,input_data=imgs, col_names=['input_size', 'output_size', 'num_params'])

We set the mode to eval() to ensure that the model is ready to predict on new data.

In [142]:
resnet_model.eval()

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

We now feed our 10 images through the fitted network.

In [145]:
img_preds = resnet_model(imgs)

Let’s look at the predicted probabilities for each of the top 5 choices.

In [148]:
img_probs = np.exp(np.asarray(img_preds.detach())) 
img_probs /= img_probs.sum(1)[:,None]

In order to see the class labels, we must download the index file associated with imagenet.

In [151]:
import json
import numpy as np
import pandas as pd

labs = json.load(open('imagenet_class_index.json')) 
class_labels = pd.DataFrame([(int(k), v[1]) for k, v in
labs.items()],
columns=['idx', 'label']) 
class_labels = class_labels.set_index('idx')
class_labels = class_labels.sort_index()

We’ll now construct a data frame for each image file with the labels with the 5 highest probabilities as estimated by the model above.

In [154]:
for i, imgfile in enumerate(imgfiles):
    img_df = class_labels.copy()
    img_df['prob'] = img_probs[i]
    img_df = img_df.sort_values(by='prob', ascending=False)[:5] 
    print(f'Image: {imgfile}') 
    print(img_df.reset_index().drop(columns=['idx']))

Image: images/ squirrel.jpg
                    label      prob
0            fox_squirrel  0.497437
1  red-breasted_merganser  0.005409
2                  marmot  0.004642
3                    corn  0.002736
4                     ear  0.002391
Image: images/bluebird.jpg
            label      prob
0            kite  0.453833
1  great_grey_owl  0.015914
2             jay  0.012210
3           quail  0.008303
4           macaw  0.005181
Image: images/cat.jpg
         label      prob
0  Persian_cat  0.163070
1        tabby  0.074143
2    tiger_cat  0.042578
3      doormat  0.034508
4  paper_towel  0.015525
Image: images/dog.jpg
             label      prob
0            Lhasa  0.260317
1         Shih-Tzu  0.097196
2  Tibetan_terrier  0.032820
3   cocker_spaniel  0.005889
4         Pekinese  0.005229
Image: images/flamingo.jpg
            label      prob
0        flamingo  0.609515
1       spoonbill  0.013586
2  American_egret  0.002132
3         pelican  0.001365
4           crane  0.00126

## Question 4.

Repeat the analysis of Lab 10.9.5 on the IMDb data using a similarlystructured neural network. We used 16 hidden units at each of twohidden layers. Explore the effect of increasing this to 32 and 64 unitsper layer, with and without 30% dropout regularization.

In [267]:
from tensorflow.keras.datasets import imdb
from torch import nn
from torch.utils.data import TensorDataset, DataLoader
from ISLP.torch import (SimpleDataModule,
                        SimpleModule,
                        ErrorTracker,
                        rec_num_workers)
from ISLP.torch.imdb import (load_lookup,
                             load_tensor,
                             load_sparse,
                             load_sequential)
from pytorch_lightning.loggers import CSVLogger
from pytorch_lightning import Trainer
from ISLP.torch.imdb import _get_imdb
from torchinfo import summary
from torch.optim import RMSprop

In [283]:
(imdb_seq_train, 
 imdb_seq_test) = load_sequential(root='data/IMDB')
padded_sample = np.asarray(imdb_seq_train.tensors[0][0])
sample_review = padded_sample[padded_sample > 0][:12]
sample_review[:12]

  S_test) = [torch.load(_get_imdb(f'IMDB_{r}', root))


array([   1,   14,   22,   16,   43,  530,  973, 1622, 1385,   65,  458,
       4468], dtype=int32)

In [285]:
lookup = load_lookup(root='data/IMDB')
' '.join(lookup[i] for i in sample_review)

"<START> this film was just brilliant casting location scenery story direction everyone's"

For our first model, we have created a binary feature for each of the 10,000 possible words in the dataset, with an entry of one in the i, j entry if word j appears in review i. 

In [288]:
((X_train, Y_train),(X_valid, Y_valid),(X_test, Y_test)) = load_sparse(
     validation=2000,
    random_state=0,
    root='data/IMDB')

We’ll use a two-layer model for our first model.

### 32 units per layer

In [291]:
class IMDBModel(nn.Module):
    def __init__(self, input_size): 
        super(IMDBModel, self).__init__() 
        self.dense1 = nn.Linear(input_size, 32) 
        self.activation = nn.ReLU()
        self.dense2 = nn.Linear(32, 32) 
        self.output = nn.Linear(32, 1)
    def forward(self, x):
        val = x
        for _map in [self.dense1,
                     self.activation , 
                     self.dense2, 
                     self.activation , 
                     self.output]:
            val = _map(val) 
            return torch.flatten(val)

We now instantiate our model and look at a summary (not shown).

In [295]:
imdb_model = IMDBModel(imdb_test.tensors[0].size()[1]) 
summary(imdb_model,
        input_size=imdb_test.tensors[0].size(), 
        col_names=['input_size','output_size', 'num_params'])

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #
IMDBModel                                [25000, 10003]            [800000]                  1,089
├─Linear: 1-1                            [25000, 10003]            [25000, 32]               320,128
Total params: 321,217
Trainable params: 321,217
Non-trainable params: 0
Total mult-adds (Units.GIGABYTES): 8.00
Input size (MB): 1000.30
Forward/backward pass size (MB): 6.40
Params size (MB): 1.28
Estimated Total Size (MB): 1007.98

### 32 units per layer with 0.3 dropout

In [306]:
class IMDBModel(nn.Module):
    def __init__(self, input_size): 
        super(IMDBModel, self).__init__() 
        self.dense1 = nn.Linear(input_size, 32) 
        self.activation = nn.ReLU()
        self.dense2 = nn.Linear(32, 32) 
        self.output = nn.Linear(32, 1)
        self.dropout = nn.Dropout(0.3)
    def forward(self, x):
        val = x
        for _map in [self.dense1,
                     self.activation , 
                     self.dense2, 
                     self.activation , 
                     self.output,
                     self.dropout]:
            val = _map(val) 
            return torch.flatten(val)

In [308]:
imdb_model = IMDBModel(imdb_test.tensors[0].size()[1]) 
summary(imdb_model,
        input_size=imdb_test.tensors[0].size(), 
        col_names=['input_size','output_size', 'num_params'])

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #
IMDBModel                                [25000, 10003]            [800000]                  1,089
├─Linear: 1-1                            [25000, 10003]            [25000, 32]               320,128
Total params: 321,217
Trainable params: 321,217
Non-trainable params: 0
Total mult-adds (Units.GIGABYTES): 8.00
Input size (MB): 1000.30
Forward/backward pass size (MB): 6.40
Params size (MB): 1.28
Estimated Total Size (MB): 1007.98

### 64 units per layer

In [299]:
class IMDBModel(nn.Module):
    def __init__(self, input_size): 
        super(IMDBModel, self).__init__() 
        self.dense1 = nn.Linear(input_size, 64) 
        self.activation = nn.ReLU()
        self.dense2 = nn.Linear(64, 64) 
        self.output = nn.Linear(64, 1)
    def forward(self, x):
        val = x
        for _map in [self.dense1,
                     self.activation , 
                     self.dense2, 
                     self.activation , 
                     self.output]:
            val = _map(val) 
            return torch.flatten(val)

In [301]:
imdb_model = IMDBModel(imdb_test.tensors[0].size()[1]) 
summary(imdb_model,
        input_size=imdb_test.tensors[0].size(), 
        col_names=['input_size','output_size', 'num_params'])

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #
IMDBModel                                [25000, 10003]            [1600000]                 4,225
├─Linear: 1-1                            [25000, 10003]            [25000, 64]               640,256
Total params: 644,481
Trainable params: 644,481
Non-trainable params: 0
Total mult-adds (Units.GIGABYTES): 16.01
Input size (MB): 1000.30
Forward/backward pass size (MB): 12.80
Params size (MB): 2.56
Estimated Total Size (MB): 1015.66

### 64 units per layer with 0.3 dropout

In [318]:
class IMDBModel(nn.Module):
    def __init__(self, input_size): 
        super(IMDBModel, self).__init__() 
        self.dense1 = nn.Linear(input_size, 64) 
        self.activation = nn.ReLU()
        self.dense2 = nn.Linear(64, 64) 
        self.output = nn.Linear(64, 1)
        self.dropout = nn.Dropout(0.3)
    def forward(self, x):
        val = x
        for _map in [self.dense1,
                     self.activation , 
                     self.dense2, 
                     self.activation , 
                     self.output,
                     self.dropout]:
            val = _map(val) 
            return torch.flatten(val)

In [320]:
imdb_model = IMDBModel(imdb_test.tensors[0].size()[1]) 
summary(imdb_model,
        input_size=imdb_test.tensors[0].size(), 
        col_names=['input_size','output_size', 'num_params'])

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #
IMDBModel                                [25000, 10003]            [1600000]                 4,225
├─Linear: 1-1                            [25000, 10003]            [25000, 64]               640,256
Total params: 644,481
Trainable params: 644,481
Non-trainable params: 0
Total mult-adds (Units.GIGABYTES): 16.01
Input size (MB): 1000.30
Forward/backward pass size (MB): 12.80
Params size (MB): 2.56
Estimated Total Size (MB): 1015.66

Having loaded the datasets into a data module and created a SimpleModule, the remaining steps are familiar.