# Some Pytorch Datasets and Models #

The purpose of this assignment is develop image models using the pytorch framework, learn how to use existing image models using the pytorch hub, and see some limitations of these models.

# Problem 0 #

AlexNet is one example of a pre-trained image model, but there are many others.  In this exercise you will load the ResNet50 model and use it to try and classify an image that you have downloaded. 

In [2]:
import torch

In [1]:
import torchvision

In [None]:
torch.hub.list('pytorch/vision')  # This lists all the models available on the pytorch/vision hub.

**Task:** There are more than 1000 categories in the Imagenet. (See below.)  Pick one, and find an image representing it on the internet, and see if ResNet50 can classify it. 

**Warning:** Be sure to research the arguments for loading ResNet50 into pytorch. Make sure you have resized the image appropriately.  If you can't find any images on the internet, my wife is a conoisseur of cat pictures and I have placed pictures of our two cats in the Assns folder on the Github.

In [None]:
import requests

x = requests.get('https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt')
print(x.text)  # This shows all the categories in imagenet.

# Problem 1 #

The goal in this problem is to see how well convolutional models perform when the data is transformed in obvious ways. We shall be using the MNIST data set, which you should import from torchvision.dataset.

1. Create and train a convolutional neural network to classify the MNIST data.  Track your validation and training accuracy throughout your training, and find your accuracy on the MNIST test dataset.  Your network should consist of the following sequence of layers, which slightly modify the network on pages 473-482 of the textbook--or see [the author's github](https://github.com/rasbt/machine-learning-book/tree/main/ch14):

+ A 2-D convolution with 16 feature maps, kernel size of 5 with a stride of **three** (how much padding?)
+ A ReLU activation function
+ A **mean** pool layer with kernel size 2 and stride **2**
+ A 2-D convolution with 32 feature maps, kernel size of 3 with a stride of 1 (how much padding?)
+ A ReLU activation function
+ A **mean** pool layer with kernel size 2 and stride **2**
+ A linear fully-connected layer with 512 nodes
+ Include a 50% dropout factor as well

2. Apply a transformation (from [torchvision.transforms](https://pytorch.org/vision/0.9/transforms.html)) which rotates each image counterclockwise by 60 degrees.
+ Produce a plot of a random sample of 10 elements of the newly transformed dataset
+ Find the accuracy score of your model on this new dataset.  Does rotating the data seem to have an impact on your accuracy?

3. Try the same thing with an [affine transformation](https://www.mathworks.com/discovery/affine-transformation.html), with any random rotation, a translation of at most 3, and a shear of at most 30 degrees, but no scaling.

# Problem 2 #

In class on Wednesday we went over the "Smile Classification" problem of using a CNN to determine if a celebrity image represented a smiling face.  This is covered on pages 482-497 of Chapter 14 of the textbook, and in [part 2](https://github.com/rasbt/machine-learning-book/tree/main/ch14) of the accompanying Jupyter notebook.  In all of the times that I've trained this model, validation accuracy tops out at about 88%.  On page 497, the author issues a challenge: Try and get validation accuracy above 90%.  He suggests several approaches that you might try:

+ Change the dropout probabilities and the number of filters in the different convolutional layers
+ Replace the global average-pooling (see page 492) with a fully connected layer
+ Overall, just see what happens if you change or modify the CNN architecture

And if that doesn't work, he's fairly confident that "if you are using the entire training dataset with the CNN archiecture we trained in thie chapter, you should be able to achieve about 90% accuracy".

**Task:** Take him up on the challenge and see if you can obtain more than 90% accuracy on your last validation rounds.

**Note:**  To make the data easier to access, I have made a copy of the **celeba** dataset available in each of your math.knox.edu 'cloud' Jupyter notebook servers.  It is available in the folder '/data'.  You don't have to know the details, but you should know that the following code should load the **celeba** dataset for you:

In [2]:
import torchvision 

image_path = '/data'
celeba_train_dataset = torchvision.datasets.CelebA(image_path, split='train', target_type='attr', download=False)
celeba_valid_dataset = torchvision.datasets.CelebA(image_path, split='valid', target_type='attr', download=False)
celeba_test_dataset = torchvision.datasets.CelebA(image_path, split='test', target_type='attr', download=False)

print('Train set:', len(celeba_train_dataset))
print('Validation set:', len(celeba_valid_dataset))
print('Test set:', len(celeba_test_dataset))

Train set: 162770
Validation set: 19867
Test set: 19962


**Warning:** The '/data' folder is **NOT** on the same computer system as the GPU processors, though the server *is* on the same network.  When files are shared over a network using this approach it takes additional time to load the data, so it *will* cause the model to train somewhat more slowly. 

# Problem 3 #

The following code reads a file which contains the list of 40 attributes that are associated with each image in the **celeba** dataset.  Note that entry '31' is the 'Smiling' attribute we used in class.

**Task:** Pick another attribute and use a neural network (perhaps the one from the text or the one you made in Problem 2) to make predictions for one of the other attributes in the list.  No need to produce a test dataset, but show your validation accuracy in the output of your training loop **and** produce a sample list of 10 images and probabilities as was done in the book.

**Warning:** I don't how good any of these variables are.  It seems to me that "Big_Nose", for instance, is a rather subjective concept. Choose wisely!

In [5]:
with open('/data/celeba/list_attr_celeba.txt') as f:
    first_line = f.readline()
    second_line = f.readline()

myattributes = second_line.split()

In [6]:
myattributes

['5_o_Clock_Shadow',
 'Arched_Eyebrows',
 'Attractive',
 'Bags_Under_Eyes',
 'Bald',
 'Bangs',
 'Big_Lips',
 'Big_Nose',
 'Black_Hair',
 'Blond_Hair',
 'Blurry',
 'Brown_Hair',
 'Bushy_Eyebrows',
 'Chubby',
 'Double_Chin',
 'Eyeglasses',
 'Goatee',
 'Gray_Hair',
 'Heavy_Makeup',
 'High_Cheekbones',
 'Male',
 'Mouth_Slightly_Open',
 'Mustache',
 'Narrow_Eyes',
 'No_Beard',
 'Oval_Face',
 'Pale_Skin',
 'Pointy_Nose',
 'Receding_Hairline',
 'Rosy_Cheeks',
 'Sideburns',
 'Smiling',
 'Straight_Hair',
 'Wavy_Hair',
 'Wearing_Earrings',
 'Wearing_Hat',
 'Wearing_Lipstick',
 'Wearing_Necklace',
 'Wearing_Necktie',
 'Young']

In [7]:
myattributes[31]

'Smiling'