# Convolutions by examples

### Acknowledgement

Code from Sections 1 to 3 taken from [fastai: lesson 0](http://course.fast.ai/lessons/lesson0.html)

## 1. Preparations

In [3]:
%matplotlib inline
import math,sys,os,numpy as np
from numpy.linalg import norm
from matplotlib import pyplot as plt

In [4]:
import torch
import torchvision
from torchvision import models,transforms,datasets

Download MNIST data on disk and convert it to pytorch compatible formating.

```torchvision.datasets``` features support (download, formatting) for a collection of popular datasets. The list of available datasets in ```torchvision``` can be found [here](http://pytorch.org/docs/master/torchvision/datasets.html).

Note that the download is performed only once. The function will always check first if the data is already on disk.


In [5]:
#to be modified!
root_dir = '/Users/sachaizadi/Documents/Data_science_for_Business/Year2/3_DeepLearning/data'
torchvision.datasets.MNIST(root=root_dir,download=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Processing...
Done!


Dataset MNIST
    Number of datapoints: 60000
    Split: train
    Root Location: /Users/sachaizadi/Documents/Data_science_for_Business/Year2/3_DeepLearning/data
    Transforms (if any): None
    Target Transforms (if any): None

MNIST datasets consists of small images of hand-written digits. The images are grayscale and have size 28 x 28. There are 60,000 training images and 10,000 testing images.

In [6]:
train_set = torchvision.datasets.MNIST(root=root_dir, train=True, download=True)

Define and initialize a data loader for the MNIST data already downloaded on disk.

In [7]:
MNIST_dataset = torch.utils.data.DataLoader(train_set, batch_size=1, shuffle=True, num_workers=1)

For the current notebook, since we are doing no training, we can format data as _numpy ndarrays_ which are easier to plot in matplotlib. The same operations can be easily performed on _pytorch Tensors_.

In [8]:
a = train_set.train_data.numpy().astype(np.float32)/255
b = train_set.train_labels.numpy()

In [9]:
print(a.shape,b.shape)

(60000, 28, 28) (60000,)


Save data as compressed ndarrays

In [10]:
np.savez_compressed(os.path.join(root_dir, 'train'), images=a, labels=b)

## 2. Data visualization

For convenience we define a few functions for formatting and plotting our image data

In [11]:
data = np.load(os.path.join(root_dir, 'train.npz'))
images=data['images']
labels=data['labels']
n=len(images)
images.shape

(60000, 28, 28)

We fetch all images from the _eight_ class and from the _one_ class.

In [12]:
eights=[images[i] for i in range(n) if labels[i]==8]
ones=[images[i] for i in range(n) if labels[i]==1]

In [13]:
len(eights), len(ones)

(5851, 6742)

## 3. Save images

In [16]:
import scipy.misc
scipy.misc.imsave('./eight.jpg', eights[0])

`imsave` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imwrite`` instead.
  


We improved the accuracy while reducing the embedding size from a $28\times 28 = 784$ vector to a $4\times 4\times 8 = 128$ vector.

## 5. Practicals: improving classification with Convolutional Neural Net

You will now build a neural net that will learn the weights of the filters.

The first layer of your network will be a convolutional layer with $8$ filters of size $3\times 3$ as we did above. This will produce a (once flatten) a vector of size $128 = 3\times 3\times 8$. From this vector, you need to predict if the corresponding input is a $1$ or a $8$. So you are back to a classification problem or logistic regression, i.e. from there you can apply the solution of previous homework!

You need to fill the code written below to construct your CNN. You will need to look for documentation about [torch.nn](https://pytorch.org/docs/stable/nn.html) in the Pytorch doc.

In [17]:
import torch.nn as nn
import torch.nn.functional as F

from models import classifier

In [21]:
conv_class = classifier()
use_gpu = False

Your code should work fine on a batch of 3 images.

In [22]:
batch_3images = train_set.train_data[0:2].type(torch.FloatTensor).resize_(3, 1, 28, 28)
conv_class(batch_3images)

tensor([[                                   -39.0323,
                                              0.0000],
        [                                  -117.3601,
                                              0.0000],
        [-5197982905302151351510909791447810048.0000,
                                              0.0000]],
       grad_fn=<LogSoftmaxBackward>)

The following lines of code implement a data loader for the train set and the test set. No modification is needed.

In [23]:
bs = 64

kwargs = {'num_workers': 1, 'pin_memory': True} if use_gpu else {}

l8 = np.array(0)
eights_dataset = [[torch.from_numpy(e.astype(np.float32)).unsqueeze(0), torch.from_numpy(l8.astype(np.int64))] for e in eights]
l1 = np.array(1)
ones_dataset = [[torch.from_numpy(e.astype(np.float32)).unsqueeze(0), torch.from_numpy(l1.astype(np.int64))] for e in ones]
train_dataset = eights_dataset[1000:] + ones_dataset[1000:]
test_dataset = eights_dataset[:1000] + ones_dataset[:1000]

train_loader = torch.utils.data.DataLoader(train_dataset,
    batch_size=bs, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(test_dataset,
    batch_size=bs, shuffle=True, **kwargs)

You need now to code the training loop. Store the loss and accuracy for each epoch.

In [24]:
def train(model,data_loader,loss_fn,optimizer,n_epochs=1):
    
    model.train(True)
    
    loss_train = np.zeros(n_epochs)
    acc_train = np.zeros(n_epochs)
    
    for epoch_num in range(n_epochs):
        running_corrects = 0.0
        running_loss = 0.0
        size = 0

        for data in data_loader:
            inputs, labels = data
            bs = labels.size(0)
            
            if use_gpu:
                inputs.cuda()
                
            #conv_class(batch_3images)
            #optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
            #
            y_pred = model(inputs)
            loss = loss_fn(y_pred, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            running_loss += loss
            
            
            _, predicted = torch.max(y_pred.data, 1)
            running_corrects += (predicted == labels).sum()
            size += bs
            
        epoch_loss = running_loss / size
        epoch_acc = running_corrects.item() / size
        loss_train[epoch_num] = epoch_loss
        acc_train[epoch_num] = epoch_acc
        print('Train - Loss: {:.4f} Acc: {:.4f}'.format(epoch_loss, epoch_acc))
        
    return loss_train, acc_train

In [25]:
conv_class = classifier()
# choose the appropriate loss
loss_fn = nn.NLLLoss()
learning_rate = 1e-3
# your SGD optimizer
optimizer_cl = torch.optim.SGD(conv_class.parameters(), lr=learning_rate)
# and train for 10 epochs
l_t, a_t = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)

Train - Loss: 0.0110 Acc: 0.4377
Train - Loss: 0.0106 Acc: 0.5413
Train - Loss: 0.0103 Acc: 0.6754
Train - Loss: 0.0099 Acc: 0.7886
Train - Loss: 0.0095 Acc: 0.8387
Train - Loss: 0.0091 Acc: 0.8759
Train - Loss: 0.0087 Acc: 0.8967
Train - Loss: 0.0083 Acc: 0.9121
Train - Loss: 0.0079 Acc: 0.9231
Train - Loss: 0.0075 Acc: 0.9322


Let's learn for 10 more epochs

In [26]:
l_t1, a_t1 = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)

Train - Loss: 0.0070 Acc: 0.9377
Train - Loss: 0.0066 Acc: 0.9424
Train - Loss: 0.0062 Acc: 0.9458
Train - Loss: 0.0059 Acc: 0.9496
Train - Loss: 0.0055 Acc: 0.9517
Train - Loss: 0.0052 Acc: 0.9548
Train - Loss: 0.0049 Acc: 0.9570
Train - Loss: 0.0046 Acc: 0.9586
Train - Loss: 0.0044 Acc: 0.9588
Train - Loss: 0.0041 Acc: 0.9610


Our network seems to learn but we now need to check its accuracy on the test set.

In [27]:
def test(model,data_loader):
    model.train(False)

    running_corrects = 0.0
    running_loss = 0.0
    size = 0

    for data in data_loader:
        inputs, labels = data    
        bs = labels.size(0)
        
        if use_gpu:
            inputs.cuda()
            
        y_pred = model(inputs)
        loss = loss_fn(y_pred, labels)    
        running_loss += loss
        _, predicted = torch.max(y_pred.data, 1)
        running_corrects += (predicted == labels).sum()
        size += bs

    print('Test - Loss: {:.4f} Acc: {:.4f}'.format(running_loss / size, running_corrects.item() / size))

In [28]:
test(conv_class,test_loader)

Test - Loss: 0.0042 Acc: 0.9630


Change the optimizer to Adam.

In [35]:
conv_class = classifier()
# choose the appropriate loss
loss_fn = nn.NLLLoss()
learning_rate = 1e-3
# your SGD optimizer
optimizer_cl = torch.optim.Adam(params=conv_class.parameters(), lr=0.0001)

# and train for 10 epochs
l_t, a_t = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)

print('\n')

l_t1, a_t1 = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)

print('\n')

test(conv_class,test_loader)

Train - Loss: 0.0105 Acc: 0.7390
Train - Loss: 0.0092 Acc: 0.9761
Train - Loss: 0.0079 Acc: 0.9757
Train - Loss: 0.0066 Acc: 0.9744
Train - Loss: 0.0054 Acc: 0.9735
Train - Loss: 0.0044 Acc: 0.9740
Train - Loss: 0.0037 Acc: 0.9745
Train - Loss: 0.0030 Acc: 0.9765
Train - Loss: 0.0026 Acc: 0.9781
Train - Loss: 0.0022 Acc: 0.9799


Train - Loss: 0.0020 Acc: 0.9806
Train - Loss: 0.0017 Acc: 0.9817
Train - Loss: 0.0016 Acc: 0.9823
Train - Loss: 0.0014 Acc: 0.9830
Train - Loss: 0.0013 Acc: 0.9834
Train - Loss: 0.0012 Acc: 0.9842
Train - Loss: 0.0011 Acc: 0.9842
Train - Loss: 0.0010 Acc: 0.9856
Train - Loss: 0.0010 Acc: 0.9866
Train - Loss: 0.0009 Acc: 0.9866


Test - Loss: 0.0010 Acc: 0.9860


How many parameters did your network learn?

In [30]:
#sum(1 for i in conv_class.parameters())

for param in conv_class.parameters():
    print(param.size())
    #print(param)

torch.Size([8, 1, 3, 3])
torch.Size([8])
torch.Size([2, 128])
torch.Size([2])


## Saving the full model

In [36]:
torch.save(conv_class, './conv_class.pk')

In [82]:
c = torch.load('/Users/sachaizadi/Desktop/conv_class.pk')
c

classifier(
  (conv1): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (fc): Linear(in_features=128, out_features=2, bias=True)
)

In [37]:
eights[0]

array([[0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.    

In [95]:
y_pred = c(torch.from_numpy(eights[0]).resize_(1, 1, 28, 28))
torch.max(y_pred.data, 1)[1].numpy()[0]

0

In [94]:
y_pred = c(torch.from_numpy(ones[0]).resize_(1, 1, 28, 28))
torch.max(y_pred.data, 1)[1].numpy()[0]

1