# Convolutional Neural Net (CNN) with Python

- CNN: can be thought as --> feat. engineering + classification
- ReLU is the most common activation function in CNN
- Typical architecture: Input layers --> (Conv2d/3d --> MaxPooling) --> (fully connected layers) --> output layer
- Ref: https://blog.algorithmia.com/convolutional-neural-nets-in-pytorch/


## Terms

- **Kernel Size** – the size of the filter.
- **Kernel Type** – the values of the actual filter. Some examples include identity, edge detection, and sharpen.
- **Stride** – the rate at which the kernel passes over the input image. A stride of 2 moves the kernel in 2 pixel increments.
- **Padding** – we can add layers of 0s to the outside of the image in order to make sure that the kernel properly passes over the edges of the image.
- **Output Layers** – how many different kernels are applied to the image.
- **Max Pooling** – used to reduce the dimension of the images. As simple as max(pix_1,...,pix_n) 


## 1. Setup

In [1]:
# Library
import numpy as np
import torch
import torchvision
import torchvision.transforms as transforms

# Set random seed --> just to replicate experiment!
seed = 42
np.random.seed(seed)
torch.manual_seed(seed)

<torch._C.Generator at 0x7fd92d284c50>

## 2. Download or load the data

Ref: https://www.cs.toronto.edu/~kriz/cifar.html

In [2]:
#The compose function allows for multiple transforms
#transforms.ToTensor() converts our PILImage to a tensor of shape (C x H x W) in the range [0,1]
#transforms.Normalize(mean,std) normalizes a tensor to a (mean, std) for (R, G, B)
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

dpath = '/home/jaime/.dropbox-lcn/Dropbox/jisoft_LARGE/0_data_cifardata'

train_set = torchvision.datasets.CIFAR10(root=dpath, train=True, download=True, transform=transform)
test_set = torchvision.datasets.CIFAR10(root=dpath, train=False, download=True, transform=transform)

Files already downloaded and verified
Files already downloaded and verified


In [3]:
# Classes
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

In [4]:
train_set.train_data.shape

(50000, 32, 32, 3)

## 3. Prepare data: training, validation and testing

In [5]:
from torch.utils.data.sampler import SubsetRandomSampler

#Training
n_training_samples = 20000
train_sampler = SubsetRandomSampler(np.arange(n_training_samples, dtype=np.int64))

#Validation
n_val_samples = 5000
val_sampler = SubsetRandomSampler(np.arange(n_training_samples, n_training_samples + n_val_samples, dtype=np.int64))

#Test
n_test_samples = 5000
test_sampler = SubsetRandomSampler(np.arange(n_test_samples, dtype=np.int64))

In [6]:
train_sampler

<torch.utils.data.sampler.SubsetRandomSampler at 0x7fd8eaaa67b8>

## 4. Design the CNN

Main functions:

- **torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)** – applies convolution
- **torch.nn.relu(x)** – applies ReLU
- **torch.nn.MaxPool2d(kernel_size, stride, padding)** – applies Max Pooling
- **torch.nn.Linear(in_features, out_features)** – fully connected layer (multiply inputs by learned weights)

In [7]:
from torch.autograd import Variable
import torch.nn.functional as F

class CNN0(torch.nn.Module):
    
    #Our batch shape for input x is (3, 32, 32)
    
    def __init__(self):
        super(CNN0, self).__init__()
        
        # Input channels = 3, output channels = 18
        #         - - -
        # kernel: - x -  stride=1 will keep the same dimension! 
        #         - - -
        # Output: 18,32,32
        self.conv1 = torch.nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1)
        
        # MaxPool: stride=2 will make jumpt 2-by-2
        #          number of channels will not change
        # Output: 18,16,16
        self.pool = torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
        
        # DenseLayer1: flatten the previous output
        # 4608 = 18*16*16 input features, 64 output features (see sizing flow below)
        # Output: 1,64
        self.fc1 = torch.nn.Linear(18 * 16 * 16, 64)
        
        # Denselayer2
        # 64 input features, 10 output features for our 10 defined classes
        # Output: 1,10
        self.fc2 = torch.nn.Linear(64, 10)
        
    def forward(self, x):
        
        # Computes the activation of the first convolution
        # Size changes from (3, 32, 32) to (18, 32, 32)
        x = F.relu(self.conv1(x))
        
        # Size changes from (18, 32, 32) to (18, 16, 16)
        x = self.pool(x)
        
        # Reshape data to input to the input layer of the neural net
        # Size changes from (18, 16, 16) to (1, 4608)
        # Recall that the -1 infers this dimension from the other given dimension
        x = x.view(-1, 18 * 16 *16)
        
        # Computes the activation of the first fully connected layer
        # Size changes from (1, 4608) to (1, 64)
        x = F.relu(self.fc1(x))
        
        #Computes the second fully connected layer (activation applied later)
        #Size changes from (1, 64) to (1, 10)
        x = self.fc2(x)
        return(x)

In [16]:
# Auxiliary function to find out the output size
def outputSize(in_size, kernel_size, stride, padding):

    output = int((in_size - kernel_size + 2*(padding)) / stride) + 1

    return(output)

in_size = 32
kernel_size = 3
stride = 1
padding = 1
outputSize(in_size, kernel_size, stride, padding)


32

In [17]:
train_set

Dataset CIFAR10
    Number of datapoints: 50000
    Split: train
    Root Location: /home/jaime/.dropbox-lcn/Dropbox/jisoft_LARGE/0_data_cifardata
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
                         )
    Target Transforms (if any): None

## 5. Data Loader

In [9]:
# DataLoader for each dataset: prepare each batch

# Train loader. It takes in a dataset and a sampler for loading (num_workers deals with system level memory) 
def get_train_loader(batch_size):
    train_loader = torch.utils.data.DataLoader(train_set, batch_size=batch_size,
                                           sampler=train_sampler, num_workers=2)
    return(train_loader)

# Test and validation loaders have constant batch sizes, so we can define them directly
test_loader = torch.utils.data.DataLoader(test_set, batch_size=4, sampler=test_sampler, num_workers=2)
val_loader = torch.utils.data.DataLoader(train_set, batch_size=128, sampler=val_sampler, num_workers=2)


## 6. Optimizer and Loss

In [10]:
import torch.optim as optim

# Function makes easy to recall functions for different NN models :)
def createLossAndOptimizer(net, learning_rate=0.001):
    
    #Loss function
    loss = torch.nn.CrossEntropyLoss()
    
    #Optimizer
    optimizer = optim.Adam(net.parameters(), lr=learning_rate)
    
    return(loss, optimizer)

In [11]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda:0" if use_cuda else "cpu") 

## 7. Training

In [14]:
import time
from sklearn.metrics import classification_report
#import sklearn.metrics.classification_report as clf_report

def trainNet(net, batch_size, n_epochs, learning_rate):
    
    #Print all of the hyperparameters of the training iteration:
    print("===== HYPERPARAMETERS =====")
    print("batch_size=", batch_size)
    print("epochs=", n_epochs)
    print("learning_rate=", learning_rate)
    print("=" * 30)
    
    #Get training data
    train_loader = get_train_loader(batch_size)
    n_batches = len(train_loader)
    
    #Create our loss and optimizer functions
    loss, optimizer = createLossAndOptimizer(net, learning_rate)
    
    #Time for printing
    training_start_time = time.time()
    
    #Loop for n_epochs
    for epoch in range(n_epochs):
        
        running_loss = 0.0
        print_every = n_batches // 10
        start_time = time.time()
        total_train_loss = 0
        
        for i, data in enumerate(train_loader, 0):
            
            #Get inputs
            inputs, labels = data # data are in torch tensor format
            
            #Wrap them in a Variable object
            #print(inputs.shape)
            #print(labels.shape)
            #inputs, labels = Variable(inputs), Variable(labels)
            
            #Set the parameter gradients to zero
            optimizer.zero_grad()
            
            #Forward pass, backward pass, optimize
            outputs = net(inputs)
            
            #print(outputs.shape)
            #print(labels.shape)
            loss_size = loss(outputs, labels)
            loss_size.backward()
            optimizer.step()
            
            #Print statistics
            running_loss += loss_size.data[0]
            total_train_loss += loss_size.data[0]
            
            #Print every 10th batch of an epoch
            if (i + 1) % (print_every + 1) == 0:
                print("Epoch {}, {:d}% \t train_loss: {:.2f} took: {:.2f}s".format(
                        epoch+1, int(100 * (i+1) / n_batches), running_loss / print_every, time.time() - start_time))
                #Reset running loss and time
                running_loss = 0.0
                start_time = time.time()
            
        #At the end of the epoch, do a pass on the validation set
        total_val_loss = 0
        for inputs, labels in val_loader:
            
            #Wrap tensors in Variables
            inputs, labels = Variable(inputs), Variable(labels)
            
            #Forward pass
            val_outputs = net(inputs)
            val_loss_size = loss(val_outputs, labels)
            total_val_loss += val_loss_size.data[0]
            # Report classification accuracy
            #y_true = np.array(labels) # this works because 'labels' does not have gradients associated
            y_true = labels.numpy() # no need to detach()
            _, y_pred = torch.max(val_outputs,1)
            y_pred = y_pred.detach().numpy() # just another way to converto to np
            print('y_true:',y_true)
            print('y_pred:',y_pred)
            print(classification_report(y_true, y_pred, target_names=list(classes)))
            
        print("Validation loss = {:.2f}".format(total_val_loss / len(val_loader)))
        
    print("Training finished, took {:.2f}s".format(time.time() - training_start_time))


In [15]:
# Run the training
CNN = CNN0()
trainNet(CNN, batch_size=32, n_epochs=5, learning_rate=0.001)

===== HYPERPARAMETERS =====
batch_size= 32
epochs= 5
learning_rate= 0.001




Epoch 1, 10% 	 train_loss: 2.11 took: 1.29s
Epoch 1, 20% 	 train_loss: 1.82 took: 1.14s
Epoch 1, 30% 	 train_loss: 1.75 took: 0.98s
Epoch 1, 40% 	 train_loss: 1.65 took: 2.14s
Epoch 1, 50% 	 train_loss: 1.60 took: 1.15s
Epoch 1, 60% 	 train_loss: 1.60 took: 1.28s
Epoch 1, 70% 	 train_loss: 1.52 took: 0.96s
Epoch 1, 80% 	 train_loss: 1.53 took: 1.23s
Epoch 1, 90% 	 train_loss: 1.43 took: 1.19s




y_true: [3 2 7 1 7 2 8 6 1 2 4 3 6 0 9 1 3 8 7 6 0 1 0 4 0 9 0 3 3 7 7 3 1 3 9 7 5
 7 1 8 8 9 7 2 6 8 0 7 7 6 2 5 3 6 7 1 6 6 1 3 3 8 1 1 7 7 7 8 2 5 9 4 0 2
 9 9 4 1 8 0 5 1 4 5 2 4 6 3 6 3 9 4 4 8 0 9 6 9 1 7 4 0 4 1 7 5 5 6 8 1 1
 0 3 3 5 1 7 5 4 8 4 7 8 6 9 7 9 5]
y_pred: [6 2 7 1 3 6 8 7 9 6 6 7 2 0 1 1 5 8 7 4 0 1 8 9 0 9 0 3 3 7 7 5 9 4 9 2 6
 7 1 2 1 9 7 2 8 0 0 7 7 6 2 8 6 6 7 1 6 6 1 2 2 0 9 1 7 2 7 8 2 6 1 4 8 3
 9 9 4 1 8 0 6 6 4 2 2 3 6 1 2 3 9 2 4 8 9 9 6 7 1 7 0 0 2 9 6 2 5 6 1 1 1
 0 7 3 3 1 9 7 2 8 4 4 8 6 9 7 1 6]
             precision    recall  f1-score   support

      plane       0.73      0.73      0.73        11
        car       0.67      0.71      0.69        17
       bird       0.29      0.62      0.40         8
        cat       0.50      0.29      0.36        14
       deer       0.62      0.42      0.50        12
        dog       0.33      0.10      0.15        10
       frog       0.42      0.62      0.50        13
      horse       0.72      0.68     

  .format(len(labels), len(target_names))
  'precision', 'predicted', average, warn_for)
  'recall', 'true', average, warn_for)


Epoch 2, 10% 	 train_loss: 1.40 took: 1.51s
Epoch 2, 20% 	 train_loss: 1.39 took: 1.47s
Epoch 2, 30% 	 train_loss: 1.32 took: 1.17s
Epoch 2, 40% 	 train_loss: 1.34 took: 0.98s
Epoch 2, 50% 	 train_loss: 1.32 took: 1.06s
Epoch 2, 60% 	 train_loss: 1.31 took: 1.22s
Epoch 2, 70% 	 train_loss: 1.33 took: 1.14s
Epoch 2, 80% 	 train_loss: 1.31 took: 1.45s
Epoch 2, 90% 	 train_loss: 1.29 took: 1.02s
y_true: [9 0 8 1 0 3 8 7 7 0 9 7 9 0 2 7 7 2 5 7 1 4 0 6 2 1 3 5 0 2 4 0 3 5 5 4 5
 0 2 8 5 4 5 4 6 2 1 7 7 6 9 8 7 2 9 3 4 5 9 5 8 1 7 9 5 5 5 0 4 6 0 7 8 8
 5 0 7 3 1 2 9 1 9 0 3 2 8 0 5 9 4 0 5 4 1 7 3 5 0 4 6 1 1 6 1 6 2 8 3 4 5
 2 4 6 7 7 5 8 1 5 8 8 6 5 2 8 3 4]
y_pred: [9 8 0 0 9 3 8 7 7 9 9 6 9 0 2 3 9 8 7 7 9 2 2 2 3 1 3 5 0 0 9 0 9 5 3 4 6
 0 2 8 3 4 5 2 6 5 0 7 7 6 8 8 6 5 9 3 6 5 9 7 8 1 7 9 6 2 5 9 4 6 0 4 8 8
 5 0 7 2 9 2 9 9 6 8 1 3 9 0 3 9 2 0 3 5 1 7 5 6 9 3 3 1 9 6 9 3 2 8 3 4 4
 2 3 6 7 5 9 8 0 3 8 8 3 5 2 8 5 6]
             precision    recall  f1-score   support

      plane 

  .format(len(labels), len(target_names))


Epoch 4, 10% 	 train_loss: 1.07 took: 1.19s
Epoch 4, 20% 	 train_loss: 1.04 took: 1.23s
Epoch 4, 30% 	 train_loss: 1.05 took: 0.97s
Epoch 4, 40% 	 train_loss: 1.11 took: 1.22s
Epoch 4, 50% 	 train_loss: 1.08 took: 1.27s
Epoch 4, 60% 	 train_loss: 1.03 took: 1.43s
Epoch 4, 70% 	 train_loss: 1.07 took: 1.12s
Epoch 4, 80% 	 train_loss: 1.06 took: 1.18s
Epoch 4, 90% 	 train_loss: 1.03 took: 1.01s
y_true: [2 2 3 8 4 1 5 8 7 6 0 5 6 0 5 9 3 6 2 7 5 4 4 1 9 9 3 3 4 5 8 2 7 4 3 2 3
 5 0 2 9 9 8 3 3 1 2 7 9 4 8 0 2 2 2 0 6 2 6 2 0 9 4 4 5 3 0 1 6 9 1 4 8 7
 3 4 8 7 3 6 0 8 4 1 7 0 8 8 8 1 9 8 6 5 3 6 3 5 2 7 7 8 3 1 6 5 9 5 6 8 3
 7 3 6 7 2 3 8 3 2 3 0 6 3 8 1 7 2]
y_pred: [7 2 8 8 6 8 2 0 7 6 0 5 6 7 4 9 6 6 0 7 4 4 2 8 1 9 2 4 8 7 6 4 7 5 5 8 5
 3 0 2 9 9 8 2 7 1 2 2 9 4 8 5 2 2 2 0 6 3 6 3 0 9 7 2 5 4 8 9 6 9 1 7 8 4
 6 7 9 7 2 6 0 8 7 1 7 0 8 8 8 1 7 8 6 4 5 6 9 9 7 7 7 6 6 1 4 5 9 0 6 0 6
 7 6 5 7 5 3 8 7 3 3 0 6 6 8 1 2 3]
             precision    recall  f1-score   support

      plane 

  .format(len(labels), len(target_names))


Epoch 5, 10% 	 train_loss: 0.95 took: 1.21s
Epoch 5, 20% 	 train_loss: 0.94 took: 1.10s
Epoch 5, 30% 	 train_loss: 0.92 took: 1.24s
Epoch 5, 40% 	 train_loss: 0.97 took: 1.10s
Epoch 5, 50% 	 train_loss: 0.94 took: 1.33s
Epoch 5, 60% 	 train_loss: 0.98 took: 1.55s
Epoch 5, 70% 	 train_loss: 0.98 took: 1.64s
Epoch 5, 80% 	 train_loss: 1.02 took: 1.34s
Epoch 5, 90% 	 train_loss: 1.00 took: 1.33s
y_true: [6 6 6 8 1 1 7 3 8 6 5 9 9 6 0 4 7 7 5 2 5 1 7 2 1 0 2 4 9 6 8 4 7 2 9 0 2
 7 2 9 6 8 6 1 3 0 9 3 9 8 4 4 5 0 4 1 9 8 7 5 1 0 0 9 9 6 4 8 3 6 0 0 7 5
 2 6 7 2 2 1 2 1 6 6 4 2 3 2 1 8 2 9 8 2 3 8 2 4 1 8 2 2 2 0 2 2 8 4 9 8 0
 8 0 0 4 6 8 3 8 4 1 1 4 9 6 6 8 4]
y_pred: [0 4 6 0 1 1 7 3 8 6 5 8 9 6 0 5 3 7 5 4 5 1 0 0 1 2 0 4 9 3 1 4 3 0 9 0 4
 7 4 9 2 8 1 1 5 9 8 3 9 8 4 2 3 0 4 1 9 8 7 3 1 0 0 9 1 2 4 8 5 2 0 0 4 2
 2 6 5 9 2 1 2 1 4 6 7 0 5 2 1 9 2 9 8 2 4 3 0 0 1 8 7 7 0 0 3 6 8 4 1 8 0
 8 8 0 2 1 8 3 8 2 0 1 4 0 3 6 0 2]
             precision    recall  f1-score   support

      plane 

  .format(len(labels), len(target_names))


In [None]:
torch.nn.CrossEntropyLoss(

In [30]:
a = Variable(torch.ones(5),requires_grad=True)
b = Variable(2*torch.ones(5))
print(a.detach().numpy())
print(b.numpy())
list(classes)

[ 1.  1.  1.  1.  1.]
[ 2.  2.  2.  2.  2.]


['plane',
 'car',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck']

In [21]:
# Examining the data set
train_loader = get_train_loader(10)
allvars = []
for i, data in enumerate(train_loader, 0):
    print('i:',i,'data shape:',data[0].shape,'labels:',data[1].shape)
    print('labels:',data[1])
    allvars.append(Variable(data[0]))

i: 0 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 7,  2,  4,  5,  5,  3,  9,  3,  4,  8])
i: 1 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 2,  7,  2,  1,  5,  1,  8,  4,  0,  7])
i: 2 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 8,  5,  9,  5,  6,  4,  3,  7,  0,  1])
i: 3 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 4,  5,  9,  7,  9,  3,  6,  9,  9,  3])
i: 4 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 3,  8,  9,  9,  9,  8,  9,  5,  6,  6])
i: 5 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 8,  5,  3,  6,  6,  6,  2,  9,  5,  6])
i: 6 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 3,  7,  7,  0,  0,  2,  2,  9,  0,  3])
i: 7 data shape: torch.Size([10, 3, 32, 32]) labels: torch.Size([10])
labels: tensor([ 1,  1,  9,  3,  0,  6,  

RuntimeError: received 0 items of ancdata

In [32]:
classes

('plane',
 'car',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck')

In [31]:
np.array(data[1])

array([6, 0, 1, 6, 2, 2, 9, 9, 9, 5])

In [28]:
torch.__version__

'0.4.0'