# Part 2. Deep Learning Frameworks

Before we go into deep learning modelling, we will first need to have a quick familiarisation with a deep learning framework. We recommend __[Keras](https://keras.io)__, which is built on top of Tensorflow, but alternatively, you can consider __[PyTorch](https://pytorch.org)__. Resources are abundant online on how to use them, but here are some official guides to get you started:
- PyTorch has a [60 Minute Blitz Guide](https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html)
- Tensorflow has an [Intro to Keras guide](https://www.tensorflow.org/guide/keras)

A few words on the difference between Keras and PyTorch - Keras is a high level wrapper on top of Google's Tensorflow, the most popular deep learning framework out there. Being more low level, Tensorflow faces many issues and troubles, which are addressed by the abstractions of Keras, making it a great way to start. Facebook's PyTorch on the other hand is a newcomer which has received massive interest in recent years, and is playing catch up to Tensorflow/Keras.

If you are more interested in how deep learning software has evolved since the days of Caffe and Theano as well as more in depth into what is happening in the software behind the scenes, we also recommend a [full lecture from Stanford](https://www.youtube.com/watch?v=6SlgtELqOWc) on this topic, although this is extra knowledge that isn't fully critical to this week.

Base on the tutorials you go through, you should be ready to build a 2 (or more) layer Multi-Level Perceptron (MLP) with deep learning. With the dataset you have prepared your machine learning model in the previous section, run your data through a MLP model with `Dense` (`Linear`) layers instead. Do some slight model adjustments, and discuss what kind of adjustments lead to improvements in score.

In [8]:
import pickle
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F

import torchvision
import torchvision.transforms as tfms

In [11]:
with open('./cifar10/data_batch_1', 'rb') as fo:
    batch1_dict = pickle.load(fo, encoding='bytes')
X = batch1_dict[b'data']
y = np.array(batch1_dict[b'labels'])

In [19]:
!ls cifar10

batches.meta data_batch_2 data_batch_4 readme.html
data_batch_1 data_batch_3 data_batch_5 test_batch


<font color=magenta>
1. transformations/preprocessing -> totensor, normalisation  <br />
2. load it into a loader  
</font>

In [20]:
with open('./cifar10/test_batch', 'rb') as fo:
    test_dict = pickle.load(fo, encoding='bytes')
X_test = test_dict[b'data']
y_test = np.array(test_dict[b'labels'])

In [35]:
X, y = torch.tensor(X).float(), torch.tensor(y).long()

  if __name__ == '__main__':


In [36]:
X_test, y_test = torch.tensor(X_test).float(), torch.tensor(y_test).long()
test_dataset = torch.utils.data.TensorDataset(X, y)
cifar10_test = torch.utils.data.DataLoader(test_dataset)

  if __name__ == '__main__':


In [37]:
X.shape

torch.Size([10000, 3072])

In [38]:
dataset = torch.utils.data.TensorDataset(X, y)
cifar10_data = torch.utils.data.DataLoader(dataset)

In [57]:
mlp = nn.Sequential(
    nn.Linear(3072, 100),
    nn.ReLU(),
    nn.Linear(100, 10),
)

In [58]:
optimizer = torch.optim.SGD(mlp.parameters(), 0.003)

In [59]:
criterion = nn.CrossEntropyLoss()

In [64]:
cnt = 0
mlp.train()
for data, label in cifar10_data:
    cnt += 1
    mlp.zero_grad()
    pred = mlp(data)
    loss = criterion(pred, label)
    loss.backward()
    optimizer.step()
    if cnt % 500 == 0:
        print(cnt)

500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
9500
10000


In [65]:
eval_mlp()

Ave loss: 2.3025026321411133, corrects: 1016


In [53]:
def eval_mlp():
    mlp.eval()
    total_loss = 0
    cnt = 0
    corrects = 0
    for data, label in cifar10_test:
        cnt += 1
        pred = mlp(data)
        loss = criterion(pred, label)
        if torch.argmax(pred) == label.item():
            corrects += 1
        total_loss += loss
    print('Ave loss: {}, corrects: {}'.format(total_loss/len(cifar10_test), corrects))
    mlp.train()

<font color=magenta>
1. loss vs accuracy: for every epoch 1. training_loss, 2. val_loss 3. top1 acc <br />
2. a. preprocess and load b. create architecture c. create training `train()` script d. evaluation `eval() -> val_loss, acc` scripts e. run altogether
</font>