<a href="https://colab.research.google.com/github/Lucas-Aerts/Machine-Learning-Course-2days/blob/main/mnist-deep-learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import warnings
warnings.filterwarnings('ignore')

# MNIST digit classification: Deep Learning

Let's load the data again:

In [None]:
from sklearn.datasets import fetch_openml
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

mnist = fetch_openml('mnist_784', as_frame=False, cache=False)
#test
X = mnist.data.astype('float32')
y = mnist.target.astype('int64')

X = MinMaxScaler().fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

For Deep Learning we will use the [PyTorch](https://pytorch.org/) library. 

PyTorch can fit models on a GPU, if available:

In [None]:
import torch
from torch import nn
import torch.nn.functional as F

device = 'cuda' if torch.cuda.is_available() else 'cpu'

print(device)

## 1. Feed-forward neural network

We can now define a neural network with one hidden layer as a class the inherits `nn.Module`. 

All we need to do is define the `forward()` function:

In [None]:
class myNeuralNetwork(nn.Module):
    def __init__(
            self,
            input_dim,
            hidden_dim,
            output_dim
    ):
        super(myNeuralNetwork, self).__init__()
        self.hidden = nn.Linear(input_dim, hidden_dim)
        self.output = nn.Linear(hidden_dim, output_dim)

    def forward(self, x, **kwargs):
        x = F.relu(self.hidden(x))
        x = self.output(x)
        x = F.softmax(x, dim=1)
        return x

The Python skorch library wraps the PyTorch model fitting such that it can be used similarly to Scikit-learn.

We can initialize the `myNeuralNetwork` architecture as follows:

In [None]:
#first we need to install the skorch library
!pip install skorch

from skorch import NeuralNetClassifier

input_dim = 784
hidden_dim = 100
output_dim = 10

net = NeuralNetClassifier(
    myNeuralNetwork(input_dim,hidden_dim,output_dim),
    max_epochs=20,
    lr=0.1, #learning rate
    device=device,
)

Now we can call the `fit()` function to train the neural network:

In [None]:
net.fit(X_train, y_train)

We can make prediction using the fitted model as follows:

In [None]:
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report

y_predicted = net.predict(X_test)

print("Accuracy = {}%\n".format(accuracy_score(y_test, y_predicted)*100))

print("Classification Report\n {}".format(classification_report(y_test, y_predicted, labels=range(0,10))))

## 2. Convolutional neural network

The input of the CNN is the image in 2D, not 1D (flattened) as for the previous neural network.

Images can also have channels. For color images there are typically 3 channels: one for red, one for green, and one for blue.
For gray-scale images there is just one channel. 

For the MNIST data we reshape the datasets as follows:

In [None]:
print(X.shape)

XCnn = X.reshape(-1, 1, 28, 28)

print(XCnn.shape)

Next we create training and test set:

In [None]:
XCnn_train, XCnn_test, y_train, y_test = train_test_split(XCnn, y, test_size=0.25, random_state=42)

print(XCnn_train.shape)
print(y_train.shape)

We define the CNN:

In [None]:
class myCNN(nn.Module):
    def __init__(self, dropout=0.5):
        super(myCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.fc1 = nn.Linear(1600, 10) # 1600 = number channels * width * height

    def forward(self, x):
        x = torch.relu(F.max_pool2d(self.conv1(x), 2))
        x = torch.relu(F.max_pool2d(self.conv2(x), 2))

        x = x.view(-1, x.size(1) * x.size(2) * x.size(3))

        x = self.fc1(x)
        x = torch.softmax(x, dim=1) #dim=1 means softmax over columns

        return x

Now we can use skorch to wrap the `myCNN` so we can use the `fit()` and `predict()` functions:

In [None]:
cnn = NeuralNetClassifier(
    myCNN,
    max_epochs=20,
    lr=0.002,
    optimizer=torch.optim.Adam,
    device=device,
)

In [None]:
cnn.fit(XCnn_train, y_train)

In [None]:
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report

y_predicted_cnn = cnn.predict(XCnn_test)

print("Accuracy = {}%\n".format(accuracy_score(y_test, y_predicted_cnn)*100))

print("Classification Report\n {}".format(classification_report(y_test, y_predicted_cnn, labels=range(0,10))))