Author: Ethan Herron 3-4-2020

This is my jupyter notebook for the book Deep Learning from Scratch by Seth Weidman. I will be following along with all of the code, and adding insights or questions I have along the way in these markdown cells.

## Lincoln imports

In [1]:
import numpy as np
from scipy.special import logsumexp
%load_ext autoreload
%autoreload 2

In [2]:
import lincoln
from lincoln.layers import Dense
from lincoln.losses import SoftmaxCrossEntropy, MeanSquaredError
from lincoln.optimizers import Optimizer, SGD, SGDMomentum
from lincoln.activations import Sigmoid, Tanh, Linear, ReLU
from lincoln.network import NeuralNetwork
from lincoln.train import Trainer
from lincoln.utils import mnist
from lincoln.utils.np_utils import softmax

In [3]:
'''
Credit: https://github.com/hsjeong5
'''

import numpy as np
from urllib import request
import gzip
import pickle

filename = [
["training_images","train-images-idx3-ubyte.gz"],
["test_images","t10k-images-idx3-ubyte.gz"],
["training_labels","train-labels-idx1-ubyte.gz"],
["test_labels","t10k-labels-idx1-ubyte.gz"]
]


def download_mnist():
    base_url = "http://yann.lecun.com/exdb/mnist/"
    for name in filename:
        print("Downloading "+name[1]+"...")
        request.urlretrieve(base_url+name[1], name[1])
    print("Download complete.")


def save_mnist():
    mnist = {}
    for name in filename[:2]:
        with gzip.open(name[1], 'rb') as f:
            mnist[name[0]] = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1,28*28)
    for name in filename[-2:]:
        with gzip.open(name[1], 'rb') as f:
            mnist[name[0]] = np.frombuffer(f.read(), np.uint8, offset=8)
    with open("mnist.pkl", 'wb') as f:
        pickle.dump(mnist,f)
    print("Save complete.")


def init():
    download_mnist()
    save_mnist()


def load():
    with open("mnist.pkl",'rb') as f:
        mnist = pickle.load(f)
    return mnist["training_images"], mnist["training_labels"], mnist["test_images"], mnist["test_labels"]

In [4]:
download_mnist()

Downloading train-images-idx3-ubyte.gz...
Downloading t10k-images-idx3-ubyte.gz...
Downloading train-labels-idx1-ubyte.gz...
Downloading t10k-labels-idx1-ubyte.gz...
Download complete.


In [5]:
save_mnist()

Save complete.


In [6]:
X_train, y_train, X_test, y_test = load()

In [7]:
num_labels = len(y_train)
num_labels

60000

In [8]:
#one-hot encode
num_labels = len(y_train)
train_labels = np.zeros((num_labels, 10))
for i in range(num_labels):
    train_labels[i][y_train[i]] = 1
    
num_labels = len(y_test)
test_labels = np.zeros((num_labels, 10))
for i in range(num_labels):
    test_labels[i][y_test[i]] = 1

# MNIST DEMO

## Scale data to mean 0, var 1

In [9]:
X_train, X_test = X_train - np.mean(X_train), X_test - np.mean(X_train)

In [10]:
np.min(X_train), np.max(X_train), np.min(X_test), np.max(X_test)

(-33.318421449829934,
 221.68157855017006,
 -33.318421449829934,
 221.68157855017006)

In [11]:
X_train, X_test, = X_train / np.std(X_train), X_test / np.std(X_train)

In [12]:
np.min(X_train), np.max(X_train), np.min(X_test), np.max(X_test)

(-0.424073894391566, 2.821543345689335, -0.424073894391566, 2.821543345689335)

In [13]:
def calc_accuracy_model(model, test_set):
    return print(f'''The model validation accuracy is: {np.equal(np.argmax(
model.forward(test_set, inference=True), axis=1), y_test).sum()
* 100.0 / test_set.shape[0]:.2f}%''')

## Softmax Cross Entropy

### Sigmoid Activation

In [25]:
model = NeuralNetwork(layers=[Dense(neurons=89,
                                    activation=Tanh()),
                              Dense(neurons=10,
                                    activation=Sigmoid())],
                      loss = MeanSquaredError(normalize=False),
                      seed=20190119)

trainer = Trainer(model, SGD(0.1))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);
print()
calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.611
Validation loss after 20 epochs is 0.427
Validation loss after 30 epochs is 0.389
Validation loss after 40 epochs is 0.374
Validation loss after 50 epochs is 0.366

The model validation accuracy is: 72.48%


We now try using meansquarederror, but this time normlaizing the outputs. 

In [27]:
model = NeuralNetwork(layers=[Dense(neurons=89,
                                    activation=Tanh()),
                              Dense(neurons=10,
                                    activation=Sigmoid())],
                      loss = MeanSquaredError(normalize=True),
                      seed=20190119)

trainer = Trainer(model, SGD(0.1))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);
print()
calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.952

Loss increased after epoch 20, final loss was 0.952, 
using the model from epoch 10

The model validation accuracy is: 41.73%


This clearly did not help at all. So we must move on from the mean squared error and try the soft cross entropy loss function.

### SCE with Sigmoid function

In [28]:
model = NeuralNetwork(layers=[Dense(neurons=89,
                                    activation=Sigmoid()),
                              Dense(neurons=10,
                                    activation=Linear())],
                      loss = SoftmaxCrossEntropy(),
                      seed=20190119)

trainer = Trainer(model, SGD(0.1))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 130,
            eval_every = 1,
            seed=20190119,
            batch_size=60);
print()
calc_accuracy_model(model, X_test)

Validation loss after 1 epochs is 1.285
Validation loss after 2 epochs is 0.970
Validation loss after 3 epochs is 0.836
Validation loss after 4 epochs is 0.763
Validation loss after 5 epochs is 0.712
Validation loss after 6 epochs is 0.679
Validation loss after 7 epochs is 0.651
Validation loss after 8 epochs is 0.631
Validation loss after 9 epochs is 0.617
Validation loss after 10 epochs is 0.599
Validation loss after 11 epochs is 0.588
Validation loss after 12 epochs is 0.576
Validation loss after 13 epochs is 0.568
Validation loss after 14 epochs is 0.557
Validation loss after 15 epochs is 0.550
Validation loss after 16 epochs is 0.544
Validation loss after 17 epochs is 0.537
Validation loss after 18 epochs is 0.533
Validation loss after 19 epochs is 0.529
Validation loss after 20 epochs is 0.523
Validation loss after 21 epochs is 0.517
Validation loss after 22 epochs is 0.512
Validation loss after 23 epochs is 0.507

Loss increased after epoch 24, final loss was 0.507, 
using the m

### SCE with ReLU

In [29]:
model = NeuralNetwork(layers=[Dense(neurons=89,
                                    activation=ReLU()),
                              Dense(neurons=10,
                                    activation=Linear())],
                      loss = SoftmaxCrossEntropy(),
                      seed=20190119)

trainer = Trainer(model, SGD(0.1))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);
print()
calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 7.379

Loss increased after epoch 20, final loss was 7.379, 
using the model from epoch 10

The model validation accuracy is: 73.86%


In [30]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh()),
            Dense(neurons=10, 
                  activation=Linear())],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

trainer = Trainer(model, SGD(0.1))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);
print()
calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.621
Validation loss after 20 epochs is 0.565
Validation loss after 30 epochs is 0.546
Validation loss after 40 epochs is 0.542
Validation loss after 50 epochs is 0.539

The model validation accuracy is: 91.12%


### SGD Momentum

In [31]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Sigmoid()),
            Dense(neurons=10, 
                  activation=Linear())],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optim = SGDMomentum(0.1, momentum=0.9)

trainer = Trainer(model, SGDMomentum(0.1, momentum=0.9))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 1,
            seed=20190119,
            batch_size=60);

calc_accuracy_model(model, X_test)

Validation loss after 1 epochs is 0.615
Validation loss after 2 epochs is 0.489
Validation loss after 3 epochs is 0.444

Loss increased after epoch 4, final loss was 0.444, 
using the model from epoch 3
The model validation accuracy is: 91.79%


In [14]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh()),
            Dense(neurons=10, 
                  activation=Linear())],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optim = SGD(0.1)

optim = SGDMomentum(0.1, momentum=0.9)

trainer = Trainer(model, SGDMomentum(0.1, momentum=0.9))
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.385
Validation loss after 20 epochs is 0.310

Loss increased after epoch 30, final loss was 0.310, 
using the model from epoch 20
The model validation accuracy is: 95.62%


### Different weight decay

In [15]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh()),
            Dense(neurons=10, 
                  activation=Linear())],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optimizer = SGDMomentum(0.15, momentum=0.9, final_lr = 0.05, decay_type='linear')

trainer = Trainer(model, optimizer)
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.403
Validation loss after 20 epochs is 0.313
Validation loss after 30 epochs is 0.311
Validation loss after 40 epochs is 0.309

Loss increased after epoch 50, final loss was 0.309, 
using the model from epoch 40
The model validation accuracy is: 95.87%


In [16]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh()),
            Dense(neurons=10, 
                  activation=Linear())],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optimizer = SGDMomentum(0.2, momentum=0.9, final_lr = 0.05, decay_type='exponential')

trainer = Trainer(model, optimizer)
trainer.fit(X_train, train_labels, X_test, test_labels,
            epochs = 50,
            eval_every = 10,
            seed=20190119,
            batch_size=60);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.513
Validation loss after 20 epochs is 0.321
Validation loss after 30 epochs is 0.296

Loss increased after epoch 40, final loss was 0.296, 
using the model from epoch 30
The model validation accuracy is: 95.70%


### Changing initial weights

In [17]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh(),
                  weight_init="glorot"),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optimizer = SGDMomentum(0.15, momentum=0.9, final_lr = 0.05, decay_type='linear')

trainer = Trainer(model, optimizer)
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 50,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.316
Validation loss after 20 epochs is 0.254
Validation loss after 30 epochs is 0.213

Loss increased after epoch 40, final loss was 0.213, 
using the model from epoch 30
The model validation accuracy is: 97.03%


In [18]:

model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh(),
                  weight_init="glorot"),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optimizer = SGDMomentum(0.15, momentum=0.9, final_lr = 0.05, decay_type='exponential')

trainer = Trainer(model, optimizer)
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 50,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.310
Validation loss after 20 epochs is 0.248
Validation loss after 30 epochs is 0.233

Loss increased after epoch 40, final loss was 0.233, 
using the model from epoch 30
The model validation accuracy is: 96.78%


In [19]:

model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh(),
                  weight_init="glorot"),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

optimizer = SGDMomentum(0.2, momentum=0.9, final_lr = 0.05, decay_type='exponential')

trainer = Trainer(model, optimizer)
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 50,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.418
Validation loss after 20 epochs is 0.288
Validation loss after 30 epochs is 0.248

Loss increased after epoch 40, final loss was 0.248, 
using the model from epoch 30
The model validation accuracy is: 96.38%


### Dropout

In [20]:
model = NeuralNetwork(
    layers=[Dense(neurons=89, 
                  activation=Tanh(),
                  weight_init="glorot",
                  dropout=0.8),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

trainer = Trainer(model, SGDMomentum(0.2, momentum=0.9, final_lr = 0.05, decay_type='exponential'))
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 50,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.275
Validation loss after 20 epochs is 0.234
Validation loss after 30 epochs is 0.201
Validation loss after 40 epochs is 0.192
Validation loss after 50 epochs is 0.181
The model validation accuracy is: 97.22%


## Deep learning with and without dropout

In [21]:
model = NeuralNetwork(
    layers=[Dense(neurons=178, 
                  activation=Tanh(),
                  weight_init="glorot",
                  dropout=0.8),
            Dense(neurons=46, 
                  activation=Tanh(),
                  weight_init="glorot",
                  dropout=0.8),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

trainer = Trainer(model, SGDMomentum(0.2, momentum=0.9, final_lr = 0.05, decay_type='exponential'))
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 100,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.352
Validation loss after 20 epochs is 0.279
Validation loss after 30 epochs is 0.265
Validation loss after 40 epochs is 0.238
Validation loss after 50 epochs is 0.216
Validation loss after 60 epochs is 0.200
Validation loss after 70 epochs is 0.196
Validation loss after 80 epochs is 0.179

Loss increased after epoch 90, final loss was 0.179, 
using the model from epoch 80
The model validation accuracy is: 97.12%


In [26]:
model = NeuralNetwork(
    layers=[Dense(neurons=178, 
                  activation=Tanh(),
                  weight_init="glorot"),
            Dense(neurons=46, 
                  activation=Tanh(),
                  weight_init="glorot"),
            Dense(neurons=10, 
                  activation=Linear(),
                  weight_init="glorot")],
            loss = SoftmaxCrossEntropy(), 
seed=20190119)

trainer = Trainer(model, SGDMomentum(0.2, momentum=0.9, final_lr = 0.05, decay_type='exponential'))
trainer.fit(X_train, train_labels, X_test, test_labels,
       epochs = 100,
       eval_every = 10,
       seed=20190119,
           batch_size=60,
           early_stopping=True);

calc_accuracy_model(model, X_test)

Validation loss after 10 epochs is 0.509
Validation loss after 20 epochs is 0.392
Validation loss after 30 epochs is 0.307
Validation loss after 40 epochs is 0.300
Validation loss after 50 epochs is 0.274
Validation loss after 60 epochs is 0.271
Validation loss after 70 epochs is 0.233

Loss increased after epoch 80, final loss was 0.233, 
using the model from epoch 70
The model validation accuracy is: 96.33%
