### Introduction to Regularization and batching

`Let's return to the MNIST dataset and attempt to classify it with the newer network`

### Summary of the Model: 

- **Input Layer:** 784 neurons (for 28 * 28 images pixels)
- **Hidden Layer:** 40 Neurons with a ReLu activation functions
- **Output Layer:** 10 Neurons
- **Loss functions:** sum of squared errors
- **Optimzer:** Stochastic Gradient Descent with a LR of 0.005 

In [5]:
import sys, numpy as np
from keras.datasets import mnist

In [6]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 0us/step


In [7]:
images, labels = (x_train[0:1000].reshape(1000, 28*28)/255, y_train[0:1000])

In [8]:
one_hot_labels = np.zeros((len(labels), 10))

In [12]:
for i,l in enumerate(labels):
    one_hot_labels[i][l] = 1
labels = one_hot_labels

In [13]:
test_images = x_test.reshape(len(x_test), 28*28)/255
test_labels = np.zeros((len(y_test), 10))
for i, l in enumerate(y_test):
    test_labels[i][l] = 1

In [14]:
np.random.seed(1)
relu = lambda x: (x>=0) * x
relu2deriv = lambda x: x>=0

In [15]:
lr, iter, hidd_size, pixels, num_labels = (0.005, 350, 40, 784, 10)

In [16]:
weights_0_1 = 0.2*np.random.random((pixels, hidd_size)) - 0.1
weights_1_2 = 0.2*np.random.random((hidd_size, num_labels)) - 0.1

In [17]:
for j in range(iter):
    error, correct_cnt = (0.0, 0)
    for i in range(len(images)):
        layer_0 = images[i:i+1]
        layer_1 = relu(np.dot(layer_0, weights_0_1))
        layer_2 = np.dot(layer_1, weights_1_2)
        error += np.sum((labels[i:i+1] - layer_2)**2)
        correct_cnt = correct_cnt + int(np.argmax(layer_2) == np.argmax(labels[i:i+1]))
        
        layer_2_delta = (labels[i:i+1] - layer_2)
        layer_1_delta = layer_2_delta.dot(weights_1_2.T)*relu2deriv(layer_1)
        weights_1_2 += lr * layer_1.T.dot(layer_2_delta)
        weights_0_1 += lr * layer_0.T.dot(layer_1_delta)
    
    sys.stdout.write("\r" + "I:"+str(j) + " Error:" + str(error/float(len(images))) + " Correct:" + str(correct_cnt/float(len(images))))

I:349 Error:0.10881979854066498 Correct:1.099