# Digit Classifier

This Notebook uses a neural network that inputs an image of a handwritten digit, and predicts what digit it is (0 to 10).

I am making this classifier while learning the basics of DL and neural networks online, from Micheal Nielsen's online book, the link of which is available in the README file.

## Import dependencies

In [4]:
import numpy as np
import random

## Helper functions

In [5]:
def sigmoid(z):
    """return sigma(z)"""
    return 1.0/(1.0 + np.exp(-z))

## The `Network` class

The following class represents a basic neural network, with attributes and methods as follows.

### Attributes
<ul>
    <li> sizes </li>
    <li> number of layers </li>
    <li> biases </li>
    <li> weights </li>
</ul>

### Methods

<ul>
    <li> `feedforward`: compute σ(wx + b) for the neural network </li>
    <li> `update_mini_batch`: perform gradient descent and backpropagation on a mini-batch </li>
    <li> `SGD`: perform stochastic gradient descent </li>
    <li> `evaulate`: test the network on test-set </li>
</ul>

### Working

You can create a neural network by creating an instance of the following class:

```
    myNetwork = Network(size)
```

In [6]:
class Network:
    
    def __init__(self, size):
        """
        Constructor to initalize the object attributes
        Assumption: layer 0 is the input layer (so it won't have any biases)
        """
        
        self.size = size # number of neurons in each layer
        self.num_layers = len(size) # number of layers
        self.biases = [np.random.randn(1, l) for l in size[1:]] # randomly initalize bias of each layer from layer 1
        self.weights = [np.random.randn(l, m) for m, l in zip(size[:-1], size[1:])] # randomly initiliaze weights in the same way
        
    
    def feedforward(self, a):
        """ 
        Compute sigma(w.x + b), where w is list of weights for each layer, and b is the list of biases for each layer.
        Return the feedforward output after going through all layers
        """
        for w,b in zip(self.weights, self.biases):
            y = sigmoid(np.dot(w,x) + b)
        return y
    
    def backpropagate(self, x, y):
        # TODO: Write the code for backpropagation
        pass
    
    def update_mini_batch(self, mini_batch, eta):
        """
        Calculate the gradients (or derivatives) for all weights and biases in the mini-batch using backpropagation.
        Apply gradient descent to update the entries of the mini-batch using the gradients obtained, thereby updating the weights and biases of the network.
        
        Parameters
        ----------
        mini_batch: a list of tuples (x,y) randomly selected from the training set
        eta: the learning rate of the algorithm
        """
        
        # gradient arrays for weights and biases, initialized with zeroes
        nabla_w = [np.zeros(w.shape) for w in self.weights] 
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        
        for x,y in mini_batch:
            delta_nabla_b, delta_nabla_w = self.backdrop(x,y) # compute derivatives
            nabla_w = [nw + dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
            nabla_b = [nb + dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
        
        # update weights and biases
        m = len(mini_batch)
        self.weights = [w - (eta*nw)/m for w, nw in zip(self.weights, nabla_w)]
        self.biases = [b - (eta*nb)/m for b, m in zip(self.biases, nabla_b)]
        
    
    def SGD(self, train_data, epochs, mini_batch_size, eta, test_data = None):
        """ Stochastic Gradient Descent
        Aim: train the neural network using stochastic gradient descent in mini-batches
        
        Parameters
        -----------
        train_data: list of (x,y) tuples listing the inputs (x) and their desired outputs (y);
        epochs: the point where training of one mini-batch ends, and another needs to begin;
        mini_batch_size: number of elements in the mini-batch;
        eta: learning rate;
        test_data (optional): if provided, the network will be evaluated with the test data after each epoch, and progress will be printed
        """
        
        n = len(train_data)
        
        # loop over every epoch until the training set is exhausted
        for epoch in range(epochs):
            # pick random samples from the training set, and put them in mini-batches
            random.shuffle(train_data)
            mini_batches = [train_data[i: i+mini_batch_size] for i in range(0, n, mini_batch_size)]
            
            # perform gradient descent and back-propagation in every mini-batch
            for mini_batch in mini_batches:
                self.update_mini_batch(mini_batch, eta)
            
            # test the trained batches, if test set is provided
            if test_data:
                m = len(test_data)
                print (f'Epoch {epoch}: {self.evaluate(test_data)} / {m}')
            else:
                print (f'Epoch {epoch} complete')
    
    
    def evaluate(test_data):
        pass
                
        
    
    def printElements(self):
        """ Print the elements of the current instance of the neural network """
        
        print(f'This neural network has: \n{self.num_layers} number of layers, \n{self.size} neurons in each layer, \n{self.biases} as biases for each layer, and \n{self.weights} as weights for each layer.')

In [7]:
# uncomment the following lines to understand how a zip works
l1 = [1,2,3]
l2 = [4,5,6]
# print(f'first zip: {list(zip(l1,l2))}')
# for i in zip(l1,l2):
#     print(i)

# l = [1,2,3,4,5]
# print(f'second zip: {list(zip(l, l[1:]))}')
print(np.multiply(l1,l2))
print(list(zip(l1,l2)))

np.random.randn(1,3)

[ 4 10 18]
[(1, 4), (2, 5), (3, 6)]


array([[-0.21487966, -1.57789714,  0.01869331]])

In [9]:
myNet = Network([2,3,1])
myNet.printElements()

This neural network has: 
3 number of layers, 
[2, 3, 1] neurons in each layer, 
[array([[-1.35151069, -0.74583158,  1.3762929 ]]), array([[ 0.58943436]])] as biases for each layer, and 
[array([[ 0.0280287 ,  1.39417031],
       [ 0.86183564,  1.16321671],
       [ 0.49095898,  0.45971998]]), array([[-0.23935568, -0.26587206,  2.9051824 ]])] as weights for each layer.


In [12]:
bias = [np.zeros(b.shape) for b in myNet.biases]

In [11]:
bias

[array([[ 0.,  0.,  0.]]), array([[ 0.]])]