# Introduction

This is a intuitive guide made for those who are just now getting into neural networks, the objective is to go step by step into how to build a neural network, making sure each and every step is fully understandable. We will be following the book made by Michael Nielsen (link to the book: http://neuralnetworksanddeeplearning.com/index.html), and will try to make the code easier to understand, since a few steps might get lost when going from the book into the code

We will be working with the Hello World of neural Networks, The MNIST Dataset, which is a collection of handwritten digits as well as the respective labels (1, 2, 3...). The network will try to correctly classify the digits, and the accuracy will be displayed and used as a measure to how well the network did.


# Necessary libraries:

If you dont have any of the libraries necessary, just run the following commands on your windows/linux terminal:
* Numpy: !pip install numpy 

In [10]:
import math
import numpy as np

# First Step: Creating the network

In order to create a network, we will be taking advantage of python's classes.

If you dont know what a class is, I suggest you take a look at the following link: https://www.w3schools.com/python/python_classes.asp

## Why use classes?

Classes are just an easy form to make the code easier to use, aggregating meaningful attributes into a single object.
In this manner, we are able to more easily track each attribute of the network.


### Explaining necessary attributes:

In order to create a neural network, a few attributes will be necessary:

* number of inputs (nInputs): How many input neurons will the network create. For this simple NN (Neural Network), each pixel will represent an input neuron. The MNIST images have each 28x28 pixels, thus, we will have 28x28 input neurons
* number of HiddenLayers (nHiddenLayers): How many hiddend layers should the network have. For a simple NN, only 1 hidden layer will suffice, but for more complex projects, multiple hidden layers may be used (a network with multiple hidden layers is called a *deep neural network* ). The purpose of the hidden layers is to try and "catch" patterns of the image, with each subsequent layer bulding upon the last; so for example, a network with 3 hidden layers could work like the following: the first hl (hidden layer) catches the edges, the second turn the edges into shapes (like circles, squares, triangles), and the third turn the shapes into more complex shapes (like eyes, mouth, etc).
* number of Neurons per Hidden Layer: How many neurons will each hidden layer have. This is kind of an arbitrary choice, for this network we will be using 30 hidden neurons, simply because this is what Michael Nielsen did in his book. 
* number of output neurons (nOutputs): How many output Neurons should the network have. It's a good idea for each outcome to have a neuron specific to it, and since we are trying to classify 10 different digits (0 to 9), we will be using 10 output neurons

Aside from those attributes which the user will give to the network, other attributes will be made based on those initial ones.
### Weights: 
The weights between two neurons from different layers. For simplicity, we are using numpy to build the matrices with each index assuming a random value between -5 and 5. 
The matrix we need to build is a 3D matrix. 3D matrix might be scary at first, but our code was structured to have an intuitive manner of accessing each component:
* The first index is which layer the "weight layer" we are dealing with. A "weight layer" is simply the layer of weights between two neurons
* The second index is from what neuron the weight is exiting from
* The third index is to what neuron the weight is arriving

### Bias: 
For the bias, a 2D matrix will suffice. Like with the weights, the biases will also be started randomly
* The first index tells which layer the bias belongs to
* The second index tells which neuron the bias belongs to

The following code will create the class with all atributes necessary

### Activation:
For the activation of the neurons, a 2D matrix will also suffice 

In [11]:
class Network:
    """
    The main object we're going to use accross this notebook
    It's a neural network that takes as input a list of 
    layers nodes
    
    Ex: [2, 3, 1] is a 3 layer network, with 2 neurons of input, 3 neurons 
    in the hidden layer and 1 for the output layer
    
    Supposedly it can take more than just 3 layers but I didnt test it
    
    It initializes an object with the proper weights, biases, activations and z
    based on the layers list. It also has the layers list and the number of layers
    
    The weights and biases initialized following a Gaussian of mean 1
    """
    def __init__(self, layers: list):        
        np.random.seed(42)        
        b = []
        w = []
        a = []
        z = []
        for l in range(0, len(layers)):
            # skipping one layer for the weights and biases
            if (l+1) < len(layers):
                b.append(np.random.normal(loc=0, scale=1,size=layers[l+1]))
                w.append(np.random.normal(loc=0,scale=1,size=[layers[l],layers[l+1]]))
            a.append(np.zeros(layers[l]))
            z.append(np.zeros(layers[l]))
    
        # b[i][j] -> "i" is which layer, "j" which neuron
        # w[i][j][k] -> "i" is which layer, "j" which neuron of the first layer, "k" which neuron of the second layer
        self.b = b
        self.w = w
        self.a = a
        self.z = z
        self.nLayers = len(layers)
        self.layers = layers

# The sigmoid function
In order to our activation neurons on the hidden layers to not explode (meaning, not have an exagerated value, which in turn would mean an exagerated importance); we implement the sigmoid function. It has a few nice traits that are of use to us and is simple enough that can be implement easily, the book has more details about our choice (chapter 1 in the section sigmoid neurons)

In [12]:
def sigmoid(number):
    sigNumber = 1/(1 + np.exp(-number))
    return sigNumber  

# Feeding Forward
Now, for the bread and butter of neural networks, the feeding forward step. Feeding forward is one of the most fundamental steps of neural networks, and consists of passing information from previous layers to later layers. Combining weights, activations and biases, our neurons will be able to try and find patterns and relative importances on the input, with each passing layer finding (hopefuly) increasingly complex patterns.
In pratical terms, the logic is fairly simple: Starting from the input layer, all neurons from the previous layer will pass information to ALL neurons of the next layer, making the iconic shape that represents neural networks. 
If you dont understand (or wanta a nice visual explanation), I suggest this video of 3blue1brown explaining how it works: https://youtu.be/aircAruvnKk?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&t=274

## Explaining the logic of our code:
The logic is as follow:
* Starting from the input layer, find the number of neurons of the giving layer and of the receiving (next) layer
* Then, loop through every neuron of the receiving layer in the following manner: First, get the bias by inputing the layer and position of the  neuron. Second, loop through every neuron of the previous layers, combining the activation with its respective weight, and summing the results.
* Finally, change layers, with the receiving layer now being the giving layer, and the receiving layer being the next one after the current layer.
* Repeat until there are no more receiving layers

In [13]:
def feedForward(net: Network) -> Network:
    """
    Feedforwading the activations to the next layer
    
    It will take as input the network already with the input image as the activation 
    on the first layer and then feedforward to the next layrse
    
    It returns the network with all the activations set
    """
    
    # resetting the activations as to not take any info from the activation of
    # the previous number while maintanin the first activation
    for i in range(1, net.nLayers):
        net.z[i] = np.zeros(net.layers[i])
        net.a[i] = np.zeros(net.layers[i])
    for l in range(0, net.nLayers-1):
        for receivingNeuron in range(net.layers[l+1]):
            for givingNeuron in range(net.layers[l]):
                net.z[l+1][receivingNeuron] += net.a[l][givingNeuron] * net.w[l][givingNeuron][receivingNeuron]
            net.z[l+1][receivingNeuron] += net.b[l][receivingNeuron]
            net.a[l+1][receivingNeuron] = sigmoid(net.z[l+1][receivingNeuron])

            
    return net
    
    

In [14]:
def setInput(net: Network, MNISTnumber):
    """
    Inputs the MNIST number into the network, since the number is a 28x28 matrix, 
    we transform it into a 784 array
    
    We also scale the pixels as to be between 0 and 1 for the sigmoid function 
    instead of 0 and 255
    
    Returns the network with the proper activations on all layers since it pass 
    through the feedforward step
    """
    numberArr = np.asarray(MNISTnumber).flatten()
    # scaling the array so that the range is between 0 and 1
    numberArr = np.interp(numberArr, (numberArr.min(), numberArr.max()), (0, 1))
    for i in range(net.layers[0]):
        net.z[0][i] = numberArr[i]
        net.a[0][i] = numberArr[i]
    net = feedForward(net)
    
    return net

In [20]:
def classify(network: Network):
    maxIndex = np.argmax(network.a[-1])
    return maxIndex, network.a[-1][maxIndex]

# Getting the MNIST Dataset
There are several ways of getting the MNIST Dataset, I used the keras library just because it seemed the easiest, but feel free to choose what you prefer.
If you dont really want to look for an alternative way, use the following commands to install the keras library (keep in mid that the tensorflow library needed is fairly large):
* !pip install tensorflow
* !pip install keras.

Using keras, our data will already be split, between training and testing. If you did it any other way, just make sure that roughly 25% of the images goes to the test set

In [16]:
from keras.datasets import mnist
(trainX, trainY), (testX, testY) = mnist.load_data()

# Passing images through the network
Note that it may take a few minutes, depending on your hardware

In [24]:
net = Network([784,30,10])
hits = 0
misses = 0
for currentImage in range(10):
    #passing the inputs to our network
    net = setInput(net, trainX[currentImage])
    net = feedForward(net)
    output, certainty  = classify(net) 
    if output == trainY[currentImage]:
        hits += 1
    else:
        misses += 1
    #print(f"Network Classified as : {numberNetworkThinks}, certainty: {certainty}, real number: {trainY[currentImage]}")
acc = hits/misses
print(acc)

0.1111111111111111


# Congratulations!
You know have a fully functional, guess generator. Since our network does not have any learning, it's output is no better than a random a guess; in turn, ou accuracy reflects that line of thought (roughly, our network guessed 10% of tests righ). In the next notebook, we will implement learning, and our network will finaly be of use!