<a href="https://colab.research.google.com/github/shashanksrajak/neural-networks-from-zero/blob/main/1_neural_net_implementation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Implementing Neural Network from Scratch
The goal is to understand neural network and its inner working and then build one from scratch using all the mathematical concepts and tools.

In [2]:
import random
import math
import numpy as np

## Single Neuron
Here I used `Numpy` to vectorize the calculation but this can also be done in raw python using `zip` and then multiplying element wise.

![Neuron](https://cs231n.github.io/assets/nn1/neuron_model.jpeg)

In [22]:
def sigmoid(z):
  return 1/(1+math.exp(-z))

In [39]:
class Neuron:
  def __init__(self, nin):
    """
    A single neuron
    nin : number of inputs to this neuron
    """
    self.w = np.array([random.uniform(-1, 1) for _ in range(nin)]) # weights
    self.b = random.uniform(-1, 1) # bias term

  def __call__(self, x):
    z = np.dot(self.w, x) + self.b # z = wx + b
    a = sigmoid(z)   # we are using sigmoid function but this can also be a user choice

    self.out = a
    return a # scalar value; one neuron outputs one scalar

In [8]:
n = Neuron(3)

In [13]:
# sample input
x = np.array([random.randint(1, 10) for _ in range(3)])
x

array([ 5, 10,  1])

In [14]:
n(x)

0.9878307196697438

In [15]:
n.w

array([-0.77995667,  0.94358218, -0.87519861])

In [16]:
n.b

-0.26424324169995606

In [17]:
n.out

0.9878307196697438

## Layer of Neurons

Now we have a single neuron, we can stack multiple neurons together to build a `Layer` of neurons with each neuron doing some calculations as per its own `w` and `b` parameters.

![layer](https://miro.medium.com/v2/resize:fit:1200/1*zdVSuUJMnW_HQiVcRzevTQ.png)


In [40]:
class Layer:
  def __init__(self, nin, n) -> None:
    """
    nin : number of inputs to this layer
    n : number of neurons in this layer
    """
    self.neurons = [Neuron(nin) for _ in range(n)]

  def __call__(self, x):
    outs = [neuron(x) for neuron in self.neurons]
    return outs

In [32]:
l = Layer(3, 2)

In [33]:
l.neurons

[<__main__.Neuron at 0x793f5b92d610>, <__main__.Neuron at 0x793f5b92cb90>]

In [37]:
l(x)

[ 5 10  1]
[-0.07167917 -6.95480187 -0.1451056 ]
-7.001496776705659
[ 5 10  1]
[ 1.138274    2.39467073 -0.88312817]
3.3962792585434465


[0.5002274224377617, 0.7246384841220469]

## Batch Training
Now lets say we want to train a batch of examples, then one way is to use a for loop and iterate for each training example. Another way is vectorization which we will see afterwards.

In [43]:
X = np.array([[2, 4, 5], [3, 5, 7], [1, 3, 6], [4, 8, 10]]) # 4 training examples
X

array([[ 2,  4,  5],
       [ 3,  5,  7],
       [ 1,  3,  6],
       [ 4,  8, 10]])

Lets create a `Layer` of 2 neurons so we get 2 outputs per example.

In [44]:
layer = Layer(3, 2)

In [46]:
outs = [layer(example) for example in X]
outs

[[0.22398334825436683, 0.9887969329375543],
 [0.1342579535452807, 0.9986753763349573],
 [0.2851572048063245, 0.9940152356750908],
 [0.14833353781105948, 0.9999216820387337]]

So far we have been iterating through loops but its time to use Matrix to do parallel computing and efficient neural net.
1. Represent inputs as a matrix: Instead of processing each training example individually, stack them into a single input matrix X, where each row represents a training example.

2. Represent weights as a matrix: Similarly, you can arrange the weights of all neurons in a layer into a single weight matrix W, where each column corresponds to a neuron's weights.

3. Represent biases as a vector: The biases for all neurons in a layer can be stored in a bias vector b.

4. Matrix multiplication for wx: The wx part of the calculation (wx + b) for all neurons and all training examples can be performed efficiently using matrix multiplication. The result will be a matrix where each element represents the wx value for a specific training example and a specific neuron.

5. Broadcasting for + b: The bias vector b can be added to the result of the matrix multiplication using broadcasting, which will add the bias to the corresponding wx value for each neuron across all training examples.

We will implement the `Layer` class again to use vectorization and let the `Neuron` class be as it is.

We also redefined the `sigmoid` function to use `np.exp` just to vectorize the operations

In [83]:
def sigmoid(z):
  return 1/(1+np.exp(-z))

class Layer:
  def __init__(self, nin, n) -> None:
    """
    nin : number of inputs to this layer
    n : number of neurons in this layer
    """
    self.neurons = [Neuron(nin) for _ in range(n)]
    neuron_ws = np.array([neuron.w for neuron in self.neurons])

    self.W = neuron_ws.T
    self.b = np.array([neuron.b for neuron in self.neurons])

  def __call__(self, X):
    """
    X : input matrix
    """
    # outs = [neuron(x) for neuron in self.neurons]
    # instead of this iteration and using call method of neuron we will use mat mul here
    Z = np.matmul(X, self.W) + self.b

    # outs = [[sigmoid(z) for z in row] for row in Z]
    # instead of this list comprehension we can directly call sigmoid because that is vectorized now
    outs = sigmoid(Z)

    return outs

In [84]:
layer = Layer(3, 2)

In [85]:
layer.W # col1 is w for neuron 1 and col2 is w for neuron 2

array([[-0.54303226,  0.47411194],
       [-0.95306267, -0.35183705],
       [ 0.43838726, -0.32361105]])

In [86]:
layer(X)

array([[0.03401578, 0.21426184],
       [0.0186032 , 0.13891009],
       [0.19594388, 0.14863963],
       [0.00234598, 0.03303469]])