##### Following the series: https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3

# Part 1- Neuron Code


In [1]:
import sys
import numpy as np
import matplotlib

Neuron Computation = $\sum_{i=0}^{n} w_i * x_i + b$ 

$n$ is the number of inputs coming into the neuron

$w_i$ is the ith weight associated with the ith input

$x_i$ is the ith input into the neuron

$b$ is a bias term

In [2]:
# Raw Python: example of simple neuron computation
inputs = [1.2,5.3,2.1]
weights = [3.1,2.1,9.4]
bias = 3

output = inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + bias
output

37.59

# Part 2 - Coding a Layer


<img src="img/net1.png" style="width:200px;height:200px;"/>

In [5]:
# Raw Python: Coding a layer with 3 neurons and 4 inputs
## We will use numpy later to vectorize all of the computation don't worry (sorry if this is cringy rn)
inputs = [1,2,3,2.5]

weights1 = [0.2,0.8,-0.5,1]
weights2 = [0.5,-0.91,0.26,-0.5]
weights3 = [-0.26,-0.27,.17,.87]

bias1 = 2
bias2 = 3
bias3 = 0.5

# Notice that output is a 3-dimensional vector (list) since we are coding a layer with 3 neurons
outputs = [0,0,0]

outputs[0] = inputs[0]*weights1[0] + inputs[1]*weights1[1] + inputs[2]*weights1[2] + inputs[3]*weights1[3] + bias1
outputs[1] = inputs[0]*weights2[0] + inputs[1]*weights2[1] + inputs[2]*weights2[2] + inputs[3]*weights2[3] + bias2
outputs[2] = inputs[0]*weights3[0] + inputs[1]*weights3[1] + inputs[2]*weights3[2] + inputs[3]*weights3[3] + bias3

outputs

[4.8, 1.21, 2.385]

# Part 3 - The Dot Product

### Some dimensionality
Array: $l = [1,5,6,2]$, Shape: $(4,)$, Type: 1D array/vector

Array: $lol = [[1,5,6,2],[3,2,1,3]]$, Shape: $(2,4)$, Type: 2D array/Matrix

Array: $lolol = [[[1,5,6,2],[3,2,1,3]],[[5,2,1,2],[6,4,8,4]],[[2,8,5,3],[1,1,9,4]]] $, Shape: $(3,2,4)$, Type: 3D Array/3-Tensor

### Definition
Let $u = [u_1,u_2,...,u_n]$ and $v = [v_1,v_2,...,v_n]$ be vectors in $R^n$. The dot product between $u$ and $v$ is defined as $u*v = \sum_{i=1}^{n} u_i*v_i$
              

In [5]:
# Inputs
inputs = [1,2,3,2.5]

# Weight matrix
weights = [[0.2,0.8,-0.5,1],
           [0.5,-0.91,0.26,-0.5],
           [-0.26,-0.27,.17,.87]]

# Vector of biases
biases = [2,3,0.5]

layer_outputs = [] # Output of current layer

# Iterate through each row of the weight matrix and each entry in the bias vector
for neuron_weights, neuron_bias in zip(weights,biases):
    neuron_output = 0 # Output of the neuron
    # Iterate through the inputs and the weights in the current row
    for n_input, weight in zip(inputs, neuron_weights):
        neuron_output += n_input*weight # Multiply input by associated weight
    neuron_output += neuron_bias # Add bias
    layer_outputs.append(neuron_output)
print(layer_outputs)

[4.8, 1.21, 2.385]


In [8]:
# Numpy implementation
# Inputs
inputs = [1,2,3,2.5]

# Weight matrix
weights = [[0.2,0.8,-0.5,1],
           [0.5,-0.91,0.26,-0.5],
           [-0.26,-0.27,.17,.87]]

# Vector of biases
biases = [2,3,0.5]

output = np.dot(weights,inputs) + biases
output

array([4.8  , 1.21 , 2.385])

# Part 4 - Batches, Layers and Objects

- Batches allow us to train networks in a parallel fashion. We can show a network multiple samples at a time which can help it to genrealize better. Imagine trying to fit a line to one data point at a time. In this case, the fit line would fluctuate dramatically with each each new data point. If we show all the samples at once, however, this can cause the network to overfit to the training data. A batch size of 32 is pretty common.

In [5]:
# Numpy implementation
# Inputs
inputs = [[1,2,3,2.5],
          [2.0,5.0,1.0,2.0],
          [-1.5,2.7,3.3,-0.8]] # Each row in inputs is an individual training example

# Weight matrix
# Layer 1
weights = [[0.2,0.8,-0.5,1],
           [0.5,-0.91,0.26,-0.5],
           [-0.26,-0.27,.17,.87]]
# Layer2
weights2 = [[0.1,-0.14,0.5],
           [-0.5,0.12,-0.33],
           [-0.44,0.73,-0.13]]

# Vector of biases
biases = [2,3,0.5]
biases2 = [-1,2,-0.5]

layer1_outputs = np.dot(inputs,np.array(weights).T) + biases
layer2_outputs = np.dot(layer1_outputs,np.array(weights2).T) + biases2
layer2_outputs

array([[ 0.5031 , -1.04185, -2.03875],
       [ 0.2406 , -2.283  , -4.9879 ],
       [-0.99314,  1.41254, -0.35655]])

## General feedforward computation in matrix form (without activation function)
#### For first layer:
$O_1 = XW^T + b$

$X$ - batch_size x num_features matrix

$W$ - num_output_neurons x num_features matrix (Hence $W^T$ is num_features x num_output_neurons matrix

$b$ - num_output_neurons x 1 vector

$O_1$ - batch_size x num_output_neurons matrix

What is the intuition? Each row of $X$ is an individual observation from the training data that contains the values of each feature for that observation. For instance, each feature of a row could represent that gray-scale value of an individual pixel of an image. When we peform the matrix multiplication $XW^T + b$, for each row in the matrix $X$, we are multilying the value of each feature to its corresponding weight in the respective column of $W^T$, summing these values, and adding a bias term. This results in the output matrix $O_1$ which can we defined as follows:

$O_{ij} = $ the output of neuron $j$ in the $i^{th}$ batch

In [11]:
# Layer object
np.random.seed(0)

# Training data X: a batch_size x n_features matrix
X = [[1,2,3,2.5],
    [2.0,5.0,1.0,2.0],
    [-1.5,2.7,3.3,-0.8]]



class Layer_Dense:
    def __init__(self,n_inputs,n_neurons):
        self.weights = 0.1*np.random.randn(n_inputs,n_neurons)
        self.biases = np.zeros((1,n_neurons))
    def forward(self,inputs):
        self.output = np.dot(inputs,self.weights) + self.biases
        
layer1 = Layer_Dense(n_inputs=4,n_neurons=5)
layer2 = Layer_Dense(n_inputs=5,n_neurons=2)

layer1.forward(X)
layer2.forward(layer1.output)
layer2.output

array([[ 0.148296  , -0.08397602],
       [ 0.20705646, -0.04265608],
       [ 0.20124979, -0.07290616]])