# Chapter 2
## Coding Our First Neurons

## A Single Neuron

In [2]:
# inputs
inputs = [1, 2, 3]
# inputs associated weights
weights = [0.2, 0.8, -0.5] 
# bias - need one for each neuron (tunable parameter)
bias = 2

In [3]:
output = (inputs[0] * weights[0] +
          inputs[1] * weights[1] +
          inputs[2] * weights[2] + bias)
print(f"Output: {output}")

Output: 2.3


In [5]:
# We have 4 inputs now

inputs = [1, 2, 3, 2.5]
# need to add the associated weight for this 4th input
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2 # stays the same
output = (inputs[0] * weights[0] +
          inputs[1] * weights[1] +
          inputs[2] * weights[2] +
          inputs[3] * weights[3] + bias)
print(f"Output: {output}")


Output: 4.8


## A Layer of Neurons

- Each neuron in a layer (layer 3) takes the exact same inputs
- inputs can either be from training data or from the previous layer
![image.png](references/layer.png)

In [1]:
# we will use 1st neuron and add 2 more neurons for layer

# each neuron gets the same inputs
inputs = [1, 2, 3, 2.5]

# each neuron has its own weights
weights1 = [0.2, 0.8, -0.5, 1.0]
weights2 = [0.5, -0.91, 0.26, -0.5]
weights3 = [-0.26, -0.27, 0.17, 0.87]

# each neuron has its own bias
bias1 = 2
bias2 = 3
bias3 = 0.5

outputs = [
    # Neuron 1:
    inputs[0] * weights1[0] +
    inputs[1] * weights1[1] +
    inputs[2] * weights1[2] +
    inputs[3] * weights1[3] + bias1,
    # Neuron 2:
    inputs[0] * weights2[0] +
    inputs[1] * weights2[1] +
    inputs[2] * weights2[2] + 
    inputs[3] * weights2[3] + bias2,
    # Neuron 3:
    inputs[0] * weights3[0] + 
    inputs[1] * weights3[1] + 
    inputs[2] * weights3[2] + 
    inputs[3] * weights3[3] + bias3]

print(f"Output: {outputs}")

Output: [4.8, 1.21, 2.385]


- Each neuron is connected to the same inputs
- only difference is the weights and biasees that each neuron applies to the input
- this is a fully connected neural network: every neuron in the current layer has connections to every neuron from the previous layer

In [10]:
# Using Loops to Scale Layers and inputs sizes

inputs = [1, 2, 3, 2.5]
# change weights to one list
weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]
# change biases to list
biases = [2, 3, 0.5]

layer_outputs = []
for neuron_weights, neuron_bias in zip(weights, biases):
    neuron_output = 0
    for n_input, weight in zip(inputs, neuron_weights):
        neuron_output += n_input*weight
    neuron_output+=neuron_bias
    layer_outputs.append(neuron_output)
print(f"Layer Outputs: {layer_outputs}")

Layer Outputs: [4.8, 1.21, 2.385]


## Tensors, Arrays, and Vectors

What are tensors?
- tensors are a generalization of vectors and matrices
- tensors are closely related to arrays
- subtle differences between tensor/array/matrix

Lists
- comma-separated values between square brackets

In [11]:
# simple list
l = [1,5,2,3]

# list of lists
lol = [[1,5,6,2],
       [3,2,1,3]]

# list of lists of lists
lolol = [[[1,5,6,2],
          [3,2,1,3],
        [[5,2,1,2],
         [6,4,8,4]],
        [[2, 8,5,3],
         [1, 1, 4,2]]]]

# Everything could also be an array 

In [None]:
# this cannot be an array because its not homologous
# homologous if each list along a dimension is identically long

another_list_of_lists = [[4,2,3],
                         [5,1]]

Matrix
- rectangular array (columns and rows)
- always 2-dimensional array

Can all arrays be matrices?
- No, not all arrays are matrices
- matrices are always 2-dimensional
- An array can be far more than just columns and rows
- Arrays can have any number of dimensions (32 dimensions)

In [12]:
# valid matrix (bec/ has columns and rows)
# automatically means it could also be an array

list_matrix_array = [[4,2],
                     [5,1],
                     [8,2]]

Whats a Tensor?
- A tensor object is an object that can be represented as an array

What is an Array?
- an ordered homologous container for numbers

What is a Vector?
- A list in python
- 1-dimensional array in Numpy
- unlike physics perspective where a vector is a quantity with magnitude and direction

## Dot Product and Vector Addition

Vector Multiplication
- our previous python implementation of multiplying weight times input can be done using dot product
- there are 2 ways of multiplying two vectors
    - dot product
    - cross product

Dot Product
- dot product is a mathematical operation that takes two equal-length vectors and returns a single number, a scalar

Dot Product Mulitplication
- Equation: a * b = a1b1 + a2b2 + a3b3 + ... + anbn
- both vectors have to be the same size 

In [4]:
# Example of dot product multiplication

a = [1, 2, 3] # think of this as neuron 1 inputs
b = [2, 3, 4] # think of this as neuron 1 weights

dot_product = a[0]*b[0] + a[1]*b[1] + a[2]*b[2]
print(f"dot product multiplication: {dot_product}")

dot product multiplication: 20


Vector Addition 
- to add two vectors, need to be performed element-wise (both vectors same size)
- the result will be a vector of the same size as the input vectors
- Equation: a + b = [a1 + b1, a2 + b2, a3 + b3, ... , an + bn]

## A Single Neuron with Numpy

In [8]:
import numpy as np

# all for one neuron
inputs = [1.0, 2.0, 3.0, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2.0

# np.dot() does the dot product multiplication for us
# whats going on: (0.2*1.0 + 0.8*2.0 + -0.5*3.0 + 1.0*2.5) + 2.0
outputs = np.dot(weights, inputs) + bias
print(f"Single Neuron Output: {outputs}")

Single Neuron Output: 4.8


# A Layer of Neurons with Numpy

In [11]:
import numpy as np

inputs = [1.0, 2.0, 3.0, 2.5]
# weights for 3 different neurons
weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]
# biases for 3 different neurons
biases = [2.0, 3.0, 0.5]

layer_outputs = np.dot(weights, inputs) + biases
print(f"Layer Outputs (3 Neurons): {layer_outputs}")

Layer Outputs (3 Neurons): [4.8   1.21  2.385]


In [24]:
inputs = [1.0, 2.0, 3.0, 2.5]
# weights for 3 different neurons
weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]
# biases for 3 different neurons
biases = [2.0, 3.0, 0.5]
print(f"inputs shape: {np.shape(inputs)}")
print(f"weights shape: {np.shape(weights)}")

layer = np.dot(weights, inputs)
layer_with_bias = np.dot(weights, inputs) + biases
print(f"Layer (without Bias): {layer}") 
print(f"Layer (with Bias): {layer_with_bias}\n")

inputs shape: (4,)
weights shape: (3, 4)
Layer (without Bias): [ 2.8   -1.79   1.885]
Layer (with Bias): [4.8   1.21  2.385]



## A Batch of Data

- when training we generally train on a batches of data (multiple inputs at a time)
- 2 reasons why:
    - 1. faster to train in batches in parallel processing
    - 2. batches help generalize the model
        - one sample at a time: if you fit the model to one sample at a time, the model will be biased towards that sample
        - multiple samples at a time: the model will be able to produce general tweaks to the weights and biases as it sees multiple samples at a time (average of those)

![image.png](references/batches.png)

Amazing Animation of why we use batches
https://nnfs.io/vyu/

In [None]:
# example batch of data
batch = [[1, 5, 2, 1],
         [7, 3, 5, 2],
         [1, 2, 3, 4],
         [5, 6, 8, 9]]

## Matrix Product

Matrix Product: an operation in which we take the dot product of each row of the first matrix with each column of the second matrix
- the left matrix must match the size of the first dimension of the right matrix
- Left Matrix: (5,4)
- Right Matrix: (4,7) # 4 == 4
- resulting matrix will be the size of the left matrix's first dimension and the right matrix's second dimension ---> (5,7)

Really good animation of how a matrix product is calculated: https://nnfs.io/jei/

![image.png](references/matrix_product.png)

## Transposition for the Matrix Product

Transposition: modifies matrix so that the rows become columns and the columns become rows

## A Layer of Neurons and Batch of Data w/ Numpy

# Neural Networks from Scratch - P.4 Batches, Layers, and Objects (Video)
https://www.youtube.com/watch?v=TEWy9vZcxW4&list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3&t=91 

- batch size is amazing
- we want a massive batch size because it will allow the model to generalize better
- think of it as a human: you want as much inputs as possible to make a decision. you dont want to see one input at a time
- the reason we dont want to just use the entire dataset is because it will just try to fit to all the data (i.e overfitting)
- we want to use a batch size that is large enough to generalize the model but small enough to not overfit

In [1]:
import numpy as np

# this is a single input with 4 features
inputs = [1, 2, 3, 2.5]
weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]

biases = [2, 3, 0.5]

output = np.dot(weights, inputs) + biases
print(f"Output: {output}")

Output: [4.8   1.21  2.385]


In [2]:
import numpy as np

# multiple inputs now
inputs = [[1, 2, 3, 2.5],
          [2.0, 5.0, -1.0, 2.0],
          [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]

biases = [2, 3, 0.5]

output = np.dot(weights, inputs) + biases
print(f"Output: {output}")

ValueError: shapes (3,4) and (3,4) not aligned: 4 (dim 1) != 3 (dim 0)

In [3]:
import numpy as np

# multiple inputs now
inputs = [[1, 2, 3, 2.5],
          [2.0, 5.0, -1.0, 2.0],
          [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]

biases = [2, 3, 0.5]

# reason we got an error in last cell is because shape error
# the sizes of the arrays were not the same

# size of index 1 of 1st array in np.dot needs to match size of index 0 of 2nd array
# 4 != 3
# what we need to do is TRANSPOSE the weights array bec/ we want to switch rows and columns
output = np.dot(inputs, np.array(weights).T) + biases # switched inputs and weights around
print(f"Output: {output}")

Output: [[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]


Tranpose:
- transposing a matrix is flipping the rows and columns
![image.png](references/transpose.png)

How matrix is calcualted:
![image.png](references/matrix_product.png)

Adding another Layer

In [7]:
import numpy as np

inputs = [[1, 2, 3, 2.5],
          [2.0, 5.0, -1.0, 2.0],
          [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
           [0.5, -0.91, 0.26, -0.5],
           [-0.26, -0.27, 0.17, 0.87]]

biases = [2, 3, 0.5]

weights2 = [[0.1, -0.14, 0.5],
           [-0.5, 0.12, -0.33],
           [-0.44, 0.73, -0.13]]

biases2 = [-1, 2, -0.5]

layer1_outputs = np.dot(inputs, np.array(weights).T) + biases 

# the outputs of layer1 are the inputs of layer2
layer2_outputs = np.dot(layer1_outputs, np.array(weights2).T) + biases2

print(f"Layer 1 Outputs: \n{layer1_outputs}")
print(f"Layer 2 Outputs: \n{layer2_outputs}")

Layer 1 Outputs: 
[[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]
Layer 2 Outputs: 
[[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]]


We are going to convert this into an object so its not so unruly

- we want small values when dealing with neural networks
- we want to initialize the weights and biases to small values between -1 and +1
- the reason we want it below 1 is because if say the first layer weights are 2 or 3 and the second layer weights are 4/5 then the output will be huge

Weights and Biases initialization:
- weights good starting point is probably good to be -0.1 to 0.1
- biases good starting point is probably good to be 0
- biases starting at zero might create a problem where all the neurons are dead (i.e. all the neurons are outputting 0) if weights aren't high enough to fire (dead network)

In [26]:
import numpy as np

np.random.seed(0)

X = [[1, 2, 3, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]]

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        # we don't need to transpose the weights array anymore
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons) # opposite order of n_inputs and n_neurons
        self.biases = np.zeros((1, n_neurons))
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases

## these are our layer objects
# n_inputs is 4 because its the number of features in each input
layer1 = Layer_Dense(n_inputs=4, n_neurons=5)
# n_inputs is 5 because its the number of neurons in the previous layer
# REMEMBER: size of index 1 of 1st array in np.dot needs to match size of index 0 of 2nd array
layer2 = Layer_Dense(n_inputs=5, n_neurons=2)

# we can now pass our inputs to the layer
layer1.forward(inputs=X)
print(f"Layer 1 Output: \n{layer1.output}\n")

layer2.forward(inputs=layer1.output)
print(f"Layer 2 Output: \n{layer2.output}")

Layer 1 Output: 
[[ 0.10758131  1.03983522  0.24462411  0.31821498  0.18851053]
 [-0.08349796  0.70846411  0.00293357  0.44701525  0.36360538]
 [-0.50763245  0.55688422  0.07987797 -0.34889573  0.04553042]]
Layer 2 Output: 
[[ 0.148296   -0.08397602]
 [ 0.14100315 -0.01340469]
 [ 0.20124979 -0.07290616]]


Next major step is to add activation functions