**Neural Network Overview**

Input Layer -> Hidden Layers -> Output Layer
Tune weights and biases to achieve desired results (input layer)
note how each neuron only has one bias regardless of input
input can be an input value from a layer 

In [1]:
inputs = [1, 2, 3]
weights = [0.2, 0.8, -0.5]
bias = 2


output = inputs[0] * weights[0]+ inputs[1] * weights[1] + inputs[2] * weights[2] + bias
print(output)

2.3


In [2]:
inputs = [1, 2, 3, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2
output = inputs[0] * weights[0]+ inputs[1] * weights[1] + inputs[2] * weights[2] + inputs[3] * weights[3] + bias

print(output)

4.8


In [3]:
inputs = [1, 2, 3, 2.5]
weights1 = [0.2, 0.8, -0.5, 1.0]
weights2 = [0.5, -0.91, 0.26, -0.5]
weights3 = [-0.26, -0.27, 0.17, 0.87]
bias1 = 2
bias2 = 3
bias3 = 2.5

output = [inputs[0] * weights1[0]+ inputs[1] * weights1[1] + inputs[2] * weights1[2] + inputs[3] * weights1[3] + bias1,
          inputs[0] * weights2[0]+ inputs[1] * weights2[1] + inputs[2] * weights2[2] + inputs[3] * weights2[3] + bias2,
          inputs[0] * weights3[0]+ inputs[1] * weights3[1] + inputs[2] * weights3[2] + inputs[3] * weights3[3] + bias3]

print(output)

[4.8, 1.21, 4.385]


In [4]:
# simplified code for input layer
inputs = [1, 2, 3, 2.5]
weights = [[0.2, 0.8, -0.5, 1.0], [0.5, -0.91, 0.26, -0.5], [-0.26, -0.27, 0.17, 0.87]]
biases = [2, 3, 2.5]

layer_outputs = []
for neuron_weights, neuron_bias in zip(weights, biases):
    neuron_output = 0
    for n_input, weight in zip(inputs, neuron_weights):
        neuron_output += n_input*weight
    neuron_output += neuron_bias
    layer_outputs.append(neuron_output)

print(layer_outputs)

[4.8, 1.21, 4.385]


**Vectors or Matrix or Tensor**

list = [1, 5, 6, 2] 1d array or vector with shape (4,1) 

list = [
    [1, 5, 6, 2],   2d array or vector with shape (4,2)
    [5, 7, 1, 4]
]

to avoid shape errors in pandas/numpy/tensorflow: ensure that shapes are HOMOLOGOUS
 
In the context of deep learning, a tensor is an object that can be represented as an array, not just the array itself.

a = [1, 2, 3]
b = [4, 5, 6]

dot_product = a[0] * b[0] + a[1] * b[1] + a[2] * b[2]

Note that dot product of two vectors results in a single scalar value.

In [5]:
# dot product sample 

import numpy as np

inputs = [1, 2, 3, 2.5]
weights = [0.2, 0.8, -0.5, 1.0]
bias = 2

output = np.dot(inputs, weights) + bias
print(output)

4.8


In [6]:
import numpy as np
inputs = [1, 2, 3, 2.5] # 4 features
weights = [[0.2, 0.8, -0.5, 1.0], [0.5, -0.91, 0.26, -0.5], [-0.26, -0.27, 0.17, 0.87]]
biases = [2, 3, 2.5]


output = np.dot(weights, inputs) + biases
print(output)

[4.8   1.21  4.385]


**Batch Size**

ne sample at a time, makes it difficult for the neuron to fit the line

Why not show all the samples to make it easier for the neuron to fit?
It leads to overfitting (hurts generalization and hurts out of sample data) 32 is good

In [7]:
inputs = [[1, 2, 3, 2.5], # this one is (3,4)
          [2.0, 5.0, -1.0, 2.0],
          [-1.5, 2.7, 3.3, -0.8]]
 
weights = [[0.2, 0.8, -0.5, 1.0], # this one is (3,4) -- shape error if not transposed to (4,3)
            [0.5, -0.91, 0.26, -0.5],
            [-0.26, -0.27, 0.17, 0.87]]

biases = [2, 3, 2.5] # notice how no biases are added since no neurons are added

'''
reminder: index 1 of first element needs to match index 0 of second element e.g (3,4) amd (4,3)

error if no transpose of weights matrix
ValueError: shapes (3,4) and (3,4) not aligned: 4 (dim 1) != 3 (dim 0)

'''


output = np.dot(inputs, np.array(weights).T) + biases   # transposed here
print(output)

[[ 4.8    1.21   4.385]
 [ 8.9   -1.81   2.2  ]
 [ 1.41   1.051  2.026]]


In [21]:
inputs = [[1, 2, 3, 2.5],
          [2.0, 5.0, -1.0, 2.0],
          [-1.5, 2.7, 3.3, -0.8]]
 
weights = [[0.2, 0.8, -0.5, 1.0], 
            [0.5, -0.91, 0.26, -0.5],
            [-0.26, -0.27, 0.17, 0.87]]

weights2 = [[0.1, -0.14, 0.5], 
            [-0.5, 0.12, -0.33],
            [-0.44, 0.73, -0.13]]

biases = [2, 3, 0.5] 

biases2 = [-1, 2, -0.5]

layer1_outputs = np.dot(inputs, np.array(weights).T) + biases
layer2_outputs = np.dot(layer1_outputs, np.array(weights2).T) + biases2
print(layer2_outputs)

[[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]]


In [46]:
import numpy as np
np.random.seed(0)

X = [[1, 2, 3, 2.5],
    [2.0, 5.0, -1.0, 2.0],
    [-1.5, 2.7, 3.3, -0.8]]

# remember Neil, n_inputs is feature numbers (4) and neurons refers to any number you want

class Layer_Dense:
    def __init__(self, n_inputs, n_neurons): # randomly generate weights and bias based on shape of input 
        self.weights = 0.1 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons)) # idk why i need 2 parentheses here plz figure it out
    def forward(self, inputs):
        self.outputs = np.dot(inputs, self.weights) + self.biases 

'''
Remember the shape shits we talked about earlier, the neuron # for layer 1
must be the same as the input # for layer 2
'''


layer1 = Layer_Dense(4, 5) 
layer2 = Layer_Dense(5, 2)

layer1.forward(X)
layer2.forward(layer1.outputs)
print(layer1.outputs)
print(layer2.outputs)

[[ 0.10758131  1.03983522  0.24462411  0.31821498  0.18851053]
 [-0.08349796  0.70846411  0.00293357  0.44701525  0.36360538]
 [-0.50763245  0.55688422  0.07987797 -0.34889573  0.04553042]]
[[ 0.148296   -0.08397602]
 [ 0.14100315 -0.01340469]
 [ 0.20124979 -0.07290616]]


**Activation Functions**

Every neron in hidden layer has an activation function. Applying the activation function applies after doing dot product.

1. Step Function (Linear) - if x > 0: then x = 1. If x < 0: x = 0
2. Sigmoid Function - It has a granular output, and ensures better backpropagagtion and loss calculation. 
3. RELU - If x > 0: then x == x. If x < 0: x == 0. (used due to speed, but has vanishing gradient problem) - offset activation point by tweaking bias

In [1]:
inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]
output = []

for i in inputs:
    if i > 0:
        output.append(i)
    elif i <= 0:
        output.append(0)

print(output)

[0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


**Softmax Activation**

Create a probability distribution for the neural network.
It accounts for negative values without losing meaning - as opposed to using abolute values or exponents


In [4]:
import math
layer_outputs = [4.8, 1.21, 2.385]

E = math.e # 2.71828

exp_values = []

for output in layer_outputs:
    exp_values.append(E**output)

print(exp_values)

norm_base = sum(exp_values)
norm_values = []

for value in exp_values:
    norm_values.append(value/norm_base)

print(norm_values)
print(sum(norm_values)) # add up to 1 or 0.999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 because of floating point error





[121.51041751873483, 3.353484652549023, 10.859062664920513]
[0.8952826639572619, 0.024708306782099374, 0.0800090292606387]
0.9999999999999999


In [21]:
import numpy as np

layer_outputs = [[4.8, 1.21, 2.385],
                 [8.9, -1.81, 0.2],
                 [1.41, 1.051, 0.026],]

exp_values = np.exp(layer_outputs)
print(np.sum(layer_outputs, axis = 1, keepdims = True))

norm_values =  exp_values / np.sum(exp_values, axis = 1, keepdims = True)
print(norm_values)

#norm_values = exp_values/np.sum(exp_values)
#print(norm_values)
#print(sum(norm_values))

[[8.395]
 [7.29 ]
 [2.487]]
[[8.95282664e-01 2.47083068e-02 8.00090293e-02]
 [9.99811129e-01 2.23163963e-05 1.66554348e-04]
 [5.13097164e-01 3.58333899e-01 1.28568936e-01]]


**Loss Function**

1. For Regression : Mean Absolute Error or Mean Squared Error is Common
2. For Classification: Categorical Cross Entropy

solving for x

e ** x  = b

in machine learning, log is based on e (euler's number 1.7)

Classes: 3
Label: 0
One Hot: [1, 0, 0]
Prediction: [0.7, 0.1, 0,2]


In [32]:
import numpy as np
import math
b = 5.2
np.log(b)
print(math.e ** 1.6486586255873816)

# Categorical Cross Entropy (negative log)

softmax_output = [0.7, 0.1, 0.2]
target_output = [1, 0, 0]
target_class = 0
loss = -(math.log(softmax_output[0])*target_output[0] +
         math.log(softmax_output[1])*target_output[1] +
         math.log(softmax_output[2])*target_output[2])

print(loss)

# -(math.log(0.7)))

5.199999999999999
0.35667494393873245
