<a href="https://colab.research.google.com/github/Ruveyda/Intro-CNN/blob/master/IntroCNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Networks

**1. Neurons :** A neuron takes inputs, does some calculation with them, and produces one output.

3 things are happening here :



*   First, each input is multiplied by a weight
        
        x1 -> x1 * w1
        x2 -> x2 * w2

*   Second, all the weighted inputs are added together with a bias :

        (x1 * w1) + (x2 * w2) + b
       
       

*  Finally, the sum is passed through an activation function

        y = f(x1 * w1 + x2*w2 + b)
       
In here, our activation function is sigmoid function :

        f(x) = 1 / (1 + e^(-x))


- **Coding a Neuron**

In [0]:
import numpy as np

In [0]:
def sigmoid(x):
  return 1/(1+ np.exp(-x))

In [0]:
class Neuron:
  
  def __init__(self, weights, bias):
    self.weights = weights
    self.bias = bias
   
  def feedforward(self, inputs):
    total = np.dot(self.weights, inputs) + bias
    return sigmoid(total)

In [4]:
weights = np.array([0, 1])
bias = 4

n = Neuron(weights, bias)

x = np.array([2, 3])

print(n.feedforward(x))

0.9990889488055994


**2. Combining Neurons into a Neural Networks**

A neural network is a bunch of neurons connected together.

A hidden layer is any layer between the input(first) layer and output(last) layer. There can be multiple hidden layers.


- **Coding a Neural Networks**

A neural network with
- 2 inputs, 
- a hidden layer with 2 neurons(h1, h2), 
- an output layer with 1 neuron(o1).

Each neuron has the same weights and bias
- w = [0,1]
- b = 0

In [0]:
import numpy as np

In [6]:
class NeuralNetwork:
  
  def __init__(self):
    weights = np.array([0, 1])
    bias = 0
    
    self.h1 = Neuron(weights, bias)
    self.h2 = Neuron(weights, bias)
    self.o1 = Neuron(weights, bias)
    
  def feedforward(self, x):
    out_h1 = self.h1.feedforward(x)
    out_h2 = self.h2.feedforward(x)
    
    out_o1 = self.o1.feedforward(np.array([out_h1, out_h2]))
    
    return out_o1
  
network = NeuralNetwork()
x = np.array([2, 3])
print(network.feedforward(x))

0.9933010896328802


**3. Training a Neural Network**

Say we have the following measurements:

          Weight     Height     Gender
          
Alice =  ---------- 133 ---------------- 65 ---------------- F

Bob = ---------- 160 ---------------- 72 ---------------- M

Charlie = ---------- 152 ---------------- 70 ---------------- M

Diana =---------- 120 ---------------- 60 ---------------- F



Let's train our network to predict someone's gender given their weight and height:

- Input Layer : weight, height
- Hidden Layer : h1, h2
- Output Layer : gender

We'll represent Male with a 0 and Female with a 1, and we'll also shift the data it easier to use :


          Weight (-135)     Height(-66)     Gender
          
Alice =  ------------- -2 ------------------------------------- -1 ---------------- 1

Bob = ------------- 25 ------------------------------------- 6 ---------------- 0

Charlie = ------------- 17 ------------------------------------- 4 ---------------- 0

Diana =------------- -15 ------------------------------------- -6 ---------------- 1




**Loss**

Before we train our network, we first need a way to quantify how "good" it's doing so that it can try to do "better". That's what the loss is.

We'll use the mean squared error (MSE) loss:

MSE = 1/n ∑ (Ytrue - Ypred)^2 

      (i=1 -> n)

- n is the number of samples, which is 4 (Alice, Bob, Charlie, Diana)
- y presents the variable being predicted, which is Gender.
- Ytrue is the true value of the variable (the correct answer). For example, Ytrue for Alice would be 1. (Female)
- Ypred is the predicted value of the variable. It's whatever our newtwork outputs.

(Ytrue - Ypred)^2 is known as the squared error. Our loss function is simply taking the average over all squared errors(hence the name mean squared error). The better our predictions are, the lower our loss will be.

Better predictions = Lower Loss.

***Training a network = trying to minimize its loss***


- **Coding MSE Loss**

In [0]:
import numpy as np

In [0]:
def mse_loss(y_true, y_pred):
  return((y_true - y_pred)**2).mean()

In [9]:
y_true = np.array([1,0,0,1])
y_pred = np.array([0,0,0,0])

print(mse_loss(y_true, y_pred))

0.5


**4. Training a Neural Network - Minimize the Loss**

We know have a clear goal : minimize the loss of the neural network. We know we can change the network's weights and biases to influance its predictions, but how do we do so in a way that decreases loss?

For simplicity, let's pretend we only have Alice in our dataset:

          Weight (-135)     Height(-66)     Gender
          
Alice =  ------------- -2 ------------------------------------- -1 ---------------- 1


Then the mean squared error loss is just Alice's squared error:

    MSE = 1/1 ∑ (Ytrue - Ypred)^2 
           (i=1 -> 1)
        = (Ytrue - Ypred) ^ 2
        = (1- Ypred)^2
        
Another way to think about loss is a function of weights and biases. Let's label each weight and bias in our network.

We can write loss as a multivariable fuction:

L(w1,w2,w3,w4,w5,w6, b1,b2,b3)

How would loss L change if we changed w1? That's a question the partial derivative ∂L/∂w1 can answer.

∂w1 / ∂L = (∂ypred / ∂L) ∗ (∂w1 /∂ypred)

This system of calculating partial derivatives by working backwards is known as **backpropagation**



**Training : Stochastic Gradient Descent**

We'll use an optimization algorithm called atochastic gradient descent that tells us how to change our weights and biases to minimize loss. 


w1 <- w1 - η ( ∂L/∂w1 )

η is constant called the learning rate that controls how fast we train. All we're doing is subtracting η ∂L1 / ∂w1 from w1 :

- If ∂L1 / ∂w1 is positive, w1 will decrease, which makes L decrease.
- If ∂L1 / ∂w1 is negative, w1 will increase, which makes L decrease.

If we do this for every weight and bias in the network, the loss will slowly decrease and our network will improve.



**Code : A Complete Neural Network**

In [0]:
import numpy as np

In [0]:
def sigmoid(x):
  return 1/(1+ np.exp(-x))

In [0]:
def deriv_sigmoid(x):
  fx = sigmoid(x)
  return fx * (1-fx)

In [0]:
def mse_loss(y_true, y_pred):
  return((y_true - y_pred)**2).mean()

In [15]:
class NeuralNetwork:
  def __init__(self):
    self.w1 = np.random.normal()
    self.w2 = np.random.normal()
    self.w3 = np.random.normal()
    self.w4 = np.random.normal()
    self.w5 = np.random.normal()
    self.w6 = np.random.normal()
    
    self.b1 = np.random.normal()
    self.b2 = np.random.normal()
    self.b3 = np.random.normal()
    
  def feedforward(self, x):
    h1 = sigmoid(self.w1 * x[0] + self.w2 * x[1] + self.b1)
    h2 = sigmoid(self.w3 * x[0] + self.w4 * x[1] + self.b2)
    o1 = sigmoid(self.w5 * h1 + self.w6 * h2 + self.b3)
    return o1
  
  def train(self, data, all_y_trues):
    learn_rate = 0.1
    epochs = 1000
    
    for epoch in range(epochs):
      
      for x, y_true in zip(data, all_y_trues):
        
        sum_h1 = self.w1 * x[0] + self.w2 * x[1] + self.b1
        h1 = sigmoid(sum_h1)
        
        sum_h2 = self.w3 * x[0] + self.w4 * x[1] + self.b2
        h2 = sigmoid(sum_h2)

        sum_o1 = self.w5 * h1 + self.w6 * h2 + self.b3
        o1 = sigmoid(sum_o1)
        
        y_pred = o1
        
        # --- Calculate partial derivatives.
        # --- Naming: d_L_d_w1 represents "partial L / partial w1"
        d_L_d_ypred = -2 * (y_true - y_pred)

        # Neuron o1
        d_ypred_d_w5 = h1 * deriv_sigmoid(sum_o1)
        d_ypred_d_w6 = h2 * deriv_sigmoid(sum_o1)
        d_ypred_d_b3 = deriv_sigmoid(sum_o1)

        d_ypred_d_h1 = self.w5 * deriv_sigmoid(sum_o1)
        d_ypred_d_h2 = self.w6 * deriv_sigmoid(sum_o1)

        # Neuron h1
        d_h1_d_w1 = x[0] * deriv_sigmoid(sum_h1)
        d_h1_d_w2 = x[1] * deriv_sigmoid(sum_h1)
        d_h1_d_b1 = deriv_sigmoid(sum_h1)

        # Neuron h2
        d_h2_d_w3 = x[0] * deriv_sigmoid(sum_h2)
        d_h2_d_w4 = x[1] * deriv_sigmoid(sum_h2)
        d_h2_d_b2 = deriv_sigmoid(sum_h2)

        # --- Update weights and biases
        # Neuron h1
        self.w1 -= learn_rate * d_L_d_ypred * d_ypred_d_h1 * d_h1_d_w1
        self.w2 -= learn_rate * d_L_d_ypred * d_ypred_d_h1 * d_h1_d_w2
        self.b1 -= learn_rate * d_L_d_ypred * d_ypred_d_h1 * d_h1_d_b1

        # Neuron h2
        self.w3 -= learn_rate * d_L_d_ypred * d_ypred_d_h2 * d_h2_d_w3
        self.w4 -= learn_rate * d_L_d_ypred * d_ypred_d_h2 * d_h2_d_w4
        self.b2 -= learn_rate * d_L_d_ypred * d_ypred_d_h2 * d_h2_d_b2

        # Neuron o1
        self.w5 -= learn_rate * d_L_d_ypred * d_ypred_d_w5
        self.w6 -= learn_rate * d_L_d_ypred * d_ypred_d_w6
        self.b3 -= learn_rate * d_L_d_ypred * d_ypred_d_b3
        
  # --- Calculate total loss at the end of each epoch
      if epoch % 10 == 0:
        y_preds = np.apply_along_axis(self.feedforward, 1, data)
        loss = mse_loss(all_y_trues, y_preds)
        print("Epoch %d loss: %.3f" % (epoch, loss))
       
      
      
data = np.array([
    [-2,-1],
    [25,6],
    [17,4],
    [-15,-6]
])

all_y_trues = np.array([
    1,
    0,
    0,
    1
])

network = NeuralNetwork()
network.train(data, all_y_trues)

Epoch 0 loss: 0.228
Epoch 10 loss: 0.123
Epoch 20 loss: 0.081
Epoch 30 loss: 0.060
Epoch 40 loss: 0.047
Epoch 50 loss: 0.039
Epoch 60 loss: 0.032
Epoch 70 loss: 0.028
Epoch 80 loss: 0.024
Epoch 90 loss: 0.021
Epoch 100 loss: 0.019
Epoch 110 loss: 0.017
Epoch 120 loss: 0.016
Epoch 130 loss: 0.014
Epoch 140 loss: 0.013
Epoch 150 loss: 0.012
Epoch 160 loss: 0.011
Epoch 170 loss: 0.011
Epoch 180 loss: 0.010
Epoch 190 loss: 0.009
Epoch 200 loss: 0.009
Epoch 210 loss: 0.008
Epoch 220 loss: 0.008
Epoch 230 loss: 0.008
Epoch 240 loss: 0.007
Epoch 250 loss: 0.007
Epoch 260 loss: 0.007
Epoch 270 loss: 0.006
Epoch 280 loss: 0.006
Epoch 290 loss: 0.006
Epoch 300 loss: 0.006
Epoch 310 loss: 0.005
Epoch 320 loss: 0.005
Epoch 330 loss: 0.005
Epoch 340 loss: 0.005
Epoch 350 loss: 0.005
Epoch 360 loss: 0.005
Epoch 370 loss: 0.004
Epoch 380 loss: 0.004
Epoch 390 loss: 0.004
Epoch 400 loss: 0.004
Epoch 410 loss: 0.004
Epoch 420 loss: 0.004
Epoch 430 loss: 0.004
Epoch 440 loss: 0.004
Epoch 450 loss: 0.004

In [16]:
emily = np.array([-7, -3])
frank = np.array([20,2])

print("Emily: %.3f" % network.feedforward(emily))
print("Frank: %.3f" % network.feedforward(frank))

Emily: 0.966
Frank: 0.039


Emily : Female
Frank : Male