# Deep Learning

Deep Learning is a subset of machine learning, a field that is dedicated to study and develop the machines that can learn. Deep learning is generally implemented with the help of neural networks. These neural networks take input data, learn from it and then make predictions.

### Explanation about neural networks in phases as we learn in the book, it evolves into perfection by progressing in the book
Def1: A neural network is one or more weights that you can multiply by the input data in order to make prediction

**Input data**: It's a number that you recorded in the real world somewhere., like today's temperature, a batsmen average score, yesterday stock price, etc.

**Prediction**: Given input data, a neural network tells you a prediction like, given today's temperature, it's likely to rain.. given yesterday's stock price, today's stock price will be xxx.xx, etc.

**How neural network learns**: By trail and error. It first makes a prediction of a known input, then compares to the original and then learns from it and changes the weight either up or down and continue the process again until it reaches a good result

A typical neural network works as following

![image.png](attachment:image.png)

# Neural Networks

A typical neural network does 3 steps, as like any other model:
1. Predict
2. Compare
3. Learn

From the image above, we have input data, and a knob(knows as the term weight, which we are going to use from the next occurance of the same term) mapping the input data and then the prediction by multiplying the input data with knob angle. The knob adjusts the angle with the weight that we need to multiply the input data. This process of adjusting the knob(weight) to a value that would give us the most accurate prediction is learning.

**The example that we are going to use to build a basic neural network is: A baseball team needs prediction if they are going to lose or win based on features like no.of toes(average number of toes per player), number of wins/losses, number of fans, etc.**

## A neural network consists of 3 parts
### 1. data:
    the number of data points to choose to send through a network at a time depends on the input data. If an image is being sent, like say a cat. Then the input data to send at a time would be the pixels containing the whole one single image.
   
***Rule of thumb for number of input data: Always present enough information to the network, where "enough information" defined loosely as how much a human might need to make the same prediction ***

You can create a network only when you understand the shape of input and output datasets

### 2. machine

### 3. prediction

# Forward Propagation

The neural network first takes in the input data points and make a prediction. This is called as forward propagation

Let us build a neural network consisting of one datapoint, one weight mapping from input to output

In [1]:
# An empty neural network with an initial weight(set by us)

weight = 0.1

def neural_network(input, weight):
    prediction = input * weight
    return prediction

In [4]:
# Inserting one input data point
number_of_toes = [8.5, 9.5, 10, 9]
input = number_of_toes[0]
pred = neural_network(input, weight)
print(round(pred, 4))

0.85


### What does a neural network do

It multiplies the input by a certain weight, i.e., it scales the input by a certain amount. If the weight was 2, then, it would have doubled the input, if it was 0.01, it would have made input 100 times smaller. So the weight can make input scale smaller or larger

NN accepts an input data as `information` and weight as `knowledge`. It uses knowledge in the weights to interpret the information in the input. 

The neural network that we built does not have access to the previously ran prediction. i.e., if we were to pass now `number_of_toes[1]` to the network, it does not have the previously made prediction. Later we are going to see how to pass the neural network a `short term memory` by feeding multiple inputs to it

Weights are measure of sensitivity between input data and its prediction.

Our weight knob tells us the likelihood of winning the match. But the network we built doesn't have complete or enough information to make the prediction. If the team had an average of 0 toes, they might play really bad. So our network may or may not work as baseball is much complicated than that. Next we are going to build a neural network that accepts multiple inputs or multiple pieces of information at the same time so that the neural network can make more informed decisions

## Making predictions in multiple inputs

A neural network can accept multiple inputs at a single time. This time for the experiment, we shall also consider **number of wins/losses** of the team and **number of fans** for the team

In [7]:
# An empty network with multiple inputs - core python implementation

weights = [0.1, 0.2, 0]

def neural_network(input, weights):
    pred = w_sum(input, weights)
    return pred

# Performing weighted sum of input
def w_sum(a, b):
    assert len(a) == len(b)
    output = 0
    
    for i in range(len(a)):
        output += a[i] * b[i]
    return output

In [8]:
# Inserting one datapoint

# This dataset is the current status at the beginning of 
# each game for the first four games in a season: 
# toes = current average number of toes per player
# wlrec = current games won (percent)
# nfans = fan count (in millions).

toes = [8.5, 9.5, 9.9, 9.0] 
wlrec = [0.65, 0.8, 0.8, 0.9] 
nfans = [1.2, 1.3, 0.5, 1.0]

input = [toes[0], wlrec[0], nfans[0]]

pred = neural_network(input, weights)
print(pred)

0.9800000000000001


In the above dataset, we have the first datapoint, i.e., no.of toes of the players in the first game, no of wins for the season and number of fans for the team as of on first game.

| Input | Weight | local predictions |
| --- | --- | --- |
| 8.5 | 0.1 | 0.85 |
| 0.65 | 0.2 | 0.13 |
| 1.2 | 0 | 0 |

<br/>
<center>total sum = 0.85 + 0.13 + 0 = 0.98</center>

So in the above neural network, we multiply input with weights and then add the individual prediction to make a single prediction. It multiplies three inputs by three knob weights and sum them. This is what we call a weighted sum.

![image.png](attachment:image.png)

In [22]:
%%time

# Above functions without numpy
def elementwise_multiplication(vec_a, vec_b):
    assert len(vec_a) == len(vec_b)
    output = []
    for i in range(len(vec_a)):
        output.append(vec_a[i] * vec_b[i])
    
    return output



def elementwise_addition(vec_a, vec_b):
    assert len(vec_a) == len(vec_b)
    output = []
    for i in range(len(vec_a)):
        output.append(vec_a[i] + vec_b[i])
    
    return output


def vector_sum(vec_a):
    s = 0
    for i in vec_a:
        s += i
        
    return s


def vector_average(vec_a):
    s = vector_sum(vec_a)
    avg = s / len(vec_a)
        
    return avg


# Dot Product using above functions
def dot_product(vec_a, vec_b):
    el_mul = elementwise_multiplication(vec_a, vec_b)
    vec_sum = vector_sum(el_mul)
    return vec_sum

print("Elementwise Multiplication ", elementwise_multiplication(input, weights))
print("Elementwise Addition ", elementwise_addition(input, weights))
print("Vector sum ", vector_sum(input))
print("Vector average ", vector_average(input))
print("Dot Product ", dot_product(input, weights))

Elementwise Multiplication  [0.8500000000000001, 0.13, 0.0]
Elementwise Addition  [8.6, 0.8500000000000001, 1.2]
Vector sum  10.35
Vector average  3.4499999999999997
Dot Product  0.9800000000000001
CPU times: total: 0 ns
Wall time: 1.02 ms


 **In order to make the computation more faster and flexible than with lists in python, we can use vectorized arrays, which are provided by numpy (Numerical Python) package from python.** 

Neural Networks give prediction as with how similar the inputs and weights are. 

In [23]:
%%time

# !pip install numpy -- RUN IF NOT INSTALLED
import numpy as np

input, weights = np.array(input), np.array(weights)

def elementwise_multiplication(vec_a, vec_b):
    assert len(vec_a) == len(vec_b)
    output = vec_a * vec_b
    return output



def elementwise_addition(vec_a, vec_b):
    assert len(vec_a) == len(vec_b)
    output = vec_a + vec_b
    
    return output


def vector_sum(vec_a):
    s = np.sum(vec_a)
    return s


def vector_average(vec_a):
    avg = np.mean(vec_a)
    return avg


# Dot Product using above functions
def dot_product(vec_a, vec_b):
    output = np.dot(vec_a, vec_b)
    return output

print("Elementwise Multiplication ", elementwise_multiplication(input, weights))
print("Elementwise Addition ", elementwise_addition(input, weights))
print("Vector sum ", vector_sum(input))
print("Vector average ", vector_average(input))
print("Dot Product ", dot_product(input, weights))

Elementwise Multiplication  [0.85 0.13 0.  ]
Elementwise Addition  [8.6  0.85 1.2 ]
Vector sum  10.35
Vector average  3.4499999999999997
Dot Product  0.9800000000000001
CPU times: total: 0 ns
Wall time: 1 ms


As the dataset above is very small, the difference is not very distinguishable, but numpy is much faster

# Gradient Descent

# Backward Propagation

# Regularization and Batching

# Activation Functions

# CNN

# NLP

# LSTM