# Deep Learning in Python
- neural networks
- deep learning models
- using Keras 2.0
- with Dan Becker from Kaggle, Keras and TensorFlow libraries

Applications: robotics, NLP, image recognition, AI

# 1. Basics of Deep Learning and Neural Networks
Interactions
- Neural Networks account for interactions really well
- Deep learning uses powerful neural networks
    - text, images, videos, audio, source code
- Deep learning models capture interactions
    - can take 2 features and caculate this interaction to predict the outcome
    - reality is that there are a lot of interactions
- Structure
    - Input Layer - features
    - Output Layer - predicted
    - Hidden Layer - everything in between input and output
        - each component is a NODE
        - we don't have data and can't observe this layer directly
        - the more nodes, the more interactions we capture
        

### 1.1 Forward propagation = using data to make predictions
Example: bank transactions
- make predictions based on:
    - number of children
    - number of existing accounts
- customer with 2 children, 3 accounts
    - line from input to each node
    - each line has a weight for how strongly the input affects the hidden node 
    - first set of weights - parameters we train/change when NN fit to data
- Dot product
    - hidden node prediction = sum(input*weight)
    - output node - repeat the same multiply-add process (sum(node*weight))

Forward propagation = input to hidden node to output
- dot product (multiply-add process)
- forward propagation for one data point at a time
- output is the prediction for that data point


In [1]:
# Forward propagation code

import numpy as np
# recall 2 children, 3 existing accounts
input_data = np.array([2,3])
# 2 hidden nodes
weights = {'node_0': np.array([1,1]),
           'node_1': np.array([-1,1]),
           'output': np.array([2,-1])}
node_0_value = (input_data * weights['node_0']).sum()
node_1_value = (input_data * weights['node_1']).sum()

hidden_layer_values = np.array([node_0_value, node_1_value])
print(hidden_layer_values)
# [5,1]

# get output
output = hidden_layer_values * weights['output']).sum()
print(output)
# 9


### Example: Coding the forward propagation algorithm
In this exercise, you'll write code to do forward propagation (prediction) for your first neural network:

Ch1Ex4

Each data point is a customer. The first input is how many accounts they have, and the second input is how many children they have. The model will predict how many transactions the user makes in the next year. You will use this data throughout the first 2 chapters of this course.

The input data has been pre-loaded as input_data, and the weights are available in a dictionary called weights. The array of weights for the first node in the hidden layer are in weights['node_0'], and the array of weights for the second node in the hidden layer are in weights['node_1'].

The weights feeding into the output node are available in weights['output'].

NumPy will be pre-imported for you as np in all exercises.

In [None]:
# Calculate node 0 value: node_0_value
node_0_value = (input_data * weights['node_0']).sum()

# Calculate node 1 value: node_1_value
node_1_value = (input_data * weights['node_1']).sum()

# Put node values into array: hidden_layer_outputs
hidden_layer_outputs = np.array([node_0_value, node_1_value])

# Calculate output: output
output = (hidden_layer_outputs * weights['output']).sum()

# Print output
print(output)

# -39

### 1.2 Activation functions
- to achieve max predictive power, we need activation function in hidden layer
- allows model to capture non-linearities
- applied to node inputs to produce node output

Improving our neural network
- used to use tanh
- now ReLU is the standard

ReLU (Rectified Linear Activation)


In [None]:
# activation functions with tanh

import numpy as np
input_data = np.array([-1, 2])
weights = {'node_0': np.array([3,3]),
           'node_1': np.array([1,5]),
           'output': np.array([2,-1])}

# note distinguishing input and output
node_0_input = (input_data * weights['node_0']).sum()
node_0_output = np.tanh(node_0_input)
node_1_input = (input_data * weights['node_1']).sum()
node_1_output = np.tanh(node_1_input)

hidden_layer_outputs = np.array([node_0_output, node_1_output])
output = hidden_layer_outputs * weights['output']).sum()
print(output)

### 1.2.a ReLU - The Rectified Linear Activation Function
As Dan explained to you in the video, an "activation function" is a function applied at each node. It converts the node's input into some output.

The rectified linear activation function (called ReLU) has been shown to lead to very high-performance networks. This function takes a single number as an input, returning 0 if the input is negative, and the input if the input is positive.

Here are some examples:
- relu(3) = 3 
- relu(-3) = 0 

In [None]:
# ReLU

def relu(input):
    '''Define your relu activation function here'''
    # Calculate the value for the output of the relu function: output
    # return 0 if input is negative, else return input
    output = max(input, 0)
    
    # Return the value just calculated
    return(output)

# Calculate node 0 value: node_0_output
node_0_input = (input_data * weights['node_0']).sum()
node_0_output = relu(node_0_input)

# Calculate node 1 value: node_1_output
node_1_input = (input_data * weights['node_1']).sum()
node_1_output = relu(node_1_input)

# Put node values into array: hidden_layer_outputs
hidden_layer_outputs = np.array([node_0_output, node_1_output])

# Calculate model output (do not apply relu)
model_output = (hidden_layer_outputs * weights['output']).sum()

# Print model output
print(model_output)

# 52
# we predicted 52 transactions

### 1.2.b Applying the network to many observations/rows of data
You'll now define a function called predict_with_network() which will generate predictions for multiple data observations, which are pre-loaded as input_data. As before, weights are also pre-loaded. In addition, the relu() function you defined in the previous exercise has been pre-loaded.

In [None]:
# Define predict_with_network()
def predict_with_network(input_data_row, weights):
    """accepts two arguments - input_data_row and weights - and 
    returns a prediction from the network as the output."""
        # Calculate node 0 value
    node_0_input = (input_data_row * weights['node_0']).sum()
    node_0_output = relu(node_0_input)

    # Calculate node 1 value
    node_1_input = (input_data_row * weights['node_1']).sum()
    node_1_output = relu(node_1_input)

    # Put node values into array: hidden_layer_outputs
    hidden_layer_outputs = np.array([node_0_output, node_1_output])
    
    # Calculate model output
    input_to_final_layer = (hidden_layer_outputs * weights['output']).sum()
    model_output = relu(input_to_final_layer)
    
    # Return model output
    return(model_output)


# Create empty list to store prediction results
results = []
for input_data_row in input_data:
    # Append prediction to results
    results.append(predict_with_network(input_data_row, weights))

# Print results
print(results)

# [52, 63, 0, 148]

### 1.3 Deeper networks
Multiple hidden layers
- can scale to even 1000 layers
- same propogation process
- assume all layers use ReLU activation function

Representation learning (aka deep learning)
- deep networks internally build representations of patterns in the data
- partially replace the need for feature engineering
- called representation learning b/c subsequent layers build increasingly sophisticated representations of raw data until we get to a stage to make predictions
- example in images:
    - early nodes: diagonal line
    - later node: face node
    - next node: cat node    

Deep learning
- pro: modeler doesn't need to specify the interactions
- when you train the model, the neural network gets weights that find the relevant patterns to make better predictions


### 1.3.a Multi-layer neural networks
In this exercise, you'll write code to do forward propagation for a neural network with 2 hidden layers. Each hidden layer has two nodes. The input data has been preloaded as input_data. The nodes in the first hidden layer are called node_0_0 and node_0_1. Their weights are pre-loaded as weights['node_0_0'] and weights['node_0_1'] respectively.

The nodes in the second hidden layer are called node_1_0 and node_1_1. Their weights are pre-loaded as weights['node_1_0'] and weights['node_1_1'] respectively.

We then create a model output from the hidden nodes using weights pre-loaded as weights['output'].

![Image of neural network](https://s3.amazonaws.com/assets.datacamp.com/production/course_3524/datasets/ch1ex10.png)

### 

### 

Adding image in Jupyter notebook

For web image:
![Image of something](web_address)

Local image:
![Image of something](folder/picture.png)

### 

### 

# 2. Optimize a Neural Network with backward propagation

### 

### 

### 

### 

### 

### 

### 

### 

### 

### 

# 3. Building Deep Learning models with keras

### 

### 

### 

### 

### 

### 

### 

### 

### 

### 

# 4. Fine-tuning keras models

### 

### 

### 

### 

### 

### 

### 

### 

### 

### 