# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) An Overview of Neural Networks

# Learning Objectives:
- Describe the structure of neural networks
- Explain how neural networks capture relationships in our data
- Code a neural network by hand!

## Let's talk about neural networks!

![](./assets/network1.png)

### Biological "Inspiration"

A neuron is the means by which electronic signals are transmitted through the human brain.

![](./assets/neuron2.png)

Neurons run throughout our nervous system and are connected in an enormous, tangled web that is difficult to understand. Electronic signals transmit through neurons, stopping short in some cases and continuing in others.

![](./assets/neuron1.jpg)

While neural networks aren't really like the neuronal system in our body, they're good visuals to consider and do have some analogous features.

Much like neurons are the building blocks of the nervous system, the **perceptron** is the building block of a neural network.

## The Perceptron

There are five main parts to the perceptron:

1. Input Layer
2. Weights
3. Activation Function
4. Bias
5. Output

![](./assets/perceptron.gif)

We'll use this diagram to walk through the different pieces of the perceptron.

### Input Layer
![](./assets/perceptron.gif)

On the left-hand side above, there are four inputs. I'll call them $x_1$, $x_2$, $x_3$, and $x_n$, as this will be more common beyond this particular image. We call this the **input layer**.

This input layer is the set of independent variables we read into our model. If we want to predict commute time, we might include number of Metro stops, whether or not it is raining, and distance from GA in miles as some of our input variables. An observation's values would be passed in here.

### Weights
![](./assets/perceptron.gif)

On each arrow, there is a weight, denoted $w_1$, $w_2$, $w_3$, and $w_n$. We call these our **weights** or **synaptic weights**.

The weights are similar to our coefficients. We usually seek to estimate these. We might specify starting weights and then iterate to find the best values of the weights. (We'll cover that later.)

### Activation Function
![](./assets/perceptron.gif)

Near the middle, there is a function denoted $\sigma(\cdot)$. We call this our **activation function**.

In a biological neuron, the electrical impulses gather and has to reach some threshold called "action potential" in order for the neuron to "fire." Either the neural reaches this "action potential" and "fires," or it does not reach "action potential" and it does not fire.

In artificial neural networks, we have to specify this activation function and it need not be "all or nothing." Our choice of activation function is important, as it will identify what signals get through and ultimately determines the output.

![](./assets/activation_function.png)

Another example is RELU (rectified linear unit). This ensures that our output is nonnegative by calculating $\sigma(z) = \max(z,0)$

### Bias
![](./assets/perceptron.gif)

Inside the $\sigma$, we see $b +$ the inner product of our weights and inputs. This value $b$ is referred to as the **bias**. We can specify some bias in conjunction with our activation functions to achieve certain results.

### Output
![](./assets/perceptron.gif)

On the right-hand side, we see an arrow extending forward. The result of $\sigma(b+\sum_{i=1}^nx_iw_i)$ is our **output**.

### Practice Problems:

1. Suppose I want to take in $n$ binary variables and return the `and` operator. How could I specify my perceptron?

2. Suppose I want to take in $n$ binary variables and return the `or` operator. How could I specify my perceptron?

3. Suppose I want to take in $n$ binary variables and return the `xor` operator. How could I specify my perceptron?

4. Suppose I want to take in $n$ quantitative variables and return the sum of all positive inputs. How could I specify my perceptron?

### Let's Manually Set Up a Network!

In [None]:
import numpy as np

input_data = np.array([30,2])

weights = { 'node_0': np.array([2,4]),
            'node_1': np.array([-1,1]),
            'output': np.array([2,1])}

In [None]:
# Calculate node 0 value: node_0_value
node_0_value = (input_data * weights['node_0']).sum()

# Calculate node 1 value: node_1_value
node_1_value = (input_data * weights['node_1']).sum()

# Put node values into array: hidden_layer_outputs
hidden_layer_outputs = np.array([node_0_value, node_1_value])

# Calculate output: output
output = (hidden_layer_outputs*weights['output']).sum()

# Print output
print(output)

In [None]:
def relu(input):
    '''Define your relu activation function here'''
    # Calculate the value for the output of the relu function: output
    output = max(input, 0)
    
    # Return the value just calculated
    return(output)

In [None]:
# Calculate node 0 value: node_0_output
node_0_input = (input_data * weights['node_0']).sum()
node_0_output = relu(node_0_input)

# Calculate node 1 value: node_1_output
node_1_input = (input_data * weights['node_1']).sum()
node_1_output = relu(node_1_input)

# Put node values into array: hidden_layer_outputs
hidden_layer_outputs = np.array([node_0_output, node_1_output])

# Calculate model output (do not apply relu)
model_output = (hidden_layer_outputs * weights['output']).sum()

# Print model output
print(model_output)

In [None]:
# Define predict_with_network()
def predict_with_network(input_data_row, weights):

    # Calculate node 0 value
    node_0_input = (input_data_row * weights['node_0']).sum()
    node_0_output = relu(node_0_input)

    # Calculate node 1 value
    node_1_input = (input_data_row * weights['node_1']).sum()
    node_1_output = relu(node_1_input)

    # Put node values into array: hidden_layer_outputs
    hidden_layer_outputs = np.array([node_0_output, node_1_output])
    
    # Calculate model output
    input_to_final_layer = (hidden_layer_outputs * weights['output']).sum()
    model_output = relu(input_to_final_layer)
    
    # Return model output
    return(model_output)


# Create empty list to store prediction results
results = []
for input_data_row in input_data:
    # Append prediction to results
    results.append(predict_with_network(input_data_row, weights))

# Print results
print(results)

### Setting Up a Deeper Network

In [None]:
import numpy as np

input_data = np.array([-1,4])

weights = { 'node_0_0': np.array([3,3]),
            'node_0_1': np.array([3,3]),
            'node_1_0': np.array([3,3]),
            'node_1_1': np.array([3,3]),
            'output': np.array([2,-1])}

In [None]:
def predict_with_network(input_data):
    # Calculate node 0 in the first hidden layer
    node_0_0_input = (input_data * weights['node_0_0']).sum()
    node_0_0_output = relu(node_0_0_input)

    # Calculate node 1 in the first hidden layer
    node_0_1_input = (input_data * weights['node_0_1']).sum()
    node_0_1_output = relu(node_0_1_input)

    # Put node values into array: hidden_0_outputs
    hidden_0_outputs = np.array([node_0_0_output, node_0_1_output])
    
    # Calculate node 0 in the second hidden layer
    node_1_0_input = (hidden_0_outputs*weights['node_1_0']).sum()
    node_1_0_output = relu(node_1_0_input)

    # Calculate node 1 in the second hidden layer
    node_1_1_input = (hidden_0_outputs*weights['node_1_1']).sum()
    node_1_1_output = relu(node_1_1_input)

    # Put node values into array: hidden_1_outputs
    hidden_1_outputs = np.array([node_1_0_output, node_1_1_output])

    # Calculate model output: model_output
    model_output = (hidden_1_outputs*weights['output']).sum()
    
    # Return model_output
    return(model_output)

output = predict_with_network(input_data)
print(output)

In [None]:
def predict_with_network(input_data):
    # Calculate node 0 in the first hidden layer
    node_0_0_input = (input_data * weights['node_0_0']).sum()
    node_0_0_output = relu(node_0_0_input)

    # Calculate node 1 in the first hidden layer
    node_0_1_input = (input_data * weights['node_0_1']).sum()
    node_0_1_output = relu(node_0_1_input)

    # Put node values into array: hidden_0_outputs
    hidden_0_outputs = np.array([node_0_0_output, node_0_1_output])

    # Calculate node 0 in the second hidden layer
    node_1_0_input = (hidden_0_outputs * weights['node_1_0']).sum()
    node_1_0_output = relu(node_1_0_input)

    # Calculate node 1 in the second hidden layer
    node_1_1_input = (hidden_0_outputs * weights['node_1_1']).sum()
    node_1_1_output = relu(node_1_1_input)

    # Put node values into array: hidden_1_outputs
    hidden_1_outputs = np.array([node_1_0_output, node_1_1_output])
    
    # Calculate output here: model_output
    model_output = (hidden_1_outputs * weights['output']).sum()
    
    # Return model_output
    return(model_output)

output = predict_with_network(input_data)
print(output)

## From the Perceptron to a Neural Network

Jumping from perceptrons to neural networks is quite straightforward. The outputs of a perceptron become the inputs to new perceptrons.

![](./assets/network2.png)

- We have six independent variables.
- We have, as always, one input layer and one output layer.
- We have two hidden layers. This can and will vary substantially from problem to problem.

## How do Neural Networks Work?

1. We specify the architecture of our neural network. (Much like model building, this is based on our data and assumptions, and we get better with practice.)
2. We decide how many `epochs` $n$ we want to run and how many `batches` $k$ go in an epoch.
3. Split training data into $k$ batches.
4. Feed first batch into neural network.
5. Calculate error and update weights/bias accordingly.
6. Feed next batch into neural network.
7. Repeat steps 5 and 6 until all $k$ batches have gone through exactly once. This ends the epoch.
8. Repeat steps 4-7 until we have completed $n$ epochs.
9. Make adjustments as necessary and/or use model for prediction.

This glosses over many, many of the details... but provides a broad picture to what is occurring.

**Check**: What are we worried about?

# Additional Resources

- Comprehensive list activation functions [StackExchange](http://stats.stackexchange.com/questions/115258/comprehensive-list-of-activation-functions-in-neural-networks-with-pros-cons)
- Why convolutional neural networks for images [cs231 - Stanford](http://cs231n.github.io/convolutional-networks/)
- The University of Toronto's neural network [Coursera](https://www.coursera.org/learn/neural-networks) is regarded to be among the best
- Deep Learning online learning: [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/chap1.html)
- The [Neural Network Zoo](http://www.asimovinstitute.org/neural-network-zoo/)