# Artificial Neural Networks (ANNs)
The fundamental neural network to deep learning. We will create a simple ANN using Keras in this lecture, but first let's learn about the essential mechanisms of an artificial neural network.

# Neuron (Node)
A Node that transmits data through layers.

<img src="images/ann/neuron.png" height="65%" width="65%"></img>
- The input values are standarized (and sometimes normalized)
- The output value can be continuous (number), binary (yes or no), or categorical (dummy variables)

The inputs are transmitted through neuron(s), which are processed to determine an output value.

In real life, the input values of a human are the 5 senses: sight, touch, sound, taste, and smell. And these input values are transmitted to the neurons, which the neurons will process and determine an output.

For example, if I touch a fire with my hand, then my touch input value will signal the neuron, and then the neuron will process it and determine that I need to take my hand away from the fire.

### Weights
For each synapse (signal), there can be weights to measure the significance of a signal. Weights are crucial, and they're the values that get adjusted across the neural network. This is where gradient descent and backpropagation come into play, but we'll get to that later.

<img src="images/ann/weights_neuron.png" height="75%" width="75%"></img>
- W1, W2, and Wm are the individual weights for each synapse (arrow, or signal)

The mathematics inside the neuron is a value that determines whether or not to pass the signal to the next neuron in the next layer. This value is determined through an activation function that is applied on the weighted sum of the input values.

# Activation Function
Adds a bias on the weighted sum of the independent values. There are many types of activation functions, and some work better than others depending on the neural network.

Note that the input values must be featured scaled (standarized or normalized) for these functions to work properly.

Below are different types of activation functions.
- The x-value is the weighted sum of the input value(s)
- The y-value is the neuron's contribution to the output value(s)

### 1. Threshold Function
A "binary" (yes or no) activation function.

<img src="images/ann/threshold_function.png" height="50%" width="50%"></img>

### 2. Sigmoid Function
A smooth activation function with gradual progression, very useful for the output layer to predict the probability of success.

<img src="images/ann/sigmoid_function.png" height="50%" width="50%"></img>

### 3. Rectifier Function
One of the most popular functions, a linear curve that increases after the x-value of 0.

<img src="images/ann/rectifier_function.png" height="50%" width="50%"></img>

### 4. Hyperbolic Tangent (tanh)
Similar to the sigmoid function, but the function's value can be a negative.

<img src="images/ann/tanh_function.png" height="50%" width="50%"></img>

# How Do Neural Networks Work?
Let's learn how neural networks actually work.

### Shallow Neural Network
In machine learning algorithms without deep learning, the algorithm can be modelled below.

<img src="images/ann/basic_neural_network.png" height="50%" width="50%"></img>

This neural network is very basic: there are only independent variables (input layer), parameter tuning variables (weights), and a dependent variable (output layer). This is actually how most machine learning models work if there is no deep learning involved.

Fortunately, in deep learning, there are "hidden" layers that increase the accuracy of the model.

### Deep Learning Neural Network
Let's assume a neural network has already been trained, let's observe how it will work.

The neural network below is trying to predict the price of a house based on area, bedrooms, distance to city, and age.

<img src="images/ann/neural_network_house_price.png" height="50%" width="50%"></img>

Each neuron in the hidden layer only accepts only some input values because it used the weights from the synapses (signals) to calculate whether or not a signal is significant enough for the neuron.

For example, the middle neuron in the hidden layer focuses on only the "Area", "Bedrooms", and "Age" input values. Maybe because the already trained neuron determined that younger people prefer high area and lots of bedrooms, so it only accepts the signals from those input values to determine if the criterions are met.

Another example is the last neuron in the hidden layer that focuses on only the "Age". Maybe because the neuron determined that a house older than 100+ years is priced significantly higher due to historical reasons. This is a good example of when to use the rectifier activiation function because the neuron would check if the age is 100+ then the neuron's contribution to the output increases and if not then the neuron's contribution to the output is 0.

Together, all the neurons can be used to predict the price of a house as seen in the output layer.