# Artificial Neural Networks
https://en.wikipedia.org/wiki/Artificial_neural_network  
Artificial neural networks (ANNs) are a popular machine learning model inspired by the biological neural networks in our brains. As with other machine learning models, by fitting the models with input data with labeled outputs the neural network can "learn" the function that maps the input to the output, and extrapolate to new examples.  

This demo focus on visualizing what a neural network looks like, how it operates, and give examples on how to create and train them.

## Tools

In [None]:
# Used for array calculations
import numpy as np

In [None]:
# Used to define and train neural network models
from tensorflow import keras

In [None]:
# Used to visualize graphs (in this case: neural networks)
import networkx as nx

In [None]:
import itertools

## Anatomy of Simple Neural Networks
https://en.wikipedia.org/wiki/Vector_(mathematics_and_physics)  
https://en.wikipedia.org/wiki/Tensor  
https://en.wikipedia.org/wiki/Vectorization_(mathematics)  
As with most machine learning models, data needs to be **vectorized**, i.e. turned into a array of numbers, so that the computer can work with it.  
Sometimes we might change the shape of the vector to be a multidimensional array instead instead of a simple array. We may then think of the vector as a **tensor**.  
For the purposes of this demo we will only be looking at data in the form of simple vectors as 1D arrays.  

A neural network takes an **input vector**, and transforms if via a series of linear and non-linear transformations into an **output vector**.

In [None]:
# Define the NN's layers 
layers = [] # Input layer is simply defined by an input shape
layers.append(keras.layers.Dense(8, activation='sigmoid', input_shape=(2,))) # 1st Hidden layer
layers.append(keras.layers.Dense(1, activation='sigmoid')) # Output layer

# Join the layers together
model = keras.Sequential(layers)

In [None]:
# Create a graph object from the NN
NN_graph = nx.DiGraph()
for i_layer,layer in enumerate(model.layers):
    for (i_input,i_output),weight in np.ndenumerate(layer.get_weights()[0]):
        input_label = f"{i_layer}:{i_input}"
        output_label = f"{i_layer+1}:{i_output}"
        if i_input==0:
            bias = layer.get_weights()[1][i_output]
            NN_graph.add_node(output_label, bias=bias)
        NN_graph.add_edge(input_label, output_label, weight=weight)

In [None]:
# Layout and display the graph


In [None]:
# Display the model weights and biases as tensors
for i,layer in enumerate(model.layers):
    weights = layer.get_weights()[0]
    biases = layer.get_weights()[1]
    print("Layer {} Weights:".format(i))
    print(weights)
    print("Layer {} Biases:".format(i))
    print(weights)
    print('')

## Training
https://en.wikipedia.org/wiki/Universal_approximation_theorem  
https://en.wikipedia.org/wiki/Machine_learning  
https://en.wikipedia.org/wiki/Mathematical_optimization  
The training, or learning, of an machine learning model is accomplished by performing an optimization over the model's parameters to minimize the error between the model's predicted values, and the true labeled values from the training data.  

Mathematically we can write training as solving the following problem:  
$$ \min_{\vec{w}} \sum_{i} C \left (  f \left ( \vec{X}_{i}, \vec{w} \right ),  \vec{y}_{i} \right ) $$
Where:  
$\vec{w}$ is the vector of parameters (weights) of the model.  
$\vec{X}_{i}$ is the input vector of the $i$th datapoint in the training set.  
$\vec{y}_{i}$ is the output vector (label) of the $i$th datapoint in the training set.  
$f$ Is the function evaluated by the model.  
$C$ is the loss (cost) function.  

As one can see, this is the classic formulation of an optimization problem.