# Neural Networks

## Motivation

Neural Networks: General-purpose learning algorithm for modeling non-linearity

... if you train it with "enough" data

## Non-linear inputs

- Images
- Text
- Speech
- XOR

## Limitations of linear models

Not "linearly separable"

![xor](assets/neural/xor.png)

Can't draw boundary to separate x's and o's

## Modeling non-linearity

Transform $x$ into $\phi(x)$ to become linearly separable

![xor](assets/neural/xor_phi.png)

$\phi(x)$ is the basis for a "neuron"

## Neuron

$$y = W\phi(x) + b$$

$$\phi(x) = g(W'x + b')$$

Trainable: $W', b', W, b$

$g(x)$ is a non-linear function, e.g. Sigmoid

## Neuron (Perceptron)

![neuron](assets/neural/neuron.png)

(image: Neural Network Methods in Natural Language Processing, Goldberg, 2017)

## Neural Network

Multiple neurons in 1 layer make up an "Artificial Neural Network"

![neural network](assets/neural/300px-Colored_neural_network.svg.png)

(image: [Wikipedia](https://en.wikipedia.org/wiki/Artificial_neural_network))

## Neural Network (Deep)

Multiple "hidden" layers of neurons make up a "Deep Neural Network"

![multi-layer perceptron](assets/neural/deep_nn.png)

(image: Goldberg, 2017)

## Properties of a Neural Network

|Term|Description|Examples|
|--|--|--|
|Input dimension|How many inputs|4|
|Output dimension|How many outputs|3|
|Number of hidden layers|Number of layers, excluding input and output|2|
|Activation type|Type of non-linear function|sigmoid, ReLU, tanh|
|Hidden layer type|How the neurons are connected together|Fully-connected, Convolutional|

## Activation types

What non-linearity is applied

![dnn](assets/neural/activations.png)

(image: Goldberg, 2017)

## Layer types

How the neurons are connected together, and what operations are performed with x, W, and b:

- Dense
- Convolutional
- Recurrent
- Residual

More detail to come...

## Walkthrough: Neural Network Architectures in keras

In this walkthrough, we will use Keras to examine the architecture of some well-known neural networks.

### Setup - Graphviz

We'll be installing Graphviz for visualizing the architectures.

1. Download and install graphviz binaries from: https://graphviz.gitlab.io/download/
2. Add the path to graphviz to your PATH environment variable, e.g. `C:/Program Files (x86)/Graphviz2.38/bin`

### Setup - Conda environment

1. Create a new conda environment called `mldds03`
  a. Launch an `Anaconda Python` command window
  b. `conda create -n mldds03 python=3`
2. Activate the conda environment: `conda activate mldds03`
3. Install: `conda install jupyter keras pydot`
4. Navigate to the courseware folder: `cd mldds-courseware`
5. Launch Jupyter: `jupyter notebook` and open this notebook

### Pre-trained Neural Networks in Keras

"Pre-trained" neural networks are available under `keras.applications`

https://keras.io/applications/

These are trained on the ImageNet dataset (http://www.image-net.org/), which contains millions of images.

The neural network architectures from keras are previous years submissions to the ImageNet annual challenge. 

In [None]:
import keras

print(keras.__version__)

### MobileNet

MobileNet is a pre-trained ImageNet DNN optimized to run on smaller devices.

Keras documentation: https://keras.io/applications/#mobilenet

You can find the URL to the original research paper that proposed this network architecture.

In [None]:
from keras.applications import mobilenet

mobilenet_model = mobilenet.MobileNet(weights='imagenet')
mobilenet_model.summary()

### ResNet50

ResNet50 is another pre-trained ImageNet DNN. This is a larger network than MobileNet (almost 26 million parameters). It improves accuracy by introducing residual connections, which are connections that skip layers.

https://keras.io/applications/#resnet50

In [None]:
from keras.applications import resnet50

resnet_model = resnet50.ResNet50(weights='imagenet')
resnet_model.summary()

### Creating Neural Networks using Keras
Finally, let's try something simpler.

Let's create a 1-layer network that can do linear regression.

In [None]:
# Reference: https://gist.github.com/fchollet/b7507f373a3446097f26840330c1c378
from keras.models import Sequential
from keras.layers import Dense

simple_model = Sequential()
simple_model.add(Dense(1, input_dim=4, activation='sigmoid')) # 4 inputs, 1 output
simple_model.compile(optimizer='rmsprop', loss='mse')

simple_model.summary()

In [None]:
keras.models.Sequential?

In [None]:
keras.layers.Dense?

In [None]:
keras.Model.compile?

How about a 2-layer network to make it a deep neural network?

In [None]:
from keras.layers import Activation

deeper_model = Sequential()

# imagine we tuned hyperparamters and derived this magic architecture to solve all our problems
deeper_model.add(Dense(12, input_dim=42, activation='relu')) # 42 inputs, 12 outputs
deeper_model.add(Dense(4, activation='relu')) # 12 inputs, 4 outputs
deeper_model.add(Activation("softmax"))
deeper_model.compile(optimizer='rmsprop', loss='binary_crossentropy')

deeper_model.summary()

### Visualizing Neural Net Architectures in Keras

https://keras.io/visualization/

In [None]:
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

model_to_dot?

In [None]:
SVG(model_to_dot(simple_model, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
SVG(model_to_dot(deeper_model, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
SVG(model_to_dot(mobilenet_model, show_shapes=True).create(prog='dot', format='svg'))

In [None]:
SVG(model_to_dot(resnet_model, show_shapes=True).create(prog='dot', format='svg'))

## Training a neural network

A neural network is trained using:
- Stochastic Gradient Descent (we covered this in Training Basics)
- Back Propagation

## Back Propagation

## Reading List

|Material|Read it for|URL
|--|--|--|
|Lecture 1: Deep Learning Challenge. Is There Theory?|Intro to Deep Learning|https://stats385.github.io/lecture_slides (lecture 1)|
|Lecture 2: Overview of Deep Learning from a Practical Point of View|More background on Neural Nets|https://stats385.github.io/lecture_slides (lecture 2)|
|Guide to the Sequential Model|Basic usage of Keras for neural net training|https://keras.io/getting-started/sequential-model-guide/|