## Deep learning playground

**What are the basic building blocks?**

* **Artificial neurons** - elementary units, combine their inputs to produce an output/activation
* **Activation functions** - non-linear transformation applied to the outputs
* **Network layers** - organize our network into layers of artificial neurons

"Tinker with a Neural Network in your browser" - [playground.tensorflow.org][tf-playground]

<a href="https://playground.tensorflow.org">
    <img src="images/tf-playground.png" width="400px" />
</a>

[tf-playground]:https://playground.tensorflow.org

## Import libraries

In [None]:
# Load libraries
import torch
print("Torch version:", torch.__version__)

import torchvision
print("Torchvision version:", torchvision.__version__)

import matplotlib
print("Matplotlib version:", matplotlib.__version__)

import seaborn as sns
print("Seaborn version:", sns.__version__)

import IPython
print("IPython version:", IPython.__version__)

import numpy as np
print("Numpy version:", np.__version__)

import sklearn
print("Scikit-learn version:", sklearn.__version__)

In [None]:
# Setup Matplotlib and Seaborn
%matplotlib inline
#%config InlineBackend.figure_format = 'retina' # If you have a retina screen
import matplotlib.pyplot as plt

sns.set() # Activate Seaborn default style
blue, green, red = sns.color_palette()[:3] # Color palette

## Circle data set

In [None]:
from sklearn.datasets import make_circles
from sklearn.preprocessing import scale

# Generate circle data set
X, y = make_circles(
    n_samples=200, shuffle=True, noise=0.1, random_state=0, factor=0.3)

# Rescale data
X = scale(X)

# Plot data points
plt.figure(figsize=(4, 4))
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.coolwarm)
plt.gca().set_aspect('equal', adjustable='box')
plt.show()

## Decision surface

**What is the decision surface?**

* What we want to learn in this **classification** task

In [None]:
import warnings

# Plot the decision surface
def decision_surface(x1, x2, y, axis, predict_fn, n=100):
    # Same scale for x- and y-axis
    axis.set_aspect('equal', adjustable='box')

    # Plot data points
    class1_idx = (y == 1)
    styling = {'edgecolors': 'white', 'linewidth': 0.5, 'zorder': 10}
    axis.scatter(x1[class1_idx], x2[class1_idx], color=red, label='class 1', **styling)
    axis.scatter(x1[~class1_idx], x2[~class1_idx], color=blue, label='class 0', **styling)

    if predict_fn is not None:
        # Generate grid
        xlim, ylim = axis.get_xlim(), axis.get_ylim()
        x_values = np.linspace(*xlim, num=n)
        y_values = np.linspace(*ylim, num=n)
        xx, yy = np.meshgrid(x_values, y_values)
        points = np.c_[xx.flatten(), yy.flatten()]
        
        # Compute predictions
        preds = predict_fn(points)
        zz = np.array(preds).reshape(xx.shape)

        # Draw decision boundary
        with warnings.catch_warnings(): 
            # Matplotlib throws UserWarnings when there are no contour lines to draw
            warnings.simplefilter('ignore', category=UserWarning)
            axis.contour(xx, yy, zz, levels=[0], colors='gray', zorder=1)

        # Plot decision surface
        axis.imshow(zz, alpha=0.2, origin='lower', extent=[*xlim, *ylim], vmin=-1, vmax=1, cmap=plt.cm.coolwarm, zorder=1, aspect='auto')
        
    # Add labels
    axis.legend(frameon=True, facecolor='white').set_zorder(20)

In [None]:
# Plot the data
fig = plt.figure(figsize=(4, 4))
axis = fig.gca()

f = lambda X: X[:, 0]
decision_surface(X[:, 0], X[:, 1], y, axis, predict_fn=f)
plt.show()

> **Task:** Improve the decision boundary! ex. Try with the norm `np.linalg.norm(X, ord=2, axis=1)`

## Deep learning model

**How do neural network learn?**

* **Loss function** - A way to quantify how much error the model does
* **Gradient descent** - A way to adjust our model to decrease the loss value

What does the network learn?

* **Parameters** - a set of weights and biases

Set of videos about deep learning by Grant Sanderson - in particular [Chapter 3: backpropagation][backprop-video]

<a href="https://youtu.be/Ilg3gGewQ5U?t=3m46s">
    <img src="https://img.youtube.com/vi/Ilg3gGewQ5U/maxresdefault.jpg" width="400px" />
</a>

[backprop-video]:https://youtu.be/Ilg3gGewQ5U?t=3m46s

## How to build and train one in PyTorch?

What is PyTorch? Source - [pytorch.org/about][pytorch-about]

> PyTorch is a python package that provides two high-level features:
> * Tensor computation (like numpy) with strong GPU acceleration
> * Deep Neural Networks built on a tape-based autodiff system
>
> Usually one uses PyTorch either as:
> * A replacement for numpy to use the power of GPUs.
> * a deep learning research platform that provides maximum flexibility and speed

**PyTorch components**

<img src="images/pytorch-about.png" width="400px" />

[pytorch-about]:https://pytorch.org/about/

## Implement a Neural Network

**What are the steps to build a network in PyTorch?**

```python
# Create the model
...

for epoch in range(10**5):
    # Forward pass
    ...
    
    # Backpropagation
    ...
    
    if epoch%100 == 0:
        # Plot decision surface
        ...
```

PyTorch implementation

In [None]:
# Create model
model = torch.nn.Sequential(
    torch.nn.Linear(in_features=2, out_features=4),
    torch.nn.Tanh(),
    torch.nn.Linear(in_features=4, out_features=2),
)

# Criterion and optimizer for "training"
criterion = torch.nn.CrossEntropyLoss() # Classification
optimizer = torch.optim.SGD(model.parameters(), lr=0.03)

def forward(X):
    # Pass input to the network
    X_tensor = torch.FloatTensor(X)
    X_variable = torch.autograd.Variable(X_tensor)
    output = model(X_variable)
    
    return output

def backpropagation(output):
    # Clear the gradients
    optimizer.zero_grad()
    
    # Compute error
    y_tensor = torch.LongTensor(y)
    y_variable = torch.autograd.Variable(y_tensor)
    loss = criterion(output, y_variable)
    
    # Backpropagation
    loss.backward()
    
    # Let the optimizer adjust our model
    optimizer.step()
    
    return loss.data
    
# Create a figure to visualize the results
# note: you can reduce the figure size it it's too slow
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(8, 3))
    
try:
    loss_values = []
    
    for epoch in range(10**5):
        # Forward pass
        output = forward(X)

        # Backpropagation
        loss = backpropagation(output)
        loss_values.append(loss)

        if epoch%100 == 0:
            # Plot decision surface
            ax1.cla()
            ax1.set_title('Epoch {}'.format(epoch))
            decision_surface(X[:, 0], X[:, 1], y, ax1, lambda X: forward(X)[:, 1].data)
            ax2.cla()
            ax2.set_title('Loss')
            ax2.plot(loss_values)

            # Jupyter trick
            IPython.display.clear_output(wait=True)
            IPython.display.display(fig)

except KeyboardInterrupt:
    # Clear output
    IPython.display.clear_output()

## Moons data set

In [None]:
from sklearn.datasets import make_moons

# Moons data set
X, y = make_moons(
    n_samples=200, shuffle=True, noise=0.1, random_state=0)

# Rescale data
X = scale(X)

# Plot data points
fig = plt.figure(figsize=(4, 4))
decision_surface(X[:, 0], X[:, 1], y, fig.gca(), predict_fn=None)

> **Task:** Implement a network with 2 hidden layers, "ReLU" activations and train it with a learning rate of 0.1 - [network in playground][moons-network]

[moons-network]:http://playground.tensorflow.org/#activation=relu&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.1&regularizationRate=0&noise=0&networkShape=6,4&seed=0.73171&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false

## Exclusive or data set

In [None]:
# "excusive or" data set
X = np.random.uniform(low=-1, high=1, size=(200, 2))
y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0).astype(np.int)

# Rescale data
X = scale(X)

# Plot data points
fig = plt.figure(figsize=(4, 4))
decision_surface(X[:, 0], X[:, 1], y, fig.gca(), predict_fn=None)

## Small challenge - visualize the output of a hidden unit

<!---
def f(X, layer_idx, unit_idx):
    activation = torch.autograd.Variable(torch.FloatTensor(X))
    for layer in model[:layer_idx]:
        activation = layer(activation)
    return activation[:, unit_idx].data
    
fig = plt.figure(figsize=(4, 4))
decision_surface(X[:, 0], X[:, 1], y, fig.gca(), predict_fn=lambda X: f(X, 1, 0))
-->

In [None]:
# TODO

## Additional resources

Nice visualizations

* "Neural network 3D simulation" - [Video by Denis Dmitriev][3d-simulation]

[3d-simulation]:https://www.youtube.com/watch?v=3JQ3hYko51Y

To go deeper
* Commonly used activation functions - [cs231n course][cs231-actfun]
* How the backpropagation algorithm works - [Michael Nielsen's book, Chapter 2][nndl-chap2]
* A visual proof that neural nets can compute any function - [Michael Nielsen's book, Chapter 4][nndl-chap4]

[cs231-actfun]:http://cs231n.github.io/neural-networks-1/#actfun
[nndl-chap2]:http://neuralnetworksanddeeplearning.com/chap2.html
[nndl-chap4]:http://neuralnetworksanddeeplearning.com/chap4.html