# Neural Networks: part 1, blocks

Necessary ingredients to train a neural network are:
    * model
    * loss
    * optimizer
    
    
Your task will be to experiment with NN architecture solving FashionMNIST

Goal is to:

* Get familiar with torch.nn
* Get intuition about how NN work
* Get familiar with various basic building blocks
* Understand overfitting/underfitting
* Understand role of validation set

When you are done, compile a simple pdf report (e.g. you can copy paste figures into a google doc, and save as a pdf) and put it into Dropbox folder. 
    
<img width=400 src="https://github.com/gmum/nn2018/raw/master/lab/fig/4/smoothie.png">

Refs:

* Great introduction to Convolutional networks: http://cs231n.github.io/convolutional-networks/
* Interpreting neural networks: https://distill.pub/2018/building-blocks/
* Playground used in the notebook: http://playground.tensorflow.org/

# Setup

In [3]:
%load_ext autoreload
%autoreload 2

import numpy as np
import tqdm
import json

import torch
import torch.nn.functional as F

from torch import optim
from torch import nn
from torch.autograd import Variable

from keras.datasets import fashion_mnist
from keras.utils import np_utils

%matplotlib inline
import matplotlib.pylab as plt
import matplotlib as mpl

from torch.autograd import gradcheck

mpl.rcParams['lines.linewidth'] = 2
mpl.rcParams['figure.figsize'] = (7, 7)
mpl.rcParams['axes.titlesize'] = 12
mpl.rcParams['axes.labelsize'] = 12

# Get FashionMNIST (see 1b_FMNIST.ipynb for data exploration)
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

# Logistic regression needs 2D data
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)

# 0-1 normalization
x_train = x_train / 255.
x_test = x_test / 255.

# Convert to Torch Tensor. Just to avoid boilerplate code later
x_train = torch.from_numpy(x_train).type(torch.FloatTensor)
x_test = torch.from_numpy(x_test).type(torch.FloatTensor)
y_train = torch.from_numpy(y_train).type(torch.LongTensor)
y_test = torch.from_numpy(y_test).type(torch.LongTensor)

# Use only first 1k examples. Just for notebook to run faster
x_valid, y_valid = x_train[1000:2000], y_train[1000:2000]
x_train, y_train = x_train[0:1000], y_train[0:1000]
x_test, y_test = x_test[0:1000], y_test[0:1000]

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# Gaining intuition

## DNN as building increasingly complex boundary

There are many ways of thinking about DNNs, and it is important to develop an intuition. Developing intuition is one of the goals of this course. 

Go to http://playground.tensorflow.org/

## Interpreting neurons

Go to https://distill.pub/2018/building-blocks/

# Overfitting

<img width=400 src="https://github.com/gmum/nn2018/raw/master/lab/fig/5/overfit1.png">
<img width=400 src="https://github.com/gmum/nn2018/raw/master/lab/fig/5/overfit2.png">

# Starting point

This section gives basic model. Please adapt yourself training loop from the previous notebook.

In [5]:
def build_simple_mlp(input_dim, output_dim):
    model = torch.nn.Sequential()
    model.add_module("linear_1", torch.nn.Linear(input_dim, 512, bias=False))
    model.add_module("nonlinearity_1", torch.nn.Sigmoid())
    model.add_module("linear_2", torch.nn.Linear(512, output_dim, bias=False))
    return model

# Find architecture fitting XOR

Go to http://playground.tensorflow.org/ and select second dataset.

TODO: Improve this part

# Convolutional networks

We will have separate lab on convolutions. A crash course on CNNs:

<img width=300 src=http://cs231n.github.io/assets/nn1/neural_net2.jpeg>

<img width=400 src=http://cs231n.github.io/assets/cnn/stride.jpeg>

CNN hyperparameters:

* Number of filters
* Filter size
* Stride (less important usually)

Ref: 
* Images from http://cs231n.github.io/convolutional-networks/
* How to create CNNs in PyTorch https://github.com/vinhkhuc/PyTorch-Mini-Tutorials/blob/master/5_convolutional_net.py

# MLP vs CNN

Compare a good CNN (tune its hyperparameters on valid) to a good MLP (tune its hyperparameters on valid).

# CNN: effect of filter size

Starting from a good CNN in previous section examine effect of filter size.

# Depth 

Starting from the basic MLP, examine effect of depth going from 1 to deepest you can on your machine.

# Width

Starting from the basic MLP, examine effect of width, going to widest you can on your machine.