# Tutorial 4: Flux Library: Convolutional Networks, Example Layers

In this tutorial we will cover the following
 - What is a Convolutional Neural Network (See Hinton 2015)
 - Data sources for CNN Networks, using [MLDatasets](https://github.com/JuliaML/MLDatasets.jl) to load MINST
 - A look at [Convolution Layers, and their arguments](https://github.com/FluxML/Flux.jl/blob/master/src/layers/conv.jl#L35)
 - ConvLayers: adding, stride, width/height, activation, Determining what padding=”same” would be
 - Convolutional Layers as dimensional manipulation
 - [Inverse Convolutional Networks](https://github.com/FluxML/Flux.jl/blob/master/src/layers/conv.jl#L71), possibly with VAE
 - Putting it all together with [MINST Classifier](https://github.com/FluxML/model-zoo/blob/master/vision/mnist/conv.jl) and [VAE w/ Conv/Deconv](https://github.com/FluxML/model-zoo/blob/master/vision/mnist/vae.jl)

# Load Flux
We are just going Flux

In [None]:
using Flux
using MLDatasets
using Test

# What is a Convolutional Neural Network (See Hinton 2015)
A convolutional neural network is a network that uses convolutional transformations
as part of the neural network, and are used for both classification and generative tasks.
The recent revolution in Convolutional Neural Networks was spawned in the Hinton lab, by applying
a CNN to outperfom alternative approaches on ImageNet.
[Krizhevsky 2012 paper on ImageNet (from Hinton's Lab)](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)

# Data Sources for Convolution Neural Networks
We are going to use ImageNet, which is available from MLDatasets

In [None]:
test_x, test_y = MNIST.testdata()
train_x, train_y = MNIST.traindata()

train_x_size = (0,) # fix me!
train_y_size = (0,); # fix me !
@test size(train_x) == train_x_size
@test size(train_y) == train_y_size

# Reshaping to WHCN Format
For our MINST dataset, we will want to work with image data that fits
in the following, standardized dimensions (W, H, C, N).
where W is width in pixels, H is width in Pixels, C is number of channels, and N is number of samples.
The Type of the array should be Array{Float, 4}
This koan asks you to sample 100 images from MINST and convert then to WHCN format.

In [None]:
N = 100
x_train_whcn = train_x
X = reshape(float.(train_x[:,:,1:N]), 28, 28, 1, N)
y = train_y # Fix me !
@test size(X) == (28,28,1,100)
@test size(y) == (100,)

# Finding the output dimensions of a Conv Layers
Given the convolution, find the output dimension after applying it to MINST

In [None]:
layer = Conv((1, 1), 1 => 32, relu, stride = (1, 1))
output_dims = (0,) # Modify me!
@test size(layer(X)) == output_dims

# Inspecting Conv Weights
We can directly access the weights used by conv, using the accessor, `weight`
For the follow example, determine what the dimensions of the Conv.weights would be!

In [None]:
conv_weights = Conv((3,4), 1 => 16).weight
conv_weights_dim = (0,0,0,0) # modify me!
conv_weights_dim = (3,4,1,16)
@test size(conv_weights) == conv_weights_dim

# Find the Convolution that fits a shape
We want to transform our 100 images of MINST such that
our W/H is 27, and our number of channels is 42. What is a layer that will acccomplish this?

In [None]:
layer = Conv((1,1), 1=> 42, identity)
@test size(layer(X)) == (27, 27, 42, size(X,4))

note, we introduce the size(X,4) motif here instead of hardcoding the number of
of examples. Further, the activation function we are using identity, has the property

In [None]:
@test identity.(X) == X
@test identity.(y) == y

# Stride
Stride, or how many pixels height/width are skipped per convolutional filter
step, is one way to manipulate how the filter is subsequently applied to each
image. Taking the following layer, modify its stride to get it to pass the
dimension of the output.

In [None]:
layer = Conv((1, 1), 1 => 32, relu, stride = (1, 1))
@test size(layer(X)) == (14, 28, 32, size(X,4))

# Padding
Padding, is the number of pixels from the edge that are used for the convolutional
filter. A greater padding will increase the total size of the output, and vice, versa.
For our Conv object, the argument to set padding is `pad`. The default pad is (0,0).

# Padding Koan.
Using our MINST dataset, what will be the output dimension if we use a pad of (0,1)

In [None]:
layer = Conv((1,1), 1 => 32, relu, stride = (1,1), pad = (0,1))
size_layer_X = (28, 28, 1, 100) # size(X), we need size(layer(X))
size_layer_X = (28, 30, 32, 100)
@test  size(layer(X)) == size_layer_X

# Padding Koan # 2
The Conv argument `pad`, can take a tuple with four arguments,
the `Tuple{4}` arguments for pad are
(width padding begin, width pad end, height pad start, height pad end)
Alter the tuple passed to Conv such that the application to X has
the dimensions (29,29,32, 100)

In [None]:
padding_argument = (0, 0, 0, 0) # modify this to get the desired output dimension
layer = Conv((1, 1), 1 => 32, relu, stride = (1,1), pad = padding_argument)
size_layer_X = (29,29,32,100)

Note, in Tensorflow, we have somthing called, "padding=SAME". However,
this is not available year for Flux. Instea

# ConvTranspose
It is possible to reverse, or create an inverse convolutional transform using
Flux's ConvTranspose. Given a convolutional transform, layer1, create a
layer2 that reshapes layer1(X) into the shape of X

In [None]:
layer1 = Conv((2,2), 1 => 16, identity)
layer2 = Conv((0,0), 0=>0, identity) # modify me !
m = Chain(layer1, layer2)
@test size(m(X)) == size(X)

# Putting together a Convolutional Network
Here is an example net that accepts images from MINST, and outputs a probability
distribution over the 10 potential digits.
Note the use of both `Conv`, `MaxPool`, and `softmax`, as well as
the introduction of an anonymous function to reshape the data!
[Example from model-zoo](https://github.com/FluxML/model-zoo/blob/master/vision/mnist/conv.jl)

In [None]:
m = Chain(
    #= First convolution, operating upon a 28x28 image =#
    Conv((3, 3), 1=>16, pad=(1,1), relu),
    MaxPool((2,2)),
    #= Second convolution, operating upon a 14x14 image =#
    Conv((3, 3), 16=>32, pad=(1,1), relu),
    MaxPool((2,2)),
    #= Third convolution, operating upon a 7x7 image =#
    Conv((3, 3), 32=>32, pad=(1,1), relu),
    MaxPool((2,2)),
    #= Reshape 3d tensor into a 2d one, at this point it should be (3, 3, 32, N) =#
    #= which is where we get the 288 in the `Dense` layer below: =#
    x -> reshape(x, :, size(x, 4)),
    Dense(288, 10),
    #= Finally, softmax to get nice probabilities =#
    softmax,
)

# Apply a CNN to MINST
Taking our chain from model zoo, above, and applying it a single Image of MINST
what would we expect the output to be?

In [None]:
X_1 = reshape(X[:,:,:,1], (28,28,1,1))
y_predicted =  m(X_1)
y_predicted_shape = (0, 0) # modify me !
@test size(y_predicted) == y_predicted_shape

# Conv is reversible!
We can actually take a convultion transform, and revere it!
This is the transformation behind auto-endcoders trained with
variational inference.
TODO: talk about the model here: https://github.com/adamwespiser/variational-autoencoders

In [None]:
#= end module =#

*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*