# Tutorial: Deep Learning Toolkits

## Overview

* DNN toolkits - why and what?
* Interactive tutorials for popular toolkits

# Why toolkits?

## A reusable tool bag of components

* Network structure; layers, connectivity
* Loss functions defining optimization problem
* Optimization methods for training (SGD,(+Momentum),ADAM,...)
* Computation of gradients (reverse mode AD aka "backpropagation")
* Computational backends (CPU/GPU/TPU/...)
* Packaged example training datasets
* Deployment frameworks
* ...



# Some toolkits

![MNIST digits](arvix_graph_cut.png)

(from https://deepsense.ai/keras-or-pytorch/)

# We will look at

* [Keras](https://keras.io) — *Simple* python framework (various backends)
* [PyTorch](https://pytorch.org/) — *Flexible*, python-native framework (cf numpy)
* [TensorFlow](https://www.tensorflow.org) — *"Industrial strength"* C++ framework; python frontend
* [Flux](http://fluxml.ai) — *Flexible transparent* pure julia framework

# Tutorial overview

* Toolkit-specific installation
* Data set import
* Model & loss function setup
* Training
* Pretty pictures!
   * Generated digits
   * Latent space structure
* Compare code from different toolkits

# Dataset and model

* We will teach the compute to draw "handwritten" digits
* Standard MNIST dataset (boo!)
* Train a variational autoencoder
* *Generate* new random digits

## Dataset 
https://en.wikipedia.org/wiki/MNIST_database

![MNIST digits](MnistExamples.png)


## Generic Autoencoder

* https://en.wikipedia.org/wiki/Autoencoder
* Compresses redundant information into low dimensions

![Autoencoder structure](Autoencoder_structure.png)


## Variational Autoencoder

* Force the code $z$ to be known distribution (a multivariate gaussian).
* Can compress input, just as a normal autoencoder
* Can *generate* new output by sampling from the known distribution of $z$.