Have a Taste of DNN on the Shoulder of Theano
====

![image](fruit.png)

## Outline
* Introduction to Theano
* Softmax regression
* Highlights of DNN
* Neural network
    * Multiple layer perception
    * Forward propagation
    * Backward propagation
* Sparse autoencoder
    * Autoencoder
    * Sparse autoencoder
* Building deep networks for classification
    * Model structure
    * Pre-training
    * Fine-tuning
    * Experimental results

## Introduction to Theano
Theano is Python library that allows you to define, evaluate and optimize math expressions.
* Efficient symbolic differentiation
* Efficient handling of matrices
* Tight integration of NumPy
* Dynamic C code generate
* Transparent use of GPU

Fast to develop and fast to run
![image](fast.png)

Machine learning libraries built on top of Theano
* Pylearn2
    * great flexibility and a good choice for trying out ML ideas
* PyMC3
    * Probabilistic programming; building statistical Bayesian models
* Sklearn-theano
    * Easy-to-use deep learning tool
* Lasagne
    * Lightweight library to build neural networks

Models that have been built with Theano
* Neural networks
* Convolutional Neural Networks (CNN)
* Recurrent Neural Networks (RNN)
* Long Short Term Memory (LSTM)
* Autoencoders
* GoogLeNet
* Overfeat
...


Symbolic variables in Theano
* Variable (C, Java, Python, etc.)
    * A segment of physical storage in RAM
    * Operations are based on value passing between variables
* Tensor (Theano)
    * A mathematical symbol
    * No physical storage in RAM to hold its value
    * Operations are actually building connections between tensors
* Shared variable (Theano)
    * Hybrid of variable and tensor
    * Tensor with physical storage in RAM to hold its value


In [12]:
import theano
import theano.tensor as T

x = T.dvector(name='x')
def f(x):
    return x ** 2
y = f(x)

theano.printing.pydotprint(theano.function([x], y), '1.png')

The output file is available at 1.png


![image](1.png)

_theano.function_ brings life to theano variables.

In [9]:
import theano
import theano.tensor as T
import numpy as np

x = T.dvector(name='x')
def f(x):
    return x ** 2
y = f(x)

pow2 = theano.function(inputs=[x], outputs=y)

a = np.array([1,2,3], dtype=theano.config.floatX)
b = pow2(a)
print "a is {a}, b is {b}".format(a=a, b=b)

a is [ 1.  2.  3.], b is [ 1.  4.  9.]


In [11]:
import theano
import theano.tensor as T
import numpy as np

x = T.dvector('x')
y = x.sum()
grad = T.grad(cost=y, wrt=[x])
grad_func = theano.function(inputs=[x], outputs=grad)

a = np.array([1,2,3], dtype=theano.config.floatX)
print grad_func(a)

[array([ 1.,  1.,  1.])]


## Softmax Regression

### Introduction

In the softmax regression setting, we are interested in multi-class classification. Suppose we have $m$ samples in the training set $\{(x^{(1)}, y^{(1)}),...,(x^{(m)}, y^{(m)})\}$, where $y^{(i)}\in \{1,2,...,k\}$.

Given a test sample $x$, we want to estimate the probability that $x$ belongs to class $j$, i.e., $p(y=j|x)$, for all possible $j$.

\begin{equation}
h_\theta(x) = 
\left[
  \begin{array}{c}
  p(y=1|x;\theta)\\
  p(y=2|x;\theta)\\
  ...\\
  p(y=k|x;\theta)\\
  \end{array}
\right]
=\frac{1}{\sum_{j=1}^{k}{e^{\theta_j^Tx+b_j}}}
\left[
  \begin{array}{c}
  e^{\theta_1^Tx+b_1}\\
  e^{\theta_2^Tx+b_2}\\
  ...\\
  e^{\theta_k^Tx+b_k}\\
  \end{array}
\right]
\end{equation}