### Introduction to Theano for Deep Learning
This notebooks presents the main functionality of Theano library for developing powerful Deep Learning models. 
More information and examples for Theano please check [here](http://deeplearning.net/software/theano/introduction.html)

### Python revisited
Here, we will see some basic functionality of python for better understanding of Theano

#### Funtions in Python
Below code snippet shows how to create a function in python. More detailed examples and python tutorial can be found [here](https://www.learnpython.org/).

In [1]:
#  sum two elements and return the value
def sum_func(x,y): 
    return x + y

#### Lists in Python
Lists are kind of an array to store values. Here we will show how to create lists in python.

In [2]:
list_ex = [1,3,4,6,56]
list_ex [0]

1

#### Dictionaries in Python
Dictionary is used for key-value mapping. Below code snippet shows how to create them. 

In [3]:
# create dictionary
dict_ex = {'key':123, 'key2': 458}
# access dictionary field
dict_ex ["key2"]

458

#### List comprehension
List comprehension is mainly used to create a list in a shorter way. 

In [4]:
# create list with 10 elements. Each elements is two times by its index and plus one. 
list_comp_ex = [ i*2 +1 for i in range(10)]
list_comp_ex

[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

## How to use Theano
Here we will go through some details of Theano step by step. We assume you already installed Theano and its dependencies by checking above provided link.

### Import Theano and numpy
As Theano itself was buit on [numpy](http://cs231n.github.io/python-numpy-tutorial/) array structure, we need to import both numpy and Theano. 

In [5]:
import theano as T
import numpy  as np

### 1 - Simple code snippet
Here we will take a look simple code example that uses Theano in the core. 

In [6]:
# create variable as a vector and give the name 'x'
x = T.tensor.fvector('x')
# create a variable as a numpy array and assign values to it
W = T.shared(np.asarray([0.3,0.7]),'W')
# define y as sum of elements in the element-wise multiplication of x and W
y = (x * W).sum()
# create theano function which takes input as x and return output as y
func_t = T.function(inputs=[x],outputs=y)
# call the function for testing
output = func_t([1, 1])
output

array(1.0)

### So what is exactly Theano ?
Theano is just a symbolic graph of our model that will be executed  accordingly. In other words, while we create a  model with Theano, we basically first define the symbolic graph of variables and operations. Afterwards, we use this graph with specific input(s) to get output(s). 

Let's take above code as an example. We created our Theano object y which knows its values can be calculated as the dot-product of x and W. The important point here is that all mathematical operations/calculations are not performed yet here. In fact, one can notice that we did not have any value of x while this line of the code was executed. 

***So, what we are doing here is just that we are creating a graph of all the variables and functions that need to be used for a given input to get an output.***

### Variables in Theano
It is crucial to understand the variables and how to define them in Theano because our graph of the model depends on the variables and their types. Each variable is defined with a specific type, for instance **x** is a vector of 32-bi floats. 

Below table illustrates the mostly used variable names and their types. However, more information and variables types can be seen [here](http://deeplearning.net/software/theano/library/tensor/basic.html)


| Constructor| type | number of dimension        
| :- |-------------: | :-:
| fvector |	float32	| 1
| ivector |int32 |	1
| fscalar | float32 | 0
| fmatrix | float32 | 2
| ftensor3 | float32 | 3
| dtensor3 | float64 | 3


#### Shared Variables
We can also define shared variables that might be used by different functions and different function calls. For example, when we create a Neural Networks model we can define weights as  shared variables. 

At the above code we defined our variable **W** as a shared variable. 

Theano moves the shared variables to GPU by default. Therefore, for computationaly efficiency it is suggested to define variables which might be used by different functions. 

One can get and set the shared variables outside of the any Theano functions as followed. 

In [7]:
# get the value
W.get_value()
# set the value
W.set_value([0.1, 0.9])

### Functions in Theano

Functions are pretty straightforward to create in theano as shown above code. We usually apply functions to send input to our model and get an output from it. More information can be accessed [here](http://deeplearning.net/software/theano/library/compile/function.html)

### 2 - Let's create a simple training example

In [9]:
# create a theano variable x as a vector and name it 'x'
x = T.tensor.fvector('x')

#### define the target value for training

In [10]:
# target value as a theano variable
target = T.tensor.fscalar('target')

In [12]:
## weights as a shared array
W = T.shared(np.asarray([0.4, 0.6]), 'W')
# theano object y which is based on variable x and w
y = (x * W).sum()

#### now we will define our cost function which will be used during training phase
It is basically a squared distance from target

In [13]:
cost = T.tensor.sqr(target - y)

#### compute partial derivatives(gradients) of the parameters  
parameters will be updated during the training with respect to cost function. Here we will use built-in function grad in Theano. 
When we call the grad function we need to pass the cost function and the parameters we want to get the gradients


In [14]:
gradients = T.tensor.grad(cost, [W])

#### define a symbolic graph of parameter update rule
Here we will apply gradient descent to update parameters. This will be specified as a symbolic graph


In [15]:
W_updated = W - (0.01 * gradients[0])

#### define list for holding updates
this list takes two arguments. First one is a variable we want to update and second is the value that we want to assign to second variable

In [18]:
update = [(W, W_updated)]

#### finally define the function for running the graph  for training


In [19]:
F = T.function(inputs = [x, target], outputs = y, updates=update)

### now let's train our simple example

In [21]:
for i in xrange(20):
    output = F([1.0, 1.0], 20.0)
    print output

7.36817991616
7.87345271952
8.35851461073
8.82417402631
9.27120706525
9.70035878264
10.1123444313
10.5078506541
10.8875366279
11.2520351628
11.6019537563
11.937875606
12.2603605818
12.5699461585
12.8671483122
13.1524623797
13.4263638845
13.6893093291
13.941736956
14.1840674777


### Basic math/linear algebra operations in Theano

Here we will see some basic operations in Theano and how to use them

In [37]:
#get value of Theano variable
a = T.shared(np.asarray([[1.3,2.8],[3.2,4.4]]),'A')
a.eval()

array([[ 1.3,  2.8],
       [ 3.2,  4.4]])

### elementwise operations

In [26]:
c = ((a + a) / 4.0)

#### dot product

In [29]:
c = T.tensor.dot(a, a)
c.eval()

array([[ 7, 10],
       [15, 22]])

#### Activation functions


In [32]:
c = T.tensor.nnet.sigmoid(a) #part of neural networks module
c = T.tensor.tanh(a)
c.eval()

array([[ 0.76159416,  0.96402758],
       [ 0.99505475,  0.9993293 ]])

#### softmax

In [39]:
c = T.tensor.nnet.softmax(a)
c.eval()

array([[ 0.18242552,  0.81757448],
       [ 0.23147522,  0.76852478]])

#### min and max

In [41]:
c = a.max()
c.eval()
c = a.max(axis=1)
c.eval()

array([ 2.8,  4.4])

In [42]:
c = a.min()
c.eval()
c = a.min(axis=1)
c.eval()

array([ 1.3,  3.2])

#### sum 

In [43]:
c = a.sum()
c.eval()

array(11.7)

In [44]:
c = a.sum(axis=1)
c.eval()

array([ 4.1,  7.6])

#### argmax

In [45]:
c = T.tensor.argmax(a)
c.eval()

array(3)

In [46]:
c = T.tensor.argmax(a,axis=1)
c.eval()

array([1, 1])

#### reshape 

In [49]:
a = T.shared(np.asarray([[1,2],[3,4]]), 'a')
c = a.reshape((1,4))
c.eval()

array([[1, 2, 3, 4]])

#### zeros and ones like array

In [50]:
c = T.tensor.zeros_like(a)
c.eval()

array([[0, 0],
       [0, 0]])

In [52]:
c = T.tensor.ones_like(a)
c.eval()

array([[1, 1],
       [1, 1]])

#### Indexing
Using Python indexing tricks can make life so much easier. In the example below, we make a separate list b containing line numbers, and use it to construct a new matrix which contains exactly the lines we want from the original matrix. This can be useful when dealing with word embeddings – we can put word ids into a list and use this to retrieve exactly the correct sequence of embeddings from the whole embedding matrix.

In [54]:
a = T.shared(np.asarray([[1.0,2.0],[3.0,4.0]]), 'a')
a.eval()

array([[ 1.,  2.],
       [ 3.,  4.]])

In [57]:
b = [1,1,0]
c = a[b]
c.eval()

array([[ 3.,  4.],
       [ 3.,  4.],
       [ 1.,  2.]])

In [61]:
## assign value, for instance we will assign [0 0] to first row of a
c = T.tensor.set_subtensor(a[0],[0.0, 0.0])
c.eval()

array([[ 0.,  0.],
       [ 3.,  4.]])