# Theano Tutorials

[1) Adding two numbers](#1_add)

[2) Adding two matrices](#2_add_matrices)

[3) Logistic Function](#3_logistic)

[4) Computing More than one Thing at the Same Time](#4_more_than_one_output)

[5) Setting a Default Value for an Argument](#5_default_value)

[6) Using Shared Variables](#6_shared_var)

[7) Copying functions](#7_copying_functions)

[8) Using Random Numbers](#8_random_numbers)

In [3]:
import theano
from theano import tensor as T
from theano import pp
from theano import shared
from theano import function
from theano import In
from theano.tensor.shared_randomstreams import RandomStreams
from theano.sandbox.rng_mrg import MRG_RandomStreams
import numpy as np

<a name='1_add'></a>
## 1) Add two numbers together

** Step 1 **
- The first step is to define two symbols (Variables) representing the quantities that you want to add.
- In Theano, **all symbols must be typed**.
- In particular, **T.dscalar** is the type we assign to **“0-dimensional arrays (scalar) of doubles (d)”**
- **'dscalar' is a Theano Type. dscalar is not a class**. Therefore, neither x nor y are actually instances of dscalar. They are **instances of TensorVariable**.
- x and y are, however, assigned the theano Type dscalar in their type field.
- By calling T.dscalar with a string argument, you create a Variable representing a floating-point scalar quantity with the given name. If you provide no argument, the symbol will be unnamed. Names are not required, but they can help debugging.

In [4]:
x = T.dscalar('x')
y = T.dscalar('y')
print type(x)
print x.type

<class 'theano.tensor.var.TensorVariable'>
TensorType(float64, scalar)


- The following types are available
    - **byte**: bscalar, bvector, bmatrix, brow, bcol, btensor3, btensor4, btensor5    
    - **16-bit integers**: wscalar, wvector, wmatrix, wrow, wcol, wtensor3, wtensor4, wtensor5    
    - **32-bit integers**: iscalar, ivector, imatrix, irow, icol, itensor3, itensor4, itensor5    
    - **64-bit integers**: lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4, ltensor5
    - **float**: fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4, ftensor5
    - **double**: dscalar, dvector, dmatrix, drow, dcol, dtensor3, dtensor4, dtensor5
    - **complex**: cscalar, cvector, cmatrix, crow, ccol, ctensor3, ctensor4, ctensor5
    
- You, the user—not the system architecture—have to choose whether your program will use 32- or 64-bit integers (i prefix vs. the l prefix) and floats (f prefix vs. the d prefix).

** Step 2 **
- The second step is to combine x and y into their sum z
- **z is yet another Variable** which represents the addition of x and y
- You can use the **pp function** to pretty-print out the computation associated to z

In [5]:
z = x+y
print pp(z)

(x + y)


** Step 3 **
- The last step is to create a function taking x and y as inputs and giving z as output
- The first argument to **function()** is a list of Variables that will be provided as inputs to the function. The second argument is a single Variable or a list of Variables. For either case, the second argument is what we want to see as output when we apply the function. 
- f may then be used like a normal Python function.

In [6]:
f = function([x,y], z)
print f(2,3)

5.0


### Bonus Tip
- As a shortcut, you can skip step 3, and just use a variable’s **eval method**. 
- The **eval()** method is not as flexible as function() but it can do everything we’ve covered in the tutorial so far. It has the added benefit of not requiring you to import function().
- We pass **eval()** a **dictionary mapping symbolic theano variables to the values to substitute for them, and it returns the numerical value of the expression**.
- eval() will be slow the first time you call it on a variable – it needs to call function() to compile the expression behind the scenes. Subsequent calls to eval() on that same variable will be fast, because the variable caches the compiled function.

In [7]:
z.eval({x:2,y:3})

array(5.0)

<a name='2_add_matrices'></a>
## 2) Adding two matrices

- The only change from the previous example is that you need to instantiate x and y using the matrix Types
- **dmatrix** is the Type for matrices of doubles.
- We can use our new function on 2D arrays.
- We can also use NumPy arrays directly as inputs.
- It is possible to add scalars to matrices, vectors to matrices, scalars to vectors, etc. The behavior of these operations is defined by [broadcasting](http://deeplearning.net/software/theano/tutorial/broadcasting.html#tutbroadcasting)

In [10]:
x = T.dmatrix('x')
y = T.dmatrix('y')
z = x+y
f = function([x,y], z)
print f([[1, 2], [3, 4]], [[10, 20], [30, 40]])
print f(np.array([[1, 2], [3, 4]]), np.array([[10, 20], [30, 40]]))

[[ 11.  22.]
 [ 33.  44.]]
[[ 11.  22.]
 [ 33.  44.]]


<a name='3_logistic'></a>
## 3) Logistic Function

$$s(x) = \frac{1}{1 + e^{-x}}$$

- We want to apply the above ligistic function elementwise to each element of a matrix of doubles.
- Logistic is performed elementwise is because all of its operations—division, addition, exponentiation, and division—are themselves elementwise operations

In [11]:
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = function([x], s)
print logistic([[0, 1], [-1, -2]])

[[ 0.5         0.73105858]
 [ 0.26894142  0.11920292]]


<a name='4_more_than_one_output'></a>
## 4) Computing More than one Thing at the Same Time

- Theano supports functions with multiple outputs. 
- For example, we can compute the elementwise difference, absolute difference, and squared difference between two matrices a and b at the same time
- **Note** - **dmatrices()** produces as many outputs as names that you provide. It is a shortcut for allocating symbolic variables.

In [12]:
a, b = T.dmatrices('a','b')
diff = a-b
abs_diff = abs(diff)
squared_diff = diff**2
f = function([a,b], [diff, abs_diff, squared_diff])
print f([[1, 1], [1, 1]], [[0, 1], [2, 3]])

[array([[ 1.,  0.],
       [-1., -2.]]), array([[ 1.,  0.],
       [ 1.,  2.]]), array([[ 1.,  0.],
       [ 1.,  4.]])]


<a name='5_default_value'></a>
## 5) Setting a Default Value for an Argument

- Let’s say you want to define a function that adds two numbers, except that if you only provide one number, the other input is assumed to be one
- We make use of the **In class**, which allows you to specify properties of your function’s parameters with greater detail. Here we give a default value of 1 for 'y' by creating a **In instance** with its value field set to 1.

In [13]:
x, y = T.dscalars('x', 'y')
z = x+y
f = function([x, In(y, value=1)], z)
print f(33)
print f(33,2)

34.0
35.0


- Inputs with default values must follow inputs without default values (like Python’s functions). 
- There can be multiple inputs with default values. 
- These parameters can be set positionally or by name, as in standard Python.

In [14]:
x,y,w = T.dscalars('x', 'y', 'w')
z = (x+y)*w
f = function([x, In(y, value=1), In(w, value=2, name='w_by_name')], z)
print f(33)
print f(33, 2)
print f(33, 0, 1)
print f(33, w_by_name=1)
print f(33, w_by_name=1, y=0)

68.0
70.0
33.0
34.0
33.0


** Note - ** **In** does not know the name of the local variables y and w that are passed as arguments. The symbolic variable objects have name attributes (set by dscalars in the example above) and these are the names of the keyword parameters in the functions that we build. This is the mechanism at work in In(y, value=1). In the case of In(w, value=2, name='w_by_name'). We override the symbolic variable’s name attribute with a name to be used for this function.

<a name='6_shared_var'></a>
## 6) Using Shared Variables

- For example, let’s say we want to make an accumulator: at the beginning, the state is initialized to zero. Then, on each function call, the state is incremented by the function’s argument. 
- Our accumulator function adds its argument to the internal state, and returns the old state value.

In [17]:
state = shared(0)
inc = T.iscalar('inc')
accumulator = function([inc], state, updates=[(state, state+inc)])

- **Shared variables :-**
    - The **shared()** function constructs so-called shared variables. 
    - These are hybrid symbolic and non-symbolic variables **whose value may be shared between multiple functions**. 
    - Shared variables can be used in symbolic expressions just like the objects returned by dmatrices(...) but they also have an internal value that defines the value taken by this symbolic variable in all the functions that use it. It is called a shared variable because its value is shared between many functions. 
    - The value can be accessed and modified by the **.get_value() and .set_value()** methods
    
    
- **Updates :-**
    - **updates** parameter of function. 
    - updates must be supplied with a list of pairs of the form (shared-variable, new expression). It can also be a dictionary whose keys are shared-variables and values are the new expressions. 
    - It means “whenever this function runs, it will **replace the .value of each shared variable with the result of the corresponding expression**”. 
    - Above, our accumulator replaces the state‘s value with the sum of the state and the increment amount.

In [18]:
print state.get_value()
print accumulator(1)
print state.get_value()
print accumulator(300)
print state.get_value()

0
0
1
1
301


- It is possible to reset the state. Just use the **.set_value() method**

In [19]:
state.set_value(-1)
print accumulator(3)
print state.get_value()

-1
2


- As mentioned above, you can define more than one function to use the same shared variable. 
- These functions can all update the value.

In [20]:
decrementor = function([inc], state, updates=[(state, state-inc)])
print decrementor(2)
print state.get_value()

2
0


- **givens parameter** :-
    - It may happen that you expressed some formula using a shared variable, but you do not want to use its value. In this case, you can use the **givens parameter** of function which replaces a particular node in a graph for the purpose of one particular function
    - a good way of thinking about the givens is as a mechanism that allows you to replace any part of your formula with a different expression that evaluates to a **tensor of same shape and dtype**.
    - [For more on 'givens' parameter](http://deeplearning.net/software/theano/tutorial/examples.html#using-shared-variables)

In [24]:
fn_of_state = state*2 + inc
# The type of foo must match the shared variable we are replacing with the ``givens``
foo = T.scalar(dtype=state.dtype)
skip_shared = function([inc, foo], fn_of_state, givens=[(state, foo)])
print skip_shared(1, 3)  # we're using 3 for the state, not state.value
print(state.get_value())  # old state still there, but we didn't use it

7
0


<a name='7_copying_functions'></a>
## 7) Copying Functions

- Theano functions can be copied, which can be useful for creating similar functions but with different shared variables or updates. 
- This is done using the **copy() method of function objects**. 
- The optimized graph of the original function is copied, so compilation only needs to be performed once.

In [25]:
# Let’s start from the accumulator defined above
state = shared(0)
inc = T.iscalar('inc')
accumulator = function([inc], state, updates=[(state, state+inc)])
print accumulator(10)
print state.get_value()

0
10


- We can use **copy()** to create a similar accumulator but with its own internal state using the **swap parameter**, which is a **dictionary of shared variables to exchange**

In [27]:
new_state = shared(0)
new_accumulator = accumulator.copy(swap={state:new_state})
print new_accumulator(100)
print new_state.get_value()

# The state of the first function is left untouched
print state.get_value()

[array(0)]
100
10


- We now create a copy with **updates removed** using the **delete_updates parameter**, which is set to **False by default**

In [28]:
null_accumulator = accumulator.copy(delete_updates=True)
print null_accumulator(100)
print state.get_value()

[array(10)]
10


- As expected, the shared state is no longer updated

<a name='8_random_numbers'></a>
## 8) Using Random Numbers