# Theano tutorial (`Module-theano`)

### [Theano tutorial](http://theano.readthedocs.org/en/latest/tutorial/index.html#tutorial)

### https://github.com/Newmu/Theano-Tutorials

Use `pip` to install the `Theano` package:

`pip install theano`

This is potentially out of date. 

In [2]:
import theano as th
th.__version__

'0.8.1'

In [3]:
import matplotlib.pyplot as plt
import numpy as np
import theano as th
import theano.tensor as T
# from theano import tensor as T

In [4]:
# declare two symbolic floating-point scalars
a = T.dscalar()
b = T.dscalar() # scalar() also works

# create a simple expression
c = a + b
sum = th.function(inputs=[a,b], outputs=c)
print(sum(3,4))

y = a * b
mul = th.function(inputs=[a, b], outputs=y)

print(mul(1, 2))
print(mul(3, 3))

7.0
2.0
9.0


The following types are available:

- byte: `bscalar`, `bvector`, `bmatrix`, `brow`, `bcol`, `btensor3`, `btensor4`
- 16-bit integers: `wscalar`, `wvector`, `wmatrix`, `wrow`, `wcol`, `wtensor3`, `wtensor4`
- 32-bit integers: `iscalar`, `ivector`, `imatrix`, `irow`, `icol`, `itensor3`, `itensor4`
- 64-bit integers: `lscalar`, `lvector`, `lmatrix`, `lrow`, `lcol`, `ltensor3`, `ltensor4`
- float: `fscalar`, `fvector`, `fmatrix`, `frow`, `fcol`, `ftensor3`, `ftensor4`
- double: `dscalar`, `dvector`, `dmatrix`, `drow`, `dcol`, `dtensor3`, `dtensor4`
- complex: `cscalar`, `cvector`, `cmatrix`, `crow`, `ccol`, `ctensor3`, `ctensor4`


In [5]:
th.pp(a), th.pp(c)

('<TensorType(float64, scalar)>',
 '(<TensorType(float64, scalar)> + <TensorType(float64, scalar)>)')

Following examples come from here:

- http://theano.readthedocs.org/en/latest/tutorial/examples.html

Create a function that accepts a matrix as input.

In [6]:
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = th.function([x], s)
logistic([[0, 1], [-1, -2]])

array([[ 0.5       ,  0.73105858],
       [ 0.26894142,  0.11920292]])

### Theano functions can return multiple values 

In [7]:
a, b = T.dmatrices('a', 'b')
diff = a - b
abs_diff = abs(diff)
diff_squared = diff**2
f = th.function([a, b], [diff, abs_diff, diff_squared])
f

<theano.compile.function_module.Function at 0x1129c1b38>

In [8]:
d, ad, ds = f([[1, 1], [1, 1]], [[0, 1], [2, 3]])

### Theano functions can have default values

In [9]:
x = T.dscalar()
y = T.dscalar()
z = x + y
f = th.function(inputs=[x, th.In(y, value=1)], outputs=z)
f(33)

array(34.0)

### Theano functions can modify shared variables

Define the shared variable `state` and the function `accumulator`.

In [10]:
state = th.shared(0)
inc = T.iscalar('inc')
accumulator = th.function([inc], state, updates=[(state, state+inc)])

Check the value of `state`, then change it and look again. 

In [11]:
print(state.get_value())

accumulator(1)
print(state.get_value())

accumulator(300)
print(state.get_value())

0
1
301


The shared variable can be reset, in this case to `-1`.

In [12]:
state.set_value(-1)
accumulator(3)
print(state.get_value())

2


### Short linear regression example

The `linspace` function, in this case, returns an array of `101` equally spaces numbers starting with `-1` and ending with `1`.

In [13]:
trX = np.linspace(start=-1, stop=1, num=101)
trX[0:5]

array([-1.  , -0.98, -0.96, -0.94, -0.92])

The `randn` function returns, in this case, an array of `101` random values which have an approximate mean of zero (`0`) and a standard deviation of one (`1`).

In [14]:
trY = np.random.randn(*trX.shape) * 0.33
trY[0:5]

array([-0.16264745,  0.83387417, -0.04307967, -0.11861804, -0.18834144])

In [15]:
X = T.scalar()
Y = T.scalar()

def model(X, w):
    return X * w

# w - weight? - initially zero (0)
w = th.shared(np.asarray(0.0, 
                         dtype=th.config.floatX))
# model is X times w
y = model(X, w)
X, w, y

(<TensorType(float64, scalar)>,
 <TensorType(float64, scalar)>,
 Elemwise{mul,no_inplace}.0)

### Formula to calculate the mean square error

In [16]:
cost = T.mean(T.sqr(y - Y))
cost

mean

In [17]:
gradient = T.grad(cost=cost, wrt=w)
th.pp(gradient)

'(((fill(sqr(((<TensorType(float64, scalar)> * <TensorType(float64, scalar)>) - <TensorType(float64, scalar)>)), fill(Sum{acc_dtype=float64}(sqr(((<TensorType(float64, scalar)> * <TensorType(float64, scalar)>) - <TensorType(float64, scalar)>))), TensorConstant{1.0})) * ((<TensorType(float64, scalar)> * <TensorType(float64, scalar)>) - <TensorType(float64, scalar)>)) * TensorConstant{2}) * <TensorType(float64, scalar)>)'

In [18]:
g = th.function(inputs=[X,Y], outputs=gradient)

In [19]:
g(-3,-2), w.get_value()

(array(-12.0), array(0.0))

In [20]:
updates = [[w, w - gradient * 0.01]]
updates

[[<TensorType(float64, scalar)>, Elemwise{sub,no_inplace}.0]]

In [21]:
train = th.function(inputs=[X, Y], outputs=cost, updates=updates, allow_input_downcast=True)

In [22]:
train(2,4)

array(16.0)

In [27]:
for i in range(100):
    for x, y in zip(trX, trY):
        train(x, y)

In [28]:
print(w.get_value()) #something around 2

-0.030906874681516143
