# Basics of Theano

Let's make a very simple function: 
## Adding two numbers

In [1]:
import theano
import theano.tensor as T
from theano import function
x = T.dscalar('x')
#double array with name to be x
y = T.dscalar('y')
#double array with name to be y
z = x + y
f = function([x, y], z)

And now that we’ve created our function and we can use it:

In [2]:
f(2, 3)

array(5.0)

In [3]:
f(16.3, 12.1)

array(28.4)

## Adding two matrices

 Indeed, the only change from the previous example is that you need to instantiate x and y using the matrix Types:

In [4]:
x = T.dmatrix('x')
y = T.dmatrix('y')
z = x + y
f = function([x, y], z)

`dmatrix` is the Type for matrices of doubles. Then we can use our new function on 2D arrays:

In [5]:
f([[1, 2], [3, 4]], [[10, 20], [30, 40]])

array([[ 11.,  22.],
       [ 33.,  44.]])

The variable is a NumPy array. We can also use NumPy arrays directly as inputs:

In [6]:
import numpy
f(numpy.array([[1, 2], [3, 4]]), numpy.array([[10, 20], [30, 40]]))

array([[ 11.,  22.],
       [ 33.,  44.]])

The following types are available:
* **byte**:`bscalar, bvector, bmatrix, brow, bcol, btensor3, btensor4`
* **16-bit integers**: `wscalar, wvector, wmatrix, wrow, wcol, wtensor3, wtensor4`
* **32-bit integers**: `iscalar, ivector, imatrix, irow, icol, itensor3, itensor4`
* **64-bit integers**: `lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4`
* **float**: `fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4`
* **double**: `dscalar, dvector, dmatrix, drow, dcol, dtensor3, dtensor4`
* **complex**: `cscalar, cvector, cmatrix, crow, ccol, ctensor3, ctensor4`

# Shared Variables

Let’s define the accumulator function. It adds its argument to the internal state, and returns the old state value.

In [7]:
from theano import shared
state = shared(0)
inc = T.iscalar('inc')
accumulator = function([inc], state, updates=[(state, state+inc)])

This code introduces a few new concepts. The `shared` function constructs so-called shared variables. These are hybrid symbolic and non-symbolic variables whose value may be shared between multiple functions.The value can be accessed and modified by the `.get_value()` and `.set_value()` methods. 

The other new thing in this code is the `updates` parameter of function. `updates` must be supplied with a list of pairs of the form (shared-variable, new expression).It defines a updating rule for a function, in a sense, “whenever this function runs, it will replace the `.value` of each shared variable with the result of the corresponding expression”.

Let's have a try!

In [8]:
state.get_value()

array(0)

In [9]:
accumulator(1)

array(0)

In [10]:
state.get_value()

array(1)

In [11]:
accumulator(300)

array(1)

In [12]:
state.get_value()

array(301)

It is possible to reset the state. Just use the `.set_value()` method:

In [13]:
state.set_value(-1)
accumulator(3)

array(-1)

In [14]:
state.get_value()

array(2)

You might be wondering why the updates mechanism exists. You can always achieve a similar result by returning the new expressions, and working with them in NumPy as usual.It is mainly for efficiency and syntactic convenience.
* Quick and convenienct for in-place algorithm, such as gradient descent
* Allocated in GPU for parallel computing

When we have `update` and `shared`, how easy a logistic regression will be?

In [15]:
# -*- coding: utf-8 -*-
"""
Created on Fri Oct 23 22:57:39 2015

@author: geogria
"""

import numpy
import theano
import theano.tensor as T
rng = numpy.random

N = 400
feats = 784
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000

# Declare Theano symbolic variables
x = T.matrix("x",dtype='float64')
y = T.vector("y",dtype='float64')
w = theano.shared(rng.randn(feats), name="w")
b = theano.shared(0., name="b")
print("Initial model:")
print(w.get_value())
print(b.get_value())

# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b))   # Probability that target = 1
prediction = p_1 > 0.5                    # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b])             # Compute the gradient of the cost
                                          # (we shall return to this in a
                                          # following section of this tutorial)

# Compile
train = theano.function(
          inputs=[x,y],
          outputs=[prediction, xent],
          updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)))
predict = theano.function(inputs=[x], outputs=prediction)

# Train
for i in range(training_steps):
    pred, err = train(D[0], D[1])


print("Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0]))
print("error rate on D:")
print("The error rate of logistic regression on current data is %f%%"% (100-100*sum(predict(D[0])==D[1])/N))

Initial model:
[ 0.26387655  1.30884819 -1.40505943 -1.70166612  1.02481198 -1.50681843
 -1.00673125  0.97593099  1.19517985 -0.59950998  0.11646974  0.13021119
  1.80328951 -0.12183797 -0.23934532  0.49098185 -0.49768521 -0.25462934
  0.8029802  -0.68715925 -0.76375023 -0.18694247  0.49796417  2.18232051
  0.83950499  0.28321799  0.52423084  0.53046588 -0.26948    -0.75118415
 -0.24592226  0.48168119  1.43678952 -0.02914829 -0.21078667  0.63208399
 -0.68087956 -0.52923221  0.65394403  0.57312532 -0.04877498  1.72828063
 -0.10654639  1.53751918 -0.9505946   0.66380084  1.35181672  0.18216064
 -1.77447859 -0.11738793  0.44273407  1.00585611 -0.30748124 -0.35192503
 -1.38858729 -0.46111469 -1.60291084 -2.71783744  0.68913571 -2.58118772
 -1.16847936 -0.07467098 -1.00171459 -1.34013243 -0.18655731  2.44109409
  0.60422183  0.29821798  1.12941728  1.05607663 -1.86983459 -1.60347768
  1.16484494 -1.28153907 -1.4022972  -0.88358031 -0.03239483  0.29141669
 -0.23749775  0.5495334   0.06250661