# Theano Variables

In [1]:
# Theano Tensor 
import theano.tensor as T

We can create a scalar, vector, and matrix as follows:

In [3]:
c = T.scalar('c')
v = T.vector('v')
A = T.matrix('A')

We also have Tensors, which work with dimensionality 3 and up. This is commonly used when dealing with images that have _not_ been flattened. For instance, if we had a 28x28 image, and we wanted to store the images as squares and we had $N$ images, we would have an $Nx28x28$ (3 dimensional) tensor.

Notice that the variables we have created so far _do not have values_, they are just symbols. This means we can even do algebra on them:

In [5]:
# Dot production
w = A.dot(v)

How do we actually set values to these variables? This is where _theano functions_ come into play. 

In [6]:
import theano

In [8]:
matrix_times_vector = theano.function(inputs=[A,v], outputs=w)

Now we can import numpy so we can create real arrays and call the function:

In [9]:
import numpy as np

In [11]:
A_val = np.array([[1,2], [3,4]])
v_val = np.array([5,6])

w_val = matrix_times_vector(A_val, v_val)
w_val

array([17., 39.])

One of the greatest benefits of theano is that it links all of these variables up into a graph. We can use that structure to calculate gradients for you, using the chain rule! In theano, regular variables are _not_ updateable. In order for a variable to be updateable it must be a _shared_ variable. 

In [12]:
x = theano.shared(20.0, 'x')

We can now create a simple cost function that we can solve ourselves, and that we know has a global minimum. 

In [15]:
cost = x*x + x + 1

Now we can tell theano how to update $x$ by giving it an update expression:

In [16]:
x_update = x - 0.3*T.grad(cost, x)

What is nice about theano is that it calculates gradients automatically. The `grad` function takes in two parameters. The first is the function you want to take the gradient of, and the second is the variable you want the gradient with respect to. 

We can now create a theano train function. It will be like the previous function we created, except we are going to add a new argument which is updates. The updates argument takes in a list of tuples, and each tuple has two things in it: 
1. The shared variable to update.
2. The update expression to use. 

In [17]:
train = theano.function(inputs=[], outputs=cost, updates=[(x, x_update)])

We have created a function to train, but we haven't actually called it yet. Notice that $x$ is not an input, it is the thing that we update. In later examples the inputs will be the data and labels. So, the inputs param takes in data and labels, and the updates param takes in your model parameters with their updates.

Now we can write a loop to call the training function:

In [18]:
for i in range(25):
  cost_val = train()
  print(cost_val)

421.0
67.99000000000001
11.508400000000002
2.4713440000000007
1.0254150400000002
0.7940664064
0.7570506250240001
0.75112810000384
0.7501804960006143
0.7500288793600982
0.7500046206976159
0.7500007393116186
0.750000118289859
0.7500000189263775
0.7500000030282203
0.7500000004845152
0.7500000000775223
0.7500000000124035
0.7500000000019845
0.7500000000003176
0.7500000000000506
0.7500000000000082
0.7500000000000013
0.7500000000000001
0.7500000000000001


We converge very quickly to the expected cost. We can print the optimal value of $x$ using the `get_value` function:

In [19]:
x.get_value()

array(-0.5)