# Adding Two Scalars

In [5]:
import numpy as np
import theano.tensor as T
from theano import function

In Theano, all symbols must be _typed_. In particular, below we will use `T.dscalar` as the type for a "0-dimensional array (scalar) of type double". This is a Theano `type`. 

In [25]:
x = T.dscalar('x')
y = T.dscalar('y')

Note, `dscalar` is _not_ a class, so `x` and `y` are not instances of `dscalar`. They are instances of `TensorVariable`. `x` and `y`, however, are assigned the theano Type `dscalar` in their `type` field, as you can see below:

In [26]:
type(x)

theano.tensor.var.TensorVariable

In [27]:
x.type

TensorType(float64, scalar)

In [28]:
T.dscalar

TensorType(float64, scalar)

In [29]:
x.type is T.dscalar

True

By calling `T.dscalar` with a string argument, you create a _Variable_ representing a floating-point scalar quantity with the given name. If you provide no argument, the symbol will be unamed. Names are not required, but will often help with debugging. 

Next, we can combine `x` and `y` into their sum, `z`:

In [30]:
z = x + y

`z` is yet another variable which represents the addition of `x` and `y`. You can use the `pp` function to pretty-print out the computation associated to `z`:

In [31]:
from theano import pp
print(pp(z))

(x + y)


The last step is to create a function taking `x` and `y` as inputs, and giving `z` as output: 

In [32]:
f = function([x, y], z)

In [33]:
f(2, 3)

array(5.)

Note there is a slight delay when executing the `function` instruction. This because behind the scenes, `f` was being compiled into C code.

The first argument to `function` is a list of Variables that will be provided as inputs to the function. The second argument is a single Variable or a list of Variables. For either case, the second argument is what we want to see as output when we apply the function. `f` may then be used like a normal Python function.

# Adding Two Matrices
This next step simply requires that we instantiate `x` and `y` using the matrix Types:

In [43]:
x = T.dmatrix('x')
y = T.dmatrix('y')
z = x + y
f = function([x, y] , z)

We can see that we are able to pass in either python lists or numpy arrays. 

In [41]:
f([[1,2],[3,4]], [[10,20],[30,40]])

array([[11., 22.],
       [33., 44.]])

In [42]:
f(np.array([[1,2], [3,4]]), np.array([[10,20], [30,40]]))

array([[11., 22.],
       [33., 44.]])

# Shared Variables
It is also possible to make a function with an internal state. For example, let's say we want to make an accumulator: at the beginning, the state is initialized to zero. Then, on each function call, the state is incremented by the function's argument. 

First, let's define the `accumulator` function. It adds its argument to the internal state, and returns the old state value. 

In [53]:
from theano import shared
state = shared(0)
inc = T.iscalar('inc')
accumulator = function([inc], state, updates=[(state, state+inc)])

The above code introduces a few new concepts. The `shared` function constructs so-called _shared variables_. These are hybrid symbolic and non-symbolic variables whose value may be shared between multiple functions. Shared variables can be used in symbolic expressions just like the objects returned by `dmatrices(...)` but they also have an internal value that defines the value taken by this symbolic variable in all the functions that use it. It is called a _shared_ variable because its value is shared between many functions. The value can be accessed and modified by the `.get_value()` and `.set_value()` methods. 

The other new thing in this code is the `updates` parameter of `function`. `updates` must be supplied with a list of pairs of the form: `(shared-variable, new expression)`. It can also be a dictionary whose keys are shared-variables and values are the new expressions. Either way, it means “whenever this function runs, it will replace the `.value` of each shared variable with the result of the corresponding expression”. Above, our accumulator replaces the state‘s value with the sum of the state and the increment amount.

We can now try this out:

In [54]:
print(state.get_value())

0


In [55]:
accumulator(1)

array(0)

In [56]:
print(state.get_value())

1


In [57]:
accumulator(300)

array(1)

In [58]:
print(state.get_value())

301


We can also reset the state by using the `.set_value()` method:

In [59]:
state.set_value(-1)

In [60]:
accumulator(3)

array(-1)

In [61]:
print(state.get_value())

2


As we mentioned above, you can define more than one function to use the same shared variable. These functions can all update the value.

In [63]:
decrementor = function([inc], state, updates=[(state, state-inc)])

In [64]:
decrementor(2)

array(2)

In [65]:
print(state.get_value())

0


You might be wondering why the updates mechanism exists. You can always achieve a similar result by returning the new expressions, and working with them in NumPy as usual. The updates mechanism can be a syntactic convenience, but it is mainly there for efficiency. Updates to shared variables can sometimes be done more quickly using in-place algorithms (e.g. low-rank matrix updates). Also, Theano has more control over where and how shared variables are allocated, which is one of the important elements of getting good performance on the GPU.

# Logistic Regression Example

In [67]:
import numpy as np
import theano 
import theano.tensor as T
rng = numpy.random

In [83]:
N = 400      # Training sample size
feats = 784  # Number of input variables

# generate a dataset: D = (input_values, target_class)
# D[0].shape = (400, 784), D[1].shape = (400,)
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000

# Declare Theano symbolic variables
x = T.dmatrix('x')
y = T.dvector('y')

# Initialize the weight vector w randomly
# 
# This and the following bias variable b 
# are shared so they keep their udpate values 
# between training iterations (updates)
w = theano.shared(rng.randn(feats), name='w')

# Initialize bias term
b = theano.shared(0., name='b')

print('Initial model: ')
print(w.get_value())
print(b.get_value())

# Construct Theano Expression Graph 
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b))       # Probability that target = 1 
prediction = p_1 > 0.5                        # Prediction threshold
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()    # Compute the gradient of the cost (w/ reg)
gw, gb = T.grad(cost, [w,b])                  # w.r.t weight vector w and bias term b

# Compile
train = theano.function(
  inputs=[x,y], 
  outputs=[prediction, xent], 
  updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb))
)
predict = theano.function(inputs=[x], outputs=prediction)

# Train
for i in range(training_steps):
  pred, err = train(D[0], D[1])
  
print("Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0]))

Initial model: 
[-0.28307552 -0.11711561 -0.67435859 -0.71231062 -0.009103   -1.18966854
 -0.77646446 -0.44784415 -0.63582262 -0.04495561 -0.21439312  1.38320957
 -1.30081328  0.31912496 -0.97557031 -0.85442273  0.24643977  0.25432972
 -0.06801892  1.32482988 -0.40875127  1.1462789   1.16642494 -0.16878324
 -0.82915517 -0.80398955  0.05212891 -0.09409004 -1.21074183  0.38909923
  0.49830529 -1.67231392 -0.95509322 -0.06741542 -0.10903575  0.86879037
 -1.3241949  -1.41218204  0.33924891 -0.27191577 -0.28371561 -0.50220569
  0.20304429  0.53546847  0.10693703  0.09831726  0.63717152 -0.33578441
  1.61789648 -0.00569742 -0.60483961 -1.07379933  0.30860778 -2.0485849
  0.38854608  0.11394413  0.0039074   1.62745394  0.83758397 -0.49460041
  1.06559908 -0.05419296  0.78746853  1.69154266 -1.44937989  0.71072351
 -0.75380912  0.2223313   1.0930451  -0.45018581  0.2855118  -0.23506212
  0.05537093 -0.09736918 -0.2835469   0.26627825  0.61577604  1.16462771
  1.76625284  0.20198312  0.04457348

Final model:
[-7.32553997e-03  1.27922373e-01  5.40192121e-02 -7.99073136e-02
 -3.29718856e-02  4.34225176e-02  1.00935763e-01  1.67103554e-01
 -1.97260799e-01 -1.50839129e-01  4.58798786e-02 -6.40497456e-02
 -5.43569231e-02 -2.53379467e-03  9.81826841e-03  4.41456073e-02
 -4.09482073e-02  1.68933601e-01 -7.99995998e-02 -6.95000593e-02
  1.71376562e-02 -1.12115828e-02  4.50506951e-02 -6.26846228e-03
 -1.67830111e-01 -3.48564063e-02  7.95880379e-02 -1.24361733e-01
  4.02783284e-02  4.45342066e-02  7.94553873e-03  9.94931809e-02
  7.27536743e-02 -5.61061985e-02 -1.07771314e-01  7.60161147e-02
 -3.30110992e-02  7.90311192e-02  1.69184821e-01 -4.46505156e-02
  7.49738825e-02  9.68492953e-02 -4.78058505e-02  3.11550561e-02
 -2.03373735e-03 -6.39608867e-02  6.30132998e-02 -1.61270786e-04
 -6.88545499e-02  5.36754298e-02  5.19537839e-02 -8.52084383e-02
 -6.02131927e-03 -1.00905619e-01  1.56084346e-01 -4.86321261e-02
 -7.35985301e-02  3.89621485e-02  1.81863345e-01 -7.30520518e-02
  5.77493577