# The DyNet computation graph
Computation graphs are the fundamental language used in frameworks like TensorFlow, Theano, PyTorch, ... They describe a graph of mathematical operations. A computation graph is defined by a set of nodes and edges, where nodes represent data (for example, a scalar value representing an input to the model or a matrix representing a set of learnable parameters) and edges represent function calls (for example, multiplying a value by another value).

The [class notes](http://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/04-nn-compgraph.pdf) contain more information about computation graphs.

## Entering/exiting the computation graph
Below we learn how to put data into the computation graph and perform a forward pass to get that data back.

In [1]:
import dynet as dy
import numpy as np

my_scalar = np.random.randint(0,100)
my_vector = np.random.random([3])
my_matrix = np.random.random([3,3])

In [2]:
my_scalar

55

In [3]:
my_vector

array([0.42570447, 0.1507006 , 0.18234419])

In [4]:
my_matrix

array([[0.63670543, 0.29752975, 0.75228366],
       [0.39132351, 0.37960974, 0.69654221],
       [0.3036    , 0.38567218, 0.13911835]])

Now that we have some random data, let's put it into the DyNet computation graph. First, we have to renew the computation graph.

In [5]:
dy.renew_cg()

scalar_exp = dy.scalarInput(my_scalar)
vector_exp = dy.inputVector(my_vector)
matrix_exp = dy.inputTensor(my_matrix)

Let's look at the size of these expressions using the [`dim`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.Expression.dim) function.

In [6]:
scalar_exp.dim()

((1,), 1)

In [7]:
vector_exp.dim()

((3,), 1)

In [8]:
matrix_exp.dim()

((3, 3), 1)

And then let's get the data back by calling the [`value`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.Expression.value) function. Depending on the dimensions of your expression, it might return a float (if it's a scalar), a list (if it's a vector), or a numpy array (if it's a matrix).

In [9]:
scalar_exp.value()

55.0

In [10]:
vector_exp.value()

[0.4257044792175293, 0.15070059895515442, 0.1823441982269287]

In [11]:
matrix_exp.value()

array([[0.63670546, 0.29752976, 0.75228363],
       [0.39132351, 0.37960973, 0.6965422 ],
       [0.30359998, 0.38567218, 0.13911834]])

You can also create DyNet expressions of any size by using several functions provided by DyNet, including [`zeros`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.zeros), [`ones`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.ones), and sampling from various random distributions.

In [12]:
zeros = dy.zeros((3, 3))
zeros.value()

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [13]:
random_uniform = dy.random_uniform((3, 3), -1.0, 1.0)
random_uniform.value()

array([[ 0.86002839,  0.56079078,  0.70130646],
       [-0.08160084,  0.59550714,  0.63070405],
       [-0.21578628, -0.91960108, -0.11020499]])

## Basic mathematic operators
Now we will learn about some of the basic math operators DyNet provides. We will also look at forward passes in the graph.

It supports basic operators like exponentiating, trigonometric functions, and nonlinearities on any expressions.

In [14]:
matrix_exp.value()

array([[0.63670546, 0.29752976, 0.75228363],
       [0.39132351, 0.37960973, 0.6965422 ],
       [0.30359998, 0.38567218, 0.13911834]])

In [15]:
dy.exp(matrix_exp).value()

array([[1.89024317, 1.34652841, 2.12184   ],
       [1.47893691, 1.46171391, 2.00680161],
       [1.35472703, 1.47060251, 1.14926004]])

In [16]:
dy.tanh(matrix_exp).value()

array([[0.56265211, 0.28905034, 0.63650942],
       [0.37250063, 0.36236846, 0.60216838],
       [0.29460362, 0.36762327, 0.13822773]])

In [17]:
dy.rectify(matrix_exp).value()

array([[0.63670546, 0.29752976, 0.75228363],
       [0.39132351, 0.37960973, 0.6965422 ],
       [0.30359998, 0.38567218, 0.13911834]])

By the way, any time `value` or `forward` is called, a forward pass is performed on the graph. This means that the  computations are actually carried out and a numerical value is returned. All nodes and edges in the graph that contribute to the value you request will be used. Before calling either of these functions, the graph is just a set of nodes and edges describing a computation. We will discuss forward passes more in detail during the batching section.

DyNet has some simple binary operators overloaded, including +, -, \*, and /. This means you can perform any of these operations with an Expression and a Python scalar, and the operation will be projected across all dimensions of the expression.

In [18]:
(matrix_exp + 1.0).value()

array([[1.6367054 , 1.2975297 , 1.75228357],
       [1.39132357, 1.3796097 , 1.69654226],
       [1.30359995, 1.38567221, 1.13911831]])

In [19]:
(matrix_exp / 2.0).value()

array([[0.31835273, 0.14876488, 0.37614182],
       [0.19566175, 0.18980487, 0.3482711 ],
       [0.15179999, 0.19283609, 0.06955917]])

It also provides component-wise operations on multiple expressions:

In [20]:
(matrix_exp + random_uniform).value()

array([[ 1.4967339 ,  0.85832053,  1.45359015],
       [ 0.30972266,  0.97511685,  1.32724619],
       [ 0.08781371, -0.53392887,  0.02891335]])

In [21]:
dy.cdiv(matrix_exp, random_uniform).value()

array([[ 0.7403307 ,  0.53055394,  1.07268882],
       [-4.79558134,  0.63745618,  1.10438836],
       [-1.40694761, -0.41939071, -1.26235974]])

Some operations are useful for summarizing information about an Expression or reshaping it.

In [22]:
dy.sum_elems(matrix_exp).value()

3.9823849201202393

In [23]:
dy.mean_elems(vector_exp).value()

0.2529164254665375

In [24]:
dy.reshape(matrix_exp, (9, 1)).dim()

((9, 1), 1)

A few special operations can be used on lists of expressions.

In [25]:
dy.esum([matrix_exp, random_uniform]).value()

array([[ 1.4967339 ,  0.85832053,  1.45359015],
       [ 0.30972266,  0.97511685,  1.32724619],
       [ 0.08781371, -0.53392887,  0.02891335]])

## Parameter Collections and Parameters
DyNet has a [`ParameterCollection`](http://dynet.readthedocs.io/en/latest/python_ref.html#parametercollection-and-parameters) object which is used to store optimizable tensors (e.g., a bias vector or weight matrix). 

[`Parameters`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.ParameterCollection) contain tensor data. To load them into the graph, you first need to call [`dy.parameter()`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.parameter).

[`LookupParameters`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.LookupParameters) represents a table of parameters. In general, these are used as lists of vectors, where you can look up the appropriate vector and add it to the graph by indexing the lookup parameters as you would a normal Python list.

Below we will see examples of the two types of parameters. First, we have to create the parameter collection.

In [26]:
pc = dy.ParameterCollection()

Then we can create a parameters object. Let's make a weight vector object that we can multiply out matrices with.

In [27]:
parameters = pc.add_parameters((3, 1))

graph_params = dy.parameter(parameters)
graph_params.value()

array([[-0.15800512],
       [ 0.0979048 ],
       [ 0.00921047]])

Above, we created a parameters vector of size 3 x 1, loaded it into the graph, and got its value. We can also check the value by calling [`as_array`](http://dynet.readthedocs.io/en/latest/python_ref.html#dynet.Parameters.as_array):

In [28]:
parameters.as_array()

array([[-0.15800512],
       [ 0.0979048 ],
       [ 0.00921047]])

We can perform a few computations in the graph:

In [29]:
m1 = matrix_exp * graph_params
m2 = random_uniform * graph_params
result = m1 + m2
result.value()

array([[-0.13906968],
       [ 0.05875542],
       [-0.06588291]])

Recall that until we call `value`, no computations have actually been performed. 

Now let's creates some lookup parameters.

In [30]:
lookup_parameters = pc.add_lookup_parameters((100, 3))

We get values from the lookup parameters by using syntax similar to Python indexing:

In [31]:
lookup_vector = lookup_parameters[my_scalar]
lookup_vector.value()

[0.7301192283630371, 0.0772939920425415, 0.5839259624481201]

Sometimes during model development it's necessary to save and load learned parameters. DyNet will save all `Parameters` and `LookupParameters` objects. However, it won't save other things during training such as learning rate coefficients and optimizer parameters, so be careful.

In [33]:
save_filename = "save.dy"
pc.save(save_filename)
pc.populate(save_filename)