In [1]:
import dynet as dy

The parameter collection contains all of the learned parameters in the model.

In [2]:
pc = dy.ParameterCollection()

# Parameters
First we will look at how to use Parameters in Dynet.

In [3]:
weights = pc.add_parameters((2, 4))
biases = pc.add_parameters((4,), init=dy.UniformInitializer(0.1))

`weights` is a matrix of size (2, 4). `biases` is a vector of length 4, and is initialized uniformly in [-0.1, 0.1].

In [4]:
weights.as_array()

array([[ 0.53551078,  0.73564613,  0.48881745, -0.47133124],
       [ 0.61683357, -0.10449088, -0.52180678,  0.52510762]])

In [5]:
biases.as_array()

array([ 0.02503315, -0.00600196,  0.0618191 ,  0.05418576])

Now we will perform some basic computations with these parameters. There are two important things to note. First, you should reset the computation graph by calling `dy.renew_cg()`. Also, whenever you want to use the parameters we created, you should wrap them in a call to `dy.parameter(.)`. This transforms them into a `dy.Expression`, the which is the type for data in the computation graph.

First though, we will create a vector to pass into the network.

In [6]:
import numpy as np
my_array = np.random.rand(2)
my_array

array([0.46854802, 0.79034192])

To put it into the computation graph, we need to call `dy.inputTensor`.

In [7]:
dy.renew_cg()
my_array = dy.inputTensor(my_array)

Then we pass it through the network by multiplying it by `weights` and adding the `biases`. 

Also note that we have to reshape the input array so that we can perform the multiplication. This adds a dimension to the vector. Before the reshape, it was a 1D vector of length 2. After, it's a 2D matrix of size 1x2. This allows us to multiply it by the 2x4 `weights` matrix. The multiplication gives us a vector of size (1, 4). So we must transpose it before adding the `biases`.

In [8]:
weights_exp = dy.parameter(weights)
biases_exp = dy.parameter(biases)

my_array = dy.reshape(my_array, (1, 2))
result = my_array * weights_exp
result = dy.transpose(result)
result = result + biases_exp


Several functions allow us to investigate the `result`. `dim()` gives us the shape or dimensions of the expression and `value()` gives us the actual values in the expression. 

By the way, before calling `value()`, the computation graph hasn't actually performed any computations. Only when you request the actual numerical value of an expression does it perform the necessary computations to get the result.

In [9]:
result.dim()

((4, 1), 1)

In [10]:
result.value()

array([[ 0.76345509],
       [ 0.25610006],
       [-0.12155221],
       [ 0.24835899]])

# Lookup parameters
Now we will learn how to use `dy.LookupParameters`. Generally speaking, lookup parameters are a matrix that you can select rows from using  indexing, in contrast to normal parameters, which require a call to `dy.parameter(.)`. They are more efficient to use than parameters as well.

In [11]:
lookup_params = pc.add_lookup_parameters((3, 4))
lookup_params.as_array()

array([[-0.65459633,  0.19909722,  0.32690293,  0.7955175 ],
       [-0.3908107 ,  0.80685657,  0.44233733,  0.8011722 ],
       [-0.66098285,  0.55475658,  0.68738383, -0.63972837]])

Lookup parameters can be used to store word vectors or feature embeddings. In this case, we have 3 rows and 4 columns, meaning that we have 3 unique vectors of size 4.

Let's look up some vectors.

In [12]:
dy.renew_cg()
embedding_0 = lookup_params[0]
embedding_1 = lookup_params[1]
embedding_2 = lookup_params[2]

Let's verify the values.

In [13]:
embedding_0.value()

[-0.6545963287353516,
 0.19909721612930298,
 0.32690292596817017,
 0.7955175042152405]

In [14]:
embedding_1.value()

[-0.39081069827079773,
 0.8068565726280212,
 0.4423373341560364,
 0.8011721968650818]

In [15]:
embedding_2.value()

[-0.6609828472137451,
 0.5547565817832947,
 0.6873838305473328,
 -0.6397283673286438]

This is what we expected (look above). Now let's show off a few things you can do in Dynet which might be useful. First, we can compute the mean of the three vectors. 

In [16]:
embeddings_mean = dy.esum([embedding_0, embedding_1, embedding_2]) / 3.
embeddings_mean.value()

[-0.5687966346740723,
 0.5202368497848511,
 0.48554134368896484,
 0.3189871311187744]