To globally install webppl-nn
, run:
mkdir -p ~/.webppl
npm install --prefix ~/.webppl https://github.com/null-a/webppl-nn
Once installed, you can make the package available to program.wppl
by running:
webppl program.wppl --require webppl-nn
This package is very experimental. Expect frequent breaking changes.
This package currently requires the development version of WebPPL.
(i.e. The tip of the dev
branch.)
In WebPPL we can represent "neural" networks as parameterized functions, typically from vectors to vectors. (By building on adnn.) This package provides a number of helper functions that capture common patterns in the shape of these functions. These helpers typically take a name and input/output dimensions as arguments.
var net = affine('net', {in: 3, out: 5});
var out = net(ones([3, 1])); // dims(out) == [5, 1]
Larger networks are built with function composition. The stack
helper makes the common pattern of stacking "layers" more readable.
var mlp = stack([
sigmoid,
affine('layer2', {in: 5, out: 1}),
tanh,
affine('layer1', {in: 5, out: 5})
]);
By default, the parameters for such functions are created internally
using the param
method. An alternative method can be specified using
the param
argument. For example, the model parameter helpers can be
used here:
var net1 = linear('net1', {in: 20, out: 10, param: modelParam});
var net2 = linear('net2', {in: 20, out: 10, param: modelParamL2(1)});
Note that parameters are created when a network constructor (linear
,
affine
, etc.) is called. This is a change from earlier versions of
webppl-nn, where parameter creation was delayed until the function
representing the network was applied to an input.
As a consequence, in typical usage, network constructors should now be
called from within Optimize
, rather than from outside of
Optimize
. See the VAE example to see what this looks
like in practice.
WebPPL parameters are primarily used to parameterize guide programs. In the model, the analog of a parameter is a prior guided by a delta distribution. This choice of guide gives a point estimate of the value of the random choice in the posterior when performing inference as optimization.
WebPPL includes a helper modelParam
which creates model parameters
using an improper uniform distribution as the prior. Since it is not
possible to sample from this improper distribution modelParam
can
only be used with optimization based algorithms.
This package provides an additional helper modelParamL2
which can be
used to create model parameters that have a Gaussian prior. When
performing inference as optimization this prior acts as a regularizer.
Since modelParamL2
creates a Gaussian random choice, it can be used
with all sampling based inference algorithms.
To allow the width of the prior to be specified, modelParamL2
takes
a single argument specifying the standard deviation of the Gaussian.
This returns a function that takes an object in the same format as
param
and modelParam
.
var w = modelParamL2(1)({name: 'w', dims: [2, 2]});
Note that in general, model parameters and parameters created with
param
are somewhat different in their behavior. For example, these
two fragments of code are equivalent:
// 1.
var p = param({name: 'p'});
f(p);
g(p);
// 2.
f(param({name: 'p'}))
g(param({name: 'p'}))
However, if param({name: 'p'})
is replaced with
modelParamL2(1)({name: 'p'})
for example, then they are not
equivalent. The reason is that each call to the function returned by
modelParamL2(1)
(for example) adds a random choice to the model. In
the common setting of optimizing the ELBO for example, each such
random choice has the effect of extending the optimization objective
with a weight decay term for its parameter. i.e. Additional calls to
modelParamL2(1)
(for a particular parameter) incur additional weight
decay penalties.
These return a parameterized function of a single argument that maps a
vector of length in
to a vector of length out
.
The init
argument can be used to specify the initialization of the
weight matrix. It accepts a function that takes the shape of the
matrix as its argument and returns a matrix of that shape. The default
is xavier
.
Example usage:
var idMatrixInit = function(dims) {
return idMatrix(dims[0]);
};
linear('l', {in: 10, out: 10, init: idMatrixInit});
See bias
for details of the initb
argument.
Returns a parameterized function of a single argument that maps
vectors of length out
to vectors of length out
.
The initb
argument specifies the value with which each element of
the bias vector is initialized. The default is 0
.
Example usage:
bias('b', {out: 10, initb: -1});
These return a parameterized function of two arguments that maps a
state vector of length hdim
and an input vector of length xdim
to
a new state vector.
Leaky rectified linear unit.
Maps vectors of length n
to probability vectors of length n + 1
.
In contrast to the softmax
function, a network with
squishToProbSimplex
at the output and no regularization is not over
parameterized. However, with regularization, a network with softmax
at the output will not be over parameterized either.
Returns a function that creates model parameters with a Gaussian({mu: 0, sigma: sd})
prior. The returned function has the same interface as
param
and modelParam
.
Returns the composition of the array of functions fns
. The functions
in fns
are applied in right to left order.
Returns the n
by n
identity matrix.
Returns a vector with length length
in which all entries are zero
except for the entry at index
which is one.
Returns the vector obtained by concatenating the elements of arr
.
(arr
is assumed to be an array of vectors.)
concat([ones([2, 1]), zeros([2, 1])]); // => Vector([1, 1, 0, 0])
Implements a variant of the parameter initialization scheme described in Understanding the difficulty of training deep feedforward neural networks.
This is the default initialization scheme for matrix valued network parameters.
MIT