webppl-nn

Installation

To globally install webppl-nn, run:

mkdir -p ~/.webppl
npm install --prefix ~/.webppl https://github.com/null-a/webppl-nn

Usage

Once installed, you can make the package available to program.wppl by running:

webppl program.wppl --require webppl-nn

Status

This package is very experimental. Expect frequent breaking changes.

Compatibility

This package currently requires the development version of WebPPL. (i.e. The tip of the dev branch.)

Introduction

Neural Nets

In WebPPL we can represent "neural" networks as parameterized functions, typically from vectors to vectors. (By building on adnn.) This package provides a number of helper functions that capture common patterns in the shape of these functions. These helpers typically take a name and input/output dimensions as arguments.

var net = affine('net', {in: 3, out: 5});
var out = net(ones([3, 1])); // dims(out) == [5, 1]

Larger networks are built with function composition. The stack helper makes the common pattern of stacking "layers" more readable.

var mlp = stack([
  sigmoid,
  affine('layer2', {in: 5, out: 1}),
  tanh,
  affine('layer1', {in: 5, out: 5})
]);

By default, the parameters for such functions are created internally using the param method. An alternative method can be specified using the param argument. For example, the model parameter helpers can be used here:

var net1 = linear('net1', {in: 20, out: 10, param: modelParam});
var net2 = linear('net2', {in: 20, out: 10, param: modelParamL2(1)});

Note that parameters are created when a network constructor (linear, affine, etc.) is called. This is a change from earlier versions of webppl-nn, where parameter creation was delayed until the function representing the network was applied to an input.

As a consequence, in typical usage, network constructors should now be called from within Optimize, rather than from outside of Optimize. See the VAE example to see what this looks like in practice.

Model Parameters

WebPPL parameters are primarily used to parameterize guide programs. In the model, the analog of a parameter is a prior guided by a delta distribution. This choice of guide gives a point estimate of the value of the random choice in the posterior when performing inference as optimization.

WebPPL includes a helper modelParam which creates model parameters using an improper uniform distribution as the prior. Since it is not possible to sample from this improper distribution modelParam can only be used with optimization based algorithms.

This package provides an additional helper modelParamL2 which can be used to create model parameters that have a Gaussian prior. When performing inference as optimization this prior acts as a regularizer. Since modelParamL2 creates a Gaussian random choice, it can be used with all sampling based inference algorithms.

To allow the width of the prior to be specified, modelParamL2 takes a single argument specifying the standard deviation of the Gaussian. This returns a function that takes an object in the same format as param and modelParam.

var w = modelParamL2(1)({name: 'w', dims: [2, 2]});

Note that in general, model parameters and parameters created with param are somewhat different in their behavior. For example, these two fragments of code are equivalent:

// 1.
var p = param({name: 'p'});
f(p);
g(p);

// 2.
f(param({name: 'p'}))
g(param({name: 'p'}))

However, if param({name: 'p'}) is replaced with modelParamL2(1)({name: 'p'}) for example, then they are not equivalent. The reason is that each call to the function returned by modelParamL2(1) (for example) adds a random choice to the model. In the common setting of optimizing the ELBO for example, each such random choice has the effect of extending the optimization objective with a weight decay term for its parameter. i.e. Additional calls to modelParamL2(1) (for a particular parameter) incur additional weight decay penalties.

Examples

Variational Auto-encoder

Reference

Networks

`linear(name, {in, out[, param, init]})`

`affine(name, {in, out[, param, init, initb]})`

These return a parameterized function of a single argument that maps a vector of length in to a vector of length out.

The init argument can be used to specify the initialization of the weight matrix. It accepts a function that takes the shape of the matrix as its argument and returns a matrix of that shape. The default is xavier.

Example usage:

var idMatrixInit = function(dims) {
  return idMatrix(dims[0]);
};
linear('l', {in: 10, out: 10, init: idMatrixInit});

See bias for details of the initb argument.

`bias(name, {out[, param, initb]})`

Returns a parameterized function of a single argument that maps vectors of length out to vectors of length out.

The initb argument specifies the value with which each element of the bias vector is initialized. The default is 0.

Example usage:

bias('b', {out: 10, initb: -1});

`rnn(name, {hdim, xdim, [, param, ctor, output]})`

`gru(name, {hdim, xdim, [, param, ctor]})`

`lstm(name, {hdim, xdim, [, param]})`

These return a parameterized function of two arguments that maps a state vector of length hdim and an input vector of length xdim to a new state vector.

Non-linearities

`sigmoid(x)`

`tanh(x)`

`relu(x)`

`lrelu(x)`

Leaky rectified linear unit.

`softplus(x)`

`softmax(x)`

`squishToProbSimplex(x)`

Maps vectors of length n to probability vectors of length n + 1.

In contrast to the softmax function, a network with squishToProbSimplex at the output and no regularization is not over parameterized. However, with regularization, a network with softmax at the output will not be over parameterized either.

Model Parameters

`modelParamL2(sd)`

Returns a function that creates model parameters with a Gaussian({mu: 0, sigma: sd}) prior. The returned function has the same interface as param and modelParam.

Utilities

`stack(fns)`

Returns the composition of the array of functions fns. The functions in fns are applied in right to left order.

`idMatrix(n)`

Returns the n by n identity matrix.

`oneHot(index, length)`

Returns a vector with length length in which all entries are zero except for the entry at index which is one.

`concat(arr)`

Returns the vector obtained by concatenating the elements of arr. (arr is assumed to be an array of vectors.)

concat([ones([2, 1]), zeros([2, 1])]); // => Vector([1, 1, 0, 0])

`xavier(dims)`

Implements a variant of the parameter initialization scheme described in Understanding the difficulty of training deep feedforward neural networks.

This is the default initialization scheme for matrix valued network parameters.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

webppl-nn

Installation

Usage

Status

Compatibility

Introduction

Neural Nets

Model Parameters

Examples

Reference

Networks

`linear(name, {in, out[, param, init]})`

`affine(name, {in, out[, param, init, initb]})`

`bias(name, {out[, param, initb]})`

`rnn(name, {hdim, xdim, [, param, ctor, output]})`

`gru(name, {hdim, xdim, [, param, ctor]})`

`lstm(name, {hdim, xdim, [, param]})`

Non-linearities

`sigmoid(x)`

`tanh(x)`

`relu(x)`

`lrelu(x)`

`softplus(x)`

`softmax(x)`

`squishToProbSimplex(x)`

Model Parameters

`modelParamL2(sd)`

Utilities

`stack(fns)`

`idMatrix(n)`

`oneHot(index, length)`

`concat(arr)`

`xavier(dims)`

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

webppl-nn

Installation

Usage

Status

Compatibility

Introduction

Neural Nets

Model Parameters

Examples

Reference

Networks

linear(name, {in, out[, param, init]})

affine(name, {in, out[, param, init, initb]})

bias(name, {out[, param, initb]})

rnn(name, {hdim, xdim, [, param, ctor, output]})

gru(name, {hdim, xdim, [, param, ctor]})

lstm(name, {hdim, xdim, [, param]})

Non-linearities

sigmoid(x)

tanh(x)

relu(x)

lrelu(x)

softplus(x)

softmax(x)

squishToProbSimplex(x)

Model Parameters

modelParamL2(sd)

Utilities

stack(fns)

idMatrix(n)

oneHot(index, length)

concat(arr)

xavier(dims)

License

`linear(name, {in, out[, param, init]})`

`affine(name, {in, out[, param, init, initb]})`

`bias(name, {out[, param, initb]})`

`rnn(name, {hdim, xdim, [, param, ctor, output]})`

`gru(name, {hdim, xdim, [, param, ctor]})`

`lstm(name, {hdim, xdim, [, param]})`

`sigmoid(x)`

`tanh(x)`

`relu(x)`

`lrelu(x)`

`softplus(x)`

`softmax(x)`

`squishToProbSimplex(x)`

`modelParamL2(sd)`

`stack(fns)`

`idMatrix(n)`

`oneHot(index, length)`

`concat(arr)`

`xavier(dims)`