Merge pull request #778 from probmods/async-docs

Add documentation for parallel async parameter updates
probmods · Feb 19, 2017 · 750a199 · 750a199
2 parents e76e04f + 7d2610c
commit 750a199
Show file tree

Hide file tree

Showing 6 changed files with 253 additions and 208 deletions.
diff --git a/docs/index.rst b/docs/index.rst
@@ -21,7 +21,7 @@ WebPPL Documentation
    sample
    distributions
    inference/index
-   optimization
+   optimization/index
    functions/index
    globalstore
    packages

diff --git a/docs/optimization.rst b/docs/optimization.rst
diff --git a/docs/optimization/async.rst b/docs/optimization/async.rst
@@ -0,0 +1,34 @@
+.. _async:
+
+Parallelization
+===============
+
+Sharing parameters across processes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, parameters are stored in-memory and don't persist across executions.
+
+As an alternative, WebPPL supports sharing parameters between WebPPL processes using MongoDB. This can be used to persist parameters across runs, speed up optimization by running multiple identical processes in parallel, and optimize multiple objectives simultaneously.
+
+To use the MongoDB store, select it at startup time as follows::
+
+   webppl model.wppl --param-store mongo
+
+Parameters are associated with a *parameter set id* and sharing only takes place between executions that use the same id. To control sharing, you can specify a particular id using the ``param-id`` command-line argument::
+
+   webppl model.wppl --param-store mongo --param-id my-parameter-set
+
+To use the MongoDB store, MongoDB must be running. By default, WebPPL will look for MongoDB at ``localhost:27017`` and use the collection ``parameters``. This can be changed by adjusting the environment variables ``WEBPPL_MONGO_URL`` and ``WEBPPL_MONGO_COLLECTION``.
+
+Running multiple identical processes in parallel
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To simplify launching multiple identical processes with shared parameters, WebPPL provides a ``parallelRun`` script in the ``scripts`` folder. For example, to run ten processes that all execute ``model.wppl`` with parameter set id ``my-parameter-set``, run::
+
+   scripts/parallelRun model.wppl 10 my-parameter-set
+
+Any extra arguments are passed on to WebPPL, so this works::
+
+   scripts/parallelRun model.wppl 10 my-parameter-set --require webppl-json
+
+For a few initial results on the use of parallel parameter updates for LDA, see `this presentation <https://gist.github.com/stuhlmueller/8ab174bfa441e797a5d1c65e5ce5dcc5>`_.
diff --git a/docs/optimization/index.rst b/docs/optimization/index.rst
@@ -0,0 +1,46 @@
+.. _optimization:
+
+Optimization
+============
+
+Optimization provides an alternative approach to :ref:`marginal
+inference <inference>`.
+
+In this section we refer to the program for which we would like to
+obtain the marginal distribution as the *target program*.
+
+If we take a target program and add a :ref:`guide distribution
+<guides>` to each random choice, then we can define the *guide
+program* as the program you get when you sample from the guide
+distribution at each ``sample`` statement and ignore all ``factor``
+statements.
+
+If we endow this guide program with adjustable parameters, then we can
+optimize those parameters so as to minimize the distance between the
+joint distribution of the choices in the guide program and those in
+the target. For example::
+
+   Optimize({
+     steps: 10000, 
+     model: function() {
+       var x = sample(Gaussian({ mu: 0, sigma: 1 }), {
+         guide: function() {
+           return Gaussian({ mu: param(), sigma: 1 });
+         }});
+       factor(-(x-2)*(x-2))
+       return x;
+     }});
+
+This general approach includes a number of well-known algorithms as
+special cases.
+
+It is supported in WebPPL by :ref:`a method for performing
+optimization <optimize>`, primitives for specifying :ref:`parameters
+<parameters>`, and the ability to specify guides.
+
+.. toctree::
+   :maxdepth: 2
+
+   optimize
+   parameters
+   async