Merge branch 'GPflow:develop' into cp_switchdim

GPflow · Jul 1, 2021 · 62eeadd · 62eeadd
2 parents e257a1c + 268e315
commit 62eeadd
Show file tree

Hide file tree

Showing 70 changed files with 2,411 additions and 401 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -40,7 +40,7 @@ jobs:
       - checkout
       - run:
           name: Install type checker
-          command: pip install mypy
+          command: pip install mypy types-pkg_resources
       - run:
           name: Run type checker
           command: mypy gpflow tests
@@ -68,7 +68,7 @@ jobs:
       - run:
           name: Install black and isort
           command: |
-            pip install black==19.10b0 isort
+            pip install black==20.8b1 isort
       - run:
           name: Run format check
           command: |

diff --git a/HOWTO_RELEASE.md b/HOWTO_RELEASE.md
@@ -4,7 +4,7 @@
    - They should cover all (non-GitHub-related) commits (PRs) on the `develop` branch since the most recent release.
    - They should make clear to users whether they might benefit from this release and what backwards incompatibilities they might face.
 
-2. Bump the version numbers in the `develop` branch, in the VERSION file **and** in doc/source/conf.py ([example PR: #1395](https://github.com/GPflow/GPflow/pull/1395)).
+2. Bump the version numbers in the `develop` branch, in the VERSION file **and** in doc/source/conf.py ([example PR: #1666](https://github.com/GPflow/GPflow/pull/1666)).
    Copy the RELEASE.md template for the following release-in-progress.
 
 3. Create a release PR from `develop` to `master`.
@@ -18,18 +18,8 @@
 5. You are almost done now! Go to https://circleci.com and monitor that tests for your newly-created tag passed and the job for pushing the pip package succeeded. CircleCI matches on the “v{VERSION}” tag to kick-start the release process.
    - [example CI workflow: 2434](https://app.circleci.com/pipelines/github/GPflow/GPflow/2434/workflows/f1274aa7-18c6-45a3-8d59-cab573305b64)
 
-6. Take a break: before you continue, wait until the new release [shows up on PyPi](https://pypi.org/project/gpflow/#history).
+6. Take a break; wait until the new release [shows up on PyPi](https://pypi.org/project/gpflow/#history).
 
-Context: Circleci of our doc-repo “GPflow/docs.git” is automatically triggered to re-build the ReadTheDocs documentation when pushed to gpflow/develop branch. This is not the case when pushing to gpflow/master. 
-Therefore, after pushing the release to PyPi, we need to trigger the circleci build for docs/master. To do this, go to doc-repo “GPflow/docs.git” master branch and update in .circleci/config.yml (line 14) the VERSION number. Commit and push this change to the master branch, and the docs will build. 
-
-```bash
-git clone -b master git@github.com:GPflow/docs.git
-cd docs
-open .circleci/config.yml
-```
-
-Update the version number of `pip install -q gpflow=={x.x.x}` in line 14. Commit change and push to master.
 
 Done done! Go and celebrate our hard work :)
 
diff --git a/Makefile b/Makefile
@@ -2,6 +2,7 @@ BLACK_CONFIG=-t py36 -l 100
 BLACK_TARGETS=gpflow tests doc setup.py
 ISORT_CONFIG=--atomic -l 100 --trailing-comma --remove-redundant-aliases --multi-line 3
 ISORT_TARGETS=gpflow tests setup.py
+MYPY_TARGETS=gpflow tests setup.py
 
 .PHONY: help clean dev-install install package format format-check type-check test check-all
 
@@ -38,7 +39,7 @@ format-check:
 	isort --check-only $(ISORT_CONFIG) $(ISORT_TARGETS)
 
 type-check:
-	mypy gpflow tests
+	mypy $(MYPY_TARGETS)
 
 test:
 	pytest -n auto --dist loadfile -v --durations=10 tests/

diff --git a/RELEASE.md b/RELEASE.md
@@ -19,7 +19,7 @@ Release notes for all past releases are available in the ['Releases' section](ht
 
 * <INSERT MAJOR FEATURE HERE, USING MARKDOWN SYNTAX>
 * <IF RELEASE CONTAINS MULTIPLE FEATURES FROM SAME AREA, GROUP THEM TOGETHER>
-  
+
 ## Bug Fixes and Other Changes
 
 * <SIMILAR TO ABOVE SECTION, BUT FOR OTHER IMPORTANT CHANGES / BUG FIXES>
@@ -33,15 +33,10 @@ This release contains contributions from:
 <INSERT>, <NAME>, <HERE>, <USING>, <GITHUB>, <HANDLE>
 
 
-# Release 2.2.0 (next upcoming release)
+# Release 2.2.2 (next upcoming release in progress)
 
 <INSERT SMALL BLURB ABOUT RELEASE FOCUS AREA AND POTENTIAL TOOLCHAIN CHANGES>
 
-## Breaking Changes
-
-* <DOCUMENT BREAKING CHANGES HERE>
-* <THIS SECTION SHOULD CONTAIN API AND BEHAVIORAL BREAKING CHANGES>
-
 ## Known Caveats
 
 * <CAVEATS REGARDING THE RELEASE (BUT NOT BREAKING CHANGES).>
@@ -50,20 +45,70 @@ This release contains contributions from:
 
 ## Major Features and Improvements
 
-* <INSERT MAJOR FEATURE HERE, USING MARKDOWN SYNTAX>
-* <IF RELEASE CONTAINS MULTIPLE FEATURES FROM SAME AREA, GROUP THEM TOGETHER>
-
+* Refactor posterior base class to support other model types.
+
 ## Bug Fixes and Other Changes
 
-* <SIMILAR TO ABOVE SECTION, BUT FOR OTHER IMPORTANT CHANGES / BUG FIXES>
-* <IF A CHANGE CLOSES A GITHUB ISSUE, IT SHOULD BE DOCUMENTED HERE>
-* <NOTES SHOULD BE GROUPED PER AREA>
+* Fix unit test failure when using TensorFlow 2.5.0 (#1684)
+* Upgrade black formatter to version 20.8b1 (#1694)
+* Remove erroneous DeprecationWarnings (#1693)
 
 ## Thanks to our Contributors
 
 This release contains contributions from:
 
-<INSERT>, <NAME>, <HERE>, <USING>, <GITHUB>, <HANDLE>
+johnamcleod, st--
+
+
+# Release 2.2.1
+
+Bugfix for creating the new posterior objects with `PrecomputeCacheType.VARIABLE`.
+
+
+# Release 2.2.0
+
+The main focus of this release is the new "Posterior" object introduced by
+PR #1636, which allows for a significant speed-up of post-training predictions
+with the `SVGP` model (partially resolving #1599).
+
+* For end-users, by default nothing changes; see Breaking Changes below if you
+  have written your own _implementations_ of `gpflow.conditionals.conditional`.
+* After training an `SVGP` model, you can call `model.posterior()` to obtain a
+  Posterior object that precomputes all quantities not depending on the test
+  inputs (e.g. Choleskty of Kuu), and provides a `posterior.predict_f()` method
+  that reuses these cached quantities. `model.predict_f()` computes exactly the
+  same quantities as before and does **not** give any speed-up.
+* `gpflow.conditionals.conditional()` forwards to the same "fused" code-path as
+  before.
+
+## Breaking Changes
+
+* `gpflow.conditionals.conditional.register` is deprecated and should not be
+  called outside of the GPflow core code.  If you have written your own
+  implementations of `gpflow.conditionals.conditional()`, you have two options
+  to use your code with GPflow 2.2:
+  1. Temporary work-around: Instead of `gpflow.models.SVGP`, use the
+     backwards-compatible `gpflow.models.svgp.SVGP_deprecated`.
+  2. Convert your conditional() implementation into a subclass of
+     `gpflow.posteriors.AbstractPosterior`, and register
+     `get_posterior_class()` instead (see the "Variational Fourier Features"
+     notebook for an example).
+
+## Known Caveats
+
+* The Posterior object is currently only available for the `SVGP` model. We
+  would like to extend this to the other models such as `GPR`, `SGPR`, or `VGP`, but
+  this effort is beyond what we can currently provide. If you would be willing
+  to contribute to those efforts, please get in touch!
+* The Posterior object does not currently provide the `GPModel` convenience
+  functions such as `predict_f_samples`, `predict_y`, `predict_log_density`.
+  Again, if you're willing to contribute, get in touch!
+
+## Thanks to our Contributors
+
+This release contains contributions from:
+
+stefanosele, johnamcleod, st--
 
 
 # Release 2.1.5
@@ -80,7 +125,7 @@ This release contains contributions from:
 
 * Improves compatibility between monitoring API and Scipy optimizer (#1642).
 * Adds `_add_noise_cov` method to GPR model class to make it more easily extensible (#1645).
-  
+
 ## Bug Fixes
 
 * Fixes a bug in ModelToTensorBoard (#1619) when `max_size=-1` (#1619)
@@ -101,4 +146,3 @@ This release contains contributions from:
 This release contains contributions from:
 
 johnamcleod, st--, vatsalaggarwal, sam-willis, vdutor
-
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-2.1.5
+2.2.1
diff --git a/doc/source/conf.py b/doc/source/conf.py
@@ -92,9 +92,9 @@
 # built documents.
 #
 # The short X.Y version.
-version = "2.1"
+version = "2.2"
 # The full version, including alpha/beta/rc tags.
-release = "2.1.5"
+release = "2.2.1"
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.

diff --git a/doc/source/generate_module_rst.py b/doc/source/generate_module_rst.py
@@ -101,14 +101,12 @@ def set_global_path(path):
 
 
 def is_documentable_module(m: Any) -> bool:
-    """Return `True` if m is module to be documented automatically, `False` otherwise.
-    """
+    """Return `True` if m is module to be documented automatically, `False` otherwise."""
     return inspect.ismodule(m) and "gpflow" in m.__name__ and m.__name__ not in IGNORE_MODULES
 
 
 def is_documentable_component(m: Any) -> bool:
-    """Return `True` if a function or class to be documented automatically, `False` otherwise.
-    """
+    """Return `True` if a function or class to be documented automatically, `False` otherwise."""
     if inspect.isfunction(m):
         return "gpflow" in m.__module__ and m.__module__ not in IGNORE_MODULES
     elif inspect.isclass(m):
@@ -120,8 +118,7 @@ def is_documentable_component(m: Any) -> bool:
 
 
 def is_documentable(m: Any) -> bool:
-    """Return `True` if a function, class, or module to be documented automatically, else `False`.
-    """
+    """Return `True` if a function, class, or module to be documented automatically, else `False`."""
     return is_documentable_component(m) or is_documentable_module(m)
 
 
@@ -189,8 +186,7 @@ def get_module_rst_string(module: ModuleType, level: int) -> str:
 
 
 def get_public_attributes(node: Any) -> Any:
-    """Get the public attributes ('children') of the current node, accessible from this node.
-    """
+    """Get the public attributes ('children') of the current node, accessible from this node."""
     return [getattr(node, a) for a in dir(node) if not a.startswith("_")]
 
 
@@ -206,7 +202,10 @@ def write_to_rst_file(node_name: str, rst_content: List[str]) -> None:
 
     level_underline = RST_LEVEL_SYMBOLS[0] * len(node_name)
     rst_file = SPHINX_FILE_STRING.format(
-        title=node_name, content="".join(rst_content), date=DATE_STRING, headerline=level_underline,
+        title=node_name,
+        content="".join(rst_content),
+        date=DATE_STRING,
+        headerline=level_underline,
     )
 
     path_to_file = path + "/index.rst"

diff --git a/doc/source/notebooks/advanced/coregionalisation.pct.py b/doc/source/notebooks/advanced/coregionalisation.pct.py
@@ -121,7 +121,10 @@
 # fit the covariance function parameters
 maxiter = ci_niter(10000)
 gpflow.optimizers.Scipy().minimize(
-    m.training_loss, m.trainable_variables, options=dict(maxiter=maxiter), method="L-BFGS-B",
+    m.training_loss,
+    m.trainable_variables,
+    options=dict(maxiter=maxiter),
+    method="L-BFGS-B",
 )
 
 

diff --git a/doc/source/notebooks/advanced/fast_predictions.pct.py b/doc/source/notebooks/advanced/fast_predictions.pct.py
@@ -0,0 +1,108 @@
+# ---
+# jupyter:
+#   jupytext:
+#     text_representation:
+#       extension: .py
+#       format_name: light
+#       format_version: '1.5'
+#       jupytext_version: 1.11.2
+#   kernelspec:
+#     display_name: Python 3
+#     language: python
+#     name: python3
+# ---
+
+# + [markdown] id="eYrSpUncKGSk"
+# # Faster predictions by caching
+
+# + [markdown] id="PLuPjfS7KLQ-"
+# The default behaviour of `predict_f` in GPflow models is to compute the predictions from scratch on each call. This is convenient when predicting and training are interleaved, and simplifies the use of these models. There are some use cases, such as Bayesian optimisation, where prediction (at different test points) happens much more frequently than training. In these cases it is convenient to cache parts of the calculation which do not depend upon the test points, and reuse those parts between predictions.
+#
+# There are three models to which we want to add this caching capability: GPR, (S)VGP and SGPR. The VGP and SVGP can be considered together; the difference between the models is whether to condition on the full training data set (VGP) or on the inducing variables (SVGP).
+
+# + [markdown] id="EACkO-iRKM5T"
+# ## Posterior predictive distribution
+#
+# The posterior predictive distribution evaluated at a set of test points $\mathbf{x}_*$ for a Gaussian process model is given by:
+# \begin{equation*}
+# p(\mathbf{f}_*|X, Y) = \mathcal{N}(\mu, \Sigma)
+# \end{equation*}
+#
+# In the case of the GPR model, the parameters $\mu$ and $\Sigma$ are given by:
+# \begin{equation*}
+# \mu = K_{nm}[K_{mm} + \sigma^2I]^{-1}\mathbf{y}
+# \end{equation*}
+# and
+# \begin{equation*}
+# \Sigma = K_{nn} - K_{nm}[K_{mm} + \sigma^2I]^{-1}K_{mn}
+# \end{equation*}
+#
+# The posterior predictive distribution for the VGP and SVGP model is parameterised as follows:
+# \begin{equation*}
+# \mu = K_{nu}K_{uu}^{-1}\mathbf{u}
+# \end{equation*}
+# and
+# \begin{equation*}
+# \Sigma = K_{nn} - K_{nu}K_{uu}^{-1}K_{un}
+# \end{equation*}
+#
+# Finally, the parameters for the SGPR model are:
+# \begin{equation*}
+# \mu = K_{nu}L^{-T}L_B^{-T}\mathbf{c}
+# \end{equation*}
+# and
+# \begin{equation*}
+# \Sigma = K_{nn} - K_{nu}L^{-1}(I - B^{-1})L^{-1}K_{un}
+# \end{equation*}
+#
+# Where the mean function is not the zero function, the predictive mean should have the mean function evaluated at the test points added to it.
+
+# + [markdown] id="GX1U-fYPKPrt"
+# ## What can be cached?
+#
+# We cache two separate values: $\alpha$ and $Q^{-1}$. These correspond to the parts of the mean and covariance functions respectively which do not depend upon the test points. Specifically, in the case of the GPR model these are:
+# \begin{equation*}
+#     \alpha = [K_{mm} + \sigma^2I]^{-1}\mathbf{y}\\ Q^{-1} = [K_{mm} + \sigma^2I]^{-1}
+# \end{equation*}
+# in the case of the VGP and SVGP model these are:
+# \begin{equation*}
+#     \alpha = K_{uu}^{-1}\mathbf{u}\\ Q^{-1} = K_{uu}^{-1}
+# \end{equation*}
+# and in the case of the SGPR model these are:
+# \begin{equation*}
+#     \alpha = L^{-T}L_B^{-T}\mathbf{c}\\ Q^{-1} = L^{-1}(I - B^{-1})L^{-1}
+# \end{equation*}
+#
+#
+# Note that in the (S)VGP case, $\alpha$ is the parameter as proposed by Opper and Archambeau for the mean of the predictive distribution.
+
+# + [markdown] id="FzCgor4nKUcW"
+# ## Example
+#
+# We will construct an SVGP model to demonstrate the faster predictions from using the cached data in the GPFlow posterior classes (subclasses of `gpflow.posteriors.AbstractPosterior`).
+
+# + id="BMnIdXNiKU6t"
+import gpflow
+import numpy as np
+
+
+model = gpflow.models.SVGP(
+    gpflow.kernels.SquaredExponential(),
+    gpflow.likelihoods.Gaussian(),
+    np.linspace(-1.1, 1.1, 1000)[:, None],
+)
+
+Xnew = np.linspace(-1.1, 1.1, 1000)[:, None]
+# -
+
+# The `predict_f` method on the `GPModel` class performs no caching.
+
+# %%timeit
+model.predict_f(Xnew)
+
+# To make use of the caching, first retrieve the posterior class from the model. The posterior class has methods to predict the parameters of marginal distributions at test points, in the same way as the `predict_f` method of the `GPModel` .
+
+posterior = model.posterior()
+
+# %%timeit
+posterior.predict_f(Xnew)
diff --git a/doc/source/notebooks/advanced/gps_for_big_data.pct.py b/doc/source/notebooks/advanced/gps_for_big_data.pct.py
@@ -191,7 +191,7 @@ def plot(title=""):
 def run_adam(model, iterations):
     """
     Utility function running the Adam optimizer
-    
+
     :param model: GPflow model
     :param interations: number of iterations
     """

diff --git a/doc/source/notebooks/advanced/mcmc.pct.py b/doc/source/notebooks/advanced/mcmc.pct.py
@@ -294,7 +294,7 @@ def plot_joint_marginals(samples, parameters, y_axis_label):
 K = kernel.K(X) + np.eye(N) * 1e-6
 
 f = rng.multivariate_normal(mean=np.zeros(N), cov=K, size=(C)).T
-Y = np.argmax(f, 1).reshape(-1,).astype(int)
+Y = np.argmax(f, 1).flatten().astype(int)
 # One-hot encoding
 Y_hot = np.zeros((N, C), dtype=bool)
 Y_hot[np.arange(N), Y] = 1
@@ -303,7 +303,7 @@ def plot_joint_marginals(samples, parameters, y_axis_label):
 
 # %%
 plt.figure(figsize=(12, 6))
-order = np.argsort(X.reshape(-1,))
+order = np.argsort(X.flatten())
 
 for c in range(C):
     plt.plot(X[order], f[order, c], ".", color=colors[c], label=str(c))