# Advanced Examples

In [1]:
%config InlineBackend.print_figure_kwargs = {'bbox_inches': 'tight', 'dpi': 110}
%load_ext autoreload
%autoreload 2
import logging, warnings
logging.getLogger("pymc").setLevel(logging.FATAL)
warnings.filterwarnings("ignore")

## PyMC

The [Example](example.html) page introduces how to use *muse-inference* for a problem defined with PyMC. Here we consider a more complex problem to highlight additional features. In particular:

* We can estimate any number of parameters with any shapes. Here we have a 2-dimensional array $\mu$ and a scalar $\theta$. Note that by default, *muse-inference* considers any variables which do not depend on others as "parameters" (i.e. the "leaves" of the probabilistic graph). However, the algorithm is not limited to such parameters, and any choice can be selected by providing a list of `params` to the `PyMCMuseProblem` constructor.

* We can work with distributions with limited domain support. For example, below we use the $\rm Beta$ distribution with support on $(0,1)$ and the $\rm LogNormal$ distribution with support on $(0,\infty)$. All necessary transformations are handled internally.

* The data and latent space can include any number of variables, with any shapes. Below we demonstrate an $x$ and $z$ which are 2-dimensional arrays. 

First, load the relevant packages:

In [2]:
%pylab inline
import pymc as pm
from muse_inference.pymc import PyMCMuseProblem

Populating the interactive namespace from numpy and matplotlib


Then define the problem,

In [3]:
def gen_funnel(x=None, σ=None, μ=None, rng=None):
    with pm.Model() as model:
        μ = pm.Beta("μ", 2, 5, size=2) if μ is None else μ
        σ = pm.Normal("σ", 0, 3) if σ is None else σ
        z = pm.LogNormal("z", μ, np.exp(σ/2), size=(100, 2))
        x = pm.Normal("x", z, 1, observed=x)
    return model

generate the model and some data, given some chosen true values of parameters,

In [4]:
θ_true = dict(μ=[0.3, 0.7], σ=1)
with gen_funnel(rng=RandomState(0), **θ_true):
    x_obs = pm.sample_prior_predictive(1, random_seed=0).prior.x[0,0]
model = gen_funnel(x=x_obs)
prob = PyMCMuseProblem(model)

and finally, run MUSE:

In [5]:
θ_start = dict(μ=[0.5, 0.5], σ=0)
result = prob.solve(θ_start=θ_start, progress=True)

MUSE:   0%|          | 0/5050 [00:00<?, ?it/s]

MUSE:   0%|          | 8/5050 [00:00<01:03, 79.30it/s]

MUSE:   0%|          | 17/5050 [00:00<01:01, 82.11it/s]

MUSE:   1%|          | 27/5050 [00:00<00:57, 87.77it/s]

MUSE:   1%|          | 36/5050 [00:00<00:57, 86.47it/s]

MUSE:   1%|          | 45/5050 [00:00<00:58, 85.26it/s]

MUSE:   1%|          | 55/5050 [00:00<00:56, 88.75it/s]

MUSE:   1%|▏         | 65/5050 [00:00<00:54, 90.95it/s]

MUSE:   1%|▏         | 75/5050 [00:00<00:55, 90.36it/s]

MUSE:   2%|▏         | 85/5050 [00:00<00:53, 93.19it/s]

MUSE:   2%|▏         | 95/5050 [00:01<00:53, 92.11it/s]

MUSE:   2%|▏         | 105/5050 [00:01<01:06, 74.48it/s]

MUSE:   2%|▏         | 113/5050 [00:01<01:09, 70.60it/s]

MUSE:   2%|▏         | 121/5050 [00:01<01:11, 68.96it/s]

MUSE:   3%|▎         | 129/5050 [00:01<01:21, 60.54it/s]

MUSE:   3%|▎         | 136/5050 [00:01<01:23, 58.68it/s]

MUSE:   3%|▎         | 143/5050 [00:01<01:20, 60.64it/s]

MUSE:   3%|▎         | 150/5050 [00:02<01:23, 58.53it/s]

MUSE:   3%|▎         | 156/5050 [00:02<01:23, 58.63it/s]

MUSE:   3%|▎         | 164/5050 [00:02<01:18, 62.37it/s]

MUSE:   3%|▎         | 171/5050 [00:02<01:19, 61.19it/s]

MUSE:   4%|▎         | 179/5050 [00:02<01:15, 64.90it/s]

MUSE:   4%|▎         | 186/5050 [00:02<01:17, 62.48it/s]

MUSE:   4%|▍         | 193/5050 [00:02<01:22, 59.06it/s]

MUSE:   4%|▍         | 199/5050 [00:02<01:28, 55.09it/s]

MUSE:   4%|▍         | 205/5050 [00:02<01:29, 54.10it/s]

MUSE:   4%|▍         | 213/5050 [00:03<01:20, 59.81it/s]

MUSE:   4%|▍         | 223/5050 [00:03<01:10, 68.95it/s]

MUSE:   5%|▍         | 233/5050 [00:03<01:03, 76.44it/s]

MUSE:   5%|▍         | 242/5050 [00:03<01:01, 77.96it/s]

MUSE:   5%|▍         | 251/5050 [00:03<00:59, 80.41it/s]

MUSE:   5%|▌         | 261/5050 [00:03<00:56, 84.54it/s]

MUSE:   5%|▌         | 270/5050 [00:03<00:57, 82.45it/s]

MUSE:   6%|▌         | 279/5050 [00:03<01:00, 79.09it/s]

MUSE:   6%|▌         | 288/5050 [00:03<00:58, 80.96it/s]

MUSE:   6%|▌         | 297/5050 [00:04<00:58, 81.94it/s]

MUSE:   6%|▌         | 306/5050 [00:04<01:01, 77.44it/s]

MUSE:   6%|▋         | 319/5050 [00:04<00:53, 89.20it/s]

MUSE:   7%|▋         | 334/5050 [00:04<00:45, 102.64it/s]

MUSE:   7%|▋         | 350/5050 [00:04<00:39, 117.87it/s]

MUSE:   7%|▋         | 367/5050 [00:04<00:36, 128.11it/s]

MUSE:   8%|▊         | 384/5050 [00:04<00:34, 136.74it/s]

MUSE:   8%|▊         | 399/5050 [00:04<00:33, 138.72it/s]

MUSE:   8%|▊         | 413/5050 [00:04<00:36, 128.07it/s]

MUSE:   9%|▊         | 432/5050 [00:05<00:32, 142.53it/s]

MUSE:   9%|▉         | 452/5050 [00:05<00:29, 157.22it/s]

MUSE:   9%|▉         | 468/5050 [00:05<00:31, 146.24it/s]

MUSE:  10%|▉         | 486/5050 [00:05<00:29, 154.41it/s]

MUSE:  10%|▉         | 504/5050 [00:05<00:28, 161.03it/s]

MUSE:  10%|█         | 521/5050 [00:05<00:31, 142.98it/s]

MUSE:  11%|█         | 545/5050 [00:05<00:27, 164.70it/s]

MUSE:  11%|█         | 566/5050 [00:05<00:25, 176.34it/s]

MUSE:  12%|█▏        | 585/5050 [00:05<00:24, 179.78it/s]

MUSE:  12%|█▏        | 604/5050 [00:06<00:24, 181.32it/s]

MUSE:  12%|█▏        | 623/5050 [00:06<00:26, 167.52it/s]

MUSE:  13%|█▎        | 641/5050 [00:06<00:26, 163.55it/s]

MUSE:  13%|█▎        | 661/5050 [00:06<00:25, 172.00it/s]

MUSE:  14%|█▎        | 682/5050 [00:06<00:24, 181.88it/s]

MUSE:  14%|█▍        | 707/5050 [00:06<00:21, 200.45it/s]

MUSE:  14%|█▍        | 728/5050 [00:06<00:21, 202.59it/s]

MUSE:  15%|█▍        | 750/5050 [00:06<00:21, 203.98it/s]

MUSE:  15%|█▌        | 777/5050 [00:06<00:19, 221.55it/s]

MUSE:  16%|█▌        | 800/5050 [00:07<00:19, 221.43it/s]

MUSE:  16%|█▋        | 823/5050 [00:07<00:19, 218.95it/s]

MUSE:  17%|█▋        | 881/5050 [00:07<00:12, 322.44it/s]

MUSE:  18%|█▊        | 914/5050 [00:07<00:12, 323.45it/s]

MUSE:  19%|█▉        | 971/5050 [00:07<00:10, 394.24it/s]

MUSE:  20%|██        | 1011/5050 [00:07<00:10, 376.96it/s]

MUSE:  21%|██        | 1069/5050 [00:07<00:09, 431.98it/s]

MUSE:  22%|██▏       | 1113/5050 [00:07<00:09, 418.56it/s]

MUSE:  23%|██▎       | 1175/5050 [00:07<00:08, 474.93it/s]

MUSE:  24%|██▍       | 1224/5050 [00:08<00:08, 443.10it/s]

MUSE:  25%|██▌       | 1286/5050 [00:08<00:07, 490.91it/s]

MUSE:  26%|██▋       | 1337/5050 [00:08<00:07, 469.58it/s]

MUSE:  28%|██▊       | 1412/5050 [00:08<00:06, 546.45it/s]

MUSE:  29%|██▉       | 1468/5050 [00:08<00:06, 531.25it/s]

MUSE: 100%|██████████| 5050/5050 [00:08<00:00, 10253.19it/s]

MUSE: 100%|██████████| 5050/5050 [00:08<00:00, 588.72it/s]  




get_H:   0%|          | 0/70 [00:00<?, ?it/s]

get_H:  17%|█▋        | 12/70 [00:00<00:00, 106.27it/s]

get_H:  36%|███▌      | 25/70 [00:00<00:00, 117.57it/s]

get_H:  56%|█████▌    | 39/70 [00:00<00:00, 118.55it/s]

get_H:  73%|███████▎  | 51/70 [00:00<00:00, 90.55it/s] 

get_H:  91%|█████████▏| 64/70 [00:00<00:00, 99.54it/s]

get_H: 100%|██████████| 70/70 [00:00<00:00, 93.45it/s]




When there are multiple parameters, the starting guess should be specified as as a dictionary, as above.

The parameter estimate is returned as a dictionary,

In [6]:
result.θ

{'μ': array([0.39753593, 0.43684889]), 'σ': array(0.90340398)}

 and the covariance as matrix, with parameters concatenated in the order they appear in the model (or in the order specified in `params`, if that was used):

In [7]:
result.Σ

array([[ 2.12552465e-02,  5.31172549e-04, -3.29325092e-05],
       [ 5.31172549e-04,  3.38848825e-02, -8.03978415e-03],
       [-3.29325092e-05, -8.03978415e-03,  9.27782381e-03]])

The `result.ravel` and `result.unravel` functions can be used to convert between dictionary and vector representations of the parameters. For example, to compute the standard deviation for each parameter (the square root of the diagonal of the covariance):

In [8]:
result.unravel(np.sqrt(np.diag(result.Σ)))

{'μ': array([0.14579179, 0.18407847]), 'σ': array(0.09632146)}

or to convert the mean parameters to a vector:

In [9]:
result.ravel(result.θ)

array([0.39753593, 0.43684889, 0.90340398])

## Jax

We can also use [Jax](https://jax.readthedocs.io/) to define the problem. In this case we will write out function to generate forward samples and to compute the posterior, and Jax will provide necessary gradients for free. To use Jax, load the necessary packages:

In [10]:
from functools import partial
import jax
import jax.numpy as jnp
from muse_inference.jax import JittableJaxMuseProblem, JaxMuseProblem

Let's implement the noisy funnel problem from the [Example](example.html) page. To do so, extend either `JaxMuseProblem`, or, if your code is able to be JIT compiled by Jax, extend `JittableJaxMuseProblem` and decorate the functions with `jax.jit`:

In [11]:
class JaxFunnelMuseProblem(JittableJaxMuseProblem):

    def __init__(self, N):
        super().__init__()
        self.N = N

    @partial(jax.jit, static_argnums=0)
    def sample_x_z(self, key, θ):
        keys = jax.random.split(key, 2)
        z = jax.random.normal(keys[0], (self.N,)) * jnp.exp(θ/2)
        x = z + jax.random.normal(keys[1], (self.N,))
        return (x, z)

    @partial(jax.jit, static_argnums=0)
    def logLike(self, x, z, θ):
        return -(jnp.sum((x - z)**2) + jnp.sum(z**2) / jnp.exp(θ) + 512*θ) / 2

    @partial(jax.jit, static_argnums=0)
    def logPrior(self, θ):
        return -θ**2 / (2*3**2)

Now generate some simulated data, which we set into `prob.x`. Note also the use of `PRNGKey` (rather than `RandomState` for PyMC/Numpy) for random number generation. 

In [12]:
prob = JaxFunnelMuseProblem(10000)
key = jax.random.PRNGKey(0)
(x, z) = prob.sample_x_z(key, 0)
prob.x = x



And finally, run MUSE:

In [13]:
prob.solve(θ_start=0., rng=jax.random.PRNGKey(1)) # warmup

<muse_inference.muse_inference.MuseResult at 0x7f296e598b20>

In [14]:
result = prob.solve(θ_start=0., rng=jax.random.PRNGKey(1), progress=True)

MUSE:   0%|          | 0/5050 [00:00<?, ?it/s]

MUSE:   2%|▏         | 102/5050 [00:00<00:05, 877.46it/s]

MUSE:   4%|▍         | 203/5050 [00:00<00:06, 722.80it/s]

MUSE:   6%|▌         | 304/5050 [00:00<00:06, 751.01it/s]

MUSE:   8%|▊         | 399/5050 [00:00<00:05, 815.85it/s]

MUSE: 100%|██████████| 5050/5050 [00:00<00:00, 15277.76it/s]

MUSE: 100%|██████████| 5050/5050 [00:00<00:00, 8361.35it/s] 




get_H:   0%|          | 0/30 [00:00<?, ?it/s]

get_H: 100%|██████████| 30/30 [00:00<00:00, 384.61it/s]




Note that the solution here is obtained around 10X faster that the PyMC version of this in the [Example](example.html) page (the cloud machines which build these docs don't always achieve the 10X, but you see this if you run these examples locally). The Jax interface has much lower overhead, which will be noticeable for very fast posteriors like the one above. 

One convenient aspect of using Jax is that the parameters, `θ`, and latent space, `z`, can be any [pytree](https://jax.readthedocs.io/en/latest/pytrees.html), ie tuples, dictionaries, nested combinations of them, etc... (there is no requirement on the data format of the `x` variable). To demonstrate, consider a problem which is just two copies of the noisy funnel problem:

In [15]:
class JaxPyTreeFunnelMuseProblem(JittableJaxMuseProblem):

    def __init__(self, N):
        super().__init__()
        self.N = N

    @partial(jax.jit, static_argnums=0)
    def sample_x_z(self, key, θ):
        (θ1, θ2) = (θ["θ1"], θ["θ2"])
        keys = jax.random.split(key, 4)
        z1 = jax.random.normal(keys[0], (self.N,)) * jnp.exp(θ1/2)
        z2 = jax.random.normal(keys[1], (self.N,)) * jnp.exp(θ2/2)        
        x1 = z1 + jax.random.normal(keys[2], (self.N,))
        x2 = z2 + jax.random.normal(keys[3], (self.N,))        
        return ({"x1":x1, "x2":x2}, {"z1":z1, "z2":z2})

    @partial(jax.jit, static_argnums=0)
    def logLike(self, x, z, θ):
        return (
            -(jnp.sum((x["x1"] - z["z1"])**2) + jnp.sum(z["z1"]**2) / jnp.exp(θ["θ1"]) + 512*θ["θ1"]) / 2
            -(jnp.sum((x["x2"] - z["z2"])**2) + jnp.sum(z["z2"]**2) / jnp.exp(θ["θ2"]) + 512*θ["θ2"]) / 2
        )

    @partial(jax.jit, static_argnums=0)
    def logPrior(self, θ):
        return - θ["θ1"]**2 / (2*3**2) - θ["θ2"]**2 / (2*3**2)

Here, `x`, `θ`, and `z` are all dictionaries. We generate the problem as usual, passing in parameters as dictionaries,

In [16]:
θ_true = dict(θ1=-1., θ2=2.)
θ_start = dict(θ1=0., θ2=0.)

In [17]:
prob = JaxPyTreeFunnelMuseProblem(10000)
key = jax.random.PRNGKey(0)
(x, z) = prob.sample_x_z(key, θ_true)
prob.x = x

and run MUSE:

In [18]:
prob.solve(θ_start=θ_start, rng=jax.random.PRNGKey(0)) # warmup

<muse_inference.muse_inference.MuseResult at 0x7f295c8aa520>

In [19]:
result = prob.solve(θ_start=θ_start, rng=jax.random.PRNGKey(0), progress=True)

MUSE:   0%|          | 0/5050 [00:00<?, ?it/s]

MUSE:   2%|▏         | 102/5050 [00:00<00:10, 473.10it/s]

MUSE:   3%|▎         | 151/5050 [00:00<00:10, 478.71it/s]

MUSE:   4%|▍         | 200/5050 [00:00<00:10, 482.77it/s]

MUSE:   5%|▍         | 249/5050 [00:00<00:16, 295.00it/s]

MUSE:   6%|▌         | 286/5050 [00:00<00:15, 301.41it/s]

MUSE:   6%|▋         | 321/5050 [00:01<00:21, 224.27it/s]

MUSE:   7%|▋         | 355/5050 [00:01<00:19, 246.52it/s]

MUSE:   8%|▊         | 390/5050 [00:01<00:17, 268.78it/s]

MUSE:   8%|▊         | 422/5050 [00:01<00:21, 211.44it/s]

MUSE:   9%|▉         | 456/5050 [00:01<00:19, 236.71it/s]

MUSE:  10%|▉         | 491/5050 [00:01<00:17, 260.87it/s]

MUSE:  10%|█         | 521/5050 [00:01<00:21, 208.23it/s]

MUSE:  11%|█         | 558/5050 [00:02<00:18, 241.24it/s]

MUSE:  12%|█▏        | 593/5050 [00:02<00:16, 265.91it/s]

MUSE:  12%|█▏        | 624/5050 [00:02<00:21, 209.59it/s]

MUSE:  13%|█▎        | 660/5050 [00:02<00:18, 240.32it/s]

MUSE:  14%|█▍        | 695/5050 [00:02<00:16, 265.60it/s]

MUSE:  14%|█▍        | 726/5050 [00:02<00:20, 211.45it/s]

MUSE:  15%|█▌        | 762/5050 [00:02<00:17, 241.85it/s]

MUSE:  16%|█▌        | 798/5050 [00:03<00:15, 269.24it/s]

MUSE:  16%|█▋        | 829/5050 [00:03<00:20, 210.72it/s]

MUSE:  17%|█▋        | 863/5050 [00:03<00:17, 237.44it/s]

MUSE:  18%|█▊        | 894/5050 [00:03<00:16, 253.61it/s]

MUSE:  18%|█▊        | 923/5050 [00:03<00:21, 193.84it/s]

MUSE:  19%|█▉        | 954/5050 [00:03<00:18, 216.80it/s]

MUSE:  20%|█▉        | 985/5050 [00:03<00:17, 237.35it/s]

MUSE:  20%|██        | 1013/5050 [00:04<00:21, 187.65it/s]

MUSE:  21%|██        | 1061/5050 [00:04<00:16, 247.97it/s]

MUSE:  22%|██▏       | 1109/5050 [00:04<00:13, 299.90it/s]

MUSE:  23%|██▎       | 1145/5050 [00:04<00:15, 250.63it/s]

MUSE:  24%|██▍       | 1204/5050 [00:04<00:11, 324.29it/s]

MUSE:  25%|██▍       | 1243/5050 [00:04<00:14, 268.05it/s]

MUSE:  26%|██▌       | 1290/5050 [00:04<00:12, 309.23it/s]

MUSE:  26%|██▋       | 1327/5050 [00:05<00:14, 252.95it/s]

MUSE:  27%|██▋       | 1379/5050 [00:05<00:11, 307.13it/s]

MUSE:  28%|██▊       | 1416/5050 [00:05<00:14, 253.35it/s]

MUSE:  29%|██▉       | 1468/5050 [00:05<00:11, 306.89it/s]

MUSE:  30%|███       | 1515/5050 [00:05<00:10, 343.04it/s]

MUSE:  31%|███       | 1555/5050 [00:05<00:12, 271.92it/s]

MUSE:  32%|███▏      | 1609/5050 [00:06<00:10, 326.14it/s]

MUSE:  33%|███▎      | 1649/5050 [00:06<00:14, 242.45it/s]

MUSE:  34%|███▎      | 1699/5050 [00:06<00:11, 290.05it/s]

MUSE:  34%|███▍      | 1736/5050 [00:06<00:13, 245.19it/s]

MUSE:  35%|███▌      | 1788/5050 [00:06<00:11, 296.19it/s]

MUSE:  36%|███▌      | 1825/5050 [00:06<00:12, 248.34it/s]

MUSE:  37%|███▋      | 1866/5050 [00:07<00:11, 279.96it/s]

MUSE:  38%|███▊      | 1907/5050 [00:07<00:10, 306.72it/s]

MUSE:  38%|███▊      | 1943/5050 [00:07<00:12, 245.83it/s]

MUSE:  39%|███▉      | 1985/5050 [00:07<00:10, 281.33it/s]

MUSE:  40%|████      | 2021/5050 [00:07<00:12, 234.45it/s]

MUSE:  41%|████▏     | 2088/5050 [00:07<00:09, 322.56it/s]

MUSE:  42%|████▏     | 2128/5050 [00:08<00:10, 275.38it/s]

MUSE:  43%|████▎     | 2194/5050 [00:08<00:08, 354.52it/s]

MUSE:  44%|████▍     | 2238/5050 [00:08<00:09, 301.00it/s]

MUSE:  46%|████▌     | 2304/5050 [00:08<00:07, 373.37it/s]

MUSE:  47%|████▋     | 2349/5050 [00:08<00:08, 302.85it/s]

MUSE:  47%|████▋     | 2391/5050 [00:08<00:08, 325.70it/s]

MUSE:  48%|████▊     | 2430/5050 [00:08<00:09, 269.49it/s]

MUSE:  49%|████▉     | 2477/5050 [00:09<00:08, 309.26it/s]

MUSE:  50%|████▉     | 2524/5050 [00:09<00:07, 344.42it/s]

MUSE:  51%|█████     | 2564/5050 [00:09<00:09, 268.78it/s]

MUSE:  52%|█████▏    | 2608/5050 [00:09<00:08, 303.58it/s]

MUSE: 100%|██████████| 5050/5050 [00:09<00:00, 4919.34it/s]

MUSE: 100%|██████████| 5050/5050 [00:09<00:00, 521.89it/s] 




get_H:   0%|          | 0/50 [00:00<?, ?it/s]

get_H:  38%|███▊      | 19/50 [00:00<00:00, 187.06it/s]

get_H:  80%|████████  | 40/50 [00:00<00:00, 195.77it/s]

get_H: 100%|██████████| 50/50 [00:00<00:00, 190.24it/s]




The result is returned as a pytree:

In [20]:
result.θ

{'θ1': DeviceArray(-1.0020632, dtype=float32),
 'θ2': DeviceArray(2.027134, dtype=float32)}

and the covariance as a matrix:

In [21]:
result.Σ

array([[ 3.3823610e-03, -4.6946143e-06],
       [-4.6946157e-06,  2.4767642e-04]], dtype=float32)

The `result.ravel` and `result.unravel` functions can be used to convert between pytree and vector representations of the parameters. For example, to compute the standard deviation for each parameter (the square root of the diagonal of the covariance):

In [22]:
result.unravel(np.sqrt(np.diag(result.Σ)))

{'θ1': DeviceArray(0.05815807, dtype=float32),
 'θ2': DeviceArray(0.01573774, dtype=float32)}

or to convert the mean parameters to a vector:

In [23]:
result.ravel(result.θ)

DeviceArray([-1.0020632,  2.027134 ], dtype=float32)