In [48]:
import genjax
import jax
import numpy as np
import seaborn as sns

sns.set_theme(style="white")

# Pretty printing.
console = genjax.console(width=80)

# Reproducibility.
key = jax.random.PRNGKey(314159)

Gen helps probabilistic programmers design and implement models and inference algorithms by automating the (often) complicated inference math. The generative function interface is the key abstraction layer which provides this automation. Generative function language designers can extend the interface to new generative function objects - providing domain-specific patterns and optimizations which users can automatically take advantage of.

One key class of generative function languages are _combinators_ - higher-order functions which accept generative functions as input, and produce a new generative function type as an output.

Combinators functionally transform the generative structure that we pass into them, expressing useful patterns - including chain-like computations, IID sampling patterns, or generative computations which form grammar-like structures. 

Combinators also expose optimization opportunities - by registering the patterns as generative functions, implementors (e.g. the library authors) can specialize the implementation of the generative function interface methods. Users of combinators can then take advantage of this interface specialization to express asymptotically optimal updates (useful in e.g. MCMC kernels), or optimized importance weight calculations.

In this notebook, we'll be discussing `Unfold` - a combinator for expressing generative computations which are reminiscent of state-space (or Markov) models. To keep things simple, we'll explore a hidden Markov model example - but combinator usage generalizes to models with much richer structure.

## Introducing `Unfold`

Let's discuss `Unfold`.^[A quick reminder: when in doubt, you can use the console from `console = genjax.pretty()` to inspect the classes which we discuss in the notebooks.]

How do we make an instance of `Unfold`? Given an existing generative function which is a _kernel_ - a kernel accepts and returns the same type signature - we can create a valid `Unfold` instance.^[This is not strictly true. `Unfold` also allows you to pass in a set of _static arguments_ which are provided to the kernel _after the state argument_, unchanged, at each time step. We show this at the bottom of the notebook.]

Here's an example kernel:

In [49]:
@genjax.Static
def kernel(prev_latent):
    new_latent = genjax.normal(prev_latent, 1.0) @ "z"
    _new_obs = genjax.normal(new_latent, 1.0) @ "x"
    return new_latent

In [50]:
key, sub_key = jax.random.split(key)
tr = jax.jit(kernel.simulate)(sub_key, (0.3,))
tr

StaticTrace(gen_fn=StaticGenerativeFunction(source=<function kernel at 0x16c8a4d30>), args=(Array(0.3, dtype=float32, weak_type=True),), retval=Array(-1.3722037, dtype=float32), address_choices=Trie(inner={'z': DistributionTrace(gen_fn=TFPDistribution(make_distribution=<class 'tensorflow_probability.substrates.jax.distributions.normal.Normal'>), args=(Array(0.3, dtype=float32, weak_type=True), Array(1., dtype=float32, weak_type=True)), value=Array(-1.3722037, dtype=float32), score=Array(-2.3170712, dtype=float32)), 'x': DistributionTrace(gen_fn=TFPDistribution(make_distribution=<class 'tensorflow_probability.substrates.jax.distributions.normal.Normal'>), args=(Array(-1.3722037, dtype=float32), Array(1., dtype=float32, weak_type=True)), value=Array(-1.7015922, dtype=float32), score=Array(-0.9731869, dtype=float32))}), cache=Trie(inner={}), score=Array(-3.2902582, dtype=float32))

To create an `Unfold` instance, we provide two things:

* The kernel generative function.
* A static maximum unroll chain argument. Dynamically, `Unfold` may not unroll all the way up to this maximum - but for JAX/XLA compilation, we need to provide this maximum value as an invariant upper bound for any invocation of `Unfold`.

In [51]:
chain = genjax.Unfold(max_length=10)(kernel)
chain

UnfoldCombinator(max_length=10, kernel=StaticGenerativeFunction(source=<function kernel at 0x16c8a4d30>))

To invoke an interface method, the arguments which `Unfold` expects is a `Tuple`, where the first element is the maximum **index** in the resulting chain, and the second element is the initial state.

::: {.callout-important}

## Usage of index argument vs. a length argument

Note how we've bolded **index** above - think of the index value as denoting an upper bound on active indices for the resulting chain. An _active index_ is one in which the value was evolved using the `kernel` from the previous value. Passing in `index = 5` means: all values after `return[5]` are not evolved, they're just filled with the `return[5]` value.

Indexing follows Python convention - so e.g. passing in `0` as the index means that a **single application** of the kernel was applied to the state, before evolution was halted and evolved statically.

:::

In [52]:
key, sub_key = jax.random.split(key)
tr = jax.jit(chain.simulate)(sub_key, (5, 0.3))
tr

UnfoldTrace(unfold=UnfoldCombinator(max_length=10, kernel=StaticGenerativeFunction(source=<function kernel at 0x16c8a4d30>)), inner=StaticTrace(gen_fn=StaticGenerativeFunction(source=<function kernel at 0x16c8a4d30>), args=(Array([ 0.3       ,  0.6131936 ,  0.00573514,  0.5613615 , -0.8505292 ,
       -0.57859993,  0.        ,  0.        ,  0.        ,  0.        ],      dtype=float32),), retval=Array([ 0.6131936 ,  0.00573514,  0.5613615 , -0.8505292 , -0.57859993,
       -0.559737  ,  0.        ,  0.        ,  0.        ,  0.        ],      dtype=float32), address_choices=Trie(inner={'z': DistributionTrace(gen_fn=TFPDistribution(make_distribution=<class 'tensorflow_probability.substrates.jax.distributions.normal.Normal'>), args=(Array([ 0.3       ,  0.6131936 ,  0.00573514,  0.5613615 , -0.8505292 ,
       -0.57859993,  0.        ,  0.        ,  0.        ,  0.        ],      dtype=float32), Array([1., 1., 1., 1., 1., 1., 0., 0., 0., 0.], dtype=float32)), value=Array([ 0.6131936 ,  0

In [53]:
tr.indices

AttributeError: 'UnfoldTrace' object has no attribute 'indices'

In [None]:
tr.get_retval()

Array([ 0.77070504,  1.2511517 , -0.05651412, -1.0273771 , -1.9366605 ,
       -2.0199924 , -2.0199924 , -2.0199924 , -2.0199924 , -2.0199924 ],      dtype=float32, weak_type=True)

Note how `tr.indices` keep track of where the chain stopped evolving, according to the index argument to `Unfold`. In `tr.get_retval()`, we see that the final dynamic value (afterwards, evolution stops) occurs at `index = 5`.

## Combinator choice maps

Typically, each combinator has a unique choice map. The choice map simultaneously represents the structure of the generative choices which the transformed combinator generative function makes, as well as optimization opportunities which a user can take advantage of.

Let's study the choice map for `UnfoldTrace`.

In [None]:
chm = tr.get_choices()
chm

VectorChoiceMap(inner=Mask(mask=Array([ True,  True,  True,  True,  True,  True, False, False, False,
       False], dtype=bool), value=HierarchicalChoiceMap(trie=Trie(inner={'z': ChoiceValue(value=Array([ 0.77070504,  1.2511517 , -0.05651412, -1.0273771 , -1.9366605 ,
       -2.0199924 ,  0.        ,  0.        ,  0.        ,  0.        ],      dtype=float32)), 'x': ChoiceValue(value=Array([ 1.1478616 ,  1.9526132 ,  0.77730536, -0.6052703 , -3.665616  ,
       -2.0058045 ,  0.        ,  0.        ,  0.        ,  0.        ],      dtype=float32))}))))

Again, let's look at the indices.

In [None]:
chm.indices

AttributeError: 'VectorChoiceMap' object has no attribute 'indices'

No surprises - the choice map also keeps track of which indices are active, and which indices are inactive. 

Inactive indices **do not** participate in inference metadata computations - so e.g. if we ask for the score of the trace:

In [None]:
tr.get_score()

Array(-15.245218, dtype=float32)

The score is the same as the sum of sub-trace scores from `[0:5]`.

In [None]:
np.sum(tr.get_subtree("z").get_score()[0:6] + tr.get_subtree("x").get_score()[0:6])

AttributeError: 'UnfoldTrace' object has no attribute 'get_subtree'

The reason why we have an `index` argument is that we can dynamically choose how much of the chain contributes to the generative computation. This `index` argument can come from other generative function - it need not be a JAX trace-time static value.

With this in mind, it's best to think of `Unfold` as representing a space of processes which unroll up to some maximum static length - but the active generative process can halt before that maximum length.