# [dev] PyMC4 Design overview. Generators, Coroutines and all the things

For a brief introductions you can fist read [PEP0342](https://www.python.org/dev/peps/pep-0342/). But I will cover most of this here on a practical example. Here we will make a draft of a PPL on top of `tfp`. The challenge is to use dynamic graph building and being flexible

## A simple case

In [1]:
import tensorflow_probability as tfp
import tensorflow as tf
from tensorflow_probability import distributions as tfd

Let's fist look at "HOW-TO PPL". We need a probabilistic program that allows to compute logp of an arbitrary model. But how we do this? In a static graph backend like theano we were able to write things in a declarative way and play with computational graph. In dynamic graph we have some problems, we do not have explicit graph representation. Instead we could write a function that does all the stuff. 

This ideally should look like this

In [2]:
def model(x):
    scale = tfd.HalfCauchy(0, 1)
    coefs = tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = tfd.Normal(tf.linalg.matvec(x, coefs), scale)
    return predictions

But this function will not work, you can try it yourself.

In [None]:
model(tf.random.normal((100, 10)))

What to do? Instead? What we want is to track function evaluation on the fly. Any ways to do this? Yes

The very first way to cope with is was writing a wrapper over a distribution object. This wrapper was intended to catch a call to the distribution and use context to figure out what to do. It is nice, but there is a natural way to do this. What we exactly want to do is to borrow the control for a while and decide what to do. Coroutines allow us to that for free

In [4]:
def model(x):
    scale = yield tfd.HalfCauchy(0, 1)
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale)
    return predictions

Now, we evaluate the model as expected but `yield` allows to give the control away. But before evaluating this function, let's figure out what does yield do.

In [5]:
def generator(message):
    print("I am a generator and I yield", message)
    responce = yield message
    print("I am a generator and I got", responce)
    return "good bye"

In [6]:
g = generator("(generators are cool)")

In [7]:
mes = g.send(None)

I am a generator and I yield (generators are cool)


In [8]:
print(mes)

(generators are cool)


In [9]:
g.send("(yeah, bro, generators are cool)")

I am a generator and I got (yeah, bro, generators are cool)


StopIteration: good bye

What has happened right here:

* we had a simple generator and were able to communicate with it via `send`
* after `send` is called (first time requires it to have `None` argument) generator goes to the next `yield` expression and yields what it it asked to yield.
* as a return value from `send` we have this exact message from `yield message`
* we set the lhs of `responce = yield message` with next `send` and no earlier
* after generator has no `yield` statements left and finally reaches `return`, it raises `StopIteration` with return value as a first argument

Now we are ready to evaluate our model by hand

In [10]:
state = dict(dists=dict(), samples=dict())

In [11]:
state["input"] = tf.random.normal((3, 10))
m = model(state["input"])

In [12]:
scale_dist = next(m)

In [13]:
print(scale_dist)

tfp.distributions.HalfCauchy("HalfCauchy/", batch_shape=[], event_shape=[], dtype=float32)


Okay, we are here

```python
def model(x):
    scale = yield tfd.HalfCauchy(0, 1) # <--- HERE
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale)
    return predictions
```

WHat to do with this distribution? We can choose forward sampling and in this case we sample from the distribution. But we need it to be used by user seamlessly. But on our side we would like to store intermidiate values, and distributions

In [14]:
assert scale_dist.name not in state["dists"]
state["samples"][scale_dist.name] = scale_dist.sample()
state["dists"][scale_dist.name] = scale_dist

In [15]:
coefs_dist = m.send(state["samples"][scale_dist.name])

In [16]:
print(coefs_dist)

tfp.distributions.Normal("Normal/", batch_shape=[10], event_shape=[], dtype=float32)


```python
def model(x):
    scale = yield tfd.HalfCauchy(0, 1)
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, ) # <--- HERE
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale)
    return predictions
```

We do the same thing

In [17]:
assert coefs_dist.name not in state["dists"]
state["samples"][coefs_dist.name] = coefs_dist.sample()
state["dists"][coefs_dist.name] = coefs_dist

In [18]:
preds_dist = m.send(state["samples"][coefs_dist.name])

In [19]:
print(preds_dist)

tfp.distributions.Normal("Normal/", batch_shape=[3], event_shape=[], dtype=float32)


```python
def model(x):
    scale = yield tfd.HalfCauchy(0, 1)
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale) # <--- HERE
    return predictions
```

We are now facing predictive distribution. Here we have several options:
* sample from it: we get prior predictive
* set a custom values instead of sample: similar to Pearl's `do`-operator, but at the leaf nodes it is equivelent to conditioning. We might be interested in this to compute unnormalized posterior
* replace it with anothe distribution, arbitrary magic

In [20]:
assert preds_dist.name not in state["dists"]
state["samples"][preds_dist.name] = tf.zeros(preds_dist.batch_shape)
state["dists"][preds_dist.name] = preds_dist

AssertionError: 

Gotcha, we found duplicated names in our toy graphical model. We can easily tell our user to rewrite the model to get rid of duplicate names

In [21]:
m.throw(RuntimeError(
    "We found duplicate names in your cool model: {}, "
    "so far we have other variables in the model, {}".format(
        preds_dist.name, set(state["dists"].keys()), 
    )
))

RuntimeError: We found duplicate names in your cool model: Normal/, so far we have other variables in the model, {'Normal/', 'HalfCauchy/'}

The good thing is that we *communicate* with user, and can give meaningful exceptions with few pain.

The correct model should look like this:

```python
def model(x):
    scale = yield tfd.HalfCauchy(0, 1)
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale, name="Normal_1") # <--- HERE we asked out user to change the name
    return predictions
```


Let's set all the names according to the new model and interact with user again using the same model

In [22]:
m.gi_running

False

Our generator is now dead, we can't interact with it any more, let's create a new one and revaluate with same sampled values (A hint how to get the desired `logp` functino)

In [23]:
def model(x):
    scale = yield tfd.HalfCauchy(0, 1)
    coefs = yield tfd.Normal(tf.zeros(x.shape[1]), 1, )
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale, name="Normal_1") # <--- HERE we asked out user to change the name
    return predictions

In [24]:
m = model(state["input"])
print(m.send(None))
print(m.send(state["samples"]["HalfCauchy/"]))
print(m.send(state["samples"]["Normal/"]))
try:
    m.send(tf.zeros(state["input"].shape[0]))
except StopIteration as e:
    stop_iteration = e
else:
    raise RuntimeError("No exception met")

tfp.distributions.HalfCauchy("HalfCauchy/", batch_shape=[], event_shape=[], dtype=float32)
tfp.distributions.Normal("Normal/", batch_shape=[10], event_shape=[], dtype=float32)
tfp.distributions.Normal("Normal_1/", batch_shape=[3], event_shape=[], dtype=float32)


In [25]:
print(stop_iteration)

tf.Tensor([0. 0. 0.], shape=(3,), dtype=float32)


Instead of returning some value in the last `send`, generator raises `StopIteration` because it is exhausted and reached the `return` statement (no more `yield` met). As explained (and checked here) in [PEP0342](https://www.python.org/dev/peps/pep-0342/), we have a return value inside

## Automate the things

We all are lazy humans and cant stand doing repetitive things. In our model evaluation we followed pretty simple rules:
* asserting name is not used
* checking if we should sample or place a specific value instead
* recording distributions and samples

Nest step is to make a function that does all this instead of us. In this tutorial we make it dumb simple

In [26]:
def interact(gen, state):
    control_flow = gen()
    return_value = None
    while True:
        try:
            dist = control_flow.send(return_value)
            if dist.name in state["dists"]:
                control_flow.throw(RuntimeError(
                    "We found duplicate names in your cool model: {}, "
                    "so far we have other variables in the model, {}".format(
                        preds_dist.name, set(state["dists"].keys()), 
                    )
                ))
            if dist.name in state["samples"]:
                return_value = state["samples"][dist.name]
            else:
                return_value = dist.sample()
                state["samples"][dist.name] = return_value
            state["dists"][dist.name] = dist
        except StopIteration as e:
            if e.args:
                return_value = e.args[0]
            else:
                return_value = None
            break
    return return_value, state

This implementation assumes no arg generator, we make things just simple

In [27]:
preds, state = interact(lambda: model(tf.random.normal((3, 10))), state=dict(dists=dict(), samples=dict()))

In [28]:
state

{'dists': {'HalfCauchy/': <tfp.distributions.HalfCauchy 'HalfCauchy/' batch_shape=[] event_shape=[] dtype=float32>,
  'Normal/': <tfp.distributions.Normal 'Normal/' batch_shape=[10] event_shape=[] dtype=float32>,
  'Normal_1/': <tfp.distributions.Normal 'Normal_1/' batch_shape=[3] event_shape=[] dtype=float32>},
 'samples': {'HalfCauchy/': <tf.Tensor: id=129, shape=(), dtype=float32, numpy=3.9818556>,
  'Normal/': <tf.Tensor: id=155, shape=(10,), dtype=float32, numpy=
  array([-2.4645033 , -1.9799898 ,  0.99812627, -0.17554197, -1.1550732 ,
         -1.5893505 , -1.2463187 , -0.3275891 ,  0.7519032 , -1.1133935 ],
        dtype=float32)>,
  'Normal_1/': <tf.Tensor: id=181, shape=(3,), dtype=float32, numpy=array([  6.8296857,   5.369957 , -11.437586 ], dtype=float32)>}}

In [29]:
preds

<tf.Tensor: id=181, shape=(3,), dtype=float32, numpy=array([  6.8296857,   5.369957 , -11.437586 ], dtype=float32)>

We get all the things as expected. To calculate `logp` you just iterate over distributions and match them with the correspondig values. But let's dive deeper

## One level deeper

Recall the motivating example from [PR#125](https://github.com/pymc-devs/pymc4/pull/125)

In [30]:
def Horseshoe(mu=0, tau=1., s=1., name=None):
    with tf.name_scope(name):
        scale = yield tfd.HalfCauchy(0, s, name="scale")
        noise = yield tfd.Normal(0, tau, name="noise")
        return scale * noise + mu


def linreg(x):
    scale = yield tfd.HalfCauchy(0, 1, name="scale")
    coefs = yield Horseshoe(tf.zeros(x.shape[1]), name="coefs")
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale, name="predictions")
    return predictions

In [31]:
preds, state = interact(lambda: linreg(tf.random.normal((3, 10))), state=dict(dists=dict(), samples=dict()))

AttributeError: 'generator' object has no attribute 'name'

Oooups, we have a type error. What we want is a nested model, but nesting models is something different from a plain generator. As we have out model being a generator itself, the return value of `Horseshoe(tf.zeros(x.shape[1]), name="coefs")` is a generator. Of course this generator has no name attribute. Okay, we can ask user to use `yield from` construction to generate from the generator

In [32]:
def linreg_ugly(x):
    scale = yield tfd.HalfCauchy(0, 1, name="scale")
    coefs = yield from Horseshoe(tf.zeros(x.shape[1]), name="coefs")
    predictions = yield tfd.Normal(tf.linalg.matvec(x, coefs), scale, name="predictions")
    return predictions

In [33]:
preds, state = interact(lambda: linreg_ugly(tf.random.normal((3, 10))), state=dict(dists=dict(), samples=dict()))

Okay, we passed this thing

In [34]:
state["dists"]

{'scale/': <tfp.distributions.HalfCauchy 'scale/' batch_shape=[] event_shape=[] dtype=float32>,
 'coefs/scale/': <tfp.distributions.HalfCauchy 'coefs/scale/' batch_shape=[] event_shape=[] dtype=float32>,
 'coefs/noise/': <tfp.distributions.Normal 'coefs/noise/' batch_shape=[] event_shape=[] dtype=float32>,
 'predictions/': <tfp.distributions.Normal 'predictions/' batch_shape=[3] event_shape=[] dtype=float32>}

we got nesting models working, but it requires `yield from`. THus is UGLY you say and I agree. This harsh world is too verbose, we can make it a bit smoother. We can rewrite out `interact` function to accept nested models just in a few lines.

In [35]:
import types
def interact_nested(gen, state):
    # for now we should check input type
    if not isinstance(gen, types.GeneratorType):
        control_flow = gen()
    else:
        control_flow = gen

    return_value = None
    while True:
        try:
            dist = control_flow.send(return_value)
            # this makes nested models possible
            if isinstance(dist, types.GeneratorType):
                return_value, state = interact_nested(dist, state)
                # ^ in a few lines of code, go recursive
            else:
                if dist.name in state["dists"]:
                    control_flow.throw(RuntimeError(
                        "We found duplicate names in your cool model: {}, "
                        "so far we have other variables in the model, {}".format(
                            preds_dist.name, set(state["dists"].keys()), 
                        )
                    ))
                if dist.name in state["samples"]:
                    return_value = state["samples"][dist.name]
                else:
                    return_value = dist.sample()
                    state["samples"][dist.name] = return_value
                state["dists"][dist.name] = dist
        except StopIteration as e:
            if e.args:
                return_value = e.args[0]
            else:
                return_value = None
            break
    return return_value, state

remember we had problems here:
```python
preds, state = interact(lambda: linreg(tf.random.normal((3, 10))), state=dict(dists=dict(), samples=dict()))
```

Additionally we can specify the observed variable

In [36]:
preds, state = interact_nested(lambda: linreg(tf.random.normal((3, 10))), state=dict(dists=dict(), samples={"predictions/":tf.zeros(3)}))

In [37]:
state["dists"]

{'scale/': <tfp.distributions.HalfCauchy 'scale/' batch_shape=[] event_shape=[] dtype=float32>,
 'coefs/scale/': <tfp.distributions.HalfCauchy 'coefs/scale/' batch_shape=[] event_shape=[] dtype=float32>,
 'coefs/noise/': <tfp.distributions.Normal 'coefs/noise/' batch_shape=[] event_shape=[] dtype=float32>,
 'predictions/': <tfp.distributions.Normal 'predictions/' batch_shape=[3] event_shape=[] dtype=float32>}

In [38]:
state["samples"]

{'predictions/': <tf.Tensor: id=349, shape=(3,), dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 'scale/': <tf.Tensor: id=384, shape=(), dtype=float32, numpy=1.5178704>,
 'coefs/scale/': <tf.Tensor: id=417, shape=(), dtype=float32, numpy=0.28995466>,
 'coefs/noise/': <tf.Tensor: id=441, shape=(), dtype=float32, numpy=-1.6426383>}

Cool, we've finished the central idea behind PyMC4 core engine. There is some extra stuff to do to make `evaluate_nested` really powerful

* resolve transforms
* resolve reparametrizations
* variational inference
* better error messages
* lazy returns in posterior predictive mode

Some of this functionality may be found in the corresponding [PR#125](https://github.com/pymc-devs/pymc4/pull/125)