In [1]:
%pylab inline

from pyro.contrib.brm.design import RealValued, Categorical
from pyro.contrib.brm.family import Normal, HalfNormal
from pyro.contrib.brm.model import model_repr
from pyro.contrib.brm.fit import marginals
from pyro.contrib.brm.priors import Prior

from pyro.contrib.brm.oed import SequentialOED, design_space_cols, design_space

Populating the interactive namespace from numpy and matplotlib


### How do we handle participants with oed + brmp?

Assuming we know ahead of time we have a fixed set of three participants to which we might show one of two stimuli, then we might want to do oed with the following model:

In [2]:
oed = SequentialOED(
    'y ~ stimulus + (stimulus | participant)',
    cols=[
        RealValued('y'),
        Categorical('stimulus', ['a','b']),
        Categorical('participant', ['x','y','z'])
    ],
    priors=[
    ]
)

For reference, here's some information about the model this defines:

In [3]:
print(model_repr(oed.model_desc))

Population
----------------------------------------
Coef Priors:
stimulus[a]     | Cauchy(loc=0.0, scale=1.0)
stimulus[b]     | Cauchy(loc=0.0, scale=1.0)
Group 0
----------------------------------------
Factors: participant
Num Levels: 3
Corr. Prior: LKJ(eta=1.0)
S.D. Priors:
stimulus[a]     | HalfCauchy(scale=3.0)
stimulus[b]     | HalfCauchy(scale=3.0)
Response
----------------------------------------
Family: Normal()
Link:
  Parameter: mu
  Function:  identity
Priors:
sigma           | HalfCauchy(scale=3.0)


By default, the system (currently) takes as the design space the full Cartesian product of all factors/columns in the model:

In [4]:
oed.design_space()

[('a', 'x'), ('a', 'y'), ('a', 'z'), ('b', 'x'), ('b', 'y'), ('b', 'z')]

<i>Aside: similarly, if the model were to included an interaction, e.g. `y ~ participant:stimulus`, then the model would be set up assuming that all combinations of these factors can occur. Something different happens in the typical use of brmp, where a concrete dataset is given rather than a description of the factors/columns. In that case, the model is setup to handle (i.e. has coefs. for) only those combinations present in the data.</i>

... and asking for the next trial searches over this entire space:

In [5]:
oed.next_trial()[0:3]

(('b', 'z'),
 5,
 [(('a', 'x'), -1.1991877555847168),
  (('a', 'y'), -1.1910467147827148),
  (('a', 'z'), -1.1890052556991577),
  (('b', 'x'), -1.1875406503677368),
  (('b', 'y'), -1.1943681240081787),
  (('b', 'z'), -1.1701955795288086)])

This is OK, but it won't always be what we want. For example, we might want to know which stimulus we ought present to a particular participant. What might the interface for that be? One possibility is to have `next_trial` take arguments that specify subsets of levels (for zero or more columns) from which the product is built, thereby restricting the search. For example, to restrict the search to only those designs involving participant `x` we would use:

In [6]:
oed.next_trial(participant=['x'])[0:3]

(('b', 'x'),
 1,
 [(('a', 'x'), -1.1555356979370117), (('b', 'x'), -1.147426724433899)])

### Questions

* Is this a reasonable approach?
* Does it handle the cases we're interested in? If not, what else is required?
* What if we don't know the number of participants ahead of time? In general, changes the metadata (such as adding extra levels to a categorical column) will change the model, which might be a messy thing to do during OED. e.g. Whether a particular `priors` specification is valid depends on the metadata. OTOH, this particular move (adding a level to a *grouping* column) has a limited effect on the model, and might be something that could be accomodated on the fly.