ADVI, NUTS and Metropolis produce significantly different results

Dear PyMC developers,

I am trying to develop simple mixed effect model using pymc (see the code below). I have tried NUTS, ADVI, Metropolis for inference. Aside from variations in speed for this  model (Time : AVDI ~= Metrolopis << NUTS ), the results are significantly different. To be more specific, I am interested to estimate $h^2$ which is basically proportion of variance explained by the random effect. 

This is the code:

```
import numpy as np, pymc3 as pm, theano.tensor as T


mixedEffect_model = pm.Model()

with pm.Model() as mixedEffect_model:
    ### hyperpriors
    tau = pm.Gamma('tau', 1.e-3, 1.e-3)
    sigma2 = 1.0/tau
    h2 = pm.Uniform('h2')
    beta_0 = pm.Uniform('beta_0', lower=-1000, upper=1000)   # a replacement for improper prior
    w = pm.Uniform('w', lower=-1000, upper=1000, shape=(M,1))   # a replacement for improper prior

    z = pm.Normal('z', mu=0, sd= (h2*sigma2)**0.5 , shape=(N,1))
    g = T.dot(L,z)

    y = pm.Normal('y', mu = beta_0+g + T.dot(X,w), sd= ((1-h2)*sigma2)**0.5 , shape=(N,1) , observed=y )

```

ADVI:

```
with mixedEffect_model:
    v_params = pm.variational.advi(n=5000)

with mixedEffect_model:
    trace = pm.variational.sample_vp(v_params, draws=5000)

_ = hist(trace['h2'],100)
```

![image](https://cloud.githubusercontent.com/assets/12301533/15840000/d61231f4-2c14-11e6-9166-093fd8c860be.png)

Metropolis:

```
with mixedEffect_model:
    start = pm.find_MAP()
    step = pm.Metropolis()
    trace = pm.sample(5000, step, start=start)

_ = hist(trace['h2'],100)
```

![image](https://cloud.githubusercontent.com/assets/12301533/15840009/ec4a6176-2c14-11e6-9349-a1cd61e3a97c.png)

NUTS (slow):

```
with mixedEffect_model:
    start = pm.find_MAP()
    step = pm.NUTS()
    trace = pm.sample(5000, step, start=start)

_ = hist(trace['h2'],100)    
```

![image](https://cloud.githubusercontent.com/assets/12301533/15842995/2e1c2b36-2c2c-11e6-8367-3480819839a0.png)

Given that `w` and `beta_0` are treated as parameter with (approximately) improper prior, I expect the results to be very close to maximum likelihood estimation.  Using maximum likelihood estimation (more specifically Restricted maximum Likelihood estimation), I expect values around 0.5 for $h^2$ with standard error of around 0.07. The results is very different. It is very different from ML and also very different across different inference engines. NUTS is closer to ML but given that the results are drastically different, I don't know how to explain it.

Any idea?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ADVI, NUTS and Metropolis produce significantly different results #1163

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ADVI, NUTS and Metropolis produce significantly different results #1163

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions