rng_seed appears to no effect in map2stan() #47

kmiddleton · 2016-08-30T13:44:57Z

Setting rng_seed does not appear to ensure repeatable analyses with map2stan(). For example, a simple model from Rethinking:

library(rethinking)
data("Howell1")

d <- Howell1
d2 <- d[ d$age >= 18 , ]

m1 <- map2stan(
  alist(
    height ~ dnorm(mu, sigma),
    mu ~ dnorm(178, 20),
    sigma ~ dunif(0, 50)),
  data = d2,
  WAIC = FALSE,
  seed = 8
)

m2 <- map2stan(
  alist(
    height ~ dnorm(mu, sigma),
    mu ~ dnorm(178, 20),
    sigma ~ dunif(0, 50)),
  data = d2,
  WAIC = FALSE,
  seed = 8
)

m1
m2

Produce slightly different results.

> m1
map2stan model fit
1000 samples from 1 chain

Formula:
height ~ dnorm(mu, sigma)
mu ~ dnorm(178, 20)
sigma ~ dunif(0, 50)

Log-likelihood at expected values: -1219.41 
Deviance: 2438.83 
DIC: 2442.6 
Effective number of parameters (pD): 1.89 

> m2
map2stan model fit
1000 samples from 1 chain

Formula:
height ~ dnorm(mu, sigma)
mu ~ dnorm(178, 20)
sigma ~ dunif(0, 50)

Log-likelihood at expected values: -1219.41 
Deviance: 2438.82 
DIC: 2442.72 
Effective number of parameters (pD): 1.95

all.equal(m1, m2) show lots of differences, which seem to be fundamental rather than related to sampling (e.g., timing).:

 [1] "Attributes: < Component “coef”: Mean relative difference: 0.0001766267 >"                                                                                                                                                                                                                         
 [2] "Attributes: < Component “deviance”: Mean relative difference: 3.669022e-06 >"                                                                                                                                                                                                                     
 [3] "Attributes: < Component “DIC”: Mean relative difference: 6.030028e-05 >"                                                                                                                                                                                                                          
 [4] "Attributes: < Component “pD”: Mean relative difference: 0.03561425 >"       
...
[54] "Attributes: < Component “start”: Component 1: Component 1: Mean relative difference: 0.2771743 >"                                                                                                                                                                                                 
[55] "Attributes: < Component “start”: Component 1: Component 2: Mean relative difference: 0.1926305 >"                                                                                                                                                                                                 
[56] "Attributes: < Component “vcov”: Mean relative difference: 0.03782353 >"

These differences are admittedly small, but my thinking is that the models should be identical except for timing differences (the case when using seed = is a pure stan() call).

The text was updated successfully, but these errors were encountered:

rmcelreath · 2016-08-30T15:13:09Z

Yeah, this is a mysterious issue. The map2stan argument is rng_seed. It's passed to stan as seed but seems not to have an effect. No idea why, since passing a value in a raw stan call works. But I can't yet figure out a way that my code is the issue. You can even check the internal stanfit object and see that stan is saving the passed seed. It just doesn't appear to use it?

Maybe something will occur to me. In the meantime, just use set.seed before the map2stan call.

kmiddleton · 2016-08-30T17:29:59Z

It might be the init parameter in lines 1356 and 1358 of map2stan.R. The comment in this SO question suggests that seed only covers the RNG from the Boost library, and that init draws separately.

set.seed() coupled with set_rng does appear to work:

library(rethinking)
data("Howell1")

d <- Howell1
d2 <- d[ d$age >= 18 , ]

set.seed(5)

m1 <- map2stan(
  alist(
    height ~ dnorm(mu, sigma),
    mu ~ dnorm(178, 20),
    sigma ~ dunif(0, 50)),
  data = d2,
  WAIC = FALSE,
  rng_seed = 8
)

set.seed(5)

m2 <- map2stan(
  alist(
    height ~ dnorm(mu, sigma),
    mu ~ dnorm(178, 20),
    sigma ~ dunif(0, 50)),
  data = d2,
  WAIC = FALSE,
  rng_seed = 8
)

m1
m2

all.equal(m1, m2)

The only differences are in the date, time, dso, and model_cpp.

jgabry · 2016-08-30T19:41:44Z

On Tuesday, August 30, 2016, Kevin Middleton notifications@github.com
wrote:

The comment in this SO question
http://stackoverflow.com/questions/27204096/supplying-seed-to-stan-doesnt-guarantee-the-same-chains
suggests that seed only covers the RNG from the Boost library, and that
init draws separately.

Yes, that's correct. If any initial value generation is being done in R
(whether by you or by passing a function to init) then you'll have to set
both the R seed and the seed for the Boost library.

rmcelreath · 2016-08-31T13:25:01Z

Thanks. Sounds like an easy fix, with this information. Will try to roll it into the next push.

rmcelreath · 2017-03-16T08:29:08Z

This is fixed in experimental branch.

rmcelreath added the bug label Aug 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rng_seed appears to no effect in map2stan() #47

rng_seed appears to no effect in map2stan() #47

kmiddleton commented Aug 30, 2016

rmcelreath commented Aug 30, 2016

kmiddleton commented Aug 30, 2016

jgabry commented Aug 30, 2016

rmcelreath commented Aug 31, 2016

rmcelreath commented Mar 16, 2017

rng_seed appears to no effect in map2stan() #47

rng_seed appears to no effect in map2stan() #47

Comments

kmiddleton commented Aug 30, 2016

rmcelreath commented Aug 30, 2016

kmiddleton commented Aug 30, 2016

jgabry commented Aug 30, 2016

rmcelreath commented Aug 31, 2016

rmcelreath commented Mar 16, 2017