Results from print() differ from monitor() for the same Stan fit #280

danielcfurr · 2016-03-17T00:29:47Z

Summary:

Results from print() differ from monitor() for the same Stan fit.

Description:

When a stanfit object is passed to monitor(), the results differ totally from print(). When an array of posterior draws from extract() is passed to monitor(), print() and monitor() show different results for n_eff but are otherwise identical.

Reproducible Steps:

# Example from stan() help file:
library(rstan)
scode <- "
parameters {
  real y[2]; 
} 
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
} 
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE) 
print(fit1, digits_summary = 3)

# Compare output from print() to monitor()

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = TRUE),
              digits_summary = 3)

Current Output:

Inference for Stan model: 08aca439b1af079914fdcfd62fb992d8.
4 chains, each with iter=10; warmup=5; thin=1; 
post-warmup draws per chain=5, total post-warmup draws=20.

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff  Rhat
y[1] -0.124   0.168 0.739 -1.213 -0.637 -0.075  0.236  1.092    19 1.124
y[2]  0.005   0.563 1.865 -3.222 -0.871  0.143  0.535  3.954    11 2.015
lp__ -0.903   0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022    20 1.213

Samples were drawn using NUTS(diag_e) at Wed Mar 16 17:09:32 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

VERSUS m1

Inference for the input samples (4 chains: each with iter=5; warmup=2):

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
y[1]  0.005   0.182 0.629 -0.946 -0.348 -0.075  0.223  1.080    12   NA
y[2]  1.032   0.738 1.425  0.024  0.165  0.394  1.088  4.018     4   NA
lp__ -0.697   0.401 0.870 -2.473 -0.733 -0.326 -0.169 -0.015     5   NA

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

VERSUS m2

Inference for the input samples (4 chains: each with iter=10; warmup=5):

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff  Rhat
y[1] -0.124   0.166 0.739 -1.213 -0.637 -0.075  0.236  1.092    20 1.124
y[2]  0.005   0.661 1.865 -3.222 -0.871  0.143  0.535  3.954     8 2.015
lp__ -0.903   0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022    20 1.213

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Expected Output:

If applicable, the output you expected from RStan.

RStan Version:

2.9.0.3

R Version:

R version 3.2.2 (2015-08-14)

Operating System:

Windows 7.1

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2016-03-17T00:31:53Z

Thanks for the very careful bug report --- we really appreciate it.

betanalpha · 2016-03-17T00:38:40Z

print() doesn’t include warmup iterations — are the outputs the same when you set inc_warmup = FALSE?

On Mar 17, 2016, at 12:29 AM, Daniel Furr notifications@github.com wrote:

Summary:

Results from print() differ from monitor() for the same Stan fit.

Description:

When a stanfit object is passed to monitor(), the results differ totally from print(). When an array of posterior draws from extract() is passed to monitor(), print() and monitor() show different results for n_eff but are otherwise identical.

Reproducible Steps:

Example from stan() help file:

library(rstan)
scode <- "
parameters {
real y[2];
}
model {
y[1] ~ normal(0, 1);
y[2] ~ double_exponential(0, 2);
}
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE)
print(fit1, digits_summary = 3)

Compare output from print() to monitor()

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = TRUE),
digits_summary = 3)

Current Output:

Inference for Stan model: 08aca439b1af079914fdcfd62fb992d8.
4 chains, each with iter=10; warmup=5; thin=1;
post-warmup draws per chain=5, total post-warmup draws=20.

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] -0.124 0.168 0.739 -1.213 -0.637 -0.075 0.236 1.092 19 1.124
y[2] 0.005 0.563 1.865 -3.222 -0.871 0.143 0.535 3.954 11 2.015
lp__ -0.903 0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022 20 1.213

Samples were drawn using NUTS(diag_e) at Wed Mar 16 17:09:32 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

VERSUS m1

Inference for the input samples (4 chains: each with iter=5; warmup=2):

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] 0.005 0.182 0.629 -0.946 -0.348 -0.075 0.223 1.080 12 NA
y[2] 1.032 0.738 1.425 0.024 0.165 0.394 1.088 4.018 4 NA
lp__ -0.697 0.401 0.870 -2.473 -0.733 -0.326 -0.169 -0.015 5 NA

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

VERSUS m2

Inference for the input samples (4 chains: each with iter=10; warmup=5):

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] -0.124 0.166 0.739 -1.213 -0.637 -0.075 0.236 1.092 20 1.124
y[2] 0.005 0.661 1.865 -3.222 -0.871 0.143 0.535 3.954 8 2.015
lp__ -0.903 0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022 20 1.213

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

Expected Output:

If applicable, the output you expected from RStan.

RStan Version:

2.9.0.3

R Version:

R version 3.2.2 (2015-08-14)

Operating System:

Windows 7.1

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub

maverickg · 2016-03-17T00:55:45Z

And by default, monitor considers the first half as warmup iterations (see
the doc).

danielcfurr · 2016-03-17T01:08:49Z

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
              digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1) return different results is confusing behavior and doesn't leave me with a convenient way to programmatically access correct results (those matching print(fit1)).

bob-carpenter · 2016-03-17T01:28:51Z

I'm continuing on the issue, which isn't really appropriate,
but I don't want to bug users with this.

I agree with Daniel that this is confusing behavior.
But then I don't understand the purpose of monitor() --- when would
I use that rather than print?

Given that permuted=FALSE and inc_warmup=FALSE are defaults for
extract, then it's really just the extra extract() we're talking
about in terms of inconvenience, right?

On Mar 16, 2016, at 9:08 PM, Daniel Furr notifications@github.com wrote:

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1) return different results is confusing behavior and doesn't leave me with a convenient way to programmatically access correct results (those matching print(fit1)).

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub

danielcfurr · 2016-03-17T03:53:50Z

I use monitor() to get a matrix of posterior summary statistics, which most
often I use to build plots. This is the basis of the plots I've used in my
case studies for example. print() shows the same info but does not return a
matrix. (Unless I'm totally mistaken about print. I'm not at a computer to
check.)

You're right that the inconvenience amounts to just an additional line of
code, so it isn't a big deal for me to adjust. Thing is that I've been
using monitor() in this naive sort of way for a long time and just realized
the discrepancy. My experiences aren't necessarily universal, but I feel
like other users may easily make the same error.

Also, I apologize if I misused the issues tracker earlier with my
opinionated response. I don't really know much about the norms in software
development.
On Mar 16, 2016 6:28 PM, "Bob Carpenter" notifications@github.com wrote:

I'm continuing on the issue, which isn't really appropriate,
but I don't want to bug users with this.

I agree with Daniel that this is confusing behavior.
But then I don't understand the purpose of monitor() --- when would
I use that rather than print?

Given that permuted=FALSE and inc_warmup=FALSE are defaults for
extract, then it's really just the extra extract() we're talking
about in terms of inconvenience, right?

On Mar 16, 2016, at 9:08 PM, Daniel Furr notifications@github.com
wrote:

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1)
return different results is confusing behavior and doesn't leave me with a
convenient way to programmatically access correct results (those matching
print(fit1)).

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#280 (comment)

maverickg · 2016-03-18T00:31:07Z

this is NOT a bug.

Just paste some doc as the answer to some questions raised in this "issue"

Doc of monitor has (note argument warmup and it's default value).

Usage:

     monitor(sims, warmup = floor(dim(sims)[1]/2),
               probs = c(0.025, 0.25, 0.5, 0.75, 0.975),
               digits_summary = 1, print = TRUE, ...)

In addition, doc of print.stanfit has

See Also:

     S4 class ‘stanfit’ and particularly its method ‘summary’, which is
     used to obtain the values that are printed out.

bob-carpenter · 2016-03-18T00:49:08Z

As Jiqiang says, not a bug per se, so I'm closing this issue.

bob-carpenter added the bug label Mar 17, 2016

bob-carpenter added this to the 2.9.0++ milestone Mar 17, 2016

bob-carpenter closed this as completed Mar 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results from print() differ from monitor() for the same Stan fit #280

Results from print() differ from monitor() for the same Stan fit #280

danielcfurr commented Mar 17, 2016

bob-carpenter commented Mar 17, 2016

betanalpha commented Mar 17, 2016

maverickg commented Mar 17, 2016

danielcfurr commented Mar 17, 2016

bob-carpenter commented Mar 17, 2016

danielcfurr commented Mar 17, 2016

maverickg commented Mar 18, 2016

bob-carpenter commented Mar 18, 2016

Results from print() differ from monitor() for the same Stan fit #280

Results from print() differ from monitor() for the same Stan fit #280

Comments

danielcfurr commented Mar 17, 2016

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

RStan Version:

R Version:

Operating System:

bob-carpenter commented Mar 17, 2016

betanalpha commented Mar 17, 2016

maverickg commented Mar 17, 2016

danielcfurr commented Mar 17, 2016

bob-carpenter commented Mar 17, 2016

danielcfurr commented Mar 17, 2016

maverickg commented Mar 18, 2016

bob-carpenter commented Mar 18, 2016