Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results from print() differ from monitor() for the same Stan fit #280

Closed
danielcfurr opened this issue Mar 17, 2016 · 8 comments
Closed
Labels
Milestone

Comments

@danielcfurr
Copy link

Summary:

Results from print() differ from monitor() for the same Stan fit.

Description:

When a stanfit object is passed to monitor(), the results differ totally from print(). When an array of posterior draws from extract() is passed to monitor(), print() and monitor() show different results for n_eff but are otherwise identical.

Reproducible Steps:

# Example from stan() help file:
library(rstan)
scode <- "
parameters {
  real y[2]; 
} 
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
} 
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE) 
print(fit1, digits_summary = 3)

# Compare output from print() to monitor()

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = TRUE),
              digits_summary = 3)

Current Output:

Inference for Stan model: 08aca439b1af079914fdcfd62fb992d8.
4 chains, each with iter=10; warmup=5; thin=1; 
post-warmup draws per chain=5, total post-warmup draws=20.

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff  Rhat
y[1] -0.124   0.168 0.739 -1.213 -0.637 -0.075  0.236  1.092    19 1.124
y[2]  0.005   0.563 1.865 -3.222 -0.871  0.143  0.535  3.954    11 2.015
lp__ -0.903   0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022    20 1.213

Samples were drawn using NUTS(diag_e) at Wed Mar 16 17:09:32 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

VERSUS m1

Inference for the input samples (4 chains: each with iter=5; warmup=2):

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
y[1]  0.005   0.182 0.629 -0.946 -0.348 -0.075  0.223  1.080    12   NA
y[2]  1.032   0.738 1.425  0.024  0.165  0.394  1.088  4.018     4   NA
lp__ -0.697   0.401 0.870 -2.473 -0.733 -0.326 -0.169 -0.015     5   NA

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

VERSUS m2

Inference for the input samples (4 chains: each with iter=10; warmup=5):

       mean se_mean    sd   2.5%    25%    50%    75%  97.5% n_eff  Rhat
y[1] -0.124   0.166 0.739 -1.213 -0.637 -0.075  0.236  1.092    20 1.124
y[2]  0.005   0.661 1.865 -3.222 -0.871  0.143  0.535  3.954     8 2.015
lp__ -0.903   0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022    20 1.213

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Expected Output:

If applicable, the output you expected from RStan.

RStan Version:

2.9.0.3

R Version:

R version 3.2.2 (2015-08-14)

Operating System:

Windows 7.1

@bob-carpenter bob-carpenter added this to the 2.9.0++ milestone Mar 17, 2016
@bob-carpenter
Copy link

Thanks for the very careful bug report --- we really appreciate it.

@betanalpha
Copy link
Contributor

print() doesn’t include warmup iterations — are the outputs the same when you set inc_warmup = FALSE?

On Mar 17, 2016, at 12:29 AM, Daniel Furr notifications@github.com wrote:

Summary:

Results from print() differ from monitor() for the same Stan fit.

Description:

When a stanfit object is passed to monitor(), the results differ totally from print(). When an array of posterior draws from extract() is passed to monitor(), print() and monitor() show different results for n_eff but are otherwise identical.

Reproducible Steps:

Example from stan() help file:

library(rstan)
scode <- "
parameters {
real y[2];
}
model {
y[1] ~ normal(0, 1);
y[2] ~ double_exponential(0, 2);
}
"
fit1 <- stan(model_code = scode, iter = 10, verbose = FALSE)
print(fit1, digits_summary = 3)

Compare output from print() to monitor()

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = TRUE),
digits_summary = 3)

Current Output:

Inference for Stan model: 08aca439b1af079914fdcfd62fb992d8.
4 chains, each with iter=10; warmup=5; thin=1;
post-warmup draws per chain=5, total post-warmup draws=20.

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] -0.124 0.168 0.739 -1.213 -0.637 -0.075 0.236 1.092 19 1.124
y[2] 0.005 0.563 1.865 -3.222 -0.871 0.143 0.535 3.954 11 2.015
lp__ -0.903 0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022 20 1.213

Samples were drawn using NUTS(diag_e) at Wed Mar 16 17:09:32 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

VERSUS m1

Inference for the input samples (4 chains: each with iter=5; warmup=2):

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] 0.005 0.182 0.629 -0.946 -0.348 -0.075 0.223 1.080 12 NA
y[2] 1.032 0.738 1.425 0.024 0.165 0.394 1.088 4.018 4 NA
lp__ -0.697 0.401 0.870 -2.473 -0.733 -0.326 -0.169 -0.015 5 NA

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

VERSUS m2

Inference for the input samples (4 chains: each with iter=10; warmup=5):

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
y[1] -0.124 0.166 0.739 -1.213 -0.637 -0.075 0.236 1.092 20 1.124
y[2] 0.005 0.661 1.865 -3.222 -0.871 0.143 0.535 3.954 8 2.015
lp__ -0.903 0.190 0.849 -2.421 -1.348 -0.522 -0.239 -0.022 20 1.213

For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

Expected Output:

If applicable, the output you expected from RStan.

RStan Version:

2.9.0.3

R Version:

R version 3.2.2 (2015-08-14)

Operating System:

Windows 7.1


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub

@maverickg
Copy link
Contributor

And by default, monitor considers the first half as warmup iterations (see
the doc).

@danielcfurr
Copy link
Author

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
              digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1) return different results is confusing behavior and doesn't leave me with a convenient way to programmatically access correct results (those matching print(fit1)).

@bob-carpenter
Copy link

I'm continuing on the issue, which isn't really appropriate,
but I don't want to bug users with this.

I agree with Daniel that this is confusing behavior.
But then I don't understand the purpose of monitor() --- when would
I use that rather than print?

Given that permuted=FALSE and inc_warmup=FALSE are defaults for
extract, then it's really just the extra extract() we're talking
about in terms of inconvenience, right?

On Mar 16, 2016, at 9:08 PM, Daniel Furr notifications@github.com wrote:

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1) return different results is confusing behavior and doesn't leave me with a convenient way to programmatically access correct results (those matching print(fit1)).


You are receiving this because you commented.
Reply to this email directly or view it on GitHub

@danielcfurr
Copy link
Author

I use monitor() to get a matrix of posterior summary statistics, which most
often I use to build plots. This is the basis of the plots I've used in my
case studies for example. print() shows the same info but does not return a
matrix. (Unless I'm totally mistaken about print. I'm not at a computer to
check.)

You're right that the inconvenience amounts to just an additional line of
code, so it isn't a big deal for me to adjust. Thing is that I've been
using monitor() in this naive sort of way for a long time and just realized
the discrepancy. My experiences aren't necessarily universal, but I feel
like other users may easily make the same error.

Also, I apologize if I misused the issues tracker earlier with my
opinionated response. I don't really know much about the norms in software
development.
On Mar 16, 2016 6:28 PM, "Bob Carpenter" notifications@github.com wrote:

I'm continuing on the issue, which isn't really appropriate,
but I don't want to bug users with this.

I agree with Daniel that this is confusing behavior.
But then I don't understand the purpose of monitor() --- when would
I use that rather than print?

Given that permuted=FALSE and inc_warmup=FALSE are defaults for
extract, then it's really just the extra extract() we're talking
about in terms of inconvenience, right?

On Mar 16, 2016, at 9:08 PM, Daniel Furr notifications@github.com
wrote:

If I amend the example like this:

m1 <- monitor(fit1, digits_summary = 3)
m2 <- monitor(extract(fit1, permuted = FALSE, inc_warmup = FALSE),
digits_summary = 3)

then m1 and m2 agree.

But from a user perspective, the fact that print(fit1) and monitor(fit1)
return different results is confusing behavior and doesn't leave me with a
convenient way to programmatically access correct results (those matching
print(fit1)).


You are receiving this because you commented.
Reply to this email directly or view it on GitHub


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#280 (comment)

@maverickg
Copy link
Contributor

this is NOT a bug.

Just paste some doc as the answer to some questions raised in this "issue"

Doc of monitor has (note argument warmup and it's default value).

Usage:

     monitor(sims, warmup = floor(dim(sims)[1]/2),
               probs = c(0.025, 0.25, 0.5, 0.75, 0.975),
               digits_summary = 1, print = TRUE, ...)

In addition, doc of print.stanfit has

See Also:

     S4 class ‘stanfit’ and particularly its method ‘summary’, which is
     used to obtain the values that are printed out.

@bob-carpenter
Copy link

As Jiqiang says, not a bug per se, so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants