Control precision of sampling ASCII output #2515

bob-carpenter · 2018-04-22T20:06:27Z

From @aaronjg on April 20, 2018 20:19

Summary:

When writing out data to the sample file, precision is lost. The data should be the same when writing out and read back in using read_stan_csv as when using rstan.

Description:

When rstan writes to a file it only keeps the first 6 decimal places of precision, this causes the sample file to differ from what is stored in the rstan object.

Reproducible Steps:

stan.model <- compile_model("bernoulli.stan")
source(bernoulli.data.R)
out <- sampling(stan.model,chains=1,sample_file="out.csv",seed=1)

Current Output:

extract(foo,permute=FALSE,inc_warmup=TRUE)[1:10,,]

          parameters
iterations      theta      lp__
      [1,] 0.10831226 -7.699964
      [2,] 0.10831226 -7.699964
      [3,] 0.10831226 -7.699964
      [4,] 0.10719339 -7.719830
      [5,] 0.08838579 -8.110978
      [6,] 0.23460585 -6.755824
      [7,] 0.22917849 -6.762448
      [8,] 0.17383842 -6.967571
      [9,] 0.17383842 -6.967571
     [10,] 0.19948957 -6.838531

head -n 35 out.csv| tail

lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
-7.69996,0.821754,1,2,3,0,7.95282,0.108312
-7.69996,1.87223e-146,10.4034,1,1,0,9.9209,0.108312
-7.69996,0.0719279,1.59718,1,1,0,7.70735,0.108312
-7.71983,0.999813,0.180632,1,1,0,7.72391,0.107193
-8.11098,0.997778,0.23924,3,7,0,8.11104,0.0883858
-6.75582,0.995139,0.366777,2,5,0,8.47479,0.234606
-6.76245,0.978753,0.609755,2,3,0,6.91575,0.229178
-6.96757,0.901618,1.01542,1,1,0,6.9752,0.173838
-6.96757,0.00030679,1.36694,1,1,0,7.87545,0.173838

Expected Output:

If applicable, the output you expected from RStan.

File written should have the same values as the extract command.

RStan Version:

Compiled from: 4706b82028a7fc3a31cbdf6c60beed4c49233562

R Version:

"R version 3.4.4 (2018-03-15)"

Operating System:

Your operating system (e.g., OS X 10.11.3)
Ubuntu 14.04

Copied from original issue: stan-dev/rstan#518

The text was updated successfully, but these errors were encountered:

bob-carpenter · 2018-04-22T20:06:28Z

From @bgoodri on April 20, 2018 20:25

This is a Stan thing rather than a RStan one, and I believe it is
intentional.

On Fri, Apr 20, 2018 at 4:19 PM, aaronjg notifications@github.com wrote:

Summary:

When writing out data to the sample file, precision is lost. The data
should be the same when writing out and read back in using read_stan_csv as
when using rstan.
Description:

When rstan writes to a file it only keeps the first 6 decimal places of
precision, this causes the sample file to differ from what is stored in the
rstan object.
Reproducible Steps:

stan.model <- compile_model("bernoulli.stan")
source(bernoulli.data.R)
out <- sampling(stan.model,chains=1,sample_file="out.csv",seed=1)
Current Output:

extract(foo,permute=FALSE,inc_warmup=TRUE)[1:10,,]
      parameters
iterations theta lp__
[1,] 0.10831226 -7.699964
[2,] 0.10831226 -7.699964
[3,] 0.10831226 -7.699964
[4,] 0.10719339 -7.719830
[5,] 0.08838579 -8.110978
[6,] 0.23460585 -6.755824
[7,] 0.22917849 -6.762448
[8,] 0.17383842 -6.967571
[9,] 0.17383842 -6.967571
[10,] 0.19948957 -6.838531

head -n 35 out.csv| tail

lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
-7.69996,0.821754,1,2,3,0,7.95282,0.108312
-7.69996,1.87223e-146,10.4034,1,1,0,9.9209,0.108312
-7.69996,0.0719279,1.59718,1,1,0,7.70735,0.108312
-7.71983,0.999813,0.180632,1,1,0,7.72391,0.107193
-8.11098,0.997778,0.23924,3,7,0,8.11104,0.0883858
-6.75582,0.995139,0.366777,2,5,0,8.47479,0.234606
-6.76245,0.978753,0.609755,2,3,0,6.91575,0.229178
-6.96757,0.901618,1.01542,1,1,0,6.9752,0.173838
-6.96757,0.00030679,1.36694,1,1,0,7.87545,0.173838

Expected Output:

If applicable, the output you expected from RStan.

File written should have the same values as the extract command.
RStan Version:

Compiled from: 4706b82
stan-dev/rstan@4706b82
R Version:

"R version 3.4.4 (2018-03-15)"
Operating System:

Your operating system (e.g., OS X 10.11.3)
Ubuntu 14.04

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
stan-dev/rstan#518, or mute the thread
https://github.com/notifications/unsubscribe-auth/ADOrqiK4jPoJYxqxM7Ic1cVL2mAkXCXlks5tqkK4gaJpZM4TeCT5
.

bob-carpenter · 2018-04-22T20:06:28Z

We could double file size and clog up I/O for that extra precision, but most computations don't have much more than the residual precision we provide left over. So even though you get about 16 digits of precision in floating point, after sampling, it's usually not that accurate.

Ideally, we'd have a feature to control the precison.

bob-carpenter · 2018-04-22T20:06:29Z

I'm going to move this to being a Stan feature request. My guess is that we'll wind up providing a binary output format before fixing it, though you never know. It should be easy to extend precision, just a matter of how to control it in the calls.

aaronjg · 2018-04-22T20:27:27Z

I don't particularly expect the extra precision to add much to the inference. However, as I was moving from keeping the results in memory to streaming to a file and loading them back in, I was expecting identical results and had some tests fail because of it. Having a binary output format seems ideal.

jgabry · 2018-04-22T21:27:13Z

Yeah this would be nice but at least it should be deterministic currently, so a tolerance level for the tests will work reliably.

If it’s not documented anywhere we should do that too.

bob-carpenter · 2018-04-23T14:50:02Z

The outputs aren't random now, just limited precision that's hard coded into the I/O (or taken by default---I don't even know which). I understand that this level of consistency with round trip I/O would be nice, but that's usually too much to ask for floating point. We could get more precision---don't know if round trips are possible. The usual recommendation is to never try to compare floating point other than to within known precision.

…

On Apr 22, 2018, at 5:27 PM, Jonah Gabry ***@***.***> wrote: Yeah this would be nice but at least it should be deterministic currently, so a tolerance level for the tests will work reliably. If it’s not documented anywhere we should do that too. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

rok-cesnovar · 2021-02-02T11:52:20Z

This was added to cmdstan via the sig_figs argument for 2.25. Closing.

bob-carpenter mentioned this issue Apr 22, 2018

Precision Differs RStan and sample file stan-dev/rstan#518

Closed

bob-carpenter changed the title ~~Precision Differs RStan and sample file~~ Control precision of sampling ASCII output Apr 22, 2018

bob-carpenter added feature good first issue labels Apr 22, 2018

bob-carpenter added this to the v3 milestone Apr 22, 2018

This was referenced Jun 4, 2018

control CSV writer output precision #2245

Closed

Consolidate and refactor output from algorithms #2534

Open

alashworth mentioned this issue Mar 12, 2019

Control precision of sampling ASCII output alashworth/test-issue-import#191

Open

rok-cesnovar closed this as completed Feb 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Control precision of sampling ASCII output #2515

Control precision of sampling ASCII output #2515

bob-carpenter commented Apr 22, 2018

bob-carpenter commented Apr 22, 2018

bob-carpenter commented Apr 22, 2018

bob-carpenter commented Apr 22, 2018

aaronjg commented Apr 22, 2018

jgabry commented Apr 22, 2018

bob-carpenter commented Apr 23, 2018 via email

rok-cesnovar commented Feb 2, 2021

Control precision of sampling ASCII output #2515

Control precision of sampling ASCII output #2515

Comments

bob-carpenter commented Apr 22, 2018

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

RStan Version:

R Version:

Operating System:

bob-carpenter commented Apr 22, 2018

bob-carpenter commented Apr 22, 2018

bob-carpenter commented Apr 22, 2018

aaronjg commented Apr 22, 2018

jgabry commented Apr 22, 2018

bob-carpenter commented Apr 23, 2018 via email

rok-cesnovar commented Feb 2, 2021