Provide access to initial values #2950

jgabry · 2020-08-08T21:36:22Z

We've recently discussed how CmdStan does not have a way to provide users access to the initial values it used:

It sounds like this is actually an io/services issue because the contents of a stan::io::var_context object would need to be dumped to text. Is that right?

I don't know how difficult this would be, but to me providing access to this information seems pretty important.

The most important information to convey to users I think would be inits for parameters on the constrained scale. Transformed parameters, generated quantities, and the unconstrained scale, are potentially useful but not as high of a priority.

bob-carpenter · 2020-08-09T19:44:03Z

We need to answer at least the following before building this:

Is this a new service argument callback or do we piggyback on the iteration writer using iteration = 0 or something like that? If the latter, what's the effect on existing interfaces? If the former, what's the callback structure (probably just like the iteration writer, but it should be stated)?
Do we output transformed parameters and generated quantities as part of the initialization? They're not given by the var_context, but by running write_array. I'm not sure if that happens as is, because we don't have any need for the transformed parameters or generated quantities of the initial state now. If we do add it, it's going to cause slightly different output than last iteration because the RNG will get advanced.
Is the output on the constrained or unconstrained scale? Presumably the former, but the feature request needs to make this clear.

I would suggest grabbing the values after the var_context is used to extract the values and the results fed through write_array. We can't read from the var_context twice because it can advance the RNG and we don't have an easy and generic way to set it back.

jgabry · 2020-08-10T18:18:43Z

We need to answer at least the following before building this:

Is this a new service argument callback or do we piggyback on the iteration writer using iteration = 0 or something like that? If the latter, what's the effect on existing interfaces? If the former, what's the callback structure (probably just like the iteration writer, but it should be stated)?

Good question. I don't know enough of the current implementation details.

Do we output transformed parameters and generated quantities as part of the initialization? They're not given by the var_context, but by running write_array. I'm not sure if that happens as is, because we don't have any need for the transformed parameters or generated quantities of the initial state now. If we do add it, it's going to cause slightly different output than last iteration because the RNG will get advanced.

If we have to just do the parameters to avoid RNG issues then I think that's better than nothing. The parameters are of greatest interest here anyway.

Is the output on the constrained or unconstrained scale? Presumably the former, but the feature request needs to make this clear.

Good point. Yes, constrained scale. I update the initial post.

bob-carpenter · 2020-08-10T18:54:46Z

For new callback vs. using existing one, you could trace down how RStan uses the iterations.

The RNG issue just means we can't call the var_context to print and then call it again for sampling because the RNG state won't match. Similarly, we can generate transformed parameters and generated quantities as long as we only do it once. Those only come out on the constrained scale.

mitzimorris · 2020-08-10T21:46:17Z

I would suggest grabbing the values after the var_context is used to extract the values and the results fed through write_array. We can't read from the var_context twice because it can advance the RNG and we don't have an easy and generic way to set it back.

this is doable given the instantiated model and the var_context for the initial parameters - it would be very similar to how the standalone generated quantities method works.

syclik · 2020-08-18T19:54:31Z

Sorry about the late reply. There's actually an init_writer() that should have the initial values written out to it.

https://github.com/stan-dev/stan/blob/develop/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp#L51
https://github.com/stan-dev/stan/blob/develop/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp#L65

It's on the unconstrained scale. There are convenience functions to get it back to the constrained scale.

To get to the original post, @jgabry, I think it's a matter of writing it out... it's already there.

mitzimorris · 2020-08-18T20:29:29Z

in CmdStan command.hpp this isn't hooked up to anything.
https://github.com/stan-dev/cmdstan/blob/fd7d8faf88a791d078500cc5fd7458a8d68f7b17/src/cmdstan/command.hpp#L154

syclik · 2020-08-18T21:25:28Z

That's right. It just needs to be stored somewhere.

As a prototype, if you replaced that writer with one that writes out to std::cout, you should see the initial values. I'll try to bring up some code to show that.

mitzimorris · 2020-08-19T00:00:03Z

I understand that it needs to be stored somewhere.
the question is where and how to label it so that it's easily interpretable.

syclik · 2020-08-19T00:16:16Z

If you replace that line linked (command.hpp L154) with this one, it'll print out right before the first iteration:

  stan::callbacks::stream_writer init_writer(std::cout);

It looks something like this...

...

Gradient evaluation took 1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
Adjust your expectations accordingly!


0.277565
Iteration:    1 / 2000 [  0%]  (Warmup)

where 0.277565 is the initial value. (if you ran it with init=0, then you'll see that it's exactly 0.)

I understand that it needs to be stored somewhere.
the question is where and how to label it so that it's easily interpretable.

Got it. That, I have no good solution... we could put it in as the first line of the CSV, but that would wreak havoc with an off-by-one error everywhere. We could add it as a comment, but that's not super useful. We could write it to a different file...

These all seem non-optimal.

mitzimorris · 2020-08-19T03:37:19Z

These all seem non-optimal.

exactly.
which is why Aki's hack is needed (for now) - documented it in the Stan User's Guide, section 9.5.4 -
https://mc-stan.org/docs/2_24/cmdstan-guide/mcmc-config.html#initializing-parameters

jgabry · 2020-08-21T17:20:08Z

Thanks for the additional info. I agree none of these are optimal, but writing it to a different file seems like the least bad of the non-optimal solutions ;)

jgabry added the feature label Aug 8, 2020

jgabry mentioned this issue Aug 8, 2020

CmdStan User's Guide - document "Aki trick" to see initial parameter values stan-dev/docs#255

Closed

jgabry mentioned this issue Nov 2, 2020

knowing the initial values for sampler even I don't set up the initials stan-dev/cmdstanr#331

Open

jgabry mentioned this issue Nov 10, 2020

Consider more efficient serialization of CmdStanFit objects stan-dev/cmdstanr#340

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide access to initial values #2950

Provide access to initial values #2950

jgabry commented Aug 8, 2020 •

edited

bob-carpenter commented Aug 9, 2020

jgabry commented Aug 10, 2020

bob-carpenter commented Aug 10, 2020

mitzimorris commented Aug 10, 2020

syclik commented Aug 18, 2020

mitzimorris commented Aug 18, 2020

syclik commented Aug 18, 2020

mitzimorris commented Aug 19, 2020

syclik commented Aug 19, 2020

mitzimorris commented Aug 19, 2020 •

edited

jgabry commented Aug 21, 2020

Provide access to initial values #2950

Provide access to initial values #2950

Comments

jgabry commented Aug 8, 2020 • edited

bob-carpenter commented Aug 9, 2020

jgabry commented Aug 10, 2020

bob-carpenter commented Aug 10, 2020

mitzimorris commented Aug 10, 2020

syclik commented Aug 18, 2020

mitzimorris commented Aug 18, 2020

syclik commented Aug 18, 2020

mitzimorris commented Aug 19, 2020

syclik commented Aug 19, 2020

mitzimorris commented Aug 19, 2020 • edited

jgabry commented Aug 21, 2020

jgabry commented Aug 8, 2020 •

edited

mitzimorris commented Aug 19, 2020 •

edited