Skip to content

Comments

Added argument to initialize Euclidean metric for samplers#576

Merged
mitzimorris merged 5 commits intostan-dev:developfrom
bbbales2:feature/set-mass-matrix
Oct 26, 2017
Merged

Added argument to initialize Euclidean metric for samplers#576
mitzimorris merged 5 commits intostan-dev:developfrom
bbbales2:feature/set-mass-matrix

Conversation

@bbbales2
Copy link
Member

Submisison Checklist

  • Run tests: ./runCmdStanTests.py src/test
  • Declare copyright holder and open-source license: see below

Summary:

This is the CmdStan interface component of: stan-dev/stan#2260

This should allow initialization of Euclidean metric for samplers with an R dump file

Questions:

  1. Where should the metric_file argument go? I think it makes sense under "init" but I didn't put it there cause that would mean breaking a bunch of other stuff (that depends on passing in init=whatever arguments). I just made it a high level argument for now.

  2. I only added check-that-it's-working tests. Lemme know if I should add check-if-it's-failing tests. I figured the Stan tests should handle most of that though (I didn't really add any functionality beyond the interface)

  3. Should the unit_e samplers allow initialization of their Euclidean metrics? The interface isn't there for them.

  4. Nomenclature-wise, are we setting the metric? Or are we initializing the Euclidean metric? Or are we setting a Euclidean metric? Or what is the verbage equivalent to "setting the mass matrix".

How to Verify:

./runCmdStanTests.py src/test/interface/metric_test.cpp

Side Effects:

Documentation:

I still need to do it

Reviewer Suggestions:

@mitzimorris or @sakrejda, whoever feels inclined

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): University of California, Santa Barbara

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@mitzimorris
Copy link
Member

mitzimorris commented Oct 15, 2017

the challenge in adding this, as you've discovered - is the way that the command.hpp code sets up the cascade of dependent arguments. if you look at the current cmdstan manual (which needs to updated as part of the PR), the set of arguments for which passing in a metric file would be valid are the ones for "sample algorithm = hmc"

method = sample (Default)
        sample
          num_samples = 1000 (Default)
          num_warmup = 1000 (Default)
          save_warmup = 0 (Default)
          thin = 1 (Default)
          adapt
            engaged = 1 (Default)
            gamma = 0.050000000000000003 (Default)
            delta = 0.80000000000000004 (Default)
            kappa = 0.75 (Default)
            t0 = 10 (Default)
            init_buffer = 75 (Default)
            term_buffer = 50 (Default)
            window = 25 (Default)
          algorithm = hmc (Default)
            hmc
              engine = nuts (Default)
                nuts
                  max_depth = 10 (Default)
              metric = diag_e (Default)
              stepsize = 1 (Default)
              stepsize_jitter = 0 (Default)

given this, you could add argument "metric_file" as a subargument to argument "hmc" - you should be able to check that when the metric is "diag_e" the metric file is diagonal, and when the metric is "dense_e" the metric file is dense, and then you should be able to make this feature more general.

@bbbales2
Copy link
Member Author

valid are only "sample algorithm = hmc"

Hmm, that does make sense.

which needs to updated as part of the PR

Will do. Was just being lazy.

@bbbales2
Copy link
Member Author

This is ready for review

@sakrejda
Copy link

sakrejda commented Oct 23, 2017 via email

Copy link

@sakrejda sakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only comment I had was maybe say in the doc that metric is used as a starting point for adaptation when both are specified (assuming this is what happens?). Looks good otherwise.

metric, \code{inv\_metric} should be a positive-definite square matrix with
number of rows and columns equal to the number of parameters in the model.
The file pointed at by \code{metric\_file} should have the same format as
the input data. This option can be used with and without adaptation enabled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to add what happens if it's used with adaptation enabled.

@bbbales2
Copy link
Member Author

Only comment I had was maybe say in the doc that metric is used as a starting point for adaptation when both are specified (assuming this is what happens?)

Good point. I ran methods with adaptation and they didn't fail, but I need to make sure the adaptation isn't simply ignoring the provided initial matrix and re-adapting a new one. I'll check that and update the doc to make it clear.

@mitzimorris
Copy link
Member

mitzimorris commented Oct 24, 2017

now that we've got this all plumbed through, I'm still trying to make sense of the use cases for this feature:

  1. the model is fully converged, we want to run the fitted model to generate more samples - in this case, additional param adapt engaged=0 will stop adaptation and go straight to sampling. in which case, the fully consistant config requires:
  • random seed=<seed>
  • metric_file=<filename>
  • stepsize=<stepsize>
  1. the model is taking a long time to converge, we want to checkpoint where it's at and restart.

running tests for case (2), I'm not sure adaptation is respecting stepsize.

here's my test metric file - same as used for stan unit test:

inv_metric <- structure(c(0.787405, 0.884987, 1.19869),.Dim=c(3))";

using model stan/src/test/test-models/good/mcmc/hmc/common/gauss3D.stan, which I copied into the cmdstan directory cmdstan/src/test/test-models, same filename.

trying this command sequence:

./src/test/test-models/gauss3D  random seed=12345 sample num_samples=200 num_warmup=199 algorithm=hmc metric_file=e_diag_3D.R stepsize=0.001
 grep -iA3 "step" output.csv 

I set the stepsize smaller and smaller - if I run this with just one iteration, stepsize changes alot - should it?

@mitzimorris
Copy link
Member

regarding my previous comment, fix for problems w/r/t stepsize and adaptation should go in at the stan services level - not a cmdstan problem.

@bob-carpenter
Copy link
Member

Use case (1) and (2), definitely. A third case is external algorithms---Aki needs that for something, I believe.

@sakrejda
Copy link

Use case 3: 1) Fit the model to data A; 2) pretend like fitting the model to data A + B where B is much much smaller won't change posterior geometry much; 3) fit the model to data A + B without adaptation in parallel since now there's no adaptation to worry about (you could still do multiple chains with different starting points).

@mitzimorris
Copy link
Member

w/r/t testing - do we have tests for use cases 2 and 3?
I've edited my comment above re usecases w/ my attempts to convince myself that this feature works for use case 2, and I'm not sure it does.

@sakrejda
Copy link

Shouldn't tests be in stan::services? Or do you mean in general showing that they are worthwhile use cases?

@mitzimorris
Copy link
Member

yes, tests should be in stan::services.
current tests in stan::services test that sampler initializes itself with pre-specified metric - a unit test at the feature level, not a functional test at the use case level.

@bbbales2
Copy link
Member Author

Only comment I had was maybe say in the doc that metric is used as a starting point for adaptation when both are specified (assuming this is what happens?)

I checked this. The way adaptation works, the provided metric is tossed if adaptation is enabled. So the way I have the docs written now is wrong. If someone provides a metric, adaptation of that metric should be disabled (otherwise it's misleading). I can make these options mutually exclusive.

Is there a way to separately turn off timestep and metric adaptation? For my use case for this I was hoping I could just leave the timestep adaptation to Stan (and just provide the metric).

Looking at the code it seems like either they both happen or neither happen: https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/adapt_diag_e_nuts.hpp#L31

Should we add this in? Or make it so that if someone provides a metric, they're liable for the timestep as well?

Either way, don't merge this pull :P.

@mitzimorris
Copy link
Member

hmm - if that's what the code is doing, that contradicts what Michael said here:

The stepsize parameter defines the initial step size from which the
algorithm begins. If adaptation is engaged then this is quickly modified,
but it can be helpful to start with a small step size on particularly nasty
problems to facilitate adaptation.

(https://groups.google.com/forum/#!topic/stan-users/O-PNZhzVjTI)

@mitzimorris
Copy link
Member

OTOH, the current Stan manual says this:

Stan can be configured with a user- specified step size or it can estimate an optimal step size during warmup using dual averaging

@mitzimorris
Copy link
Member

@bbbales2 - you're misreading the code -

Looking at the code it seems like either they both happen or neither happen: https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/hmc/nuts/adapt_diag_e_nuts.hpp#L31

the variable named adapt_flag_ gets changed by the sampler during its run.

the stepsize can be set at initialization:

https://github.com/stan-dev/stan/blob/476975dacfc13ff7a1aec4cf23ff0fd11a64caea/src/stan/mcmc/hmc/base_hmc.hpp#L152

@mitzimorris
Copy link
Member

this PR looks good, and I believe the stan code for the samplers will do the right thing. however, it would be good for the docs to spell out the 3 use cases and appropriate config:
(1) specify stepsize, metric, set "adapt engaged=0". the appropriate "num_samples" is determined by desired precision of your quantity of interest which in turn depends on "N_eff" for that QoI.
(2) specify stepsize, metric and any other non-default settings used in initial run
(3) depends on external algorithm

@betanalpha
Copy link
Contributor

betanalpha commented Oct 25, 2017 via email

Copy link
Member

@mitzimorris mitzimorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good - many thanks!

@mitzimorris
Copy link
Member

doc looks great! very clear.

@bbbales2
Copy link
Member Author

Np thanks for the review

@mitzimorris mitzimorris merged commit 87b0446 into stan-dev:develop Oct 26, 2017
@mitzimorris
Copy link
Member

#563

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants