Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Particle marginal Metropolis Hastings Example #13

Merged
merged 4 commits into from Aug 3, 2017

Conversation

@LeahPrice
Copy link
Collaborator

@LeahPrice LeahPrice commented Aug 3, 2017

Adding an example using particle marginal Metropolis Hastings. This example is based on a very similar model to pfNonlinBS. So that a single function can be used to simulate data from these models, the simNonlin function has been generalized and documentation updated accordingly.

LeahPrice added 2 commits Aug 2, 2017
The log normalising constant estimate was not initialised to 0 (which causes problems when doing multiple SMC runs by calling Initialise() and Iterate() functions multiple times)
Adding an example using particle marginal Metropolis Hastings. This example is based on a very similar model to pfNonlinBS. So that a single function can be used to simulate data from these models, the simNonlin function has been generalised and documentation updated accordingly.
@adamjohansen
Copy link
Collaborator

@adamjohansen adamjohansen commented Aug 3, 2017

Hi Leah, thanks for this. This is a trivial thing, but it's likely to recur so I'll mention it...

There are a few more minor things popping up in the travis checks; nothing major, but it's a good idea to take notice of notes and warnings as they appear. Check out lines 900-905 in particular and see Dirk's commit bdc201c for a prototypical solution to this type of NOTE.

Copy link
Collaborator

@adamjohansen adamjohansen left a comment

Good catch on the initialisation, this could have caused some subtle problems later on if you hadn't noticed this.

@LeahPrice
Copy link
Collaborator Author

@LeahPrice LeahPrice commented Aug 3, 2017

Thanks Adam. I ran R CMD check on my computer and saw I didn't get any warnings so I'll have to make sure I look more closely at the Travis checks next time. Are lines 907-912 an issue too? I haven't touched the linear regression examples in this pull request so I'm not sure where that's coming from. I'm using Lazydata: TRUE so I might need to look into what else could be causing the warning. Are Travis checks only done for pull requests? I tried to find the checks from Dirk's most recent commit but I couldn't.

By the way, I wrote a message on the PMMH issue ticket this morning thanking you for your comments and saying I'd followed your advice with the cosine sequence offset. It seems it's been sitting on my computer all day, but I guess there's no point hitting send now.

@LeahPrice
Copy link
Collaborator Author

@LeahPrice LeahPrice commented Aug 3, 2017

I realized I can just do --as-cran in my CMD check, so feel free to disregard my question about whether Travis checks are only done for pull requests.

Copy link
Collaborator

@adamjohansen adamjohansen left a comment

This is essentially there, but it would be good if you could resolve the NAMESPACE issue identified by travis, tweak any of the code that you want to in response to any of these rather trivial comments, and see if there's anything you can do about the identation/whitespace in your editor.

If it's too much of a pain to force your editor to indent consistently then it can be fixed later, it would just be nice to be as efficient as possible.

\code{\link{simNonlin}} for a function to simulate from the model and
\code{\link{nonLinPMMH}} for an example of particle
marginal Metropolis Hastings applied to a non-linear state space model.
}
\author{Adam M. Johansen and Dirk Eddelbuettel}

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

and yourself...

#include "nonLinPMMH.h"

namespace nonLinPMMH {
const double a_prior = 0.01;

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

Indentation seems a bit variable (cf. 27, 42 amongst others). This is easily fixable, but can your editor be persuaded to "do the right thing"?

This comment has been minimized.

@LeahPrice

LeahPrice Aug 3, 2017
Author Collaborator

I think I have fixed this now.

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

Thanks. It looks much more consistent now. BTW I hope you didn't spend too much time on this as it can be done quite automatically (even if you don't like emacs, you can always use it occasionally for things like auto-indent, for example).

(I know things like this seem trivial but it's amazing how much more readable code is if it's indented in a consistent way and when you're working with a big code base it /really/ helps.)

sigw(i) = theta_prop.sigw;
loglike(i) = loglike_prop;
logprior(i) = logprior_prop;
Rcpp::Rcout << "ACCEPTED proposal (" << setw(6) << sigv(i) << ", " << setw(6) << sigw(i) << ") at iteration " << setw(4) << i+1 << " with loglike " << setw(8) << loglike(i) << std::endl;

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

Do we really want to produce this output? I can see it has value for debugging but it's not clear to me that it's something a normal user would want to see.

This comment has been minimized.

@LeahPrice

LeahPrice Aug 3, 2017
Author Collaborator

I had this stuff printed out for debugging and also to give the user a sense of how long is left. This example takes a while to run (especially if you want to do the 50,000 iterations suggested by the paper!), but I guess it's fairly standard to not get a progress update.

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

Fair enough. I wouldn't personally want ~10k lines of output printed to the console when doing something like this; a boolean "verbose" or "quiet" argument might be a good idea so that the user can suppress it. Personally, I'd just print an "iteration xx/yy" every 100 or 1000 iterations if I wanted to reassure the user that something was happening.

I don't recall how the buffering works with Rcout, but you might need some sort of flush instruction to guarantee that the output appears in a timely fashion, too. @eddelbuettel?

logweight += R::dnorm(y(lTime),mean,theta_prop.sigw,TRUE);
}

}

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

More trivia: It's conventional (and good practice) to end source files with a new line (some compilers expect the final source line to be terminated with one).

double MH_ratio;
for (unsigned int i = 1; i<lMCMCits; i++){
// RW proposal for parameters
theta_prop.sigv = R::rnorm(sigv(i-1),0.15);

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

The code looks fine and as we've said before working is far more important than fast, but one thought which might be worth bearing in mind for the future.

In this context, we know how many normal and uniform random variables we need in advance. From a speed perspective, it would make more sense to call R::rnorm only once or twice to obtain vectors of these RVs and similarly to call R::runif only once rather than once per iteration. I.e. To generate vectors of all of the RVs required in advance outside of the main loop. There's a memory cost associated with it, but one which isn't enormous and it's best to minimize the number of calls back to R. It might be interesting to use rprof (or profvis) to compare the two approaches, actually.

This comment has been minimized.

@LeahPrice

LeahPrice Aug 3, 2017
Author Collaborator

Thanks, that's a good point. I've used vectors for this now. I tried using profvis quickly to compare the two approaches. I did 100 iterations without any plotting and the non-vectorised time was 91040 (milliseconds I think) and vectorised was 92240, so it was slightly slower with vectorisation. That's probably not a very fair test though -- 100 iterations is unrealistically small.

I can do a proper check with a more realistic number of iterations and look at time and memory if that's useful.

This comment has been minimized.

@adamjohansen

adamjohansen Aug 3, 2017
Collaborator

I wouldn't worry too much; I'm slightly surprised by the numbers but this isn't a priority anyway.

@adamjohansen
Copy link
Collaborator

@adamjohansen adamjohansen commented Aug 3, 2017

The data() NOTE is another small thing. I didn't mention it because, as you observe, it's unrelated to this pull request. It's nice if individual commits / pulls are coherent change sets.

It would, in the fullness of time, be worth cleaning this up in that it currently dumps the data into the global workspace which is bad practice because (a) the name might collide with something which is already there and (b) it will persist there until it's deleted. Using the "envir=environment()" parameter in the call to data might be sufficient to avoid this.

Dirk might have stronger views about data in R packages, but I'm largely agnostic as long as we don't do anything too antisocial (and putting stuff in a global workspace when it doesn't really need to be there might qualify as mildly antisocial).

Another think @eddelbuettel is better placed to answer, is whether we ought to do a bit of work to kill the notes at line 924 of the travis log (cf. RcppCore/Rcpp#636).

Correcting indentation in nonLinPMMH.h and nonLinPMMH.cpp, storing vectors for random component of RW and correcting authors in simNonlin and nonLinPMMH.
@LeahPrice
Copy link
Collaborator Author

@LeahPrice LeahPrice commented Aug 3, 2017

I tried using envir=environment() and the NOTE doesn't appear anymore. Thanks for this fix. If @eddelbuettel is also happy with this, then I'll update it in a separate pull request. I have the news file updated too and ready for a future pull request.

@eddelbuettel
Copy link
Collaborator

@eddelbuettel eddelbuettel commented Aug 3, 2017

Yes, it is good practice to run R CMD check --as-cran. You can also tell you system to do it (ie if you use RSrudio) and/or just do it by hand. "Real submissions" also have to do it via the R-devel version which is often pickier still (and you get access to that for free via win-builder and/or R-hub if you don't have R-devel built locally).

I'll merge this now and take a look later.

@eddelbuettel eddelbuettel merged commit fe4ea98 into rcppsmc:master Aug 3, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@LeahPrice LeahPrice deleted the LeahPrice:examplePMMH branch Aug 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.