Particle marginal Metropolis Hastings Example #13

LeahPrice · 2017-08-03T02:45:49Z

Adding an example using particle marginal Metropolis Hastings. This example is based on a very similar model to pfNonlinBS. So that a single function can be used to simulate data from these models, the simNonlin function has been generalized and documentation updated accordingly.

The log normalising constant estimate was not initialised to 0 (which causes problems when doing multiple SMC runs by calling Initialise() and Iterate() functions multiple times)

Adding an example using particle marginal Metropolis Hastings. This example is based on a very similar model to pfNonlinBS. So that a single function can be used to simulate data from these models, the simNonlin function has been generalised and documentation updated accordingly.

adamjohansen · 2017-08-03T06:45:57Z

Hi Leah, thanks for this. This is a trivial thing, but it's likely to recur so I'll mention it...

There are a few more minor things popping up in the travis checks; nothing major, but it's a good idea to take notice of notes and warnings as they appear. Check out lines 900-905 in particular and see Dirk's commit bdc201c for a prototypical solution to this type of NOTE.

adamjohansen

Good catch on the initialisation, this could have caused some subtle problems later on if you hadn't noticed this.

LeahPrice · 2017-08-03T07:10:45Z

Thanks Adam. I ran R CMD check on my computer and saw I didn't get any warnings so I'll have to make sure I look more closely at the Travis checks next time. Are lines 907-912 an issue too? I haven't touched the linear regression examples in this pull request so I'm not sure where that's coming from. I'm using Lazydata: TRUE so I might need to look into what else could be causing the warning. Are Travis checks only done for pull requests? I tried to find the checks from Dirk's most recent commit but I couldn't.

By the way, I wrote a message on the PMMH issue ticket this morning thanking you for your comments and saying I'd followed your advice with the cosine sequence offset. It seems it's been sitting on my computer all day, but I guess there's no point hitting send now.

LeahPrice · 2017-08-03T07:17:59Z

I realized I can just do --as-cran in my CMD check, so feel free to disregard my question about whether Travis checks are only done for pull requests.

adamjohansen

This is essentially there, but it would be good if you could resolve the NAMESPACE issue identified by travis, tweak any of the code that you want to in response to any of these rather trivial comments, and see if there's anything you can do about the identation/whitespace in your editor.

If it's too much of a pain to force your editor to indent consistently then it can be fixed later, it would just be nice to be as efficient as possible.

adamjohansen · 2017-08-03T07:24:44Z

man/pfNonlinBS.Rd

+\code{\link{simNonlin}} for a function to simulate from the model and
+\code{\link{nonLinPMMH}} for an example of particle
+marginal Metropolis Hastings applied to a non-linear state space model.
+}
 \author{Adam M. Johansen and Dirk Eddelbuettel}


and yourself...

adamjohansen · 2017-08-03T07:26:28Z

src/nonLinPMMH.cpp

+#include "nonLinPMMH.h"
+
+namespace nonLinPMMH {
+    const double a_prior = 0.01;


Indentation seems a bit variable (cf. 27, 42 amongst others). This is easily fixable, but can your editor be persuaded to "do the right thing"?

I think I have fixed this now.

Thanks. It looks much more consistent now. BTW I hope you didn't spend too much time on this as it can be done quite automatically (even if you don't like emacs, you can always use it occasionally for things like auto-indent, for example).

(I know things like this seem trivial but it's amazing how much more readable code is if it's indented in a consistent way and when you're working with a big code base it /really/ helps.)

adamjohansen · 2017-08-03T07:30:44Z

src/nonLinPMMH.cpp

+			sigw(i) = theta_prop.sigw;
+			loglike(i) = loglike_prop;
+			logprior(i) = logprior_prop;
+			Rcpp::Rcout  << "ACCEPTED proposal (" << setw(6) << sigv(i) << ", " << setw(6) <<  sigw(i) << ") at iteration " << setw(4) << i+1 << " with loglike " << setw(8) << loglike(i) << std::endl;


Do we really want to produce this output? I can see it has value for debugging but it's not clear to me that it's something a normal user would want to see.

I had this stuff printed out for debugging and also to give the user a sense of how long is left. This example takes a while to run (especially if you want to do the 50,000 iterations suggested by the paper!), but I guess it's fairly standard to not get a progress update.

Fair enough. I wouldn't personally want ~10k lines of output printed to the console when doing something like this; a boolean "verbose" or "quiet" argument might be a good idea so that the user can suppress it. Personally, I'd just print an "iteration xx/yy" every 100 or 1000 iterations if I wanted to reassure the user that something was happening.

I don't recall how the buffering works with Rcout, but you might need some sort of flush instruction to guarantee that the output appears in a timely fashion, too. @eddelbuettel?

adamjohansen · 2017-08-03T07:34:18Z

src/nonLinPMMH.cpp

+	logweight += R::dnorm(y(lTime),mean,theta_prop.sigw,TRUE);
+}
+
+}


More trivia: It's conventional (and good practice) to end source files with a new line (some compilers expect the final source line to be terminated with one).

adamjohansen · 2017-08-03T07:38:29Z

src/nonLinPMMH.cpp

+	double MH_ratio;
+	for (unsigned int i = 1; i<lMCMCits; i++){
+		// RW proposal for parameters
+		theta_prop.sigv = R::rnorm(sigv(i-1),0.15);


The code looks fine and as we've said before working is far more important than fast, but one thought which might be worth bearing in mind for the future.

In this context, we know how many normal and uniform random variables we need in advance. From a speed perspective, it would make more sense to call R::rnorm only once or twice to obtain vectors of these RVs and similarly to call R::runif only once rather than once per iteration. I.e. To generate vectors of all of the RVs required in advance outside of the main loop. There's a memory cost associated with it, but one which isn't enormous and it's best to minimize the number of calls back to R. It might be interesting to use rprof (or profvis) to compare the two approaches, actually.

Thanks, that's a good point. I've used vectors for this now. I tried using profvis quickly to compare the two approaches. I did 100 iterations without any plotting and the non-vectorised time was 91040 (milliseconds I think) and vectorised was 92240, so it was slightly slower with vectorisation. That's probably not a very fair test though -- 100 iterations is unrealistically small.

I can do a proper check with a more realistic number of iterations and look at time and memory if that's useful.

I wouldn't worry too much; I'm slightly surprised by the numbers but this isn't a priority anyway.

adamjohansen · 2017-08-03T08:02:40Z

The data() NOTE is another small thing. I didn't mention it because, as you observe, it's unrelated to this pull request. It's nice if individual commits / pulls are coherent change sets.

It would, in the fullness of time, be worth cleaning this up in that it currently dumps the data into the global workspace which is bad practice because (a) the name might collide with something which is already there and (b) it will persist there until it's deleted. Using the "envir=environment()" parameter in the call to data might be sufficient to avoid this.

Dirk might have stronger views about data in R packages, but I'm largely agnostic as long as we don't do anything too antisocial (and putting stuff in a global workspace when it doesn't really need to be there might qualify as mildly antisocial).

Another think @eddelbuettel is better placed to answer, is whether we ought to do a bit of work to kill the notes at line 924 of the travis log (cf. RcppCore/Rcpp#636).

Correcting indentation in nonLinPMMH.h and nonLinPMMH.cpp, storing vectors for random component of RW and correcting authors in simNonlin and nonLinPMMH.

LeahPrice · 2017-08-03T09:22:23Z

I tried using envir=environment() and the NOTE doesn't appear anymore. Thanks for this fix. If @eddelbuettel is also happy with this, then I'll update it in a separate pull request. I have the news file updated too and ready for a future pull request.

eddelbuettel · 2017-08-03T10:27:27Z

Yes, it is good practice to run R CMD check --as-cran. You can also tell you system to do it (ie if you use RSrudio) and/or just do it by hand. "Real submissions" also have to do it via the R-devel version which is often pickier still (and you get access to that for free via win-builder and/or R-hub if you don't have R-devel built locally).

I'll merge this now and take a look later.

LeahPrice added 2 commits August 3, 2017 08:49

Initialising logNC = 0

a9c45f0

The log normalising constant estimate was not initialised to 0 (which causes problems when doing multiple SMC runs by calling Initialise() and Iterate() functions multiple times)

adamjohansen reviewed Aug 3, 2017

View reviewed changes

Adding abline to imports

f31701d

adamjohansen requested changes Aug 3, 2017

View reviewed changes

Minor changes

d726cda

Correcting indentation in nonLinPMMH.h and nonLinPMMH.cpp, storing vectors for random component of RW and correcting authors in simNonlin and nonLinPMMH.

adamjohansen approved these changes Aug 3, 2017

View reviewed changes

eddelbuettel merged commit fe4ea98 into rcppsmc:master Aug 3, 2017

LeahPrice deleted the examplePMMH branch August 4, 2017 01:04

adamjohansen mentioned this pull request Aug 4, 2017

Updating news and fixing data note #15

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Particle marginal Metropolis Hastings Example #13

Particle marginal Metropolis Hastings Example #13

LeahPrice commented Aug 3, 2017

adamjohansen commented Aug 3, 2017

adamjohansen left a comment •

edited

Loading

LeahPrice commented Aug 3, 2017

LeahPrice commented Aug 3, 2017

adamjohansen left a comment

adamjohansen Aug 3, 2017

adamjohansen Aug 3, 2017

LeahPrice Aug 3, 2017

adamjohansen Aug 3, 2017

adamjohansen Aug 3, 2017

LeahPrice Aug 3, 2017

adamjohansen Aug 3, 2017 •

edited

Loading

adamjohansen Aug 3, 2017

adamjohansen Aug 3, 2017

LeahPrice Aug 3, 2017

adamjohansen Aug 3, 2017

adamjohansen commented Aug 3, 2017 •

edited

Loading

LeahPrice commented Aug 3, 2017

eddelbuettel commented Aug 3, 2017

Particle marginal Metropolis Hastings Example #13

Particle marginal Metropolis Hastings Example #13

Conversation

LeahPrice commented Aug 3, 2017

adamjohansen commented Aug 3, 2017

adamjohansen left a comment • edited Loading

Choose a reason for hiding this comment

LeahPrice commented Aug 3, 2017

LeahPrice commented Aug 3, 2017

adamjohansen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamjohansen Aug 3, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamjohansen commented Aug 3, 2017 • edited Loading

LeahPrice commented Aug 3, 2017

eddelbuettel commented Aug 3, 2017

adamjohansen left a comment •

edited

Loading

adamjohansen Aug 3, 2017 •

edited

Loading

adamjohansen commented Aug 3, 2017 •

edited

Loading