-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user output (convergence diagnostics) for PMMH example #50
Conversation
- adding two arguments to onLinPMMH_impl() and nonLinPMMH() to alter user output: - boolean verbose: if TRUE, then (additionally to percentage completion) mean parameter estimates, current estimated log-likelihood and log-prior values are reported; if FALSE, only percentage of completion is printed - int msg_frq: number of iterations after which percentage completion is printed; defaults to every 100 iterations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (at a casual glance)
Indeed, it looks good and does what is needed. A few very minor thoughts;
|
This looks great to me also and I agree with Adam's suggestions. It might also be nice to add the acceptance rate to the output of nonLinPMMH_impl. |
Additions are: - add a ChangeLog entry - change check_prog to integer (with integer equality comparison in l.117) - use static_cast() instead of double() for all (both) occasions - add NumericVector acceptance_rate to output of nonLinPMMH_impl() to store all acceptance rates
Thanks for all remarks! @adamjohansen hopefully addressed your points (special thanks for the note on static_cast() with the accompanying SO-post, this was completely new to me!). @LeahPrice added a NumericVector to the dataframe output that reports all acceptance rates (may seem redundant as the acceptance rate tend to converge to a rather constant number as well but the output is a dataframe so it requires all its columns to be of the same length, hence all acceptance rates and not just the last from the final iteration). |
Looks great. I feel tremendously pedantic at this point, but the convention with the changelog is to identify the files in which the changes took place with a
I guess this is a bit outdated, but old habits die hard and we've done this so far. (I think we're more or less following the GNU style insofar as there is one, see https://www.gnu.org/prep/standards/html_node/Style-of-Change-Logs.html#Style-of-Change-Logs). |
I don't think this is pedantic at all, and the earlier I get used to the style of this project the better! Also: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect to me; can you merge this if you're happy, @LeahPrice?
I tend to keep the ChangeLog more or less up to date as I go, as much because I'd probably forget about things otherwise as anything, but as long as its consistent with what's been done by the time a pull request reaches the repository it should be fine.
I picked up the ChangeLog habit from other GNU project and use it in almost all my projects to, well, "log changes". The entries are typically a little more detailed than git log entries. (Emacs makes it easy: in a file, press |
@ilyaZar I just looked into why the continuous integration failed, on both Travis CI and at GitHub Actions. It is one of those small things: You expanded the interface of [.... more stuff omitted ...]
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘RcppSMC’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/usr/local/lib/R/site-library/00LOCK-rcppsmc/00new/RcppSMC/libs/RcppSMC.so':
/usr/local/lib/R/site-library/00LOCK-rcppsmc/00new/RcppSMC/libs/RcppSMC.so: undefined symbol: _Z15nonLinPMMH_implN4arma3ColIdEEmm
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/usr/local/lib/R/site-library/RcppSMC’
* restoring previous ‘/usr/local/lib/R/site-library/RcppSMC’
Warning message:
In install.packages(pkgs = f, lib = lib, repos = if (isMatchingFile(f)) NULL else repos) :
installation of package ‘.’ had non-zero exit status
edd@rob:~/git/rcppsmc(pr50)$ c++filt _Z15nonLinPMMH_implN4arma3ColIdEEmm
nonLinPMMH_impl(arma::Col<double>, unsigned long, unsigned long)
edd@rob:~/git/rcppsmc(pr50)$ So I just added those two (as GitHub lets us commit (by default, unless opted out) to repos of PR submitters). We soon should have 'green' CI as we prefer. Keep an eye on it going forward. |
Thanks, Dirk, I'd made a note to look at what was causing the trouble but as ever you're more efficient... |
Hah. I am timely for tasks if and when they fall into the very small window of morning coffee hackery before the real day starts. |
@eddelbuettel sorry! And thanks, I was not aware how Travis CI and Github Actions handle these and for, whatever reason, assumed that generated files should not bee committed... So to just to sort things out for the future: As far as I understood so far, compileAttributes() deals with exporting C++ functions to R i.e. both generated files (as they only contain the external C wrappers required to call exported C++ functions and their corresponding .Call wrappers required to call them) are "platform independent boilerplate" and thus can always be included in a commit? (so as to be always on the safe side, not worrying if some function signature changed...) Or would it be particularly annoying or bad practice, say for small changes to the function body, to always include them in a commit? |
Yes. Just experiement with it a little in a corner and it should become clear. And what also becomes clear after a while is that people around the world write code in projects. This isn't math -- we don't stand around a board with chalk all over our pants. It's code, and it runs. So more like 'engineering' if you will. And that is why we use GitHub with all the bells and whistles. Which ... is a lot but that is why we have GSoC to make new victims^Hvolunteers more familiar with how it works. Just ask -- there are no silly questions! So the iron rule: don't break things (at least not in the main branch). PRs tend to get ignored when they do not show 'green'. As for 'small commits versus large' and small versus large change sets: that is a more personal matter. I try to keep them in self contained chunks. But e.g. last evening I had to 'tame' and adjust CI for the work project, and it took me six commits but only commits trigger GitHub Actions runs here. At the end the change was overall simple so I "rebased" and "squashed" what was six commits (for purely procedural reasons) into just one and PRed that earlier. So it depends. Your change set was great as start. We're here to help you with the small mechanics -- like the |
Thanks for all the details @eddelbuettel ! Sorry @LeahPrice , I probably realised too late that draft pull requests cannot be merged until I mark them as "ready for review", now it should be ok. |
These changes look good to me. Thanks everyone and sorry for the slow reply! |
Excellent. It's always nice to be able to cross something of a list. There is still one tiny glitch we should sort out, not something that's critical but we ought to keep what documentation we do have as current as we can:
which really just means we need to add the new arguments that got added to the interface in this PR to the documentation in rcppsmc/man/nonLinPMMH.Rd |
There might be two things to consider here:
This is what happens when I run 'devtools::document()' on my plattform:
There seems to be non UTF-8 character(s) somewhere hidden, I'll try to find them on this weekend :) ! |
Logically further commits belong in a separate PR given that this one has already been merged, at least IMO -- and we should really take this discussion elsewhere, but the ongoing discussion at the end of a closed PR is entirely my fault. Although smctc did use doxygen, and the C++ code seems more or less to have the right /// comments for this, I don't recall using Roxygen for RcppSMC and there indeed doesn't seem to be any evidence of it in the R code. I suspect the Rd files were produced manually, but I think @LeahPrice has worked on these rather more recently than I have. I don't think there was a strong reason for that, but others may remember something I don't. Non UTF-8 characters seem unlikely to be wanted... they often seem to be hyphen-like symbols in references copied from elsewhere. |
We haven't used Roxygen so far but I'm not opposed to using it. It might be good to stick with the current approach for now and then we can all discuss whether we want to switch to Roxygen later. I believe Dirk said he'd prefer to only use Roxygen to generate the .Rd files when we were having discussions about this. Is that right, @eddelbuettel? Apologies for putting this discussion here. I'm happy to write it elsewhere too if we decide to create an issue ticket or discussion on the topic. |
We are splattering discussion now over
We can chat next week on the call (kidding aside that should be useful). Now, kudos to @adamjohansen for spotting the discrepance. Help updates sometimes fall to the way side but that is also what we have R CMD check for. I don't mind roxygen2, I use it in newer projects but I have too many existing ones (still with working Rd files) to obsess over converting old ones. We can spend time there, but I would rather schedule that at a lower priority and keep it to the end. My $0.02, if others feel strongly (and do the work :-) ) I will not object, of course. |
For completeness, the standard workflow (and the builder in RStudio does that for you too, whether or not you use
The key point here is that the former updates the (generated) R file for which the documentation has to match, so the sequence matters for the latter to pick this up and refresh the Rd file. Again, the tools generally do that, one just has to remain aware and check that it really happens. Or else But for any changes meriting a ChangeLog entry (which, really, should be most) it is worth to go a little slower if in doubt. Lastly, the utf8 nag is, I think, automatic these days. We can just add the line |
@eddelbuettel: agreed, we don't want lots of parallel channels and there's not much benefit to using email in parallel with all of the other stuff. The point about video calls and time is well taken, I'm certainly not expecting us to do this regularly (at least not with all 4 of us given the time zones) but I do think there's some value in having one "meeting" to say "hi" and put names to faces. FWIW, I have no objection to sticking "Use roxygen" up as a new issue, especially if it can be at least partially automated, although I agree its not a high priority. |
This addresses bullet point 5 of #25
mean parameter estimates, current estimated log-likelihood and log-prior values
are reported; if FALSE, only percentage of completion is printed
defaults to every 100 iterations
The output is not exactly a one-liner but still quite readable, I hope. I have not added the log-target density yet which, as suggested by @adamjohansen , may be useful to investigate to determine if the chain is in a transient regime or reached a typical set (I can add this if you like).