Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve history getters #37

Merged
merged 9 commits into from
Mar 13, 2018
Merged

Improve history getters #37

merged 9 commits into from
Mar 13, 2018

Conversation

tibi77
Copy link
Contributor

@tibi77 tibi77 commented Mar 10, 2018

Hello RcppSMC Developers,

I think this resolves #31 point 3.

@LeahPrice
Copy link
Collaborator

Hi tibi77,

About your previous question - I would recommend downloading the package devtools which is useful when you’re developing packages. To recompile and build the package, you can use

devtools::load_all('rcppsmc')
devtools::build('rcppsmc')

You’ll also need to install Rtools if you’re on Windows.

This looks like a good start but it would be really nice to see the new code in action. The file src/LinReg_LA_adapt.cpp uses the old method of getting the history and you'll notice that it's pretty ugly. Do you think you could update that file so that you don't need the line std::vector<smc::historyelement<rad_state> > Hist = Sampler->GetHistory();? This would also be a good way of checking that your updates are working as you expected. @adamjohansen @eddelbuettel - do you agree?

All the best,
Leah

…ot see the use of the function/method GetHistory form the sampler header.
@adamjohansen
Copy link
Collaborator

Hi tibi77 (Can you share a little more? We like to know at least who we should acknowledge for code contributions...). Thanks for the code, it's always appreciated.

Leah's suggestion seems like a good one to me; having a working example's always a good idea.

This looks better to me. I had a quick look last night and couldn't quite see how it could work; this looks as though it should. I've got a few specific comments I'll make in a minute.

Copy link
Collaborator

@adamjohansen adamjohansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, a few trivial quibbles and one thought about a possible further improvement but this seems to tidy up the interface somewhat.

///Returns a reference to the particle set stored in the history.
population<Space> & GetHistoryPopulationRefs(long n) {return History[n].GetValues();}
///Returns the history flags
historyflags GetHistoryFlags(long n) {return History[n].GetFlags();}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this not to be declared const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually find strange to declare the references as const but in this particular situation i don not think that it will be a problem to declare them as const. I come from a C programming background so i am just beginning to grasp the subtleties of the language (i am working in cpp just for an year or so). I will remake it.

///Returns the current particle set stored in the history.
population<Space> GetHistoryPopulation(long n) const {return History[n].GetValues();}
///Returns a reference to the particle set stored in the history.
population<Space> & GetHistoryPopulationRefs(long n) {return History[n].GetValues();}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My inclination would have been for this to be a const function returning a const reference (both for consistency for what that's worth and because I don't /think/ we want this reference to be used to make changes to sampler internals, but if anyone can see a good reason for that behaviour then do shout...).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, at the moment I can't think of any reason why people would want to change the history in this way.

population<Space> & GetHistoryPopulationRefs(long n) {return History[n].GetValues();}
///Returns the history flags
historyflags GetHistoryFlags(long n) {return History[n].GetFlags();}
///Returns the history ###################
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading comment: it returns the historical effective sample size...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used this just as a placeholder :D.. I will write a good comment now.

historyflags GetHistoryFlags(long n) {return History[n].GetFlags();}
///Returns the history ###################
long GetHistoryESS(long n) {return History[n].GetESS();}
///Returns the history ###################
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another misleading comment; it returns the number of mcmc repeats used during a particular historical iteration...

loglike(i,n) = Hist[n].GetValues().GetValueN(i).loglike;
logprior(i,n) = Hist[n].GetValues().GetValueN(i).logprior;
theta.slice(n).row(i) = Sampler->GetHistoryPopulation(n).GetValueN(i).theta.t();
loglike(i,n) = Sampler->GetHistoryPopulation(n).GetValueN(i).loglike;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a general thought not directly related to the undoubted usefulness of the current PR, but if we envisage users wanting to access a particular value from a particular iteration on a regular basis then perhaps a GetHistoryValue(n,i) helper which is equivalent to GetHistoryPopulation(n).GetValueN(i) and a corresponding GetHistoryWeight(n,i) might be worth adding. Any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be harder for the users to use it. I have worked at the nautilus project and I think that i have learned from them that we need to stop with the OOP when the documentation necessary to understand the code is longer than the previous code itself. But i would like to get the opinion of a more seasoned member of the community.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine having two getters like GetValueN(i) and GetValueN(i,t) would be sensible. I think it would be nice for users to be able to access as many things as possible directly through the sampler, but I don't have any particularly strong preferences.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More code and more documentation isn't always good, for sure, I have a vague feeling this might be a useful enough feature to warrant adding, but it doesn't block merging what is already proposed here.

Copy link
Contributor Author

@tibi77 tibi77 Mar 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make another commit with this changes later. I think that we may exaggerate with the changes if I include this in the commit as well. And also the matter is a bit disputable.

@LeahPrice
Copy link
Collaborator

Thanks Tibi77.

A couple of minor things:
I think it makes more sense to return the ESS as type double (and that is what the history function GetESS() does).

Also, you can delete the function

/// Returns the Effective Sample Size of the specified particle generation.
double GetESS(long lGeneration) const;

now. You've used a much neater way of getting the ESS (I came from a stats background and was learning best practices in C++ as I did GSoC) but I wonder whether the name of the previous function could be useful. What do people think of using function overloading for the getters? I.e. you would call GetParticleValueN(long n) for the current iteration and GetParticleValueN(long n, long t) for the history. I think you've fixed the issue with the getters so I'd be happy to accept this if others are, but it's just something to think about.

@tibi77
Copy link
Contributor Author

tibi77 commented Mar 12, 2018

So, regarding my interest in the project and my history with programming I would be more than happy if i can discuss it through a irc or something similar because i feel that i am poluating the issue section right now. But as a short intro: I am now in my second year in university studying computer science and engineering in Bucharest, Romania. I an research assistant for the numerical methods department in my university and i enjoy thinking about solution to difficult problems using maths and programming combined. I have found R for the first time when i was tired to use Matlab as the primary tool for research purposes. I am looking right now for something to do, I think that i have a lot of free time and i do not really know where to invest it. Also i would be interested in doing the GSoC along R.
If you want to know more about me I will be happy to come back with an email to a mailing list or to be a part of a irc channel.

@tibi77
Copy link
Contributor Author

tibi77 commented Mar 12, 2018

Regarding what LeahPrice have said I am usually reluctant on deleting code without knowing first what the person who have written it had in mind.

@@ -154,12 +154,12 @@ namespace smc {
///Returns the current particle set stored in the history.
population<Space> GetHistoryPopulation(long n) const {return History[n].GetValues();}
///Returns a reference to the particle set stored in the history.
population<Space> & GetHistoryPopulationRefs(long n) {return History[n].GetValues();}
population<Space> & GetHistoryPopulationRefs(long n) const {return History[n].GetValues();}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to return a const population<Space> & here, to avoid unintended consequences: I don't see this as a way for facilitating changing the history of the sampler from externally.

@adamjohansen
Copy link
Collaborator

Looking good.

I would be inclined to add an extra const to the reference return type, but perhaps others differ on that?

In terms of Leah's point, yes it would be possible to remove that function but if we remove an exposed function we should at least make sure we document the change in interface clearly. As Leah suggested perhaps it would make sense to simply use the old function name for the newer interface here?

Finally, we should make sure that the ChangeLog is up to date, showing what's changed in this PR, are you happy to do that @tibi77? (If you'd prefer we can merge this and then I can do a quick tidy up, but it's usually easiest for the author of code to do this directly.)

…e. Also I find that this function should return long and have a long parameter. Also I think that I can make the GetESS(void) implementation better. I am going to make another commit to update the ChangeLog after you check this commit.
///Returns the history effective sample size.
long GetHistoryESS(long n) {return History[n].GetESS();}
/// Returns the Effective Sample Size of the specified particle generation.
long GetESS(long n) {return History[n].GetESS();}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, why should this function return a long rather than a double? ESS is conveniently interpreted as an effective sample size, but its real valued rather than an integer...

(I'd also make this function const for consistency with the other getters.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I think I will make it to return double. Also the const will be a great addition. To keep the programming style in line.

@adamjohansen
Copy link
Collaborator

Thanks, @tibi77, but I don't see why the GetESS function should return a long rather than a double -- am I missing something?

@tibi77
Copy link
Contributor Author

tibi77 commented Mar 13, 2018

Depending on the compiler a long could store more information than a double. So I thought that it will be better to make it return long instead of double.

@adamjohansen
Copy link
Collaborator

Well, that depends what more information means of course... it's not generally a good idea to coerce real values into integer types; the loss of precision associated with a double precision float relative to a long when dealing with those integers small enough to fit into a long is likely to be rather more than offset by the loss of precision associated with storing non-integer values in a long.

@@ -156,7 +156,7 @@ namespace smc {
///Returns the history flags
historyflags GetHistoryFlags(long n) const {return History[n].GetFlags();}
/// Returns the Effective Sample Size of the specified particle generation.
long GetESS(long n) {return History[n].GetESS();}
const double GetESS(long n) {return History[n].GetESS();}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the risk of getting bogged down in trivia, I meant that this should be a const function [https://www.geeksforgeeks.org/const-member-functions-c/]
not that the return type needed a const qualifier, i.e. that it should be declared as
double GettEss(long n) const
because getters shouldn't change anything. Returning by const value is a bit esoteric and probably unnecessary.

C++ constness is fiddly...

@adamjohansen
Copy link
Collaborator

Looks good; I'll wait for the ChangeLog updates before merging to keep things simple.

@adamjohansen adamjohansen merged commit d9d4916 into rcppsmc:master Mar 13, 2018
Copy link
Collaborator

@eddelbuettel eddelbuettel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty close. The format is automagic if you use Emacs (as Adam and I do :) else a bit of getting used to.

Top line is fine: iso date, two spaces, name, two spaces, email.

Next entries with a TAB and the then roughly fit to the 70 or so colums. It looks like your 2nd and 3rd line need help.

@eddelbuettel
Copy link
Collaborator

And I just cleaned that up in 7f205af . Thanks all!

@adamjohansen
Copy link
Collaborator

Thanks, @eddelbuettel -- sorry, going too quickly and missed that.

@tibi77
Copy link
Contributor Author

tibi77 commented Mar 13, 2018

I am using vim and I have it to automatically expand the tabs for me. I am trying to migrate to Emacs for a while now. Also my text is wrapped at 80 characters. Where can I find the Emacs configuration that you are using?

Thanks @eddelbuettel for doing the modifications for me!

@eddelbuettel
Copy link
Collaborator

Emacs configuration is a bit of 'game' in the 'time sink' sense. I got started as a grad student, I try to not loose too much time over it -- but some modes (ie ESS for R) are fantastic and need a little bit tweaking.

Now, for ChangeLog nothing should be needed as this convention started with the FSF / GNU. To test, I just did emacs -q -nw ChangeLog (suppressing init files, and just using console mode) --- and as expected 'ChangeLog mode' was show in the status line. You can then do C-h m (ie Control-h followed by m) to get help for the current mode. The only command I regularly use is C-x 4 a which, from ony source file below a ChangeLog file finds that ChangeLog file and puts and try in. I also have two state variables in ~/.emacs to override my name and email, just in case, to keep the local hostname out:

;; make changelog entries complete   
(setq user-mail-address "edd@debian.org")
(setq user-full-name "Dirk Eddelbuettel")

Emacs is in my (perfectly biased way) totally worth learning (beeing using it 20+ years) but you can take your time and be incremental.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some Self-contained Improvements
4 participants