New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-7549: GaussianProcess updates #96
Conversation
data = np.zeros((pp, dd), dtype=float) | ||
tol = 1.0e-10 | ||
|
||
f = open(os.path.join(testPath, "data", "kd_test_data.sav")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use with open
so you don't need the close
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
data[i][j] = float(s[j]) | ||
|
||
kd = gp.KdTreeD() | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of these try/catch blocks? Are you expecting the pex.Exception
to happen, or are you trying to intercept an error that should not happen and print it out? If the former, why not use the more standard with self.assertRaises
, if the latter, shouldn't this be a test failure? If it's something non-standard then there should be a comment explaining it before the first try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latter. I have removed the try/except blocks.
This pull request adds tests to all of the user-facing methods in GaussianProcess.cc. These tests ensure (or ought to) that exceptions are thrown if users pass in invalid inputs. I also took the opportunity to flesh-out testGaussianProcess.py so that the functionality of GaussianProcess.cc is tested more methodically. I added some methods to the GaussianProcess class to allow users to inspect its contents (specifically methods to return the data points and function values underlying the GaussianProcess). |
@jmeyers314 Just wanted to make sure you saw this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @danielsf .
Looks good to me, just a few small comments. My one high-level comment, which may be outside the scope of this issue, is that the kdtree class here looks like it might be useful beyond just gaussian process interpolation. Should it be moved somewhere more generally visible/accessible in the stack?
@@ -603,6 +603,42 @@ class GaussianProcess { | |||
std::shared_ptr< Covariogram<T> > const &covarIn); | |||
|
|||
/** | |||
* @brief return the number of data points stored in the GaussianProcess | |||
*/ | |||
int getPoints() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use getNPoints() or something like that perhaps? getPoints sounds too much like getData to me.
I suppose this applies to the private variable _pts
as well: -> _npts
or something.
with self.assertRaises(pex.Exception): | ||
gg.selfInterpolate(sigma, -1, nData-1) | ||
gg.selfInterpolate(sigma, -1, nData-1) | ||
# the following segfaults, for unknown reasons, so run directly instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this still segfault?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I have restored the code block.
with self.assertRaises(pex.Exception): | ||
gg.selfInterpolate(sigma, -1, nData-1) | ||
gg.selfInterpolate(sigma, -1, nData-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of what made the reported bug difficult to track before was that while most of the time this line did raise an exception, it was actually not the intended exception. Is there anyway within pex.Exception to check that the right exception is being raised? If so, no need to implement that everywhere, but probably a good idea to implement it here where we no there was a problem in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use unittest
's code to check the message that comes with the exception. Probably a good idea.
Regarding the high-level comment: at the time I wrote this code, I think we had several kdTree implementations floating around, but hadn't picked one to be our "official" kdTree class (thus I was allowed to go ahead and implement one specifically for GaussianProcess). I don't know if that is still the case, but you're right: picking a "blessed" kdTree implementation is probably something we should do. |
Sorry so slow, @danielsf . Didn't realize you had completed these all 9 days ago. I'm happy to merge now. |
This is the minimal change needed to fix the intermittent failure that spawned this ticket. selfInterpolate() was not noticing if users passed in a negative index (which makes no sense) or asked for nData nearest neighbors (not possible with selfInterpolate, since selfInterpolate ignores the point at which interpolation is requested when assembling the list of neighbors (so nData-1 is the largest number of neighbors you can request).
This is in GaussianProcess.cc. If you ask the KdTree to find zero, less than zero, or greater than _pts nearest neighbors, a RuntimeError will be thrown.
so that, if adding a point of improper size, the kdTree exception gets raised before any other work is done
specifically, test that, if you pass a single function value to a GaussianProcess on many functions, you get an exception
This is necessary because, if _min and _max are used, the point gets mapped onto a renormalized parameter space before it is added to the KDTree. This mapping already assumes that the point is of the correct dimensionality.
This reverts commit 0fdee37.
(i.e. access the points and function values on which the GaussianProcess is built)
there were some try/except blocks in testGaussianProcess.py that would have prevented the test from failing in the event an unexpected exception was raised. These have been removed.
There was a self.assertRaises() test that was marked as segFaulting. It no longer segFaults, due to the more careful input checking implemented in this ticket. I have restored the test.
in both GaussianProcess and KdTree
by checking the contents of the exception message
4ed7bce
to
06e4eb9
Compare
No description provided.