Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-7549: GaussianProcess updates #96

Merged
merged 65 commits into from Oct 19, 2016
Merged

DM-7549: GaussianProcess updates #96

merged 65 commits into from Oct 19, 2016

Conversation

timj
Copy link
Member

@timj timj commented Sep 16, 2016

No description provided.

data = np.zeros((pp, dd), dtype=float)
tol = 1.0e-10

f = open(os.path.join(testPath, "data", "kd_test_data.sav"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use with open so you don't need the close?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

data[i][j] = float(s[j])

kd = gp.KdTreeD()
try:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of these try/catch blocks? Are you expecting the pex.Exception to happen, or are you trying to intercept an error that should not happen and print it out? If the former, why not use the more standard with self.assertRaises, if the latter, shouldn't this be a test failure? If it's something non-standard then there should be a comment explaining it before the first try.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the latter. I have removed the try/except blocks.

@danielsf
Copy link
Contributor

This pull request adds tests to all of the user-facing methods in GaussianProcess.cc. These tests ensure (or ought to) that exceptions are thrown if users pass in invalid inputs.

I also took the opportunity to flesh-out testGaussianProcess.py so that the functionality of GaussianProcess.cc is tested more methodically.

I added some methods to the GaussianProcess class to allow users to inspect its contents (specifically methods to return the data points and function values underlying the GaussianProcess).

@danielsf
Copy link
Contributor

danielsf commented Oct 5, 2016

@jmeyers314 Just wanted to make sure you saw this

Copy link
Contributor

@jmeyers314 jmeyers314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danielsf .

Looks good to me, just a few small comments. My one high-level comment, which may be outside the scope of this issue, is that the kdtree class here looks like it might be useful beyond just gaussian process interpolation. Should it be moved somewhere more generally visible/accessible in the stack?

@@ -603,6 +603,42 @@ class GaussianProcess {
std::shared_ptr< Covariogram<T> > const &covarIn);

/**
* @brief return the number of data points stored in the GaussianProcess
*/
int getPoints() const;
Copy link
Contributor

@jmeyers314 jmeyers314 Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use getNPoints() or something like that perhaps? getPoints sounds too much like getData to me.

I suppose this applies to the private variable _pts as well: -> _npts or something.

with self.assertRaises(pex.Exception):
gg.selfInterpolate(sigma, -1, nData-1)
gg.selfInterpolate(sigma, -1, nData-1)
# the following segfaults, for unknown reasons, so run directly instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still segfault?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I have restored the code block.

with self.assertRaises(pex.Exception):
gg.selfInterpolate(sigma, -1, nData-1)
gg.selfInterpolate(sigma, -1, nData-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of what made the reported bug difficult to track before was that while most of the time this line did raise an exception, it was actually not the intended exception. Is there anyway within pex.Exception to check that the right exception is being raised? If so, no need to implement that everywhere, but probably a good idea to implement it here where we no there was a problem in the past.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use unittest's code to check the message that comes with the exception. Probably a good idea.

@danielsf
Copy link
Contributor

danielsf commented Oct 5, 2016

Regarding the high-level comment: at the time I wrote this code, I think we had several kdTree implementations floating around, but hadn't picked one to be our "official" kdTree class (thus I was allowed to go ahead and implement one specifically for GaussianProcess). I don't know if that is still the case, but you're right: picking a "blessed" kdTree implementation is probably something we should do.

@jmeyers314
Copy link
Contributor

Sorry so slow, @danielsf . Didn't realize you had completed these all 9 days ago. I'm happy to merge now.

This is the minimal change needed to fix the intermittent
failure that spawned this ticket.  selfInterpolate() was
not noticing if users passed in a negative index (which
makes no sense) or asked for nData nearest neighbors (not
possible with selfInterpolate, since selfInterpolate ignores
the point at which interpolation is requested when assembling
the list of neighbors (so nData-1 is the largest number of
neighbors you can request).
This is in GaussianProcess.cc.  If you ask the KdTree to find
zero, less than zero, or greater than _pts nearest neighbors,
a RuntimeError will be thrown.
so that, if adding a point of improper size, the kdTree exception
gets raised before any other work is done
specifically, test that, if you pass a single function value to a
GaussianProcess on many functions, you get an exception
This is necessary because, if _min and _max are used,
the point gets mapped onto a renormalized parameter space before
it is added to the KDTree.  This mapping already assumes that the point
is of the correct dimensionality.
(i.e. access the points and function values on which the
GaussianProcess is built)
there were some try/except blocks in testGaussianProcess.py that would
have prevented the test from failing in the event an unexpected exception
was raised.  These have been removed.
There was a self.assertRaises() test that was marked as segFaulting.
It no longer segFaults, due to the more careful input checking implemented
in this ticket.  I have restored the test.
in both GaussianProcess and KdTree
by checking the contents of the exception message
@danielsf danielsf merged commit 06e4eb9 into master Oct 19, 2016
@ktlim ktlim deleted the tickets/DM-7549 branch August 25, 2018 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants