BF: nan entries cause segfault #690

omarocegueda · 2015-07-29T20:32:34Z

This fixes the segmentation fault caused when attempting to interpolate an image at nan entries.

omarocegueda · 2015-07-29T20:35:03Z

dipy/align/imaffine.py

@@ -144,6 +144,8 @@ def set_affine(self, affine):
            self.affine_inv = None
            return
        try:
+            if np.isnan(np.sum(affine)):


Using sum is faster than, e.g., min for finding nan's, according to this:
http://stackoverflow.com/questions/6736590/fast-check-for-nan-in-numpy

Surely these two lines should go above the try block? I think the np.sum trick is to avoid large temporary arrays, for me a simple np.any(np.isnan(x)) is faster still, and I think it's easier to read.

arokem · 2015-07-29T22:11:56Z

Does this actually address all of the errors mentioned in #654?

In particular, one of the things reported there was an error in test_imaffine:test_mi_gradient (for example, see https://travis-ci.org/nipy/dipy/jobs/73233253#L2162), which is not a segfault. Is this addressed through these changes?

omarocegueda · 2015-07-29T23:38:05Z

I don't know if this will solve the other bug, but the segmentation fault occurs in several buildbots:
http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/138/steps/shell_8/logs/stdio
http://nipy.bic.berkeley.edu/builders/dipy-py2.6-32/builds/557/steps/shell_6/logs/stdio
http://nipy.bic.berkeley.edu/builders/dipy-py2.7-osx-10.7/builds/310/steps/shell_6/logs/stdio
http://nipy.bic.berkeley.edu/builders/dipy-py2.7-osx-10.8/builds/378/steps/shell_6/logs/stdio

The other bug appears in this buildbot:
http://nipy.bic.berkeley.edu/builders/dipy-py3.4/builds/192/steps/shell_6/logs/stdio
@matthew-brett, can I have access to that machine too?
Do you guys think it will be better to add both fixes to this PR?

arokem · 2015-07-29T23:41:33Z

OK - just wanted to make sure that I have the full picture.
On Wed, Jul 29, 2015 at 4:38 PM, Omar Ocegueda notifications@github.com
wrote:

I don't know if this will solve the other bug, but the segmentation fault
occurs in several buildbots:

http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/138/steps/shell_8/logs/stdio

http://nipy.bic.berkeley.edu/builders/dipy-py2.6-32/builds/557/steps/shell_6/logs/stdio

http://nipy.bic.berkeley.edu/builders/dipy-py2.7-osx-10.7/builds/310/steps/shell_6/logs/stdio

http://nipy.bic.berkeley.edu/builders/dipy-py2.7-osx-10.8/builds/378/steps/shell_6/logs/stdio

Yes - I expect this PR will solve these issues. Have you tried a a
try_branch with any of these?

The other bug appears in this buildbot:

http://nipy.bic.berkeley.edu/builders/dipy-py3.4/builds/192/steps/shell_6/logs/stdio
@matthew-brett https://github.com/matthew-brett, can I have access to
that machine too?
Do you guys think it will be better to add both fixes to this PR?

Do you have some sense of what is causing it? I don't think it matters much
if this one also gets resolved here: If you think this resolves the
segfaults, I am ready to merge this one, and you can take the other fix on
another PR, but if you prefer to keep going here, that's fine too. Your
call.

—
Reply to this email directly or view it on GitHub
#690 (comment).

omarocegueda · 2015-07-30T00:11:22Z

I didn't run a try_branch but I tested the fix manually on the buildbot Matthew gave access to (It didn't brake after the fix).

I'm not sure about the other bug, it may be as simple as a precission issue (e.g. the result was something like 0.9989, very close to the assertion value but still failing). I need to reproduce the bug and find out why it fails there and not in other platforms.

matthew-brett · 2015-07-30T13:51:54Z

This is just option 3?

matthew-brett · 2015-07-30T13:53:30Z

Omar - I'll check with the owner of that buildbot machine and the OSX machine - will get back to you.

matthew-brett · 2015-07-30T14:53:54Z

Now I think I see that this is option 2 and 3.

omarocegueda · 2015-07-31T18:55:12Z

Thanks @matthew-brett!, yes this is option 2 and 3. I just reproduced the bug on the buildbot. The root cause is that iteration over dictionary keys is no longer deterministic in Python 3:
http://stackoverflow.com/questions/14956313/dictionary-ordering-non-deterministic-in-python3

This explains the "intermitent" behavior. The assertion was failing because the inner product between numeric and analytical gradients was about 0.994. I think it is safe to reduce the threshold to 0.99. For the non-deterministic behavior, I guess the way to go is to replace the dictionary with a list. What do you think?

matthew-brett · 2015-07-31T18:56:33Z

Where is the iteration-over-dictionary code?

omarocegueda · 2015-07-31T18:58:35Z

Here is the dictionary:
https://github.com/nipy/dipy/blob/master/dipy/align/tests/test_imaffine.py#L25
and here is the loop that is failing:
https://github.com/nipy/dipy/blob/master/dipy/align/tests/test_imaffine.py#L268

matthew-brett · 2015-07-31T19:02:26Z

How about for ttype in sorted(factors) to make it deterministic?

omarocegueda · 2015-07-31T19:03:34Z

sure! that's fine too. I'll do the fix.

omarocegueda · 2015-07-31T19:30:15Z

Alright!, this now addresses both failures (checked directly on the failing buildbots)

matthew-brett · 2015-07-31T19:44:50Z

dipy/align/tests/test_imaffine.py

@@ -172,7 +172,7 @@ def test_align_origins_3d():

 def test_affreg_all_transforms():
    # Test affine registration using all transforms with typical settings
-    for ttype in factors.keys():
+    for ttype in sorted(factors):


Comment to explain why the factor keys must be sorted (in order to preserve relationship of random numbers to dict key / values)? Ditto for other instances.

Comments added. Actually only one of the tests needed the sort, but I think it is ok to still sort the keys in the other three places, just in case we extend the tests in the future.

matthew-brett · 2015-08-01T08:45:15Z

There seems to be a new merge commit here - 33451ca - introducing lots of changes not relevant to this PR?

omarocegueda · 2015-08-01T14:42:40Z

Right, now I see what happened... fortunately there is cherry-pick!
Thanks!

matthew-brett · 2015-08-01T16:16:06Z

dipy/align/imaffine.py

@@ -143,6 +143,8 @@ def set_affine(self, affine):
        if self.affine is None:
            self.affine_inv = None


Sorry to do small suggestions not relevant to this PR - but how about putting this line after self.affine = affine so that self.affine_inv is always defined (as None or an affine) even if the inverse raises an error. Otherwise it's set to whatever it was before, which could be confusing. Probably also worth noting that the method sets self.affine_inv in the docstring too.

Actually, don't worry, I'll do a PR for that.

matthew-brett · 2015-08-01T17:09:51Z

Great - thanks for fixing - merging now.

MRG: fix bugs in affine registration NaN entries in affines causing segfault on some platforms. Relaxing similarity threshold for random number tests.

omarocegueda reviewed Jul 29, 2015
View reviewed changes

matthew-brett reviewed Jul 31, 2015
View reviewed changes

omarocegueda added 4 commits August 1, 2015 09:12

BF: handle nan entries

061fdaf

BF: iterate over dictionary keys deterministically.

8a348fc

BF: use np.any(np.isnan(.)) instead of np.isnan(np.sum())

0f5d547

DOC: explain why it is convenient to sort dictionary entries

8bf1318

omarocegueda force-pushed the fix_nans branch from c4c5850 to 8bf1318 Compare August 1, 2015 14:24

matthew-brett reviewed Aug 1, 2015
View reviewed changes

matthew-brett added a commit that referenced this pull request Aug 1, 2015

Merge pull request #690 from omarocegueda/fix_nans

4943433

MRG: fix bugs in affine registration NaN entries in affines causing segfault on some platforms. Relaxing similarity threshold for random number tests.

matthew-brett merged commit 4943433 into dipy:master Aug 1, 2015

matthew-brett mentioned this pull request Aug 1, 2015

WIP: affine map tests #693

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BF: nan entries cause segfault #690

BF: nan entries cause segfault #690

omarocegueda commented Jul 29, 2015

omarocegueda Jul 29, 2015

matthew-brett Jul 30, 2015

arokem commented Jul 29, 2015

omarocegueda commented Jul 29, 2015

arokem commented Jul 29, 2015

omarocegueda commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

omarocegueda commented Jul 31, 2015

matthew-brett commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

matthew-brett commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

matthew-brett Jul 31, 2015

omarocegueda Aug 1, 2015

matthew-brett commented Aug 1, 2015

omarocegueda commented Aug 1, 2015

matthew-brett Aug 1, 2015

matthew-brett Aug 1, 2015

matthew-brett Aug 1, 2015

matthew-brett commented Aug 1, 2015

		@@ -143,6 +143,8 @@ def set_affine(self, affine):
		if self.affine is None:
		self.affine_inv = None

BF: nan entries cause segfault #690

BF: nan entries cause segfault #690

Conversation

omarocegueda commented Jul 29, 2015

omarocegueda Jul 29, 2015

Choose a reason for hiding this comment

matthew-brett Jul 30, 2015

Choose a reason for hiding this comment

arokem commented Jul 29, 2015

omarocegueda commented Jul 29, 2015

arokem commented Jul 29, 2015

omarocegueda commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

matthew-brett commented Jul 30, 2015

omarocegueda commented Jul 31, 2015

matthew-brett commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

matthew-brett commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

omarocegueda commented Jul 31, 2015

matthew-brett Jul 31, 2015

Choose a reason for hiding this comment

omarocegueda Aug 1, 2015

Choose a reason for hiding this comment

matthew-brett commented Aug 1, 2015

omarocegueda commented Aug 1, 2015

matthew-brett Aug 1, 2015

Choose a reason for hiding this comment

matthew-brett Aug 1, 2015

Choose a reason for hiding this comment

matthew-brett Aug 1, 2015

Choose a reason for hiding this comment

matthew-brett commented Aug 1, 2015