Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Affine registration PR 3/3 #654

Merged
merged 47 commits into from Jul 26, 2015
Merged

Conversation

omarocegueda
Copy link
Contributor

This is the last PR of the affine registration series. It also includes a tutorial explaining how to do an ANTS2-like affine registration. The following graph shows the performance of this implementation compared to ANTS2, using sparse sampling (30% of the voxels).
image

EDIT: we have now the same accuracy as ANTS2 in this dataset (image updated). Our implementation is a bit faster too (still a bit slower than ANTS2), the following are the updated times:

(sparse means use 30% of the voxels for PDF estimation)
ANTS2 sparse: 11.0 minutes
Dipy sparse: 12.0 minutes
Flirt : 4.41 minutes
Nipy: 49 seconds

@@ -96,310 +97,6 @@ def get_direction_and_spacings(affine, dim):
A = affine[:dim,:dim]
return A.dot(np.diag(1.0/scalings)), scalings


class ScaleSpace(object):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the ScaleSpace class to its own module because now we have two different ways of building a scale space (the ANTS-like anisotropic scale space and the ANTS2-like isotropic one). The new scale space is IsotropicScaleSpace.

@arokem
Copy link
Contributor

arokem commented May 25, 2015

Seems to currently be failing on the minimal scipy requirement (0.9), because the setting of inputs to the optimization have changed since:

https://travis-ci.org/nipy/dipy/jobs/63966654#L2007

Any way we can provide some kind of compatibility shim in dipy.core.optimize to help with that?

@arokem
Copy link
Contributor

arokem commented May 25, 2015

Any chance to increase test coverage in imaffine? It's currently at 90% (which is great), but I suspect that some non-default settings are not getting exercised at all.

@omarocegueda
Copy link
Contributor Author

Hi @Garyfallidis, @arokem, @matthew-brett,
I think this is a bug in dipy.core.optimize for scipy<=0.12:
https://github.com/nipy/dipy/blob/master/dipy/core/optimize.py#L169
According to this:
http://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html#scipy.optimize.fmin_l_bfgs_b
this function expects fprime as third argument, but is given the tuple args instead, which is not callable and causes this error when it attempts to get the gradient:
https://travis-ci.org/nipy/dipy/jobs/63966654#L2021
As a quick fix, I am simply passing jac:
omarocegueda@6dbcd4a#diff-6b28d8cc8891944f1a6adca73299277aR154
but I think this may need a bit more thought about what's a good way to keep compatibility with scipy<=0.12. What do you think?
Thanks! =)

@arokem
Copy link
Contributor

arokem commented Jul 27, 2015

Oh durnit - I relaunched that run, and now it's working.

It was failing on this assertion:

https://github.com/nipy/dipy/blob/master/dipy/align/tests/test_imaffine.py#L307

But only on python 3.3 (?!). Let's hope that was just a fluke.

On Mon, Jul 27, 2015 at 12:45 PM, Omar Ocegueda notifications@github.com
wrote:

Hi @arokem https://github.com/arokem,
I don't see the failure, is that the right link?


Reply to this email directly or view it on GitHub
#654 (comment).

@omarocegueda
Copy link
Contributor Author

Thanks for reporting that! this kind of weird failures usually indicates a memory access violation, or an uninitialized buffer. I'll double check.

@matthew-brett
Copy link
Contributor

No need to revert, we're just discussing the details of the inheritance, and I doubt anyone will start using that stuff in the next week or so.

@omarocegueda
Copy link
Contributor Author

Hi @arokem,

Oh durnit - I relaunched that run, and now it's working.

Marc just reported the same failure in Python 3.4. I am investigating but since this is already in master, do you think it is preferable to skip the test for now and open an issue so it stops breaking the tests (I can re-activate the test after I figure out what the issue is about)?

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

Yeah - I saw that. I am somewhat reluctant to remove this from our tests,
because it's better to have a constant reminder of this, than to forget
about it in the shuffle of doing other things... Do you have a clear path
forward to fix this? Any idea why this didn't show up during the review on
this PR? I haven't looked closely at the code that leads to this yet.

On Wed, Jul 29, 2015 at 7:21 AM, Omar Ocegueda notifications@github.com
wrote:

Hi @arokem https://github.com/arokem,

Oh durnit - I relaunched that run, and now it's working.

Marc just reported the same failure in Python 3.4. I am investigating but
since this is already in master, do you think it is preferable to skip the
test for now and open an issue so it stops breaking the tests (I can
re-activate the test after I figure out what the issue is about)?


Reply to this email directly or view it on GitHub
#654 (comment).

@omarocegueda
Copy link
Contributor Author

It's hard to reproduce (that's what I am trying to do right now), I think that's why it didn't show up during the review, it works correctly in my Ubuntu, Windows 8.1 and Travis didn't complain either.

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

The worst kind... I will also give it a try.

@matthew-brett
Copy link
Contributor

I see this now too:

http://nipy.bic.berkeley.edu/builders/dipy-py2.6-32/builds/555/steps/shell_6/logs/stdio

dipy.align.tests.test_imaffine.test_align_origins_3d ... ok
dipy.align.tests.test_imaffine.test_affreg_all_transforms ... 
command timed out: 1200 seconds without output, attempting to kill
process killed by signal 9

Maybe that would be a good machine to start on?

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

@matthew-brett - is that really the same issue? Superficially it looks like another test than this failure. Not saying we don't need to fix this one as well - just trying to understand.

@matthew-brett
Copy link
Contributor

Actually, even better, it's failing in a similar way on OSX, with a segfault:

http://nipy.bic.berkeley.edu/builders/dipy-py2.7-osx-10.8/builds/377/steps/shell_6/logs/stdio

That machine is much faster.

Same on my laptop:

$ nosetests dipy/align/tests/test_imaffine.py:test_affreg_all_transforms
nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
dipy.align.tests.test_imaffine.test_affreg_all_transforms ... Segmentation fault: 11

@matthew-brett
Copy link
Contributor

Ariel - I'm guessing there's some memory access violation somewhere, maybe causing both problems, just recommending this one as a place to start because it's easier to reproduce, and hoping that this will lead us to a fix for both issues.

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

Actually - I get the test_mi_gradient failure on my laptop as well. At least for one run out one. Now trying to see if it's an intermittent thing.

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

*one run out of one

@omarocegueda
Copy link
Contributor Author

Oh! that's awesome!, @matthew-brett, when you say "Maybe that would be a good machine to start on" do you mean using try-branch for debugging or is there a way to get access to the actual machine?

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

Sadly, intermittent: one for two.

@matthew-brett
Copy link
Contributor

I can't give you access to the '10.8' machine (actually, it's a 10.10 machine now). But I can give you access to the 10.7 machine that is failing in the same way. I'm afraid that machine is in really bad shape, running very slowly. You should now have access via your ssh-rsa key for jomaroceguedag@gmail.com : ssh buildslave@169.229.158.3 - if that gets too painful, we can ask Min, who owns the faster machine, if he will give you access.

@matthew-brett
Copy link
Contributor

That machine already had the code compiled. To get going using the virtualenv that the tests used:

$ cd osx-10.7/dipy-py2_7-osx-10_7/build/
$ source venv/bin/activate

@omarocegueda
Copy link
Contributor Author

Thanks @matthew-brett!, I just activated the venv and I am now trying to build with make ext from
/Users/buildslave/osx-10.7/dipy-py2_7-osx-10_7/source
But gcc fails with:
lipo: can't figure out the architecture type of: /var/folders/8n/t5rvqnld23n0mn_lbhkgyg4w0000gt/T//ccvZTm15.out error: command 'gcc-4.2' failed with exit status 1

is that the right thing to do?

@matthew-brett
Copy link
Contributor

You can get the parameters that buildbot runs with from looking at the file /Library/LaunchDaemons/edu.berkeley.bic.pimba.osx-10.7.plist. In this case I think you need:

export CC=clang
export MACOSX_DEPLOYMENT_TARGET=10.6
make ext

@omarocegueda
Copy link
Contributor Author

Awesome! it's compiling now =)

@omarocegueda
Copy link
Contributor Author

Hi @matthew-brett, @arokem,
The segmentation fault is caused by the interpolation functions, e.g.:
https://github.com/nipy/dipy/blob/master/dipy/align/vector_fields.pyx#L201
Here, I am assuming that the coordinates are valid floating point values (not nan). Infinite is handled correctly. The problem is that at some point the optimizer requests the metric to be evaluated with a ScaleTransform with parameter nan, which creates an affine containing nans. When we ask numpy to invert this matrix with np.linalg.inv, numpy returns a matrix full of nans instead of raising an exception. Since the exception is not caught, I assume that it's a valid transform and attempt to transform the moving image with the provided matrix, which calls the interpolation function with parameter nan.

We have two options here:

  1. Add comparisons of the form "if x!=x return" to all interpolation functions , which will cause some overhead on all the registration module because these functions are called once per voxel each time we interpolate. I will need to actually measure how large is this overhead.
  2. Check that the matrix does not contain nans at the beginning of all the warping functions. The overhead will likely be much lower than (1).
  3. Raise an exception from imaffine when the requested parameter is invalid. This is a very simple, 2-line fix, but we will still assume that the input arguments to the module arevalid, and will probably get the "segfault" again if the user provides an invalid matrix, or displacement field.

I personally would vote (2). What do you think?

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

On Wed, Jul 29, 2015 at 11:00 AM, Omar Ocegueda notifications@github.com
wrote:

Hi @matthew-brett https://github.com/matthew-brett, @arokem
https://github.com/arokem,
The segmentation fault is caused by the interpolation functions, e.g.:
https://github.com/nipy/dipy/blob/master/dipy/align/vector_fields.pyx#L201
Here, I am assuming that the coordinates are valid floating point values
(not nan). Infinite is handled correctly. The problem is that at some point
the optimizer requests the metric to be evaluated with a ScaleTransform
with parameter nan, which creates an affine containing nans. When we ask
numpy to invert this matrix with np.linalg.inv, numpy returns a matrix full
of nans instead of raising an exception. Since the exception is not caught,
I assume that it's a valid transform and attempt to transform the moving
image with the provided matrix, which calls the interpolation function with
parameter nan.

We have two options here:

  1. Add comparisons of the form "if x!=x return" to all interpolation
    functions , which will cause some overhead on all the registration module
    because these functions are called once per voxel each time we interpolate.
    I will need to actually measure how large is this overhead.
  2. Check that the matrix does not contain nans at the beginning of all the
    warping functions. The overhead will likely be much lower than (1).
  3. Raise an exception from imaffine when the requested parameter is
    invalid. This is a very simple, 2-line fix, but we will still assume that
    the input arguments to the module arevalid, and will probably get the
    "segfault" again if the user provides an invalid matrix, or displacement
    field.

I personally would vote (2). What do you think?

Yes, unless I am missing something, 2 definitely sounds like the best way
to go about this.

@matthew-brett
Copy link
Contributor

How about doing 2 and 3? Is there any reason not to raise the error for option 3?

@arokem
Copy link
Contributor

arokem commented Jul 29, 2015

P.S. Does this explain why the error is intermittent? With the random seed
fixed at the beginning of this test, that's not obvious to me.

On Wed, Jul 29, 2015 at 11:09 AM, Ariel Rokem arokem@gmail.com wrote:

On Wed, Jul 29, 2015 at 11:00 AM, Omar Ocegueda notifications@github.com
wrote:

Hi @matthew-brett https://github.com/matthew-brett, @arokem
https://github.com/arokem,
The segmentation fault is caused by the interpolation functions, e.g.:
https://github.com/nipy/dipy/blob/master/dipy/align/vector_fields.pyx#L201
Here, I am assuming that the coordinates are valid floating point values
(not nan). Infinite is handled correctly. The problem is that at some point
the optimizer requests the metric to be evaluated with a ScaleTransform
with parameter nan, which creates an affine containing nans. When we ask
numpy to invert this matrix with np.linalg.inv, numpy returns a matrix full
of nans instead of raising an exception. Since the exception is not caught,
I assume that it's a valid transform and attempt to transform the moving
image with the provided matrix, which calls the interpolation function with
parameter nan.

We have two options here:

  1. Add comparisons of the form "if x!=x return" to all interpolation
    functions , which will cause some overhead on all the registration module
    because these functions are called once per voxel each time we interpolate.
    I will need to actually measure how large is this overhead.
  2. Check that the matrix does not contain nans at the beginning of all
    the warping functions. The overhead will likely be much lower than (1).
  3. Raise an exception from imaffine when the requested parameter is
    invalid. This is a very simple, 2-line fix, but we will still assume that
    the input arguments to the module arevalid, and will probably get the
    "segfault" again if the user provides an invalid matrix, or displacement
    field.

I personally would vote (2). What do you think?

Yes, unless I am missing something, 2 definitely sounds like the best way
to go about this.

@omarocegueda
Copy link
Contributor Author

Oh! of course!, we need 2 and 3 so we can catch the error and send inf to the optimizer when it attempts to evaluate the metric at an invalid point. We don't want to raise the error because we want the optimizer to continue exploring even when it evaluates the metric at an invalid point.

@arokem, I'm not sure if this explains the intermitent behavior. I think it depends on what happens when we attempt to access an array at index 'nan'. Let me investigate a bit more about that, but for now this should fix this memory error too:
http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/138/steps/shell_8/logs/stdio

@omarocegueda
Copy link
Contributor Author

what happens when we attempt to access an array at index 'nan'

Sorry, I meant: access an array at the index that results from converting nan to integer (the resulting index might yield a valid memory address some times and some times it might not)...

@Garyfallidis
Copy link
Contributor

Hello @omarocegueda , I noticed that you haven't renamed the following functions in what is now in the master
In [1]: from dipy.align.imaffine import align_
align_centers_of_mass align_geometric_centers align_origins

Can you please use transform_X instead as we discussed in this PR?
Thx in advance.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 741e3c0 on omarocegueda:imaffine_pr3 into ** on nipy:master**.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants