Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LiFE #419

Merged
merged 62 commits into from Dec 10, 2014
Merged

LiFE #419

merged 62 commits into from Dec 10, 2014

Conversation

arokem
Copy link
Contributor

@arokem arokem commented Sep 17, 2014

This is just to let you know that I am working on an implementation of the Linear Fascicle Evaluation algorithm, described in our recent paper: http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3098.html

There might be an opportunity to speed things up, using cython. In particular, I am thinking about ways to speed up both sl_signal and voxel2fiber, and any suggestions on that are most welcome.

@Garyfallidis
Copy link
Contributor

Let me know when this will be ready to look at. Thx!

@arokem
Copy link
Contributor Author

arokem commented Oct 4, 2014

Almost ready, I think. I also need to make a compelling example. For this, do we have the data from which track300.trk were made? Or should I import one of the tracking examples and go from there?

@Garyfallidis
Copy link
Contributor

Nope, you will need to upload new data and bundles/streamlines for your tutorial.

@Garyfallidis
Copy link
Contributor

Maybe create some streamlines from the stanford data and then select a bundle that you like? Can you put both the streamlines and any created bundles online (with the fetchers etc.)? That could be useful for other projects too.

@arokem
Copy link
Contributor Author

arokem commented Oct 4, 2014

Got it. Working on it. I am thinking of importing from one of the other
tracking tutorials (e.g. tracking_eudx_odf.py) and using these tracks.

On Sat, Oct 4, 2014 at 1:05 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Nope, you will need to upload new data and bundles/streamlines for your
tutorial.


Reply to this email directly or view it on GitHub
#419 (comment).

@Garyfallidis
Copy link
Contributor

It all depends how fast you want your tutorial to run. If you want to run quickly then it will be easier if you can fetch the streamlines that you need.

@arokem
Copy link
Contributor Author

arokem commented Oct 4, 2014

tracking_eudx_odf runs real fast, especially if you've already run the DTI
example beforehand, so I might just use the tracks that are generated in
that example.

On Sat, Oct 4, 2014 at 1:15 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

It all depends how fast you want your tutorial to run. If you want to run
quickly then it will be easier if you can fetch the streamlines that you
need.


Reply to this email directly or view it on GitHub
#419 (comment).

@Garyfallidis
Copy link
Contributor

ok

@arokem
Copy link
Contributor Author

arokem commented Oct 5, 2014

Alright - open season! This PR is now open for comments!

On Sat, Oct 4, 2014 at 1:31 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

ok


Reply to this email directly or view it on GitHub
#419 (comment).

@arokem arokem force-pushed the life branch 2 times, most recently from 43d5524 to 3ab746a Compare October 7, 2014 16:05
@arokem arokem changed the title WIP: Life LiFE Oct 7, 2014
@arokem arokem force-pushed the life branch 8 times, most recently from 92cf409 to 10dcf5f Compare October 12, 2014 19:19
@Garyfallidis
Copy link
Contributor

Hi @arokem, when I run the life tutorial I am getting this error

ValueError Traceback (most recent call last)
/usr/lib/python2.7/dist-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
202 else:
203 filename = fname
--> 204 builtin.execfile(filename, *where)

/home/eleftherios/Devel/dipy/doc/examples/life.py in ()
228 """
229
--> 230 sum_signals = np.asarray(FF.life_matrix.sum(-1)).squeeze()
231 tracks_per_voxel = np.asarray(FF.life_matrix.astype(bool).sum(axis=-1)).squeeze()
232

/usr/local/lib/python2.7/dist-packages/scipy/sparse/compressed.pyc in sum(self, axis)
546 return self.data.sum()
547 else:
--> 548 return spmatrix.sum(self,axis)
549 raise ValueError("axis out of bounds")
550

/usr/local/lib/python2.7/dist-packages/scipy/sparse/base.pyc in sum(self, axis)
627 return (self * np.asmatrix(np.ones((n, 1), dtype=res_dtype))).sum()
628 else:
--> 629 raise ValueError("axis out of bounds")
630
631 def mean(self, axis=None):

ValueError: axis out of bounds

@arokem
Copy link
Contributor Author

arokem commented Oct 13, 2014

Do all the tests run fine for you?

On Sun, Oct 12, 2014 at 5:53 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Hi @arokem https://github.com/arokem, when I run the life tutorial I am
getting this error

ValueError Traceback (most recent call last)
/usr/lib/python2.7/dist-packages/IPython/utils/py3compat.pyc in
execfile(fname, where)
202 else:
203 filename = fname
--> 204 *builtin
.execfile(filename, *where)

/home/eleftherios/Devel/dipy/doc/examples/life.py in ()
228 """
229
--> 230 sum_signals = np.asarray(FF.life_matrix.sum(-1)).squeeze()
231 tracks_per_voxel =
np.asarray(FF.life_matrix.astype(bool).sum(axis=-1)).squeeze()
232

/usr/local/lib/python2.7/dist-packages/scipy/sparse/compressed.pyc in
sum(self, axis)
546 return self.data.sum()
547 else:
--> 548 return spmatrix.sum(self,axis)
549 raise ValueError("axis out of bounds")
550

/usr/local/lib/python2.7/dist-packages/scipy/sparse/base.pyc in sum(self,
axis)
627 return (self * np.asmatrix(np.ones((n, 1), dtype=res_dtype))).sum()
628 else:
--> 629 raise ValueError("axis out of bounds")
630
631 def mean(self, axis=None):

ValueError: axis out of bounds


Reply to this email directly or view it on GitHub
#419 (comment).

@Garyfallidis
Copy link
Contributor

Yes, they do actually.

@Garyfallidis
Copy link
Contributor

Numpy 1.8.2 and scipy 0.13.3

@arokem
Copy link
Contributor Author

arokem commented Oct 13, 2014

OK - this part of the example is not that important. I've actually been
thinking about rewriting it anyway. It makes more sense to compare model
performance against a less contrived baseline, for example, a naive model
of the signal that says that the signal is the mean of the all the
measurements across directions.

On Sun, Oct 12, 2014 at 6:01 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Yes, they do actually.


Reply to this email directly or view it on GitHub
#419 (comment).

import scipy.linalg as la

from dipy.reconst.base import ReconstModel, ReconstFit
from dipy.core.onetime import ResetMixin, auto_attr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both imported but unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing. Fixed!

@francopestilli
Copy link

Hi Ariel, can something like this help: http://stackoverflow.com/questions/11784329/python-memory-usage-of-numpy-arrays

@arokem
Copy link
Contributor Author

arokem commented Dec 8, 2014

Thanks - that's quite useful. I think that the tricky bit is profiling the
internal variables of the gradient descent algorithm.

This should also be helpful: https://pypi.python.org/pypi/memory_profiler

On Mon, Dec 8, 2014 at 12:55 PM, Franco Pestilli notifications@github.com
wrote:

Hi Ariel, can something like this help:
http://stackoverflow.com/questions/11784329/python-memory-usage-of-numpy-arrays


Reply to this email directly or view it on GitHub
#419 (comment).

@arokem
Copy link
Contributor Author

arokem commented Dec 8, 2014

OK - I lied, here's what I did on a bus ride through the East Quebec wilderness:

http://nbviewer.ipython.org/gist/arokem/1f3529f967f334af74b7

The plot at the bottom is a worst-case scenario: a linear extrapolation from the data, based on the approximately 10k streamlines that I could easily run this for. Let me know whether you think this is an appropriate analysis.

@arokem
Copy link
Contributor Author

arokem commented Dec 9, 2014

Oh, the dangers of extrapolating from limited data!

Here's an updated analysis of this: http://nbviewer.ipython.org/gist/arokem/f91d436af3f0d3084af4

Looks like the memory usage is bounded by a fixed factor of the data size. In this case, no more than 4.5 GB, but it might grow larger, if the ROI within which the streamlines are defined grows larger.

@Garyfallidis
Copy link
Contributor

Okay great. Now we have a better idea of what we can improve in the future. So, for example maybe those maps that take space could be saved as memmaps if they don't reduce performance etc. And...
merged!!!!!

Garyfallidis added a commit that referenced this pull request Dec 10, 2014
@Garyfallidis Garyfallidis merged commit 023f476 into dipy:master Dec 10, 2014
@Garyfallidis
Copy link
Contributor

Congrats!

@arokem
Copy link
Contributor Author

arokem commented Dec 10, 2014

Sweet!

On Tue, Dec 9, 2014 at 5:46 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Congrats!


Reply to this email directly or view it on GitHub
#419 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants