New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added to quantize_evecs: multiprocessing and v #1097

Closed
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@xjhc

xjhc commented Jul 16, 2016

Added a multiprocessing option and the ability to choose which eigenvector to use as primary vector for fiber tracking.

Added to quantize_evecs: multiprocessing and v
Added a multiprocessing option and the ability to choose which eigenvector to use as primary vector for fiber tracking.
@coveralls

This comment has been minimized.

coveralls commented Jul 16, 2016

Coverage Status

Coverage decreased (-0.1%) to 82.797% when pulling 747a57f on xjhc:quantize_evecs-mp_v into 64ed68d on nipy:master.

@codecov-io

This comment has been minimized.

codecov-io commented Jul 16, 2016

Current coverage is 80.86% (diff: 84.94%)

Merging #1097 into master will increase coverage by 0.01%

@@             master      #1097   diff @@
==========================================
  Files           200        201     +1   
  Lines         23023      23110    +87   
  Methods           0          0          
  Messages          0          0          
  Branches       2309       2316     +7   
==========================================
+ Hits          18614      18687    +73   
- Misses         3933       3944    +11   
- Partials        476        479     +3   

Powered by Codecov. Last update 64ed68d...e7e3b00

@coveralls

This comment has been minimized.

coveralls commented Jul 16, 2016

Coverage Status

Coverage decreased (-0.1%) to 82.797% when pulling 012d91a on xjhc:quantize_evecs-mp_v into 64ed68d on nipy:master.

@arokem

This comment has been minimized.

Member

arokem commented Jul 17, 2016

Interesting. Do you have any idea whether this actually improves performance in practice?

@xjhc

This comment has been minimized.

xjhc commented Jul 17, 2016

Paralleling over N cores gives roughly a N-fold speedup for a test dataset of size 1.5 GB. Because the original implementation also reshapes the array (for vectorization), there is very little additional overhead.

This allows processing of 210 GB array of eigenvectors (i.e. 70 GB array of the primary eigenvector) within ~2 hours on Intel Xeon E5-2667 v2 with 20 x SSD on RAID 10.

Also, the ability to choose the primary eigenvector is useful for studies like doi: 10.1016/j.neuroimage.2015.01.061, doi:10.1016/j.cell.2016.05.010 and doi: 10.1117/1.JBO.20.3.036003, where the eigenvector associated with the smallest eigenvalue is actually the primary eigenvector for tracking.

@arokem

This comment has been minimized.

Member

arokem commented Jul 18, 2016

OK - sounds like there is a clear use case for this. I do have a couple of comments about the code. The first is that we should add some tests for this functionality. The other is a suggestion: instead of having two different variables for nbr_processes and parallel, would it make sense to have just one variable nbr_processes? Set it to 1 per default, which means no parallelism, -1 means as many processes as possible (the return value of cpu_count), etc. Does that make sense, or am I missing something?

@arokem

This comment has been minimized.

Member

arokem commented Jul 28, 2016

@xjhc : any more thoughts here? Have you had a chance to write tests for this functionality?

@xjhc

This comment has been minimized.

xjhc commented Jul 28, 2016

Thanks for the mention - I hadn't checked in on Github lately.

And thanks for the insightful suggestions.

  • I'll add the tests (in accordance with other tests in this project); I really should've done this before.
  • I'll make the update for the nbr_processes.
@xjhc

This comment has been minimized.

xjhc commented Aug 4, 2016

Just an update - I finished the tasks, but haven't had the opportunity to test it out yet. I'll have access to our workstation in two days, and I hope to get it done then. Thanks for your patience!

@arokem

This comment has been minimized.

Member

arokem commented Aug 7, 2016

OK - great. Let us know when you have a chance to try this out, and to push the changes here for review.

Combined variables and added tests
- Instead of having two different variables for nbr_processes and
parallel, just use nbr_processes.
- Added tests for quantize_evecs and for new features of quantize_evecs
@xjhc

This comment has been minimized.

xjhc commented Aug 7, 2016

Everything's good to go.

@coveralls

This comment has been minimized.

coveralls commented Aug 7, 2016

Coverage Status

Coverage increased (+0.02%) to 82.925% when pulling e7e3b00 on xjhc:quantize_evecs-mp_v into 64ed68d on nipy:master.

@arokem

This comment has been minimized.

Member

arokem commented Sep 29, 2016

Sorry for the long gap here. Could you please run a PEP8 checker on the tests? There are a lot of missing white-space in that file. We've generally been following a convention where tests of a module are all in one file, in this case in test_dti. Would you mind moving the tests into that file? Thanks again.

@arokem

This comment has been minimized.

Member

arokem commented Oct 21, 2016

Pinging @xjhc : Have you had a chance to run a PEP8 checker on this? Thanks!

@arokem

This comment has been minimized.

Member

arokem commented Dec 18, 2016

This is superseded by #1164

@arokem arokem closed this Dec 18, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment