New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local PCA using SVD #1223

Merged
merged 59 commits into from Jun 21, 2017

Conversation

Projects
None yet
8 participants
@arokem
Member

arokem commented Apr 14, 2017

This extends #1108 and supersedes it, addressing some of the issue raised in the review of that previous PR.

  1. Following a comment by @samuelstjean (https://github.com/nipy/dipy/pull/1108/files#r81179260), I implemented the PCA using the SVD, instead of decomposing the covariance matrix. This is slower by about 50%, when done using scipy.linalg.svd, but as suggested there, using the direct lapack API performs as fast as the covariance matrix decomposition approach. I am worried that this will not work with older versions of scipy, but let's see what Travis says.

  2. I fixed a bunch of other things, typos, structure of the code overall (the whole thing was wrapped in a bit if...else structure for error handling, which I moved to the top).

@arokem arokem force-pushed the arokem:localpca_slow-svd branch from e09df39 to 7654e1a Apr 14, 2017

@codecov-io

This comment has been minimized.

codecov-io commented Apr 14, 2017

Codecov Report

Merging #1223 into master will increase coverage by 0.09%.
The diff coverage is 97.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1223      +/-   ##
==========================================
+ Coverage   87.05%   87.14%   +0.09%     
==========================================
  Files         226      228       +2     
  Lines       28261    28494     +233     
  Branches     3026     3043      +17     
==========================================
+ Hits        24602    24831     +229     
- Misses       2976     2980       +4     
  Partials      683      683
Impacted Files Coverage Δ
dipy/denoise/tests/test_noise_estimate.py 100% <100%> (ø) ⬆️
dipy/denoise/localpca.py 95.08% <95.08%> (ø)
dipy/denoise/tests/test_lpca.py 98.6% <98.6%> (ø)
dipy/core/graph.py 75% <0%> (+1.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01135ee...34474a9. Read the comment docs.

@coveralls

This comment has been minimized.

coveralls commented Apr 15, 2017

Coverage Status

Coverage increased (+0.07%) to 88.491% when pulling 7654e1a on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 15, 2017

Coverage Status

Coverage increased (+0.06%) to 88.485% when pulling 6016102 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 15, 2017

Coverage Status

Coverage increased (+0.06%) to 88.485% when pulling 6016102 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 15, 2017

Coverage Status

Coverage increased (+0.06%) to 88.485% when pulling 6016102 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@arokem

This comment has been minimized.

Member

arokem commented Apr 15, 2017

As predicted, older versions of Scipy don't have the lapack API (https://travis-ci.org/nipy/dipy/jobs/222247543#L2671), so we might want to abandon this, or do this conditional on newer versions of scipy. Anyone have any strong opinions about that?

I still have a couple more things to clean up here. I'd like to go over the example once over, and we still need to deal with the issue described in this comment: https://github.com/nipy/dipy/pull/1108/files#r81159634

@coveralls

This comment has been minimized.

coveralls commented Apr 15, 2017

Coverage Status

Coverage increased (+0.06%) to 88.485% when pulling 6016102 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Apr 15, 2017

You definitely want to use the svd based version, as the covariance based version will see its condition number grow by the square or the svd based one (or rather, the covariance one will be unstable by a factor O(n2) while the svd one is only O(n), roughly). And that will most likely lead to weird corner issues where the eigenvalues are badly estimated and the whole thing will get cut off.

You might want to do a thin svd wrapper which returns either the lapack svd or the scipy svd depending on the version found for that.

Oh yeah, even more recent version (like 0.18 and up) let's you use those form cython even, but that may be overkill for now.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Apr 15, 2017

Don't have time to check in more details, but I'll be back around mid may, so that leaves time to figure stuff out. Did not recheck the whole logic as the old version was, well,, not really working, but it could be due to the covariance based thing and it just did not kick in on simple phantoms I guess.

Anyway, it's also missing the signal bias inversion formula (which is 3-4 lines, so it should be doable), but if you want more realistic phantoms and data to throw at it just ask. Better fix it now than 3-4 months later because Maxime and I find problems in it, much easier to fix it at the source directly than revert it afterwards after all.

@coveralls

This comment has been minimized.

coveralls commented Apr 16, 2017

Coverage Status

Coverage increased (+0.05%) to 88.476% when pulling 9c66e84 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 16, 2017

Coverage Status

Coverage increased (+0.05%) to 88.476% when pulling 9c66e84 on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 16, 2017

Coverage Status

Coverage increased (+0.05%) to 88.476% when pulling c5b550d on arokem:localpca_slow-svd into d3cb337 on nipy:master.

@arokem arokem force-pushed the arokem:localpca_slow-svd branch 2 times, most recently from 2b5913c to 5daa747 Apr 18, 2017

@coveralls

This comment has been minimized.

coveralls commented Apr 19, 2017

Coverage Status

Changes Unknown when pulling 3c9fef8 on arokem:localpca_slow-svd into ** on nipy:master**.

@coveralls

This comment has been minimized.

coveralls commented Apr 19, 2017

Coverage Status

Coverage increased (+0.06%) to 88.623% when pulling 3c9fef8 on arokem:localpca_slow-svd into 53d8741 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 19, 2017

Coverage Status

Coverage increased (+0.06%) to 88.623% when pulling 3c9fef8 on arokem:localpca_slow-svd into 53d8741 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Apr 21, 2017

Coverage Status

Coverage increased (+0.06%) to 88.623% when pulling 483e57f on arokem:localpca_slow-svd into 53d8741 on nipy:master.

@arokem

This comment has been minimized.

Member

arokem commented Apr 22, 2017

I believe this PR is almost done. The one thing I still want to do is to add a check for ill-conditioned patch-size to number-of-directions inputs, when the number of voxels in the patch is smaller than the number of directions in the data (see also discussion here: #1108 (comment)).

But if someone wants to start taking a look at this, please feel free to do so. You can also see the changes relative to #1108 here: https://github.com/riddhishb/dipy/pull/1/files (but it's a bit of a tohuwabohu, because @riddhishb hasn't rebased in a few months...)

@coveralls

This comment has been minimized.

coveralls commented Apr 22, 2017

Coverage Status

Coverage increased (+0.06%) to 88.623% when pulling 2ca0dea on arokem:localpca_slow-svd into 53d8741 on nipy:master.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented May 8, 2017

@villalonreina and all can you test this brunch and tell us if it works for you?

@arokem

This comment has been minimized.

Member

arokem commented May 8, 2017

OK - error handling added. This one is ready to go from my point of view. Please review!

@arokem arokem force-pushed the arokem:localpca_slow-svd branch from 04b4e01 to 12ce970 May 8, 2017

@coveralls

This comment has been minimized.

coveralls commented May 8, 2017

Coverage Status

Coverage increased (+0.06%) to 88.655% when pulling 12ce970 on arokem:localpca_slow-svd into b7aa619 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented May 8, 2017

Coverage Status

Coverage increased (+0.06%) to 88.655% when pulling 12ce970 on arokem:localpca_slow-svd into b7aa619 on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented May 8, 2017

Coverage Status

Coverage increased (+0.06%) to 88.655% when pulling 12ce970 on arokem:localpca_slow-svd into b7aa619 on nipy:master.

@riddhishb

This comment has been minimized.

Contributor

riddhishb commented May 9, 2017

@arokem I am going to give this a try, I dont see any apparent problem with the implementation,

@riddhishb riddhishb closed this May 9, 2017

@arokem arokem force-pushed the arokem:localpca_slow-svd branch from d2a4e98 to 3b4ec77 Jun 20, 2017

@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Changes Unknown when pulling 3b4ec77 on arokem:localpca_slow-svd into ** on nipy:master**.

1 similar comment
@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Changes Unknown when pulling 3b4ec77 on arokem:localpca_slow-svd into ** on nipy:master**.

@arokem arokem force-pushed the arokem:localpca_slow-svd branch from 3b4ec77 to 34474a9 Jun 21, 2017

@arokem

This comment has been minimized.

Member

arokem commented Jun 21, 2017

This is now coming back green (on my fork: https://travis-ci.org/arokem/dipy/builds/245181737). @Garyfallidis : could you please take one more look, and merge this if it's ready?

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Jun 21, 2017

Sure, waiting for Travis.

@arokem

This comment has been minimized.

Member

arokem commented Jun 21, 2017

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Jun 21, 2017

Sure, okay its 12:15 am here. Will merge it tomorrow morning.

@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Coverage increased (+0.1%) to 85.757% when pulling 34474a9 on arokem:localpca_slow-svd into 01135ee on nipy:master.

1 similar comment
@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Coverage increased (+0.1%) to 85.757% when pulling 34474a9 on arokem:localpca_slow-svd into 01135ee on nipy:master.

@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Coverage increased (+0.1%) to 85.757% when pulling 34474a9 on arokem:localpca_slow-svd into 01135ee on nipy:master.

1 similar comment
@coveralls

This comment has been minimized.

coveralls commented Jun 21, 2017

Coverage Status

Coverage increased (+0.1%) to 85.757% when pulling 34474a9 on arokem:localpca_slow-svd into 01135ee on nipy:master.

@arokem

This comment has been minimized.

Member

arokem commented Jun 21, 2017

It's all green now! 💚

@Garyfallidis Garyfallidis merged commit 241b0d8 into nipy:master Jun 21, 2017

4 checks passed

codecov/patch 97.88% of diff hit (target 87.05%)
Details
codecov/project 87.14% (+0.09%) compared to 01135ee
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.1%) to 85.757%
Details
e_s += " that matches the spatial dimensions of the data."
raise ValueError(e_s)
tau = np.median(np.ones(arr.shape[:-1]) * ((tau_factor * sigma) ** 2))

This comment has been minimized.

@ssheybani

ssheybani Jul 14, 2017

Contributor

@arokem Sorry, I don't understand why you are using the median of Sigma, instead of the whole array. I couldn't find it in the paper. Isn't it more accurate to use the whole array, which has a specific value for each voxel?

This comment has been minimized.

@arokem

arokem Jul 15, 2017

Member

Good point. The paper does state that they are using "local estimates of the noise variance" for sigma/tau.

But following this comment: #1223 (review)

I made the change to a median instead: #1223 (comment)

Using one estimate for the entire array saves memory. I take the median, rather than the mean to avoid sensitivity to outliers in the sigma array (which may arise around edges and such).

@arokem arokem referenced this pull request Sep 4, 2018

Closed

Local PCA Slow Version #1108

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

RF: Reduce the memory signature of `tau`, `sigma`.
Addresses nipy#1223 (comment)

Simplify some of the code around the error-handling

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

Cast inputs to float64, if needed. Reduce memory bottle-neck.
Delete intermediate variables, divide up variable declarations.

Addresses nipy#1223 (comment)

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment