New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local PCA Slow Version #1108

Closed
wants to merge 10 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@riddhishb
Copy link
Contributor

riddhishb commented Aug 9, 2016

Only the python implementation of the localPCA algorithm which is without bugs and can be merged. A separate PR will be made very soon to incorporate the cythonized version of this algorithm.

@Garyfallidis Do have a look at this

@riddhishb riddhishb added the gsoc2016 label Aug 9, 2016

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 9, 2016

Coverage Status

Coverage increased (+0.1%) to 83.024% when pulling 28328f9 on riddhishb:localpca_slow into e309c15 on nipy:master.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 9, 2016

Current coverage is 80.98% (diff: 98.77%)

Merging #1108 into master will increase coverage by 0.01%

@@             master      #1108   diff @@
==========================================
  Files           205        219     +14   
  Lines         23329      24756   +1427   
  Methods           0          0           
  Messages          0          0           
  Branches       2341       2499    +158   
==========================================
+ Hits          18889      20049   +1160   
- Misses         3958       4195    +237   
- Partials        482        512     +30   

Powered by Codecov. Last update 31bc6fe...fbd6d9e

@RafaelNH

This comment has been minimized.

Copy link
Contributor

RafaelNH commented Aug 9, 2016

Hi @riddhishb! Can you rename the localpca_slow.py file to localpca.py. I know there will be a faster version in a future PR, however calling this slow is misleading because the current version is not that slow right?

@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 9, 2016

Coverage Status

Coverage increased (+0.1%) to 83.024% when pulling 29544c9 on riddhishb:localpca_slow into e309c15 on nipy:master.

[1]: Manjon JV, Coupe P, Concha L, Buades A, Collins DL
"Diffusion Weighted Image Denoising Using Overcomplete
Local PCA"

This comment has been minimized.

@RafaelNH

RafaelNH Aug 10, 2016

Contributor

Remove the extra blank space

This comment has been minimized.

@Garyfallidis

Garyfallidis Sep 11, 2016

Member

Yeah, @riddhishb you need a better name. Is this noise estimation useful only for LPCA?

This comment has been minimized.

@riddhishb

riddhishb Sep 12, 2016

Contributor

@Garyfallidis Yes this is only useful for the localPCA, what do you suggest should a good name be ?
Also I noticed there are some build errors and merge conflicts let me correct them

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

So before this gets close to merging, this function cna be used of anything else needing noise estimation. If not, it would be faster to embed it in the lpca code directly as you end up doing two pcas instead, which is probably the most expensive operation here.

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

Well in theory this function can be used for other techniques needing noise estimation. Right @riddhishb ?
Also, even if we embed this code in the lpca code directly, we will always need to do two PCAs - the first pca is done only at part of the data.

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

Well, you can do PCA, take smallest eigen value, compute local std noise field, then hard threshold and rebuild in one step. The second part for the noise bias (which is not included here) is separate and is using the difference image as the new noise field for bias correction.

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

Oh wait, that is the version where the noise field is estimated from multiple b0s or from the lowest component of the data? In the first case you would need two pca yes (and thus the function could be accessible from outside, invalidating the previous remark.)

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
def fast_noise_estimate(data, gtab):

This comment has been minimized.

@RafaelNH

RafaelNH Aug 10, 2016

Contributor

We have to find a more suggestive name for this!

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

What can be a better name ? PCA noise estimate ?

Returns
-------

This comment has been minimized.

@RafaelNH

RafaelNH Aug 10, 2016

Contributor

Remove this blank space

PCA, PLOS 2013
"""

This comment has been minimized.

@RafaelNH

RafaelNH Aug 10, 2016

Contributor

Remove this Blank space

be taken around each voxel
rician : boolean
If True the noise is estimated as Rician, otherwise Gaussian noise
is assumed.

This comment has been minimized.

@RafaelNH

RafaelNH Aug 10, 2016

Contributor

@riddhishb - I think the code for this is missing, right?

:] + temp / (1 + np.linalg.norm(d,
ord=0))

denoised_arr = thetax / theta

This comment has been minimized.

@samuelstjean

samuelstjean Aug 15, 2016

Contributor

You seem to be missing the debias step from eq. 5 for rician noise http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0073021

It would be useful to have it has a keyword like in nlmeans for using it on gaussian noise for example.

This comment has been minimized.

@riddhishb

riddhishb Aug 16, 2016

Contributor

What I actually found is that the adaptation step (eq 5 and 6) worsens the results and it doesn't mat at all with the Manjon's MATLAB implementation. We compared our results withe their output with the current algorithm and it is really really close. Plus even though the MATLAB implementation has an option to choose between the Gaussian and Rician Noise, the error map between the two output is exactly zero, which makes us believe that they might have used is minus the debias step.

This comment has been minimized.

@samuelstjean

samuelstjean Aug 16, 2016

Contributor

That is, hmm, unexpected, I had a look at the code and they do

lpca denoising

if rician:
recompute noise field with another equation/function than the first one.
apply eq 5/6

Or something close to that, could be the discrepancy you see. They indeed seem to always use the part you have for the noise field correction, but they have an extra debiasing the signal step if the rician check box is activated (similar to -2*var in nlmeans), which takes place completely after the pca reconstruction.

I can have a look at what does the recomputing noise field does (I helped test the original lpca matlab version), but unfortunately I was also asked to not share it, so I can have a quick look at why the function name is different/what broadly changed.

This comment has been minimized.

@samuelstjean

samuelstjean Aug 16, 2016

Contributor

Ah yes, also this correction is most noticeable at low SNR, as with a high enough SNR a gaussian approximation is enough, so it could be why you don't see too much difference if you have a high SNR. Here is a graph form the original paper that proposed this correction
capture d ecran de 2016-08-16 19-55-14

It is equation 4 from the lpca paper, and closely related to eq. 5. As you can see, above SNR 4 the correction factor is almost 1, and thus does not produce that much of a difference in the Rician noise case. It is more pronounced for other cases, but at low SNR it should give you different results though.

Figure 1 (or equation 8 in this ref) is from Koay, C. G., & Basser, P. J. (2006). Analytically exact correction scheme for signal extraction from noisy magnitude MR signals. Journal of Magnetic Resonance, 179(2), 317–322. http://doi.org/10.1016/j.jmr.2006.01.016

This comment has been minimized.

@riddhishb

riddhishb Aug 17, 2016

Contributor

@samuelstjean This makes sense!, I had the rician correction implemented exactly how you pointed out, but I dont know if it was wrong or something else, it just didnt work for me. I got very decent result compared to the MATLAB implementation. However If you can look at how eq 5 / 6 are implemented in matlab and give some pointers as to how it can be implemented in this function than it would be great!

This comment has been minimized.

@RafaelNH

RafaelNH Aug 17, 2016

Contributor

Hi guys! This rician bias correction is indeed important! However, my suggestion is to make the implementation of this in a separate PR and after the GSoC submission. The rician bias correction is a related but, since its is applied after the denoising procedure, we can deal with it separately! We can even make it a general function compatible with other denoising techniques.

This comment has been minimized.

@samuelstjean

samuelstjean Aug 17, 2016

Contributor

So, while my tracking computes, it seems they recompute the final noise variance on the residuals maps + correction factor, similar to the way it is computed on the smallest pca components at first. Since you now have a 4D local map of noise variance, they take the median at the end to get a robust 3D estimation.

For the debias step itself, it gets a tad more complicated as you need to compute a numerical inverse for the mapping function. So, eq. 5 works with the local SNR once again, this time using the computed denoised value and the new value from the residual noise map at each voxel.

This value of E(x)/sigma is then used in eq. 6 to build a table of values (or it can just be applied to each voxel, they do that to save some computation time here it seems).


And a new comment appeared between me starting and finishing this message, so here goes :

It would be easier to get it included now as it is part of the algorithm. Since apparently when looking a difference of FA you see stuff along 1-2% difference, the value you can get out with and without bias correction is in the same order/higher if you move to high b-value/finer voxels (and thus less signal).

Well anyway, let's see how it goes but all the parts seem to be in place already except for the one liner of eq 6. While it is true that you could use it for other functions, nlmeans work in the square domain and already include such a correction, so I don't know how applicable to other stuff it would be.

theta = np.zeros(arr.shape, dtype=np.float64)
thetax = np.zeros(arr.shape, dtype=np.float64)

patch_size = 2 * patch_radius + 1

This comment has been minimized.

@samuelstjean

samuelstjean Aug 15, 2016

Contributor

The default implementation is using a patch size of 4 directly [1], so you might want to change the keyword to allow this directly as a patch size rather than patch center. You will probably need to play with border effects to get this though to fit exactly with any array.

http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0073021 section Local PCA Denoising (LPCA)

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

This also means you don't need padding nor playing with negatives indices, you can juts feed the blocks directly to the pca.


sigma = (np.ones(arr.shape, dtype=np.float64)
* sigma[..., np.newaxis])
tou = (np.ones(arr.shape, dtype=np.float64) * tou[..., np.newaxis])

This comment has been minimized.

@samuelstjean

samuelstjean Aug 15, 2016

Contributor

For the threshold, they use the greek letter tau (unless tou is a valid way of writing it though, never saw it before).

This comment has been minimized.

@riddhishb

riddhishb Aug 16, 2016

Contributor

@samuelstjean oh I will change the name

@riddhishb riddhishb force-pushed the riddhishb:localpca_slow branch from 5cecc67 to 1d97120 Aug 25, 2016

@RafaelNH
Copy link
Contributor

RafaelNH left a comment

Hi @riddhishb! Here are my last comment for this pca! In my view after addressing them this can be merged. Regarding to the implementation of the rician correction algorithm. I agree with @samuelstjean that is a related issue, however I will still argue that we can leave this for another PR. I think in this way the can speed up some work because if @riddhishb currently does not have time to address this I don't mind leading this work in a next PR. How that sounds?


if arr.ndim == 4:

tou = 2.3 * 2.3 * sigma

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

As @samulstjean mentioned in the PR #1053, this should be tou (2.3 * sigma) ** 2

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

The sigma from the function is actually sigma squared but yeah I will change the naming so that so that there is no confusion.

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

Changed this

Denoise images using Local PCA
===============================
Using the local PCA based denoising for diffusion

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

suggestion:

Here we show how one can use the PCA to denoise diffusion images [Manjon2013]_. This approach might be advantageous for diffusion-weighted data as it takes into account the diffusion directional information.

* Threshold the eigenvalues based on the local estimate of sigma and then do
the PCA reconstruction
Let's load the necessary modules

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

'Let's start by loading ... '

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

All changes to the heading and docstrings incorporated

diffusion data.
The basic idea behind local PCA based diffusion denoising can
be explained in the following three basic steps

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

The basic idea behind the local PCA denoising strategy consists of three basic steps:

import matplotlib.pyplot as plt
from time import time
from dipy.denoise.localpca import localpca
from dipy.denoise.fast_noise_estimate import fast_noise_estimate

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

You still have to rename this function name


t = time()

sigma = np.array(fast_noise_estimate(data.astype(np.float64), gtab))

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

Same here

information in the diffusion MR data. It performs PCA on local 4D patch and
then thresholds it using the local variance estimate done by noise estimation
function, then performing PCA reconstruction on it gives us the deniosed
estimate.

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

This information is redundant. Here just explain how to use the localpca function and what are the inputs!


denoised_arr = localpca(data, sigma=sigma, patch_radius=2)

print("Time taken for local PCA (slow)", -t + time())

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

Why not 'time() - t' ?

print("Time taken for local PCA (slow)", -t + time())

"""
Let us plot the axial slice of the original and denoised data.

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

How about:
To visually inspect the noise suppression of the pca algorithm, we plot below the axial slices of the original and denoised data.

patch_size *
patch_size,
arr.shape[3])
# compute the mean and normalize

This comment has been minimized.

@RafaelNH

RafaelNH Sep 29, 2016

Contributor

Hi @riddhishb! I was reviewing the math begin this and when patch_size ** 3 < arr.shape[3] you will have zero eigenvalues! I will suggest adding at the beginning of this function an error message if patch_size ** 3 is smaller than the number of gradient directions informing user to give a larger value of patch_size.

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

I do not get this. Why is this true, Every time I have experimented the patch_size**3 has been smaller than the gradient directions.

This comment has been minimized.

@RafaelNH

RafaelNH Oct 15, 2016

Contributor

What patch_size do you normally use?

When patch_size ** 3 < arr.shape[3] the covariance matrix is not full rank. In this cases you will have the following number of zero eigenvalues:
N = arr.shape[3]-patch_size ** 3.

This comment has been minimized.

@RafaelNH

RafaelNH Oct 15, 2016

Contributor

Perhaps instead of adding the error message, you could just remove the N lower eigenvalues when patch_size ** 3 is smaller than arr.shape. In tests we have to make sure that we cover the two cases: patch_size ** 3 < arr.shape and patch_size ** 3 > arr.shape.

This comment has been minimized.

@riddhishb

riddhishb Oct 15, 2016

Contributor

I think I use patch size 3 and 5 for testing in normal cases. Oh I think thats right, removing N lower eigenvalues makes sense

@riddhishb

This comment has been minimized.

Copy link
Contributor

riddhishb commented Sep 29, 2016

@RafaelNH I think you are right, I will get some time in the weekend, I think this PR has taken long enough and I will see to all the changes are done. Fingers crossed :)

@arokem

This comment has been minimized.

Copy link
Member

arokem commented Sep 29, 2016

Don't forget to rebase on master!

On Thu, Sep 29, 2016 at 8:21 AM, Riddhish Bhalodia <notifications@github.com

wrote:

@RafaelNH https://github.com/RafaelNH I think you are right, I will get
some time in the weekend, I think this PR has taken long enough and I will
see to all the changes are done. Fingers crossed :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1108 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAHPNt6Kh3yv3hHasahzepfDReRsSzrQks5qu9eDgaJpZM4JfxSs
.

@RafaelNH

This comment has been minimized.

Copy link
Contributor

RafaelNH commented Sep 29, 2016

Cool - During the summer I was also able to implement the marchenko pastur based pca! After having your PR merged I am planing to extent your work with the selection of signal component based on the Marchenko-Pastur distribution! So I am looking so forward for your revision @riddhishb!

@riddhishb

This comment has been minimized.

Copy link
Contributor

riddhishb commented Sep 29, 2016

@RafaelNH Thats great!! Lets all work, may be have a chat on hangouts as well.

@RafaelNH

This comment has been minimized.

Copy link
Contributor

RafaelNH commented Sep 29, 2016

@riddhishb! Sounds good! But let's do this step by step. First milestone is to have this PR merged!

@samuelstjean
Copy link
Contributor

samuelstjean left a comment

I just went trough quick stuff I remembered on top of my head + stuff I discussed with Rafael in lisbon, so don't take that as a full blown review.

for i0 in range(-1,2):
for j0 in range(-1,2):
for k0 in range(-1,2):
sum_reg += I[i + i0, j + j0, k + k0] / 27.0

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

Might want to use the shape of the 3D block instead of hard coding 3x3x3 = 27 in case someone changes it later.

mean[i + i0, j + j0, k + k0] += temp1
count[i + i0, j +j0, k + k0] += 1

sigma = np.divide(sigma, count)

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

The (unbiased for samples) standard deviation needs a N-1 denominator, you seem to use sum(X - mean)**2 / N instead.

dtype=np.float64)

# Compute the covariance matrix C = X_transpose X
C = np.transpose(X).dot(X)

This comment has been minimized.

@samuelstjean

samuelstjean Sep 29, 2016

Contributor

This is more of a performance and stability issue, but you can also use the SVD for computing PCA which is more stable https://en.wikipedia.org/wiki/Principal_component_analysis#Singular_value_decomposition. It kind of falls in the same category as don't invert matrices with the Gauss-Jordan method, don't compute likelihood directly but use the log likelihood, etc.

Unfortunately, back in the days, the SVD of scipy was terribly slow, so I gave up on it. Nowadays, you can hook up directly into the lapack version instead, so it should be pretty fast https://docs.scipy.org/doc/scipy/reference/linalg.lapack.html That's also what the original is using, in a multithreaded loop for efficiency.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 29, 2016

Coverage Status

Coverage increased (+0.3%) to 83.34% when pulling 85763d2 on riddhishb:localpca_slow into 31bc6fe on nipy:master.

1 similar comment
@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 29, 2016

Coverage Status

Coverage increased (+0.3%) to 83.34% when pulling 85763d2 on riddhishb:localpca_slow into 31bc6fe on nipy:master.

@samuelstjean

This comment has been minimized.

Copy link
Contributor

samuelstjean commented Oct 1, 2016

I gave it a run on some real data, and the noise estimation throws some warning

In [22]: sigma = np.array(fast_noise_estimate(data.astype(np.float64), gtab))
/home/samuel/anaconda3/bin/ipython:1: RuntimeWarning: overflow encountered in multiply
  #!/home/samuel/anaconda3/bin/python

/home/samuel/anaconda3/bin/ipython:1: RuntimeWarning: overflow encountered in square
  #!/home/samuel/anaconda3/bin/python
/home/samuel/anaconda3/bin/ipython:1: RuntimeWarning: invalid value encountered in multiply
  #!/home/samuel/anaconda3/bin/python

I checked the local standard deviation, seemed a bit high (this is real data, so who knows the right value, might want to test a synthetic dataset at a lower SNR).

capture d ecran de 2016-10-01 14-51-26

It also seems to overblur compared to the original lpca, could be the default settings from the example.

lpca

lpca_out

@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 15, 2016

Coverage Status

Coverage increased (+0.02%) to 83.047% when pulling fbd6d9e on riddhishb:localpca_slow into 31bc6fe on nipy:master.

1 similar comment
@coveralls

This comment has been minimized.

Copy link

coveralls commented Oct 15, 2016

Coverage Status

Coverage increased (+0.02%) to 83.047% when pulling fbd6d9e on riddhishb:localpca_slow into 31bc6fe on nipy:master.

@Garyfallidis

This comment has been minimized.

Copy link
Member

Garyfallidis commented Nov 28, 2016

@riddhishb can you give us an update with this PR?

@Garyfallidis

This comment has been minimized.

Copy link
Member

Garyfallidis commented Dec 13, 2016

Ping @riddhishb :)

@arokem arokem referenced this pull request Apr 14, 2017

Merged

local PCA using SVD #1223

@arokem

This comment has been minimized.

Copy link
Member

arokem commented Sep 4, 2018

I believe this was superseded by #1223. Closing.

@arokem arokem closed this Sep 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment