Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A functional implementation of Random matrix local pca. #1917

Merged
merged 22 commits into from Aug 1, 2019

Conversation

RafaelNH
Copy link
Contributor

@RafaelNH RafaelNH commented Jul 21, 2019

Hi all! I am working on a functional implementation of the Random matrix local PCA for Dipy's next release. I've already added the classifier based on the moments of the Marchenko–Pastur probability distribution according to Veraart et al., 2016 - which seems to pass my added tests. Now, I just need to:

  • Add a test of the main localpca function.

  • Update the example (If you want to start playing with the new procedure, you can run it in the current example by skipping the sigma estimation part and setting the sigma parameter to None when calling the localpca function - results are looking good and example runs in a couple of minutes).

In the meantime, I need our opinion on the follow aspect - The new feature does not require the estimation of noise sigma, so what should I do with the previous implemented noise estimate procedure? @arokem, @skoudoro did you mentioned that this procedure is not working well? The current version of the code is still compatible with this previous procedure; however if it is not working I don't mind removing it to simplified a bit the code.

PS: Also I hope this work helps the future integration of the code in PR#1781 in a future release.

@codecov-io
Copy link

codecov-io commented Jul 21, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@4de2d70). Click here to learn what that means.
The diff coverage is 97.36%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1917   +/-   ##
=========================================
  Coverage          ?   84.65%           
=========================================
  Files             ?      119           
  Lines             ?    14739           
  Branches          ?     2341           
=========================================
  Hits              ?    12478           
  Misses            ?     1712           
  Partials          ?      549
Impacted Files Coverage Δ
dipy/denoise/localpca.py 94.23% <97.36%> (ø)

@skoudoro
Copy link
Member

Hi @RafaelNH, Thank you for this!

The new feature does not require the estimation of noise sigma so what should I do with the previous implemented noise estimate procedure?

If it does not need it, I do not think so

@ShreyasFadnavis
Copy link
Member

@RafaelNH Thank you for this wonderful PR! Can we do this.. can we add the noise_classifier to the noise_estimate.py ? with an option to switch between lpca and mppca ? Does this make sense?

@Garyfallidis
Copy link
Contributor

@RafaelNH not sure if @ShreyasFadnavis was clear. We need to keep the existing lpca implementation and add the mppca as a new function. Do NOT replace lpca.

@RafaelNH
Copy link
Contributor Author

@Garyfallidis and @ShreyasFadnavis, I've tried to address your comment of having a separate function for mppca; however, when doing this I was just replicating code.

As I mentioned in our personal chat, the mppca can be seen as a general version of the localpca algorithm proposed by Manjon et al. In general, the core of pca denoising algorithm is thresholding the eigenvalues of principal components using the following expression:

th = (tau * sigma) ** 2

where th is the threshold, sigma is the noise std estimate and tau is a scaling factor of the relationship between sigma and eigenvalue threshold. While Manjon et al. used a prior estimation for sigma and an empirical value for tau, Veraart et al. showed that optimal values for tau and sigma can be computed from random matrix theory. For instance, tau can be given as (1+sqrt(N/M)) where N is the number of diffusion-weighted images and M the number of voxels of the sliding windows. As you can see in that expression, having a fixed empirical values is far to be optimal when running this algorithm for different diffusion datasets or when changing the sliding window patch radius.

In this PR, I basically generalize the localpca function and changed its defaults setting for mppca. However, if you are still want to use Manjom suggestions to can run the algorithm as:

den = localpca(data, sigma=sigma_estimate, tau_factor=2.3)

Please let me know your comments.

Copy link
Member

@ShreyasFadnavis ShreyasFadnavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks very good @RafaelNH ! Some minor suggestions on my end!

dipy/denoise/localpca.py Show resolved Hide resolved
dipy/denoise/localpca.py Outdated Show resolved Hide resolved
dipy/denoise/localpca.py Outdated Show resolved Hide resolved
@Garyfallidis
Copy link
Contributor

Is this still a WIP?

@RafaelNH
Copy link
Contributor Author

@Garyfallidis - yes, still WIP, but I am almost there - I will finish it tomorrow morning!

@pep8speaks
Copy link

pep8speaks commented Jul 31, 2019

Hello @RafaelNH, Thank you for updating !

Line 319:32: E127 continuation line over-indented for visual indent

Comment last updated at 2019-07-31 21:37:56 UTC

@RafaelNH RafaelNH changed the title (WIP) A functional implementation of Random matrix local pca. A functional implementation of Random matrix local pca. Jul 31, 2019
@RafaelNH
Copy link
Contributor Author

Hi all - examples updated! This should be a nice first version of the mp-pca for release 1.0. Let me know if there are further comment to address before merging.

@ShreyasFadnavis - I agree that it would be nice to have this method in the noise_estimate.py file. However, as I've mentioned above this might not be a straightforward thing to do. We really need time to think and discuss how to do this without requiring code duplicates. I will suggest to do this in a following PR after this one being merged. At the moment, I am explaining how to obtain the different noise std estimates in the two PCA example files.

Copy link
Member

@ShreyasFadnavis ShreyasFadnavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor documentation corrections. After these, this PR is GTG :) +1 for merge on my end!

theta[ix1:ix2, jx1:jx2, kx1:kx2] += this_theta
thetax[ix1:ix2, jx1:jx2, kx1:kx2] += Xest * this_theta
if return_sigma is True and sigma is None:
var[ix1:ix2, jx1:jx2, kx1:kx2] += this_var * this_theta
thetavar[ix1:ix2, jx1:jx2, kx1:kx2] += this_theta

denoised_arr = thetax / theta
denoised_arr.clip(min=0, out=denoised_arr)
denoised_arr[~mask] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RafaelNH this is not your fault, but just a previous bug I guess:
Can you change this to..
denoised_arr[mask==0] = 0

Denoise images using Local PCA
===============================
=======================================================
Denoise images using Local PCA and empirical thresholds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denoise images using Local PCA via empirical thresholds

of anatomical information for different techniques such as DTI [Manjon2013]_,
spherical deconvolution [Veraart2016a] and DKI [Henri2018]_.

The basic idea behind the PCA-based denoising algorithm is to remove the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea behind the PCA-based denoising algorithms is to remove the
Principal Components of the data that are mostly classified as noise.

The principal components classification can be performed based on prior noise
variance estimates [Manjon2013]_ (see :ref:`denoise_localpca`)or automatically
based on the Marcenko-Pastur distribution [Veraa2016a]_. In addition to noise
suppression, the PCA algorithm can be used to estimate the noise standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the noise
suppression, the PCA algorithm can be used to get the standard deviation of the estimated noise [Veraa2016b].

print("Entire denoised data saved in denoised_mppca.nii.gz")

"""
Additionally, we show on this example how the PCA denoising algorithm effects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we show in this example how the PCA denoising algorithm affects
different diffusion measurements.

note this new feature does not require a prior sigma estimate
test_lpca are now ~30% faster
bug was also detected and fixed - sigma should be a 3D volume
the order of output variable of this function was also changed
fix a bug on the expected functionality of a 3D sigma matrix. Previously, localpca was not exploring the spatial information of the 3D sigma matrix.
…ts than pca denoising using an inputed Random matrix optimal sigma
Proof read example
Add noise std estimate using MP-PCA
Add SNR estimate
Copy link
Member

@skoudoro skoudoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RafaelNH,

Thank you for this nice work. Overall it looks good.

I just find it really confusing to use localpca everywhere. You switch Algorithm by just modifying the sigma parameter which is not clear everywhere. I think it will be good to be more explicit. You could have an interface function. I propose you to create something like:

def mppca(arr, mask=None, pca_method='eig', patch_radius=2, tau_factor=None, return_sigma=False, out_dtype=None):
           return localpca(arr, sigma=None, pca_method=pca_method, patch_radius=patch_radius, tau_factor=tau_factor, return_sigma=return_sigma)

Not doing much but really useful and clarify which algorithm you are using. Of course, you will have to update your MP-PCA tutorial with this function name.

----------
L : array (n,)
Array containing the PCA eigenvalues in ascending order.
wdim : int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdim or nvoxels

The basic idea behind the PCA-based denoising algorithms is to remove the
components of the data that are classified as noise. The Principal Components
classification can be performed based on prior noise variance estimates
[Manjon2013]_ (see :ref:`denoise_localpca`)or automatically based on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space after or


denoised_arr = localpca(data, sigma=None, patch_radius=2)

print("Time taken for local PCA ", -t + time())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Time taken for MP-PCA"

@Garyfallidis
Copy link
Contributor

I like the design. But agree also with Serge that we need an mppca utility function that calls localpca. Will help with the tutorial too. Also on the local pca function have Manjon cited first. And on the mppca have Veraart cited first.

@Garyfallidis
Copy link
Contributor

Apart from these few comments that we have here. This is getting super close to be merged and be included to this release. Rafael please act quickly! Thank you in advance.

@RafaelNH
Copy link
Contributor Author

@Garyfallidis, @skoudoro - Done! Many thanks for your comments, I've created a general pca function (genpca) containing the main body for both pca algorithms (localpca and mppca). I've only maintained the essential input parameters for both individual localpca and mppca functions. I thinks this indeed improves a lot the functions design.

@skoudoro skoudoro added this to the 1.0 milestone Aug 1, 2019
@Garyfallidis Garyfallidis merged commit 4ef70d6 into dipy:master Aug 1, 2019
@15thai
Copy link

15thai commented Aug 1, 2019

Hi, I just saw that you have a great work on Random matrix local pca already. I just have one stupid question.
If your 3D patch data from patch_radius to arr.shape - patch_radius. Does the border problem after reconstructing denoised image still exist?
Thank you.

@skoudoro
Copy link
Member

skoudoro commented Aug 1, 2019

Yes, I agree, @15thai, I think there is a small border problem but it is easy to fix with np.pad with the patch radius. It will be for the release 1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants