Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A functional implementation of Random matrix local pca. #1917

Merged
merged 22 commits into from Aug 1, 2019

Conversation

@RafaelNH
Copy link
Contributor

commented Jul 21, 2019

Hi all! I am working on a functional implementation of the Random matrix local PCA for Dipy's next release. I've already added the classifier based on the moments of the Marchenko–Pastur probability distribution according to Veraart et al., 2016 - which seems to pass my added tests. Now, I just need to:

  • Add a test of the main localpca function.

  • Update the example (If you want to start playing with the new procedure, you can run it in the current example by skipping the sigma estimation part and setting the sigma parameter to None when calling the localpca function - results are looking good and example runs in a couple of minutes).

In the meantime, I need our opinion on the follow aspect - The new feature does not require the estimation of noise sigma, so what should I do with the previous implemented noise estimate procedure? @arokem, @skoudoro did you mentioned that this procedure is not working well? The current version of the code is still compatible with this previous procedure; however if it is not working I don't mind removing it to simplified a bit the code.

PS: Also I hope this work helps the future integration of the code in PR#1781 in a future release.

@codecov-io

This comment has been minimized.

Copy link

commented Jul 21, 2019

Codecov Report

❗️ No coverage uploaded for pull request base (master@4de2d70). Click here to learn what that means.
The diff coverage is 97.36%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1917   +/-   ##
=========================================
  Coverage          ?   84.65%           
=========================================
  Files             ?      119           
  Lines             ?    14739           
  Branches          ?     2341           
=========================================
  Hits              ?    12478           
  Misses            ?     1712           
  Partials          ?      549
Impacted Files Coverage Δ
dipy/denoise/localpca.py 94.23% <97.36%> (ø)
@skoudoro

This comment has been minimized.

Copy link
Member

commented Jul 22, 2019

Hi @RafaelNH, Thank you for this!

The new feature does not require the estimation of noise sigma so what should I do with the previous implemented noise estimate procedure?

If it does not need it, I do not think so

@ShreyasFadnavis

This comment has been minimized.

Copy link
Member

commented Jul 22, 2019

@RafaelNH Thank you for this wonderful PR! Can we do this.. can we add the noise_classifier to the noise_estimate.py ? with an option to switch between lpca and mppca ? Does this make sense?

@Garyfallidis

This comment has been minimized.

Copy link
Member

commented Jul 24, 2019

@RafaelNH not sure if @ShreyasFadnavis was clear. We need to keep the existing lpca implementation and add the mppca as a new function. Do NOT replace lpca.

@RafaelNH

This comment has been minimized.

Copy link
Contributor Author

commented Jul 30, 2019

@Garyfallidis and @ShreyasFadnavis, I've tried to address your comment of having a separate function for mppca; however, when doing this I was just replicating code.

As I mentioned in our personal chat, the mppca can be seen as a general version of the localpca algorithm proposed by Manjon et al. In general, the core of pca denoising algorithm is thresholding the eigenvalues of principal components using the following expression:

th = (tau * sigma) ** 2

where th is the threshold, sigma is the noise std estimate and tau is a scaling factor of the relationship between sigma and eigenvalue threshold. While Manjon et al. used a prior estimation for sigma and an empirical value for tau, Veraart et al. showed that optimal values for tau and sigma can be computed from random matrix theory. For instance, tau can be given as (1+sqrt(N/M)) where N is the number of diffusion-weighted images and M the number of voxels of the sliding windows. As you can see in that expression, having a fixed empirical values is far to be optimal when running this algorithm for different diffusion datasets or when changing the sliding window patch radius.

In this PR, I basically generalize the localpca function and changed its defaults setting for mppca. However, if you are still want to use Manjom suggestions to can run the algorithm as:

den = localpca(data, sigma=sigma_estimate, tau_factor=2.3)

Please let me know your comments.

@ShreyasFadnavis
Copy link
Member

left a comment

Overall looks very good @RafaelNH ! Some minor suggestions on my end!

dipy/denoise/localpca.py Show resolved Hide resolved
dipy/denoise/localpca.py Outdated Show resolved Hide resolved
dipy/denoise/localpca.py Outdated Show resolved Hide resolved
@Garyfallidis

This comment has been minimized.

Copy link
Member

commented Jul 30, 2019

Is this still a WIP?

@RafaelNH

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2019

@Garyfallidis - yes, still WIP, but I am almost there - I will finish it tomorrow morning!

@pep8speaks

This comment has been minimized.

Copy link

commented Jul 31, 2019

Hello @RafaelNH, Thank you for updating !

Line 319:32: E127 continuation line over-indented for visual indent

Comment last updated at 2019-07-31 21:37:56 UTC

@RafaelNH RafaelNH changed the title (WIP) A functional implementation of Random matrix local pca. A functional implementation of Random matrix local pca. Jul 31, 2019

@RafaelNH

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2019

Hi all - examples updated! This should be a nice first version of the mp-pca for release 1.0. Let me know if there are further comment to address before merging.

@ShreyasFadnavis - I agree that it would be nice to have this method in the noise_estimate.py file. However, as I've mentioned above this might not be a straightforward thing to do. We really need time to think and discuss how to do this without requiring code duplicates. I will suggest to do this in a following PR after this one being merged. At the moment, I am explaining how to obtain the different noise std estimates in the two PCA example files.

@ShreyasFadnavis
Copy link
Member

left a comment

Minor documentation corrections. After these, this PR is GTG :) +1 for merge on my end!

theta[ix1:ix2, jx1:jx2, kx1:kx2] += this_theta
thetax[ix1:ix2, jx1:jx2, kx1:kx2] += Xest * this_theta
if return_sigma is True and sigma is None:
var[ix1:ix2, jx1:jx2, kx1:kx2] += this_var * this_theta
thetavar[ix1:ix2, jx1:jx2, kx1:kx2] += this_theta

denoised_arr = thetax / theta
denoised_arr.clip(min=0, out=denoised_arr)
denoised_arr[~mask] = 0

This comment has been minimized.

Copy link
@ShreyasFadnavis

ShreyasFadnavis Jul 31, 2019

Member

@RafaelNH this is not your fault, but just a previous bug I guess:
Can you change this to..
denoised_arr[mask==0] = 0

Denoise images using Local PCA
===============================
=======================================================
Denoise images using Local PCA and empirical thresholds

This comment has been minimized.

Copy link
@ShreyasFadnavis

ShreyasFadnavis Jul 31, 2019

Member

Denoise images using Local PCA via empirical thresholds

of anatomical information for different techniques such as DTI [Manjon2013]_,
spherical deconvolution [Veraart2016a] and DKI [Henri2018]_.
The basic idea behind the PCA-based denoising algorithm is to remove the

This comment has been minimized.

Copy link
@ShreyasFadnavis

ShreyasFadnavis Jul 31, 2019

Member

The basic idea behind the PCA-based denoising algorithms is to remove the
Principal Components of the data that are mostly classified as noise.

The principal components classification can be performed based on prior noise
variance estimates [Manjon2013]_ (see :ref:`denoise_localpca`)or automatically
based on the Marcenko-Pastur distribution [Veraa2016a]_. In addition to noise
suppression, the PCA algorithm can be used to estimate the noise standard

This comment has been minimized.

Copy link
@ShreyasFadnavis

ShreyasFadnavis Jul 31, 2019

Member

In addition to the noise
suppression, the PCA algorithm can be used to get the standard deviation of the estimated noise [Veraa2016b].

print("Entire denoised data saved in denoised_mppca.nii.gz")

"""
Additionally, we show on this example how the PCA denoising algorithm effects

This comment has been minimized.

Copy link
@ShreyasFadnavis

ShreyasFadnavis Jul 31, 2019

Member

Additionally, we show in this example how the PCA denoising algorithm affects
different diffusion measurements.

RafaelNH added 18 commits Jul 21, 2019
RF: Refactoring local pca code to incorporate mp threshold
note this new feature does not require a prior sigma estimate
TEST: define gtab as global variable
test_lpca are now ~30% faster
TEST, BF: Test PCA sigma return functionality
bug was also detected and fixed - sigma should be a 3D volume
RF: convert pca_classifier as a auxiliar internal function
the order of output variable of this function was also changed
RF, BF: Implement optimal tau_factor for pca denoising and:
fix a bug on the expected functionality of a 3D sigma matrix. Previously, localpca was not exploring the spatial information of the 3D sigma matrix.
TEST: Check that pca denosing using sigma=None provides similar resul…
…ts than pca denoising using an inputed Random matrix optimal sigma
DOC: Complete MP-PCA example:
Proof read example
Add noise std estimate using MP-PCA
Add SNR estimate

@RafaelNH RafaelNH force-pushed the RafaelNH:mpPCA branch from 80ddb47 to c4a009a Jul 31, 2019

@skoudoro
Copy link
Member

left a comment

Hi @RafaelNH,

Thank you for this nice work. Overall it looks good.

I just find it really confusing to use localpca everywhere. You switch Algorithm by just modifying the sigma parameter which is not clear everywhere. I think it will be good to be more explicit. You could have an interface function. I propose you to create something like:

def mppca(arr, mask=None, pca_method='eig', patch_radius=2, tau_factor=None, return_sigma=False, out_dtype=None):
           return localpca(arr, sigma=None, pca_method=pca_method, patch_radius=patch_radius, tau_factor=tau_factor, return_sigma=return_sigma)

Not doing much but really useful and clarify which algorithm you are using. Of course, you will have to update your MP-PCA tutorial with this function name.

----------
L : array (n,)
Array containing the PCA eigenvalues in ascending order.
wdim : int

This comment has been minimized.

Copy link
@skoudoro

skoudoro Jul 31, 2019

Member

wdim or nvoxels

The basic idea behind the PCA-based denoising algorithms is to remove the
components of the data that are classified as noise. The Principal Components
classification can be performed based on prior noise variance estimates
[Manjon2013]_ (see :ref:`denoise_localpca`)or automatically based on the

This comment has been minimized.

Copy link
@skoudoro

skoudoro Jul 31, 2019

Member

missing space after or


denoised_arr = localpca(data, sigma=None, patch_radius=2)

print("Time taken for local PCA ", -t + time())

This comment has been minimized.

Copy link
@skoudoro

skoudoro Jul 31, 2019

Member

"Time taken for MP-PCA"

@Garyfallidis

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

I like the design. But agree also with Serge that we need an mppca utility function that calls localpca. Will help with the tutorial too. Also on the local pca function have Manjon cited first. And on the mppca have Veraart cited first.

@Garyfallidis

This comment has been minimized.

Copy link
Member

commented Jul 31, 2019

Apart from these few comments that we have here. This is getting super close to be merged and be included to this release. Rafael please act quickly! Thank you in advance.

@RafaelNH

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2019

@Garyfallidis, @skoudoro - Done! Many thanks for your comments, I've created a general pca function (genpca) containing the main body for both pca algorithms (localpca and mppca). I've only maintained the essential input parameters for both individual localpca and mppca functions. I thinks this indeed improves a lot the functions design.

@skoudoro skoudoro added this to the 1.0 milestone Aug 1, 2019

@Garyfallidis Garyfallidis merged commit 4ef70d6 into nipy:master Aug 1, 2019

2 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
Codacy/PR Quality Review Up to standards. A positive pull request.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@15thai

This comment has been minimized.

Copy link

commented Aug 1, 2019

Hi, I just saw that you have a great work on Random matrix local pca already. I just have one stupid question.
If your 3D patch data from patch_radius to arr.shape - patch_radius. Does the border problem after reconstructing denoised image still exist?
Thank you.

@skoudoro

This comment has been minimized.

Copy link
Member

commented Aug 1, 2019

Yes, I agree, @15thai, I think there is a small border problem but it is easy to fix with np.pad with the patch radius. It will be for the release 1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.