NF : nlmeans now support arrays of noise std #572

samuelstjean · 2015-02-01T15:13:01Z

So that's it, the upper wrapper function now also supports 3D arrays in addition to a single value for the nosie standard deviation. All the inner functions now also works on arrays, but that is transparent ot the user. In addition, one can now supply his own variable estimation, such as a volume computed with piesno for example.

All the test passes for whatever it's worth, and I also ran it on a T2 image I had lying around, seems to work nicely in the same way as the old one.

arokem · 2015-02-01T19:26:29Z

dipy/denoise/tests/test_nlmeans.py

@@ -18,14 +18,14 @@ def test_nlmeans_padding():

 def test_nlmeans_static():
    S0 = 100 * np.ones((20, 20, 20), dtype='f8')
-    S0n = nlmeans(S0, sigma=1, rician=False)
+    S0n = nlmeans(S0, sigma=np.ones((20, 20, 20)), rician=False)


I wouldn't replace the test here. Instead add the new test, together with the old one.

The tests are defined on the inner nlmeans cython function, while the wrapper nlmeans sets everything for you. Because of this, I had to change the tests or they would just not work since the inner mechanic is different.

The 3D/4D caller function still work as-is, which is what people are supposed to be calling.

Garyfallidis · 2015-03-15T23:16:49Z

dipy/denoise/denspeed.pyx

@@ -36,7 +36,7 @@ def nlmeans_3d(arr, mask=None, sigma=None, patch_radius=1,
    """

    if arr.ndim != 3:
-        raise ValueError('arr needs to be a 3D ndarray')
+        raise ValueError('data needs to be a 3D ndarray', arr.shape)


No need for replacing arr with data here. Except if you change arr with data in the parameters. But if you do that then we will need to writing down in API changes.

Garyfallidis · 2015-03-15T23:24:39Z

Hello, sorry for delaying to review this. I think an important issue here is that there is no benchmark.

I would like to know how much performance we lose by adding a block of sigmas by default. Can you write a benchmark were you compare the execution speed between having one sigma and an array of sigmas. Please use a full brain (DWI and T1) to get more realistic timings.

If the speed gets worse we need to support both options. Have sigma as one number or as an array so that at least the people who were using nlmeans will not see any delays in the execution time.

samuelstjean · 2015-03-17T18:24:17Z

for a 2mm stack of 10 dwis, old run time is 2 mins, and new runtime is 2 mins 40 s. For a t1 it's somethibng like 1 min 20s and 1 min 40s I think. So it would better to have two code paths, but that adds a lot of code duplication of course.

fmorency · 2015-03-19T18:47:39Z

Since #576 should be merged soon, the next step in order to be able to use piesno easily is this PR. I will review the code and I would like to relaunch the discussion about the benchmark results. @Garyfallidis, what's your call on this?

arokem · 2015-03-19T19:08:03Z

Closed in favor of #576

samuelstjean · 2015-03-19T19:19:17Z

Why was this closed? These two issues are mostly unrelated. So far, (to my knowledge), no functions accept a variable noise estimation array as is provided by piesno. Enhancing nlmeans with this fix would actually make it the first one to use such an estimation, which we can add such support for stuff like the restore fitting algorithm.

arokem · 2015-03-19T19:20:46Z

Sorry - my bad.

arokem · 2015-03-19T19:22:29Z

I meant to close #390, of course.

On Thu, Mar 19, 2015 at 12:19 PM, Samuel St-Jean notifications@github.com
wrote:

Why was this closed? These two issues are mostly unrelated. So far, (to my
knowledge), no functions accept a variable noise estimation array as is
provided by piesno. Enhancing nlmeans with this fix would actually make it
the first one to use such an estimation, which we can add such support for
stuff like the restore fitting algorithm.

—
Reply to this email directly or view it on GitHub
#572 (comment).

samuelstjean · 2015-03-19T19:22:38Z

oh well, it's only the first in refactoring a bunch of stuff, but it will at least provide a bsis example on how to do it and possible improvements it can bring. This may be less apparent on 1.5T data, but you can actually see the noise varying spatially on 3T data as you move away from the coils to the deep nucleis ;)

Garyfallidis · 2015-03-30T13:02:19Z

Hi @samuelstjean because the delay is significant the "code duplication" makes sense. Please create a variable that checks if one sigma value is used or an array of sigmas is provided and execute the relevant code accordingly.

Also as you are on it can you address issue #608 ?
The default behaviour should be use all cores.

Keep it up!

samuelstjean · 2015-03-30T13:32:23Z

Maybe someday, it would require two codepath, and I don't have time to do this properly right now with the dicom parser being more useful right now. Plus adding the variable to keep track the number of cores should be relatively easy so that @fmorency can do it.

Garyfallidis · 2015-03-30T14:13:11Z

Look, if you don't fix this probrem of calling different code for different sigma input I will not merge this PR. I thought it would be trivial for you to fix also what @fmorency issued as you are on this codebase right now and you know what this algorithm does. So, it would be much easier and faster for you than for @fmorency. If you think that it is so time consuming and difficult for you that is okay. But you should definitely fix the issue of calling different code when there is one sigma or many. Otherwise, this PR is incomplete.

samuelstjean · 2015-03-30T14:51:40Z

Yes I agree, I did not mean that is good as it is right now. I'll just let it simmer and come back to it later, like end of april/may/this summer maybe.

In the meantime, adding the number of cores selectign variable could be done separately and is easily doable in my opinion. Plus thetre is a bunch of people doing little fixes for gsoc elegibility, you could have one of them do it to familizarize with the codebase.

arokem · 2015-06-24T19:05:02Z

Is this PR still a thing? Any chance to rebase it and revisit?

samuelstjean · 2015-06-24T19:45:00Z

I don't know if it should be split in two or what, it pretty much the same except for the array access.
Expliciting in two functions would duplicate stuf, but it would also be more flexible later for adding improvements and whatnot, so I don't know.

arokem · 2015-06-25T18:19:42Z

See @Garyfallidis comments above.

samuelstjean · 2015-06-25T18:22:03Z

I still think the little added computing time is not worth splitting the code in two and risking changing one without the other, while it would give more flexibility, it could laos just be doing something for not much gain after all, hence I'm still undecided.

Garyfallidis · 2015-07-13T13:37:05Z

@samuelstjean did you close this PR intentionally or accidentally?

samuelstjean · 2015-07-13T13:56:44Z

Intentional, it has been simmering since february, so I'll just restart
fresh another day instead by spliting stuff in two.
Le 2015-07-13 09:37, Eleftherios Garyfallidis a écrit :

@samuelstjean https://github.com/samuelstjean did you close this PR
intentionally or accidentally?

—
Reply to this email directly or view it on GitHub
#572 (comment).

Garyfallidis · 2015-07-13T13:57:30Z

ok

arnaudbore · 2015-08-31T15:40:01Z

I think nlmeans, piesno and even estimate_sigma are not well integrated as a whole.
Is there any update on this part of the code @samuelstjean, @Garyfallidis ? What about NLSAM ?

arokem · 2015-08-31T15:48:11Z

Thanks for the feedback! Could you please be a bit more specific? What difficulties have you encountered? What would you propose as a rational organization of this part (these parts) of the library? Feel free to give an example of the kind of code you would like to write when using these elements, to demonstrate how you would like to see these things better integrated.

Also - what's NLSAM? Link?

samuelstjean · 2015-08-31T16:04:29Z

Well, easy - this PR adds support for arrays of variance (which is the output of piesno) while nlmeans only support single values. So if estimate_sigma fails, the user is stuck, and he cannot use piesno directly because of limitations to the nlmeans function itself. So basically piesno serves no direct purpose unless you use advanced / custom functions not in mainline dipy. Well, it's just that people who want to use more features usually don't like taking random people branches I guess.

NLSAM is the other denoising algorithm I'm developping, but that's gonna be in 2-3 PR later as I build upon those I made recently (stabilisation PR for example) instead of putting 2000+ lines at once.

arnaudbore · 2015-08-31T17:26:54Z

Thank you @samuelstjean for the quick answer. Is there anything to do/code/merge to add these feature for NLMEANS ?

Is there an issue (label as enhancement) about NLSAM to follow its development ?
NLSAM: http://samuelstjean.github.io/pages/downloads/st-jean_et_al_ismrm_2014.pdf

samuelstjean · 2015-08-31T17:44:24Z

Nah it's only in my personal (private) repo and in 2-3 abstracts + my master thesis so far. Until a bunch of other functions are in, it will be missing stuff and I hate doing wip pr as they get drenched with commits which are probably not interesting.

arnaudbore · 2015-08-31T18:30:25Z

First thing first, NLMEANS should be able to use piesno output otherwise it doesn't make sense to have a different way to compute sigma and not been able to use it automatically. I don't really understand here what is missing to make it work @arokem, @Garyfallidis, @samuelstjean ?

arokem · 2015-08-31T18:36:16Z

Thanks - that really clarifies for me. A similar request came up in the
context of this PR: #677, and see post
here:
http://gsoc2015dipydki.blogspot.co.uk/2015/08/rnh-post-12-attempt-to-further-improve.html

IIUC, you mean that you want to be able to do something like:

noise = piesno(data)
denoised = nlmeans(noise, data)

Seems like it should be possible to make that work. At least under some
circumstances (see post from Rafael, and discussion on #677).

On Mon, Aug 31, 2015 at 11:30 AM, arnaudbore notifications@github.com
wrote:

First thing first, NLMEANS should be able to use piesno output otherwise
it doesn't make sense to have a different way to compute sigma and not been
able to use it automatically. I don't really understand here what is
missing to make it work @arokem https://github.com/arokem, @Garyfallidis
https://github.com/Garyfallidis, @samuelstjean
https://github.com/samuelstjean ?

—
Reply to this email directly or view it on GitHub
#572 (comment).

arnaudbore · 2015-08-31T18:56:15Z

Thank you @arokem
This is exactly what I was talking about. Like @RafaelNH in #677 I think we really need to work on the denoising part so people don't get confuse about the way to use these functions (and @samuelstjean if you could keep working on the NLSAM algorithm , it would be G.R.E.A.T).

@samuelstjean , is the support of arrays for nlmeans ready for merging ?

samuelstjean · 2015-08-31T19:19:29Z

Has always been since february.

Garyfallidis · 2015-08-31T19:29:01Z

Hi all. About the array support. As far as I know Sam hasn't yet made the effort to keep the initial performance when one sigma is added and not an entire array. What I mean is that his introduction of arrays is making the general algorithm slower even if only one sigma is provided. This is bad because we often use nlmeans for very large datasets and it's bad to go for 4 hours to 6 hours although we don't use multiple sigmas.

Also @arnaudbore I think that probably NLSAM will be added after the journal paper has been published.

But yes we should improve what we already have. And do it before the coming release.

arnaudbore · 2015-08-31T19:40:34Z

Thank @Garyfallidis, I'm going to follow the development of NLSAM !!
@samuelstjean can you find a way to make it run faster ? Do you need help ? Anything that can help you on this ?

samuelstjean · 2015-08-31T19:54:33Z

The only speed drawback seems that acessing arrays in C is slower than having a single variable, that seems to be the only performance hit incurred. That being said, I also think it's a terrible idea to duplicate the whole thing just because of that, since if anything needs to be fixed/improved in the future, pretty sure one version or the other will be left behind and produce different result because of that.

Now, I have no idea how the two could be used without major copy-paste of stuff and keep only a single version which would branch out in each case as needed, so if you have suggestion feel free to post stuff. And about the speed perofrmance, I don't think it can be made faster as the change is only some lines about memcopy stuff and whatnot.

arnaudbore · 2015-08-31T20:26:21Z

@samuelstjean are you 100% sure it will end with different results, did you run some tests ? I understand that it's never a good feeling to have to duplicate code. If it's the only way to not end with 4-6 hours when you have only one sigma let's go for it. I think @Garyfallidis will agree with me if I say that is better to have this in the next release even with duplicate code than not having it at all.

samuelstjean · 2015-08-31T20:31:58Z

Not sure what you meant, I said the problem with two different codepath is that one is bound to be forgotten when the other is changed, thus leading to various problem. Just changing the function to accept an array will produce exactly the same result if the array is constant (I tested it).

About runtime, I more recall something like 2 mins vs 2 min 40 sec on a simple T1 image, no idea is that is 40 secs per volume though, and let's say that T1 have 8 times more voxels than regular DWIs (1mm iso vs 2mm iso). So even taking into account that the change is for all volume, that would rather be 40 s / 8 / 65 DWIs = 5,41 mins more than the regular version, which is not that bad in my opinion.

Garyfallidis · 2015-08-31T20:45:40Z

Okay, 2' (120sec) vs 2' 40" (160sec) this is 1/4 more time and not 1/3 third more time which is what I said. So, Sam why don't your recalculate the time to make sure that the difference is indeed as small and then reopen the PR?

samuelstjean · 2015-08-31T21:12:36Z

All done, and it does not seems to add much time on a 1.8mm dataset with 65 DWIs.

stjs2902 ~/frank $ [frank] time run_nlmeans.py dwi.nii.gz runtime.nii.gz --noise_est basic --mask b0_brain_mask.nii.gz -N 1
real 5m58.365s
user 21m50.078s
sys 0m1.568s

stjs2902 ~/frank $ [frank] time run_nlmeans.py dwi.nii.gz runtime.nii.gz --noise_est basic --mask b0_brain_mask.nii.gz -N 1
real 6m50.071s
user 25m6.314s
sys 0m3.144s

So, 50 secs more in all, which is almost nothing.

Garyfallidis · 2015-08-31T21:21:22Z

Okay this is good news. Reopen the PR. Let's have a quick look and then merge asap.

MarcCote · 2015-08-31T22:14:46Z

dipy/denoise/denspeed.pyx


                            # this line takes most of the time! mem access
                            d = cache[(B + a) * BS * BS + (B + b) * BS + (B + c)] - cache[(m + a) * BS * BS + (n + b) * BS + (o + c)]
                            summ += d * d
+                            sigm += sigma_block[(m + a) * BS * BS + (n + b) * BS + (o + c)]


There are lot of computation going on here to obtain the index (m + a) * BS * BS does not change when b and c do. The same goes for (n + b) * BS w.r.t. c. (See also line 169).

I don't know if the compiler already optimize this for you but if you need speed up it's worth trying.

Does it really changes that much for 5-6 additions and multiplications?

Why don't you try it out and tell me if it does :P

Saves like 4 seconds after all, so not worth the trouble.

Ok, the compiler must already do that trivial optimization then. Thanks for testing it.

Garyfallidis · 2015-09-04T19:05:22Z

sure okay.

Garyfallidis · 2015-09-12T23:15:34Z

@samuelstjean you have forgotten to update the docstring for the sigma parameter in nlmeans. It still says that the expected parameter is float but now can be an array of floats correct?

Apart from this. This PR is failing because of the rbf problem which is irrelevant for this PR. So, it feels like I should go ahead and merge it. Except if other people think that I should wait first for the rbf problem. @arokem and @stefanv any input on this? Seems it needs an asap resolution. I am open to suggestions.

arokem · 2015-09-12T23:37:05Z

Yes - fine for a merge on my end.

On Sat, Sep 12, 2015 at 4:15 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

@samuelstjean https://github.com/samuelstjean you have forgotten to
update the docstring for the sigma parameter in nlmeans. It still says that
the expected parameter is float but now can be an array of floats correct?

Apart from this. This PR is failing because of the rbf problem which is
irrelevant for this PR. So, it feels like I should go ahead and merge it.
Except if other people think that I should wait first for the rbf problem.
@arokem https://github.com/arokem and @stefanv
https://github.com/stefanv any input on this? Seems it needs an asap
resolution.

—
Reply to this email directly or view it on GitHub
#572 (comment).

samuelstjean · 2015-09-13T07:51:51Z

I can correct the doc when I get access to a computer.
On Sep 13, 2015 1:15 AM, "Eleftherios Garyfallidis" <
notifications@github.com> wrote:

@samuelstjean https://github.com/samuelstjean you have forgotten to
update the docstring for the sigma parameter in nlmeans. It still says that
the expected parameter is float but now can be an array of floats correct?

Apart from this. This PR is failing because of the rbf problem which is
irrelevant for this PR. So, it feels like I should go ahead and merge it.
Except if other people think that I should wait first for the rbf problem.
@arokem https://github.com/arokem and @stefanv
https://github.com/stefanv any input on this? Seems it needs an asap
resolution.

—
Reply to this email directly or view it on GitHub
#572 (comment).

arnaudbore · 2015-09-21T14:57:28Z

Ready for merging ;) ?

NF : nlmeans now support arrays of noise std

Garyfallidis · 2015-09-21T14:59:41Z

Done!

Added noise std array to nlmeans

e47fe96

arokem reviewed Feb 1, 2015
View reviewed changes

arokem mentioned this pull request Feb 4, 2015

WIP: New PIESNO example and small corrections #390

Closed

stefanv modified the milestone: 0.9 Mar 4, 2015

Garyfallidis reviewed Mar 15, 2015
View reviewed changes

arokem closed this Mar 19, 2015

arokem reopened this Mar 19, 2015

samuelstjean closed this Jul 13, 2015

samuelstjean deleted the nlmean_array branch July 13, 2015 13:18

Merge branch 'upstream_master' into nlmean_array

85f60b1

MarcCote reviewed Aug 31, 2015
View reviewed changes

samuelstjean reopened this Sep 1, 2015

samuelstjean added 2 commits September 21, 2015 14:51

updated docstring

9aa9127

Update nlmeans.py

71c88a5

Garyfallidis added a commit that referenced this pull request Sep 21, 2015

Merge pull request #572 from samuelstjean/nlmean_array

f0d46ce

NF : nlmeans now support arrays of noise std

Garyfallidis merged commit f0d46ce into dipy:master Sep 21, 2015

NF : nlmeans now support arrays of noise std #572

NF : nlmeans now support arrays of noise std #572

Conversation

samuelstjean commented Feb 1, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Garyfallidis commented Mar 15, 2015

samuelstjean commented Mar 17, 2015

fmorency commented Mar 19, 2015

arokem commented Mar 19, 2015

samuelstjean commented Mar 19, 2015

arokem commented Mar 19, 2015

arokem commented Mar 19, 2015

samuelstjean commented Mar 19, 2015

Garyfallidis commented Mar 30, 2015

samuelstjean commented Mar 30, 2015

Garyfallidis commented Mar 30, 2015

samuelstjean commented Mar 30, 2015

arokem commented Jun 24, 2015

samuelstjean commented Jun 24, 2015

arokem commented Jun 25, 2015

samuelstjean commented Jun 25, 2015

Garyfallidis commented Jul 13, 2015

samuelstjean commented Jul 13, 2015

Garyfallidis commented Jul 13, 2015

arnaudbore commented Aug 31, 2015

arokem commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

arnaudbore commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

arnaudbore commented Aug 31, 2015

arokem commented Aug 31, 2015

arnaudbore commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

Garyfallidis commented Aug 31, 2015

arnaudbore commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

arnaudbore commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

Garyfallidis commented Aug 31, 2015

samuelstjean commented Aug 31, 2015

Garyfallidis commented Aug 31, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Garyfallidis commented Sep 4, 2015

Garyfallidis commented Sep 12, 2015

arokem commented Sep 12, 2015

samuelstjean commented Sep 13, 2015

arnaudbore commented Sep 21, 2015

Garyfallidis commented Sep 21, 2015