Non-deterministic result of antsAffineInitializer #444

oesteban · 2017-05-11T20:03:26Z

I've run 10 times antsAffineInitializer on the same inputs and the 10 resulting transform.mat files are different (judging from their md5 sums). I can imagine the 10 transforms are very close to one another but not exactly the same. Does the transform.mat encode some variable metadata (like date and time) that invalidates checking on the md5 sums?

If checking the md5 is ok, is there a way to get deterministic results from this utility?

Thanks very much

spinicist · 2017-05-11T20:04:56Z

How many threads are you using? I've been having issues today with antsRegistration and the number of threads (I haven't got a minimal test case working so was holding off on reporting it). Toby

oesteban · 2017-05-11T20:13:19Z

@spinicist that's hilarious: I had a very long issue already written, ready to click on the submit button about the reproducibility of antsRegistration w.r.t. the number of threads. I have that test case, I can post it if you want. Bit-to-bit reproducibility of parallel computing is not trivial, so I thought I would need to set some tolerance and held off posting the issue.

While investigating that issue, when I was checking the inputs to antsRegistration, I realized that antsAffineInitializer gives you different results regardless the number of threads. And that is more surprising.

That said, the test case for antsRegistration that I mentioned before is built without antsAffineInitializer (so all inputs to antsRegistration are exactly the same)

spinicist · 2017-05-11T20:16:57Z

Ha! Good to know I'm not crazy. It's @stnava or @ntustison who would have to decide if such a test case was useful to them. I think I have additional problems with masks that I'm trying to isolate.

But - maybe we should keep this thread about antsAffineInitializer? It sounds like your issue there is totally different, and threads was my only guess.

cookpa · 2017-05-11T20:17:15Z

I don't know if antsAffineInitializer uses random sampling. Have you tried antsAI? The random seed for that is hard coded.

The number of threads definitely has to be constant or the results will differ.

stnava · 2017-05-11T20:25:45Z

https://www.nist.gov/sites/default/files/documents/itl/ssd/is/NRE-2015-07-Nguyen_slides.pdf there are some deep computational issues at hand with such issues that combine floating point error / randomization / resource availability. see also https://itk.org/pipermail/insight-developers/2014-March/023731.html we tried compensated summation and "cheap" rounding tricks ... in my own experiments, the latter provided the easiest solution but the approach was really hacky and probably would not truly fix the problem across platforms so we abandoned it. in the end, we know there are reproducibility issues. the reasons go all the way back to the itk pipeline. surely there would be a better solution than what we have but no convergence yet. brian

…

On Thu, May 11, 2017 at 4:17 PM, Philip Cook ***@***.***> wrote: I don't know if antsAffineInitializer uses random sampling. Have you tried antsAI? The random seed for that is hard coded. The number of threads definitely has to be constant or the results will differ. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#444 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AATyfknaTXNxM2NecsFK09GkyDoCM-d2ks5r42zMgaJpZM4NYfFC> .

oesteban · 2017-05-11T20:36:44Z

@stnava thanks, that is why I held off posting anything about the reproducibility of antsRegistration w.r.t. the number of threads. Is antsAffineInitializer affected by the very same issue? I'm under the impression that here we are facing a different problem.

@cookpa antsAI is not in the previous 2.1.0 release, and since there are no binaries (yet) for the latest release, I would need to hold on. I don't see any random sampling in the antsAffineInitializer code.

oesteban · 2017-05-11T21:55:39Z

I just checked that setting ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=1 yields a deterministic result for antsAffineInitializer. So we are talking of the same reproducibility issue of antsRegistration, and, in general, the parallelization of floating-point calculations. Thank you all for your answers.

This PR just enables the general control for number of threads for this tool. I just learned that it works differently depending on it ANTsX/ANTs#444

spinicist · 2017-05-22T13:22:36Z

For posterity, the issues I was having were at least partly to do with specifying --float in my antsRegistration call. Using double precision appears to make my metric values stable across runs with a limited number of threads (4). This is completely unsurprising! I should never have got into the habit of specifying --float in the first place.

This makes the output deterministic, at the cost of running unnecessarily slowly. The order of the floating point sums used internally is numerically unstable. See https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues ANTsX/ANTs#444 (comment) ANTsX/ANTsR#210 (comment)

This is also to make the output deterministic, at the cost of running slow. It turned out that using dense sampling wasn't enough; there was still some numerical instability that came from the order of addition: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues#variance-due-to-floating-point-precision-errors For some reason it only appeared on OS X, and only about 10% of the time, and never on Linux. I [showed](#2642 (comment)) that the instability in isct_antsSliceRegularizedRegistration did exist on Linux, so something still unknown about how we call it was hiding it there. See: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTs#444 (comment) * ANTsX/ANTsR#210 (comment)

This is also to make the output deterministic, at the cost of running slow. It turned out that using dense sampling wasn't enough; there was still some numerical instability that came from the order of addition: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues#variance-due-to-floating-point-precision-errors For some reason it only appeared on OS X, and only about 10% of the time, and never on Linux. I [showed](#2642 (comment)) that the instability in isct_antsSliceRegularizedRegistration did exist on Linux, so something still unknown about how we call it was hiding it there. This was actually supposed to be in place already but the code had atrophied, so all this does is fix it up. See: * https://github.com/ANTsX/ANTs/wiki/antsRegistration-reproducibility-issues * ANTsX/ANTs#444 (comment) * ANTsX/ANTsR#210 (comment)

oesteban closed this as completed May 11, 2017

oesteban added a commit to oesteban/nipype that referenced this issue May 11, 2017

[ENH] Enable num_threads for antsAffineInitializer

61fa9f1

This PR just enables the general control for number of threads for this tool. I just learned that it works differently depending on it ANTsX/ANTs#444

oesteban mentioned this issue May 11, 2017

[ENH] Enable num_threads for antsAffineInitializer nipy/nipype#2010

Merged

oesteban mentioned this issue Jul 28, 2017

[ENH] Use double resolution in ANTs by default nipreps/mriqc#608

Merged

ntustison mentioned this issue Mar 29, 2018

antsRegistration does not produce equivalent results ANTsX/ANTsR#210

Open

This was referenced Jul 12, 2018

Randomness in calculation of FD/Motion Parameters ANTsX/ANTsR#226

Closed

Reproducible kmeans segmentation ANTsX/ANTsRCore#51

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-deterministic result of antsAffineInitializer #444

Non-deterministic result of antsAffineInitializer #444

oesteban commented May 11, 2017

spinicist commented May 11, 2017 via email •

edited

oesteban commented May 11, 2017

spinicist commented May 11, 2017

cookpa commented May 11, 2017

stnava commented May 11, 2017 via email

oesteban commented May 11, 2017

oesteban commented May 11, 2017

spinicist commented May 22, 2017

Non-deterministic result of antsAffineInitializer #444

Non-deterministic result of antsAffineInitializer #444

Comments

oesteban commented May 11, 2017

spinicist commented May 11, 2017 via email • edited

oesteban commented May 11, 2017

spinicist commented May 11, 2017

cookpa commented May 11, 2017

stnava commented May 11, 2017 via email

oesteban commented May 11, 2017

oesteban commented May 11, 2017

spinicist commented May 22, 2017

spinicist commented May 11, 2017 via email •

edited