New and improved Cross correlator #494

jrmlhermitte · 2017-03-16T02:51:37Z

I met with Mark Sutton and we discussed the CrossCorrelator code. He had a lot of input and useful advice. He wrote some modifications that made the code 40% faster. I went through them and added some comments to help make them easier to understand for a future contrbutor.

Changes
1. Used more efficient indexing tricks to index in images
2. Removed the necessity of expanding the image (fftconvol does this for free), got rid of _crop_from_mask
3. Removed the wrap parameter (since fftconvol expands the image to speed up computations,
wrapping is not possible). This parameter was never used so it is simply ignored.
4. Modified tests for 2D and 1D CrossCorrelator. New code results are identical to old,
with the exception that new code more efficiently wraps the image.
5. Removed the memory intensive allocation of arrays for the computations during
the object initialization.
Instead, the arrays are allocated and de-allocated per computation (per bin id).
This is likely close in expense as zero-ing the array, which happened regardless.
6. Removed self.ids. Not very useful.

The testing

I went through them and made sure they pass tests so I changed them. They didn't at first, but the reason is minor (basically, my old version's array was 1 or 2 pixels larger, because I was less efficient and including zero padded regions). I also made sure to plot the test results and compare the differences. Again they're minor. I also tested in my own code, and the (notebook example)[https://github.com/scikit-beam/scikit-beam-examples/blob/master/demos/spatial_correlation/Spatial_Correlations.ipynb].

The future

I plan on also adding a feature that allows one to force the output of the correlations to be regular. For example, right now 2D correlations return lists of 2D arrays. They're varying shapes, but normally all we're interested in is in a, say 20x20 subregion of the array. However, given the amount of code here, I figured I would stop here for this PR. If you think we should wait for this, I'm also fine.

Thanks for all the support!

Changes: 1. Used more efficient indexing tricks to index in images 2. Removed the necessity of expanding the image (fftconvol does this for free) 3. Removed the wrap parameter (since fftconvol expands the image to speed up computations, wrapping is not possible). This parameter was never used so it is simply ignored. 4. For now, removed the 1D case of CrossCorrelator (to be added in later commit) 5. Modified tests for 2D cross correlator. New code results are identical to old, with the exception that new code more efficiently wraps the image. (Basically, previous result was zero padded at edges, this version is not) 6. Removed the memory intensive allocation of arrays for the computations during the object initialization. Instead, the arrays are allocated and de-allocated per computation (per bin id). This is likely close in expense as zero-ing the array, which happened regardless.

jrmlhermitte · 2017-03-16T13:10:48Z

skbeam/core/correlation.py


-            # now handle the normalizations
+            # Note, in this code, non-overlapping regions will now get np.nan


Note : I made a mistake here, non-overlapping regions will get np.inf (1/np.array(0)) not np.nan. Need to think about this later, just leaving a quick note. (Suggestions welcome)

licode · 2017-03-16T14:07:08Z

Thanks for all the explanation. I will start reviewing this very soon.

licode · 2017-03-17T14:32:08Z

Any paper or reference from Mark on how to make cross correlation faster? Or this is purely code optimization? Just want to get some big picture first before diving into it.

licode · 2017-03-17T14:34:49Z

Another question related to forward analysis. Once you get this correlation curve, do you need to run statistical test, like to test stationary or not? Or mainly use physics function to fit?

jrmlhermitte · 2017-03-17T14:45:11Z

@licode there is no paper. The speedup is mainly from some better indexing (explained in line 930). Also, there is less memory usage in that large arrays are not allocated in the initialization (rather arrays are allocated, and de-allocated on the fly).

The use case for these correlation curves is just to measure some shift.
It is similar to this:
http://scikit-image.org/docs/dev/auto_examples/transform/plot_register_translation.html#sphx-glr-auto-examples-transform-plot-register-translation-py

except, we want to see the cross-correlation, and we want it to handle large numbers of bins. Oh, and we fit the correlations to a Gaussian to measure the shift. We don't perform any statistical tests. We don't test stationarity because this is not a time-correlation. It is a spatial correlation for one specific time interval.

On top of that, I just noticed the linked paper on that page:
https://www.osapublishing.org/ol/abstract.cfm?uri=ol-33-2-156

They introduce three algorithms here. I don't have time now but I may take a look as well.
We could potentially also test these in the future. However, we've used this current version and have obtained some nice results. Writing up a paper that obtained some physically meaningful results to submit hopefully soon...

So I would say, this code is meant to speed up original code, and in the future, I'll review that paper linked as well, see how it performs.

licode · 2017-03-17T16:55:27Z

I am quite familiar with that paper. I used that code from skimage for image registration. We can talk more on that later.

danielballan · 2017-03-22T18:21:33Z

Just chiming in to say this is exciting!

licode · 2017-03-24T14:36:29Z

skbeam/core/correlation.py

+        bind = mask[pi, pj].astype(np.uint32)
+
+        # sort them so that we can easily index into subregions with one number
+        order = np.argsort(bind)


any reason to order the data here?

yes, it's actual the most important step to allow quicker indexing.
When the data is ordered, you can make lots of assumptions, for example, the nonzero points of
np.diff(bind[order]) are basically where the identifier changes. note np.diff(bind) (without ordering) doesn't give any information.
(I assume bind and everyting is of course flattened)

got it, smart idea.

you can credit Mark Sutton for that (he doesn't have a github). that's the integral piece he's basically added here, yeah pretty clever :-)

licode · 2017-03-24T15:22:23Z

skbeam/core/correlation.py

            maskcorr = _cross_corr(submask)
-            # quick fix for finite numbers should be integer so
            # choose some small value to threshold
            maskcorr *= maskcorr > .5


Does this threshold need to be changed? or set as fixed.

I would leave it. The FFT introduces finite numbers even at zero values (as opposed to brute force convolution). But they're very small. Since we know results should be at least 1 or greater (since the submask is essentially 1's and 0's).

However, perhaps a better way to do this to round to the nearest integer. You can't just typecast to int though, I think. Reason is beacuse maskcorr.astype(int) for values like 2.99999 would become 2 when they should be 3 etc.)
Maybe it's something to try for a later pull request?

(I should mention this trick just ensures close to zero values are zero, since they're used to figure out which regions contain information. Perhaps we should also ensure values that are close to what their integer values should be should also be enforced with rounding, so that the FFT convolution is now equal (nearly) to the brute force convolution)

I see. I thought the convolution results might be complicated, but actually either 0, or larger than 1. Then 0.5 fits ok.

licode · 2017-03-24T15:29:42Z

skbeam/core/correlation.py

                if self_correlation:
-                    Icorr2 = _cross_corr(self.submasks[i], self.tmpimgs[i] *
+                    Icorr2 = _cross_corr(self.submasks[i], tmpimg *


It looks to me Icorr and Icorr2 are the same? when you do cross correlation between A and B, it doesn't matter which one goes first?

not exactly the same; One is reverse of the other, this is very important for symmetric averaging.
However, instead of running convolution again, we could do:

Icorr2 = np.copy(Icorr[::-1, ::-1])

(I'm not sure if copy is necessary, I'll have to check)
I have to think about it to be certain and test just in case... (But I'm sure this is equivalent)
maybea cleaner? thanks ill check later :)

right, forgot the inverse relation. I think no need to do copy here.

thanks for the feedback, I'll change it to a reverse index a little later
(need to be careful with 1D and 2D case)

I vote let's leave it as it is for now. We may need to deal it differently in 1d and 2d.

licode · 2017-03-24T15:36:01Z

skbeam/core/correlation.py

@@ -1113,77 +1167,6 @@ def _cross_corr(img1, img2=None):
    # fftconvolve(A,B) = FFT^(-1)(FFT(A)*FFT(B))
    # but need FFT^(-1)(FFT(A(x))*conj(FFT(B(x)))) = FFT^(-1)(A(x)*B(-x))
    reverse_index = [slice(None, None, -1) for i in range(ndim)]
-    imgc = fftconvolve(img1, img2[reverse_index], mode='same')
+    imgc = fftconvolve(img1, img2[reverse_index], mode='full')


the size will not be the same if using full.

It's true, but it works out here. I should explain this. So previously, the way I ran the computation using (ifft(fft(a)*conj(fft(b)) )), I had to double the image size, then run the convolution mentioned. The reason for doubling is to ensure that boundary points are not correlated together.

When I switched to fftconvol as you suggested previously, I expanded the image and used mode='same' to ensure the output was same size as this doubled array.

However, what i did not appreciate (which Mark Sutton pointed out) is that fftconvol does this magic for you. It doubles your image (plus a few extra pixels to make it a magic number) to prevent this wraparound. Thus, expanding the image and using method='same' was more or less doing the same thing twice. All is needed to do is not expand the image, let fftconvol do it for you and set method='full'.

So in light of that, after dropping image doubling, I had to select method=full to ensure the image is roughly the same size. However, in this case, fftconvol is smarter than me and the array is actually consistently slightly smaller (i.e. it wraps the boundaries as tight as possible).

I had done testing and saw the array sizes were off by one or two pixels (but never more). I plotted both the 1D and 2D cases and compared to previous data and they're almost the same.

I hope I explained correctly? It is easier to explain in person.

good to know. thanks for the explanation.

jrmlhermitte · 2017-03-24T18:00:13Z

skbeam/core/correlation.py

-                               ravel()[self.subpxlsts[i]])**2
-
+                if self_correlation:
+                    ccorr /= self.maskcorrs[i] * \


non-overlapping regions will be np.inf instead of 0 now. Maybe we should change this?

the code has been changed so that non-overlapping regions are np.nan, for the record

jrmlhermitte · 2017-03-24T19:14:51Z

skbeam/core/correlation.py

            # choose some small value to threshold
            maskcorr *= maskcorr > .5
+            maskcorr[np.where(maskcorr == 0)] = np.nan


I added this line so that 0 regions (basically where there is no correlation information) lead to np.nan. Previous convention was setting them to zero. But I think np.nan makes more sense.
This is a slight convention change but I don't think many have used it yet. I could revert to zero if better.
attn: @licode

jrmlhermitte · 2017-03-24T20:12:27Z

great thanks! :-)

licode · 2017-03-24T20:18:12Z

sure. good work!

jrmlhermitte added 11 commits March 14, 2017 22:11

remove the wrap term (fftconvol ignores what wrap would do)

3fb5dc7

Added Mark's changes, still need to test...

70d244f

nomenclature change, better commenting

bd77357

Non-overlapping regions now get np.nan since undefined, not 0

00302a2

formatting changes

28389c9

PEP8

456742b

comment edit

cc1c318

Added 1D capability. Had to modify numbers in tests

6c6eabc

Added tests for ValueError exceptions

7bc85f4

PEP8

392fb10

jrmlhermitte requested a review from licode March 16, 2017 03:06

jrmlhermitte commented Mar 16, 2017

View reviewed changes

licode reviewed Mar 24, 2017

View reviewed changes

jrmlhermitte commented Mar 24, 2017

View reviewed changes

changed invalid pixels in maskcorrs to np.nan

021eaad

jrmlhermitte commented Mar 24, 2017

View reviewed changes

licode merged commit 24aad63 into scikit-beam:master Mar 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New and improved Cross correlator #494

New and improved Cross correlator #494

jrmlhermitte commented Mar 16, 2017 •

edited

jrmlhermitte Mar 16, 2017 •

edited

licode commented Mar 16, 2017

licode commented Mar 17, 2017

licode commented Mar 17, 2017

jrmlhermitte commented Mar 17, 2017 •

edited

licode commented Mar 17, 2017

danielballan commented Mar 22, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017 •

edited

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017 •

edited

licode Mar 24, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017

licode Mar 24, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017

licode Mar 24, 2017

jrmlhermitte Mar 24, 2017

jrmlhermitte Mar 24, 2017 •

edited

jrmlhermitte Mar 24, 2017 •

edited

jrmlhermitte commented Mar 24, 2017

licode commented Mar 24, 2017


		# now handle the normalizations
		# Note, in this code, non-overlapping regions will now get np.nan

New and improved Cross correlator #494

New and improved Cross correlator #494

Conversation

jrmlhermitte commented Mar 16, 2017 • edited

jrmlhermitte Mar 16, 2017 • edited

Choose a reason for hiding this comment

licode commented Mar 16, 2017

licode commented Mar 17, 2017

licode commented Mar 17, 2017

jrmlhermitte commented Mar 17, 2017 • edited

licode commented Mar 17, 2017

danielballan commented Mar 22, 2017

Choose a reason for hiding this comment

jrmlhermitte Mar 24, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrmlhermitte Mar 24, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrmlhermitte Mar 24, 2017 • edited

Choose a reason for hiding this comment

jrmlhermitte Mar 24, 2017 • edited

Choose a reason for hiding this comment

jrmlhermitte commented Mar 24, 2017

licode commented Mar 24, 2017

jrmlhermitte commented Mar 16, 2017 •

edited

jrmlhermitte Mar 16, 2017 •

edited

jrmlhermitte commented Mar 17, 2017 •

edited

jrmlhermitte Mar 24, 2017 •

edited

jrmlhermitte Mar 24, 2017 •

edited

jrmlhermitte Mar 24, 2017 •

edited

jrmlhermitte Mar 24, 2017 •

edited