Implement SEC-C style internal chunking of frequency domain correlations #285

calum-chamberlain · 2018-12-24T01:28:36Z

What does this PR do?

This PR is inspired by the SEC-C paper. It implements internal chunking of the FFTs of the data. Initial testing shows this to be faster than running really large FFTs. It should also be much more memory efficient because the size of FFTs can be reduced. At the moment this is controlled with an fft_len kwarg on the fftw correlation functions. This can be passed through from the matched-filter functions.

Why was it initiated? Any relevant Issues?

Senobari et al. (2018) make the point well that EQcorrscan can be very costly in memory. Currently the way around this is to use either shorter processing lengths, or group templates. Neither of these options is very efficient. They present a simple alternative whereby many shorter FFTs can be computed for the correlations. This does not effect accuracy (as using an incorrect/different processing length between template and data would), and allows the template FFTs to be cached while looping through chunks of continuous data. A side effect of this is that longer streams of data could be worked on efficiently.

To do:

Document this behaviour;
~~- [ ] Estimate the most efficient fft_len on the fly given the memory restrictions of the system;~~
~~- [ ] Further parallelism could be enabled, e.g. the current outer_core parallelism could be changed to work on the loop over chunks of continuous data;~~
Testing using Add correlation speed-test #180 would be good, some graphs demonstrating the different memory and time requirements would be nice.
Wait until Speed-up clustering #266 is merged, which makes changes to the C functions which will require some tweaking to merge with this.
Long-term, this could be memory efficient enough to be ported to the GPU, which could allow for some serious desktop speed-ups.

PR Checklist

develop base branch selected?
This PR is not directly related to an existing issue (which has no PR yet).
All tests still pass.
Any new features or fixed regressions are be covered via new tests.
Any new or changed features have are fully documented.
Significant changes have been added to CHANGES.md.
~~- [ ] First time contributors have added your name to CONTRIBUTORS.md.~~

eqcorrscan/tests/correlate_test.py

calum-chamberlain · 2019-01-17T21:58:50Z

There is an indexing error now with this that needs to be resolved. This is readily apparent in the correlate_test.py::TestArrayCorrelateFunctions::test_single_channel_similar test.

calum-chamberlain · 2019-01-22T02:57:51Z

So it looks like chunking is good, faster, more efficient at loading CPUs, and less memory intensive. It also passes tests and gives the same results. The gist here shows some of the profiling I ran. I ran a range of other dataset sizes and found that an fft-length of 2**13 was always fastest on my machine. I am tempted to auto-set the fft-len to this.

calum-chamberlain · 2019-01-23T00:27:36Z

I tested this on a cluster running python 3.6, ubuntu 16.04 and using gcc 5.4.0 and ran into libgomp errors for resource unavailable. This is removed by disabling outer loop threading. I was struggling to maintain that as it was so I'm happy to remove this. It also had limited benefits, which are more than outweighed by this change.

Implement SEC-C style chunking

db88c74

stickler-ci bot reviewed Dec 24, 2018

View reviewed changes

eqcorrscan/tests/correlate_test.py Outdated Show resolved Hide resolved

calum-chamberlain self-assigned this Dec 24, 2018

calum-chamberlain added enhancement utils.correlate labels Dec 24, 2018

calum-chamberlain added this to In progress in 0.4.0 via automation Dec 24, 2018

calum-chamberlain added this to the 0.4.0 milestone Dec 24, 2018

Merge - indexing issue

fa7bd9e

calum-chamberlain added 2 commits January 22, 2019 00:45

Fix indexing

aae1dc6

stickler

5dab0d9

calum-chamberlain added 5 commits January 22, 2019 21:48

add docs

3cf9f96

Enforce fft length longer than template length

0464caf

correct typo

21aca7e

Disable outer loop parallelism

6c211cb

stickler

705f37c

calum-chamberlain added 2 commits January 23, 2019 02:20

Add failing correlation test with real data

7065273

Enforce negative padding

7e841e0

calum-chamberlain merged commit 5d6a146 into develop Jan 23, 2019

0.4.0 automation moved this from In progress to Done Jan 23, 2019

calum-chamberlain mentioned this pull request Sep 3, 2019

EQcorrscan version 0.4.0 prep #328

Closed

14 tasks

calum-chamberlain mentioned this pull request Oct 21, 2020

Speed bottleneck in tribe.detect()? #431

Closed

calum-chamberlain deleted the chunk-xcorr branch March 19, 2023 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement SEC-C style internal chunking of frequency domain correlations #285

Implement SEC-C style internal chunking of frequency domain correlations #285

calum-chamberlain commented Dec 24, 2018 •

edited

calum-chamberlain commented Jan 17, 2019

calum-chamberlain commented Jan 22, 2019

calum-chamberlain commented Jan 23, 2019

Implement SEC-C style internal chunking of frequency domain correlations #285

Implement SEC-C style internal chunking of frequency domain correlations #285

Conversation

calum-chamberlain commented Dec 24, 2018 • edited

What does this PR do?

Why was it initiated? Any relevant Issues?

To do:

PR Checklist

calum-chamberlain commented Jan 17, 2019

calum-chamberlain commented Jan 22, 2019

calum-chamberlain commented Jan 23, 2019

calum-chamberlain commented Dec 24, 2018 •

edited