Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SEC-C style internal chunking of frequency domain correlations #285

Merged
merged 11 commits into from Jan 23, 2019

Conversation

calum-chamberlain
Copy link
Member

@calum-chamberlain calum-chamberlain commented Dec 24, 2018

What does this PR do?

This PR is inspired by the SEC-C paper. It implements internal chunking of the FFTs of the data. Initial testing shows this to be faster than running really large FFTs. It should also be much more memory efficient because the size of FFTs can be reduced. At the moment this is controlled with an fft_len kwarg on the fftw correlation functions. This can be passed through from the matched-filter functions.

Why was it initiated? Any relevant Issues?

Senobari et al. (2018) make the point well that EQcorrscan can be very costly in memory. Currently the way around this is to use either shorter processing lengths, or group templates. Neither of these options is very efficient. They present a simple alternative whereby many shorter FFTs can be computed for the correlations. This does not effect accuracy (as using an incorrect/different processing length between template and data would), and allows the template FFTs to be cached while looping through chunks of continuous data. A side effect of this is that longer streams of data could be worked on efficiently.

To do:

  • Document this behaviour;
    - [ ] Estimate the most efficient fft_len on the fly given the memory restrictions of the system;
    - [ ] Further parallelism could be enabled, e.g. the current outer_core parallelism could be changed to work on the loop over chunks of continuous data;

  • Testing using Add correlation speed-test #180 would be good, some graphs demonstrating the different memory and time requirements would be nice.

  • Wait until Speed-up clustering #266 is merged, which makes changes to the C functions which will require some tweaking to merge with this.

  • Long-term, this could be memory efficient enough to be ported to the GPU, which could allow for some serious desktop speed-ups.

PR Checklist

  • develop base branch selected?
  • This PR is not directly related to an existing issue (which has no PR yet).
  • All tests still pass.
  • Any new features or fixed regressions are be covered via new tests.
  • Any new or changed features have are fully documented.
  • Significant changes have been added to CHANGES.md.
    - [ ] First time contributors have added your name to CONTRIBUTORS.md.

@calum-chamberlain calum-chamberlain self-assigned this Dec 24, 2018
@calum-chamberlain calum-chamberlain added this to In progress in 0.4.0 via automation Dec 24, 2018
@calum-chamberlain calum-chamberlain added this to the 0.4.0 milestone Dec 24, 2018
@calum-chamberlain
Copy link
Member Author

There is an indexing error now with this that needs to be resolved. This is readily apparent in the correlate_test.py::TestArrayCorrelateFunctions::test_single_channel_similar test.

@calum-chamberlain
Copy link
Member Author

So it looks like chunking is good, faster, more efficient at loading CPUs, and less memory intensive. It also passes tests and gives the same results. The gist here shows some of the profiling I ran. I ran a range of other dataset sizes and found that an fft-length of 2**13 was always fastest on my machine. I am tempted to auto-set the fft-len to this.

@calum-chamberlain
Copy link
Member Author

I tested this on a cluster running python 3.6, ubuntu 16.04 and using gcc 5.4.0 and ran into libgomp errors for resource unavailable. This is removed by disabling outer loop threading. I was struggling to maintain that as it was so I'm happy to remove this. It also had limited benefits, which are more than outweighed by this change.

@calum-chamberlain calum-chamberlain merged commit 5d6a146 into develop Jan 23, 2019
0.4.0 automation moved this from In progress to Done Jan 23, 2019
@calum-chamberlain calum-chamberlain mentioned this pull request Sep 3, 2019
14 tasks
@calum-chamberlain calum-chamberlain deleted the chunk-xcorr branch March 19, 2023 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
0.4.0
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

1 participant