Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add xcorr_detector function #2315

Open
wants to merge 21 commits into
base: master
from

Conversation

@trichter
Copy link
Member

trichter commented Feb 14, 2019

I added a new xcorr_detector function which is based on correlate_template and more flexible than templates_max_similarity.

  • return list of all detections
  • Streams dont have to be aligned
  • correct normalization by using correlate_template

PR Checklist

  • Correct base branch selected? master for new features, maintenance_... for bug fixes
  • All tests still pass.
  • Any new features or fixed regressions are be covered via new tests.
  • Any new or changed features have are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • Add tutorial
  • Add figure to gallery

@trichter trichter added this to the 1.2.0 milestone Feb 14, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 14, 2019

+DOCS

@krischer krischer added this to Waiting for Review in Release 1.2.0 Feb 14, 2019

@d-chambers
Copy link
Member

d-chambers left a comment

This looks cool @trichter. I like the idea of having a basic xcorr detector in obspy. Of course, if users outgrow it they can switch over to an external package like EQCorrScan, so it doesn't need to be wildly efficient or comprehensive.

A few suggestions:

  1. The xcorr_detector function is very long and doing a lot of things. Maybe it could be broken up into smaller functions? I am thinking of _prep_streams function to do all the trimming/seed_id filtering, _correlate_streams to preform the actual correlations, then a distinct plotting function (similar to plot_trigger). It might also be better to have a function that returns the ccs dictionary and a different function that picks out the detections from it, in the same spirit as the other detectors in obspy. If the output could somehow be used with coincidence_trigger that would be wonderful.

  2. Filling all the gaps with 0s on the input stream may not be what the user should do in all cases. Might be better to just require there are no gaps in the input stream to force the user to address data quality issues rather than returning potentially surprising results.

  3. This great addition could really use a tutorial page to add visibility. Maybe it belongs as a section in the Tirgger/Picker Tutorial, or perhaps it would merit a page of its own.

stream.merge(fill_value=0)
if len({tr.stats.sampling_rate for tr in stream + template}) > 1:
raise ValueError('Traces have different sampling rate')
ids = {tr.id for tr in stream} | {tr.id for tr in template}

This comment has been minimized.

@d-chambers

d-chambers Feb 14, 2019

Member

I think you mean ids = {tr.id for tr in stream} & {tr.id for tr in template}

This comment has been minimized.

@trichter

trichter Feb 14, 2019

Author Member

Yea, you are right

if len(ids) == 0:
raise ValueError('No traces with matching ids in template and stream')
stream.traces = [tr for tr in stream if tr.id in ids]
template.traces = [tr for tr in template if tr.id in ids]

This comment has been minimized.

@d-chambers

d-chambers Feb 14, 2019

Member

These change the stream instances in place, perhaps worth copying first?

This comment has been minimized.

@megies

megies Feb 14, 2019

Member

A shallow copy might even suffice (without havin looked through all code, depends on whether that stream is just throwaway inside a block or returned actually)

@megies

This comment has been minimized.

Copy link
Member

megies commented Feb 14, 2019

@trichter would be good to have a docs Tutorial page with an example use case. If you need data for it we can put it in http://examples.obspy.org/ via https://github.com/obspy/examples

@megies megies added the .signal label Feb 14, 2019

@megies megies force-pushed the xcorr_detector branch from 44db214 to 30fbc3f Feb 14, 2019

@megies

This comment has been minimized.

Copy link
Member

megies commented Feb 14, 2019

Rebased on current master and force-pushed so that we have fresh CI results tomorrow.

@megies megies force-pushed the xcorr_detector branch from 30fbc3f to ab967d9 Feb 14, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 14, 2019

This looks cool @trichter. I like the idea of having a basic xcorr detector in obspy. Of course, if users outgrow it they can switch over to an external package like EQCorrScan, so it doesn't need to be wildly efficient or comprehensive.

Thanks, that was exactly my idea. Also it is just natural now that the correlate_template function is available.

1. The `xcorr_detector` function  is very long and doing a lot of things. Maybe it could be broken up into smaller functions? I am thinking of `_prep_streams` function to do all the trimming/seed_id filtering, `_correlate_streams` to preform the actual correlations, then a distinct plotting function (similar to `plot_trigger`). It might also be better to have a function that returns the ccs dictionary and a different function that picks out the detections from it, in the same spirit as the other detectors in obspy.  If the output could somehow be used with `coincidence_trigger` that would be wonderful.

The preparation is not that much. It is also a lot about setting default values for optional argument. If this is moved to a separate function the docs would be split up, too.
But I would be OK with spliting up the function into two: correlate_stream_template and xcorr_detector. The first function could also return a stream with correlations, which could be used with the coincidence_trigger, too. I don't know how useful this is. Still, I like the approach of a single function a bit more. Other opinions?

2. Filling all the gaps with 0s on the input stream may not be what the user should do in all cases. Might be better to just require there are no gaps in the input stream to force the user to address data quality issues rather than returning potentially surprising results.

That sounds quiet reasonable.

3. This great addition could really use a tutorial page to add visibility. Maybe it belongs as a section in the Tirgger/Picker Tutorial, or perhaps it would merit a page of its own.

Good idea, I will create a short tutorial, just wanted to get some opinions first.

@megies

This comment has been minimized.

Copy link
Member

megies commented Feb 14, 2019

  1. The xcorr_detector function ....

The preparation is not that much. It is also a lot about setting default values for optional argument. If this is moved to a separate function the docs would be split up, too.
But I would be OK with spliting up the function into two: correlate_stream_template and xcorr_detector. The first function could also return a stream with correlations, which could be used with the coincidence_trigger, too. I don't know how useful this is. Still, I like the approach of a single function a bit more. Other opinions?

Time is tight, kind of. So I'd be OK with leaving it inside a god function for now. Refactoring can still be done afterwards if anybody feels a need to reuse part of the funtionality.

  1. Filling all the gaps with 0s ..

That sounds quiet reasonable.

I agree, usually I'd prefer raising an exception on strange/unexpected input. Users should be forced to take care of their input rather than loads of obscure automagic.

  1. This great addition could really use a tutorial page

Good idea, I will create a short tutorial, just wanted to get some opinions first.

🚀

@megies

This comment has been minimized.

Copy link
Member

megies commented Feb 14, 2019

Circle CI is failing, yay! That means rebasing (all the open PRs) was already useful for something (to see this issue over all the test failure noise). 🚀

@trichter trichter moved this from Waiting for Review to In Progress in Release 1.2.0 Feb 15, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 15, 2019

Can somebody have a look? Later today I will create the tutorial.

@trichter trichter moved this from In Progress to Waiting for Review in Release 1.2.0 Feb 15, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 15, 2019

@ThomasLecocq

This comment has been minimized.

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 15, 2019

It's working now, the doc buildbot had not yet finished. I still have to figure out how to make the plots appear. Consider the tutorial as WIP.

@trichter trichter force-pushed the xcorr_detector branch from 0efe992 to aefb6e6 Feb 18, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Feb 18, 2019

I rebased and force-pushed


from obspy import read, UTCDateTime as UTC
from obspy.signal.cross_correlation import correlate_stream_template, similarity_detector

template = read('https://examples.obspy.org/IC.MDJ.2013.043.mseed')
template.filter('bandpass', freqmin=0.5, freqmax=2)
template.plot()

.. plot::
:context:

This comment has been minimized.

@megies

megies Feb 18, 2019

Member

from the Sphinx docs it's quite unclear how long this context stays active..

This comment has been minimized.

@trichter

trichter Feb 19, 2019

Author Member

I get your point. If I use :context: reset in the first code cell that should be OK.

@trichter trichter moved this from Waiting for Review to In Progress in Release 1.2.0 Feb 19, 2019

@trichter trichter moved this from In Progress to Waiting for Review in Release 1.2.0 Feb 20, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Mar 13, 2019

@d-chambers I think this PR is ready for review. Could you have a look again?

@calum-chamberlain
Copy link
Contributor

calum-chamberlain left a comment

I think this looks really good, and agree that it is great to have this in obspy - I'm looking forward to using it for teaching next year! The tutorial is nice too, although it might be nice to use named arguments in the tutorial so people don't have to flick back and forth between the tutorial and the docs to remind themselves of what each argument means?

Show resolved Hide resolved obspy/signal/cross_correlation.py
Show resolved Hide resolved obspy/signal/cross_correlation.py Outdated
@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Mar 14, 2019

I think this looks really good, and agree that it is great to have this in obspy - I'm looking forward to using it for teaching next year! The tutorial is nice too, although it might be nice to use named arguments in the tutorial so people don't have to flick back and forth between the tutorial and the docs to remind themselves of what each argument means?

Thanks for the review. I agree with the named arguments.

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Mar 14, 2019

Looks good, test failures are unrelated.

@trichter trichter moved this from Waiting for Review to Waiting for final manual validation by Core Dev in Release 1.2.0 Mar 14, 2019

@d-chambers

This comment has been minimized.

Copy link
Member

d-chambers commented Mar 14, 2019

@trichter if @calum-chamberlain is happy I am happy 👍

@megies megies self-assigned this Mar 15, 2019

@trichter

This comment has been minimized.

Copy link
Member Author

trichter commented Mar 18, 2019

I had a second thought about this PR. Here are my conclusions:

  • Maybe it is better to merge the three functions again into one. They need to a certain degree the same arguments and it is a bit cumbersome to call all three functions in a row. And maybe it is a bit discouraging for a beginner. The stream_correlate_template() function could stay separate, but could be called from the merged function.
  • It would be a nice extension to accept not only one template but also a list or dictionary of templates and return the detections with the index or key of the corresponding template.
  • I think templates_max_similarity can be deprecated, it also does not return the information from which template the best correlation coefficient originates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.