Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream.rotate: enable rotating multiple groups of traces simultaneously #3155

Merged
merged 11 commits into from
Oct 13, 2022

Conversation

megies
Copy link
Member

@megies megies commented Sep 29, 2022

What does this PR do?

Fix Stream.rotate() with multiple-station Streams

Why was it initiated? Any relevant Issues?

Fixes #2623 with an alternative approach to #3124

PR Checklist

  • Correct base branch selected? master for new features, maintenance_... for bug fixes
  • This PR is not directly related to an existing issue (which has no PR yet).
  • If the PR is making changes to documentation, docs pages can be built automatically.
    Just add the "build_docs" tag to this PR.
    Docs will be served at docs.obspy.org/pr/{branch_name} (do not use master branch).
    Please post a link to the relevant piece of documentation.
  • If all tests including network modules (e.g. clients.fdsn) should be tested for the PR,
    just add the "test_network" tag to this PR.
  • All tests still pass.
  • Any new features or fixed regressions are covered via new tests.
  • Any new or changed features are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • First time contributors have added your name to CONTRIBUTORS.txt .
  • If the changes affect any plotting functions you have checked that the plots
    from all the CI builds look correct. Add the "upload_plots" tag so that plotting
    outputs are attached as artifacts.
  • Add the "ready for review" tag when you are ready for the PR to be reviewed.

@trichter
Copy link
Member

trichter commented Sep 29, 2022

I am unsure here. The grouping will not work e.g. for multiple event waveforms at the same station which is very common for receiver function processing. Should we not better rely on the user sorting the stream correctly and mention this in the docs (if not already done). We could even state a sort expression working in most cases. (I think stream.sort(reverse=True) should do the trick.)

By using sets the current implementation may change the order of the groups.

Edit: Adding to that, we could simply warn if the seed ids of the traces in one rotation do not fit together, similar as an error is raised if times do not match.

@megies
Copy link
Member Author

megies commented Oct 4, 2022

Hmm.. might need to think about this more. I still think this might be the better approach and it can be done cleanly (I havent looked into the CI fails yet xD). Maybe similar to Stream._trim_common_channels or expanding that a bit, so it would not only group together by SEED ID but also in a second stage by common time windows. I don't think doing that would be much more work on top and I think it might be more convenient than relying on user sorting correctly. Hmm, not sure..

@trichter
Copy link
Member

trichter commented Oct 4, 2022

Yes, that would be more convenient if working correctly. Please also use the updated test case from the other PR.

Using trim_common_channels would also loosen the very restrictive time span condition.
Maybe even make this "iteration over 3-component streams" a public method or function? See #1918.

@megies
Copy link
Member Author

megies commented Oct 12, 2022

@trichter the problem with this was just that the order of traces changed in some cases. I added a self.sort() after any and all of the rotation operations now, so it will stay consistent in the future (even though that means now results come back ENZ etc). This changes behavior a tiny wee little bit, but I think thats acceptable here for the bugfix release.
I think relying on order of traces and using indexing to access traces from a stream is a bad idea in all cases anyway, if not with a rigorous select() call before it.

I think this recursion approach is cleaner, you OK with this change?

@@ -2011,18 +2011,22 @@ def test_rotate(self):
st = read()
st += st.copy()
st[3:].normalize()
for tr in st[3:]:
tr.stats.location = '01'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are cheating here ;)

IMHO this test should pass without these lines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I think it's a weird test case, why would you process a fully duplicated exact same waveform set of 2x the same three Traces?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is completely nonsense. For me, it conceptually represents the rotation of waveforms of two different events (with different baz) registered at the same station. Changing the starttime instead of the location code would be more in line with this idea. I do not know if this was the original intention.

@trichter
Copy link
Member

I think this recursion approach is cleaner, you OK with this change?

I agree the recursion approach is better. But I think we should still handle streams of more than 3 traces in one seed id group (see inline comment). More than a single event at the same station inside one stream is a perfectly valid use case.

I think the sorting is not really necessary? Why not use a dictionary instead of a set which keeps the order?

        seed_id_groups = {}     
        for tr in self:
            ...
            seed_id_groups[(net, sta, loc, cha_prefix)] = None

@megies
Copy link
Member Author

megies commented Oct 13, 2022

I think the sorting is not really necessary? Why not use a dictionary instead of a set which keeps the order?

I'm not sure why the sort would hurt, tbh. Any code that is strictly relying on the order of traces in a stream mixed with multiple stations, channels and even same stations for different times and expecting to pick some trace by an index number like stream[6] is a catastrophe waiting to happen. Any code that is not using a rigorous stream.select(network=.., station=.., location=.., ..) before grabbin a trace is a banana peel on a sidewalk.

With proper automatic selecting/grouping traces out of an arbitrary mix of traces in a stream, order of traces fortunately becomes irrelevant, so ordering in output should not matter either, I would argue.

In any case, maybe bump this to

But I think we should still handle streams of more than 3 traces in one seed id group

I already do in e.g. ->ZNE option, it's just that the other methods are super hacky and I think should get either a recursion leaving the old working code alone that allegedly works for one set of traces or should get a full rewrite with proper code. I'm just not sure if further tweaking of that old hacky part of the code is the right way forward.

Anyway @trichter, I think I'll bump this to 1.4.0 or even 1.4.1 even. 1.3.1 is months overdue with lots of important fixes that break installations with new versions, so it can't be held up any further.

@megies megies added this to the 1.4.1 milestone Oct 13, 2022
@trichter
Copy link
Member

Any code that is not using a rigorous stream.select(network=.., station=.., location=.., ..) before grabbin a trace is a banana peel on a sidewalk.

Yea, you are correct, but then why sort inside rotate at all?

I already do in e.g. ->ZNE option,
->ZNE is different because ZNE is well defined, but RT and LQT are only defined together with baz and inc.

I pushed some more commits. Order of the seed id groups is now preserved. And I added the fix from #3124. If tests pass, this can be merged in 1.3.1 IMHO. Would this be OK with you?

@trichter trichter mentioned this pull request Oct 13, 2022
17 tasks
@megies
Copy link
Member Author

megies commented Oct 13, 2022

Yea, you are correct, but then why sort inside rotate at all?

Mainly, I did not like the notion of order of traces coming out in a different order after changing internal implementation. So I thought, might be good to just sort everything, so that on future implementation changes the order would not change anymore. But I don't really care, can stay without sort for all I care.

I already do in e.g. ->ZNE option,

->ZNE is different because ZNE is well defined, but RT and LQT are only defined together with baz and inc.

If baz/inc is specified in the rotate call, it doesn't make sense to mix multiple stations anyway. If a global baz/inc is valid for all or it instead is attached to each Stats, then I don't see why the same approach as for ZNE shouldnt work.

Aaaaaanyway, I'll have a look at the changes now ;)

@megies
Copy link
Member Author

megies commented Oct 13, 2022

Ah, I think I didn't ever understand what your PR was doing, to be honest, the commit message in this one helped. Sorry about the confusion.. 🙈

I also wasn't aware that dictionaries keep order of items stored in them these days..

@trichter
Copy link
Member

If baz/inc is specified in the rotate call, it doesn't make sense to mix multiple stations anyway. If a global baz/inc is valid for all or it instead is attached to each Stats, then I don't see why the same approach as for ZNE shouldnt work.

For ->ZNE, different stations have different rotations this is caught by the seed id grouping, for ->LQT different events have different rotations, this is not caught by the seed id grouping

Aaaaaanyway, I'll have a look at the changes now ;)

:)

@megies
Copy link
Member Author

megies commented Oct 13, 2022

for ->LQT different events have different rotations, this is not caught by the seed id grouping

Ah, yeah, good point. I just didn't latch on to the fact that the code just used to grab baz/inc from first trace no matter what 🤦

@megies
Copy link
Member Author

megies commented Oct 13, 2022

Had to rebase n force push

@megies megies merged commit 4a4f45a into maintenance_1.3.x Oct 13, 2022
@megies megies deleted the rotate_multiple_seed_ids branch October 13, 2022 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants