New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPSD Fix exact trace cutting; includes test #2040

Merged
merged 6 commits into from Jan 17, 2018

Conversation

Projects
None yet
3 participants
@Jollyfant
Contributor

Jollyfant commented Jan 9, 2018

Fixes #2039

@Jollyfant Jollyfant requested a review from megies Jan 9, 2018

@Jollyfant Jollyfant added the .signal label Jan 9, 2018

@megies megies added this to the 1.1.1 milestone Jan 9, 2018

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jan 9, 2018

Member

Would be cool if you added a test that checks the length of the time series going into the individual PSD calculations.

Member

megies commented Jan 9, 2018

Would be cool if you added a test that checks the length of the time series going into the individual PSD calculations.

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 9, 2018

Contributor

Is there a private keeping track of this..? I can't find it.

Contributor

Jollyfant commented Jan 9, 2018

Is there a private keeping track of this..? I can't find it.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jan 9, 2018

Member

Probably not..

Actually, there's a property that keeps the expected number of samples in a slice though..

 348         # Trace length for one psd segment.
 349         self._len = int(self.sampling_rate * self.ppsd_length)
 350         # make an initial dummy psd and to get the array of periods
 351         _, freq = mlab.psd(np.ones(self.len), self.nfft,
 352                            self.sampling_rate, noverlap=self.nlap)

..and it's also checked when the individual PSDs are calculated:

 746     def __process(self, tr):
 747         """
 748         Processes a segment of data and save the psd information.
 749         Whether `Trace` is compatible (station, channel, ...) has to
 750         checked beforehand.
 751        
 752         :type tr: :class:`~obspy.core.trace.Trace`
 753         :param tr: Compatible Trace with data of one PPSD segment
 754         :returns: `True` if segment was successfully processed,
 755             `False` otherwise.
 756         """
 757         # XXX DIRTY HACK!!
 758         if len(tr) == self.len + 1:
 759             tr.data = tr.data[:-1]
 760         # one last check..
 761         if len(tr) != self.len:
 762             msg = "Got a piece of data with wrong length. Skipping"
 763             warnings.warn(msg)
 764             print(len(tr), self.len)
 765             return False

..and there's also some code that was probably used to handle such 1-sample-too-long slices..

Member

megies commented Jan 9, 2018

Probably not..

Actually, there's a property that keeps the expected number of samples in a slice though..

 348         # Trace length for one psd segment.
 349         self._len = int(self.sampling_rate * self.ppsd_length)
 350         # make an initial dummy psd and to get the array of periods
 351         _, freq = mlab.psd(np.ones(self.len), self.nfft,
 352                            self.sampling_rate, noverlap=self.nlap)

..and it's also checked when the individual PSDs are calculated:

 746     def __process(self, tr):
 747         """
 748         Processes a segment of data and save the psd information.
 749         Whether `Trace` is compatible (station, channel, ...) has to
 750         checked beforehand.
 751        
 752         :type tr: :class:`~obspy.core.trace.Trace`
 753         :param tr: Compatible Trace with data of one PPSD segment
 754         :returns: `True` if segment was successfully processed,
 755             `False` otherwise.
 756         """
 757         # XXX DIRTY HACK!!
 758         if len(tr) == self.len + 1:
 759             tr.data = tr.data[:-1]
 760         # one last check..
 761         if len(tr) != self.len:
 762             msg = "Got a piece of data with wrong length. Skipping"
 763             warnings.warn(msg)
 764             print(len(tr), self.len)
 765             return False

..and there's also some code that was probably used to handle such 1-sample-too-long slices..

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 9, 2018

Contributor

So we can check if ppsd.len is equal 3600 in my test case? That should be OK. (fyi it is 👍)

Also that print statement below the dirty hack should probably go too. During processing it keeps printing numbers without any meaning to my terminal (37 35). I can do some clean up but I'm sure there were reasons this code was put in.

Contributor

Jollyfant commented Jan 9, 2018

So we can check if ppsd.len is equal 3600 in my test case? That should be OK. (fyi it is 👍)

Also that print statement below the dirty hack should probably go too. During processing it keeps printing numbers without any meaning to my terminal (37 35). I can do some clean up but I'm sure there were reasons this code was put in.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jan 9, 2018

Member

Ok, so the problem is reduced to later slices (e.g. at the end of daily stream object) having slightly later starttime (by a few samples) due to the malformed slicing.. and thus the last slice being excluded since it's starting some milliseconds later then expected? Nice catch!

Member

megies commented Jan 9, 2018

Ok, so the problem is reduced to later slices (e.g. at the end of daily stream object) having slightly later starttime (by a few samples) due to the malformed slicing.. and thus the last slice being excluded since it's starting some milliseconds later then expected? Nice catch!

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 9, 2018

Contributor

Yeah that is basically just it, nothing worrisome.

Contributor

Jollyfant commented Jan 9, 2018

Yeah that is basically just it, nothing worrisome.

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 9, 2018

Contributor

Can this go in maintenance_1.1.x since this is essentially a bug fix? Or maybe that's later than 1.1.1.. idk!

Contributor

Jollyfant commented Jan 9, 2018

Can this go in maintenance_1.1.x since this is essentially a bug fix? Or maybe that's later than 1.1.1.. idk!

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jan 9, 2018

Member

Personally, I'm OK if this goes into maintenance.

Member

megies commented Jan 9, 2018

Personally, I'm OK if this goes into maintenance.

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 10, 2018

Contributor

What going on with CI? It's bailing on unrelated floating point accuracies.

Contributor

Jollyfant commented Jan 10, 2018

What going on with CI? It's bailing on unrelated floating point accuracies.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 10, 2018

Member

What going on with CI? It's bailing on unrelated floating point accuracies.

Oh this is nasty :( They changed the string representation of arrays with numpy 1.14.0.

Member

krischer commented Jan 10, 2018

What going on with CI? It's bailing on unrelated floating point accuracies.

Oh this is nasty :( They changed the string representation of arrays with numpy 1.14.0.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 10, 2018

Member

After looking at it in a bit more detail its maybe not that bad: The printing style can be chosen and has support for a "legacy" mode. I'll open a new PR and then we can rebase the others on it.

Member

krischer commented Jan 10, 2018

After looking at it in a bit more detail its maybe not that bad: The printing style can be chosen and has support for a "legacy" mode. I'll open a new PR and then we can rebase the others on it.

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 11, 2018

Contributor

After looking at it in a bit more detail its maybe not that bad: The printing style can be chosen and has support for a "legacy" mode. I'll open a new PR and then we can rebase the others on it.

Can we stick that in one central place or do all scripts need updating?

Contributor

Jollyfant commented Jan 11, 2018

After looking at it in a bit more detail its maybe not that bad: The printing style can be chosen and has support for a "legacy" mode. I'll open a new PR and then we can rebase the others on it.

Can we stick that in one central place or do all scripts need updating?

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 11, 2018

Member

I'll push later today - it took a bit longer because there are also a ton of new warnings in numpy 1.14. I put it in the central obspy-runtests script and also added configuration files for pytest so it works for those without a lot of effort.

Member

krischer commented Jan 11, 2018

I'll push later today - it took a bit longer because there are also a ton of new warnings in numpy 1.14. I put it in the central obspy-runtests script and also added configuration files for pytest so it works for those without a lot of effort.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 12, 2018

Member

Done in #2044.

Member

krischer commented Jan 12, 2018

Done in #2044.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 16, 2018

Member

The failures should be gone now if you rebase.

Member

krischer commented Jan 16, 2018

The failures should be gone now if you rebase.

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 16, 2018

Contributor

Okay! Rebased and force pushed with a changelog entry. We pick maintenance_1.1.x or master? I put this under 1.1.x in the changelog.

Contributor

Jollyfant commented Jan 16, 2018

Okay! Rebased and force pushed with a changelog entry. We pick maintenance_1.1.x or master? I put this under 1.1.x in the changelog.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Jan 16, 2018

Member

I would tend to do this in master as it does change the calculations (however slightly) but I'll leave it up to @megies .

Member

krischer commented Jan 16, 2018

I would tend to do this in master as it does change the calculations (however slightly) but I'll leave it up to @megies .

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Jan 16, 2018

Member

@Jollyfant do you have a strong need to have this in 1.1.x? I don't care too much either way, as it will only bother expert/power users most likely, but in principle @krischer is right that since it changes things (however slightly) it might be better suited for master.
On the other hand you might be able to argue that it's fixing a bug, so.. dunno

Member

megies commented Jan 16, 2018

@Jollyfant do you have a strong need to have this in 1.1.x? I don't care too much either way, as it will only bother expert/power users most likely, but in principle @krischer is right that since it changes things (however slightly) it might be better suited for master.
On the other hand you might be able to argue that it's fixing a bug, so.. dunno

@Jollyfant

This comment has been minimized.

Show comment
Hide comment
@Jollyfant

Jollyfant Jan 16, 2018

Contributor

Nah master is fine if we go for biannual releases 🐇

Contributor

Jollyfant commented Jan 16, 2018

Nah master is fine if we go for biannual releases 🐇

@megies megies modified the milestones: 1.1.1, 1.2.0 Jan 16, 2018

for i, time in enumerate(ppsd._times_processed):
current = UTCDateTime("2017-01-01T00:00:00") + (i * 30 * 60)
self.assertTrue(UTCDateTime(time) == current)

This comment has been minimized.

@megies

megies Jan 17, 2018

Member

interesting.. so I guess start times of individual PSD slices were never tested so far.. :-/

@megies

megies Jan 17, 2018

Member

interesting.. so I guess start times of individual PSD slices were never tested so far.. :-/

@megies

megies approved these changes Jan 17, 2018

Looks good to me.

@krischer krischer merged commit 1fbdebb into obspy:master Jan 17, 2018

2 of 4 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
docker-testbot docker testbot results not available yet
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment