Reftek mass position channels #2678

MaevaP · 2020-07-08T17:42:41Z

What does this PR do?

Reftek mass position channel data are '16' encoded and have a sampling rate of 0.1 Hz.
This PR modifies section of the obspy.io.reftek.core routines that deal with upacking data with '16' and '32' encodings to account for number of samples in each data packet.
This PR also changes the type of EH_PAYLOAD['sampling_rate'] from int to float in the obspy.io.reftek.packet module.

Why was it initiated? Any relevant Issues?

This PR was initiated to provide read support for Reftek 130 mass position channel data.

PR Checklist

Correct base branch selected? master for new features, maintenance_... for bug fixes
This PR is not directly related to an existing issue (which has no PR yet).
If the PR is making changes to documentation, docs pages can be built automatically.
Just remove the space in the following string after the + sign: "+ DOCS"
If any network modules should be tested for the PR, add them as a comma separated list
(e.g. clients.fdsn,clients.arclink) after the colon in the following magic string: "+TESTS:"
(you can also add "ALL" to just simply run all tests across all modules)
All tests still pass.
Any new features or fixed regressions are be covered via new tests.
Any new or changed features have are fully documented.
Significant changes have been added to CHANGELOG.txt .
First time contributors have added your name to CONTRIBUTORS.txt .

obspy/io/reftek/packet.py

obspy/io/reftek/core.py

megies · 2020-07-10T14:16:05Z

So, if I understand correctly this actually does two very different things

store samples per second as a float. the official manual states that field is ASCII integer but I guess since reftek itself is writing those floating point ASCII values, we'll have to adapt. I'm always feeling slightly uneasy storing integers as float, but as far as I understand, there should be no floating point accuracy issues for any integer in the range +/-2**24 when storing as a 32 bit float, so I guess we're good
allow for packets that are not filling the fixed amount of data space in the packet. I never thought of that case, since the reftek plugin it started out with STEIM encodings unpacked by libmseed which takes care of any empty space at the end of packets, I guess

Otherwise this looks good to me, especially since existing tests pass and you have a test for the new case.

todos:

one liner in changelog
add yourself to contributors (unless you dont want to)
would be good if we could make the string representation have a guaranteed fixed width again (maybe just use str(...)[:4], since the original value's string representation also used to fit in 4 ASCII chars?)
that other inline comment about setting the itemsize in bytes

The part about taking into account the number of samples in each packet / empty space at end could be considered a bug fix, I guess. So we could think about putting it into maintenance_1.2.x. On the other hand not sure if we will have a 1.2.3 release even, so , I guess it can go into master as proposed and if need be it can be backported.

MaevaP · 2020-07-10T16:54:44Z

Hi Tobias,
You are correct the updates do two different things.
Regarding the sample rate, the manual is unfortunately a bit misleading (or out-of-date) because it implies that the sample rate is always an integer which is not the case for the auxiliary data streams. Most PASSCAL experiments end up configuring the sample rate of the auxiliary channels to 0.1 Hz.
I just addressed all the items in your to-do list. It looks like, however, that I can't update the CHANGELOG.txt file without creating a conflict and making the checks fail.
Thank you for all the feedback and positively considering my PR.

dsentinel · 2020-07-10T18:52:33Z

Thanks @MaevaP !
This looks good to me as well, with Tobias's suggestions.

I just addressed all the items in your to-do list. It looks like, however, that I can't update the CHANGELOG.txt file without creating a conflict and making the checks fail.

Rebase from the current master and push.... I could have sworn there used to be a button for this
OR
hit the resolve conflicts button above

@megies ;

there should be no floating point accuracy issues for any integer in the range +/-2**24 when storing as a 32 bit float, so I guess we're good

Ugh, yes, floats are such a pain. Sometimes decimal comes to the rescue.
I think this shouldn't be an issue either given the parameters from rt's such as as SR's are not values like 20.1.

Not a clean way to solve the printing problem as str(SR)[:4] -> 500. with a dangling dot, but only a little hard on the eyes.

MaevaP · 2020-07-10T20:06:00Z

Thank you for the feedback @dsentinel.
The conflict has been resolved.

megies · 2020-07-11T12:52:15Z

Not a clean way to solve the printing problem as str(SR)[:4] -> 500. with a dangling dot, but only a little hard on the eyes.

if we wanna be super clean, we could check int(samprate) == samprate then %d (like before) otherwise the construct we have right now. but i dont care, you can leave as is as far as i'm concerned, its just a print representation that will only very rarely be used anyway.

obspy/CONTRIBUTORS.txt

megies · 2020-07-11T13:11:17Z

Just the ordering in contributors, and if you're comfortable with that please git rebase -i and just go through with it without edits to get rid of that merge commit. I can do it if its unclear

MaevaP · 2020-07-13T16:44:23Z

Hi Tobias,
I changed the ordering in the CONTRIBUTORS file and pushed the updates after running git pull --rebase origin reftek_soh. I am not sure whether this is what you were asking for. I am not very familiar with git rebase.
Thank you for the feedback and help.

obspy/io/reftek/core.py

CHANGELOG.txt

obspy/io/reftek/core.py

megies · 2020-07-14T08:09:51Z

I am not very familiar with git rebase.
Thank you for the feedback and help.

For git rebase you want to specify the base branch, so e.g. git rebase origin/master (unless you git fetch origin before it might not do anything though). It basically is an alternative to git merge origin/master. Rebasing will disrupt history, so other people working on the same branch will have to rewrite their history too, if they want to keep working on the same PR. But rebasing gives a clean history, that could even be merged into the base branch with just a fast-forward. Merging will keep current history intact, but on the other hand it introduces ugly merge commits and make history have lots of branching and merging. In obspy we clean up PRs by doing rebases, which is usually fine since mostly one person is usually in charge of a PR.

…(sampling_rate=0.1Hz and 16 encoding)

…ling_rate is now a float

…ng representations of Reftek object and Packets

…sentations, and changed bytes_ to be static

Update name ordering in contributors file

Co-authored-by: dsentinel <lloyd@passcal.nmt.edu>

add some comments, easier to read code

megies · 2020-07-15T09:51:48Z

I took over for the (hopefully) last iteration of changes, since I don't want this to stay dangling for too long, as it's hard to come back to stuff like this after a while.

If CI goes green I'd appreciate a final review @dsentinel

MaevaP · 2020-07-15T15:14:00Z

Awesome! Thanks for taking over and addressing that quickly. All these changes make sense and are an improvement.
Thanks again to both for the help and feedback.

dsentinel

Looks good to me. Merge as is, or minor changes.

dsentinel · 2020-07-15T17:47:24Z

obspy/io/reftek/core.py

@@ -314,15 +310,38 @@ def to_stream(self, network="", location="", component_codes=None,
                            sample_data = _unpack_C0_C2_data(packets_,
                                                             encoding)
                        elif encoding in ('16', '32'):
-                            dtype = {'16': np.int16, '32': np.int32}[encoding]
+                            # rt130 stores in big endian
+                            dtype = {'16': '>i2', '32': '>i4'}[encoding]


dsentinel · 2020-07-15T17:49:16Z

obspy/io/reftek/core.py

-                            sample_data = sample_data.flatten().view(dtype)
-                            # switch endianness, rt130 stores in big endian
-                            sample_data = sample_data.byteswap()
+                            sample_data = sample_data.view(dtype)


A bit strange to take a view of a variable that is being overwritten by itself. I think astype would be cleaner, but there's a chance of a copy, the conditions which are vague from the docs so this is fine with me.

obspy/io/reftek/core.py

Co-authored-by: dsentinel <lloyd@passcal.nmt.edu>

megies · 2020-07-16T14:19:42Z

thanks for the PR @MaevaP good stuff! 🎉

megies · 2020-07-16T14:24:31Z

@MaevaP if you can be asked, it would be nice to test if it also works if the half-full packet is in the middle of the contiguous packet list. You happen to have real data with a half-full packet that is contiguously followed by some other packets? I noticed in the test, that the half-full packet is at the end, so the test case is lacking a bit

MaevaP · 2020-07-16T23:36:02Z

@megies That's a good point.
With all the test data sets I have worked with so far, I have never seen an incomplete packet in the middle of the contiguous packet list. But that being said, I have only recently started working with rt130 data.
I can manipulate the packets in the test data set I provided to create a data set that would fit the bill if you think that it would still be useful for testing purposes.

megies · 2020-07-17T08:14:59Z

if you think that it would still be useful for testing purposes.

I think it'd be worth it, yes. I'm not using these numpy function on a daily basis, and we are cutting out some parts of a contiguous array and just return a view, so I'm wondering if that cuts it, especially if we run something that is working on the C pointers from numpy afterwards.. or if we need to make a copy after the cutting out middle parts to be safe.. I have things in mind like #192, #193 or #1704

megies reviewed Jul 9, 2020

View reviewed changes

obspy/io/reftek/packet.py Outdated Show resolved Hide resolved

megies reviewed Jul 10, 2020

View reviewed changes

obspy/io/reftek/core.py Outdated Show resolved Hide resolved

megies self-assigned this Jul 10, 2020

megies added this to the 2.0.0 milestone Jul 10, 2020

megies added the .io issues generally related to our read/write plugins label Jul 10, 2020

megies reviewed Jul 11, 2020

View reviewed changes

obspy/CONTRIBUTORS.txt Outdated Show resolved Hide resolved

dsentinel reviewed Jul 13, 2020

View reviewed changes

obspy/io/reftek/core.py Outdated Show resolved Hide resolved

CHANGELOG.txt Outdated Show resolved Hide resolved

obspy/io/reftek/core.py Outdated Show resolved Hide resolved

obspy/io/reftek/core.py Outdated Show resolved Hide resolved

MaevaP and others added 7 commits July 15, 2020 10:35

updates to core.py and packet.py to deal with mass position channels …

e14834e

…(sampling_rate=0.1Hz and 16 encoding)

Unit test and test data sets for mass position channel related updates

d5c6758

Updates to string representations of Reftek object and Packets - samp…

42b94dd

…ling_rate is now a float

Updates to __str__() method in EHPacket class to account for new stri…

441232a

…ng representations of Reftek object and Packets

Updated changelog and contributors files, fixed width of string repre…

63635d8

…sentations, and changed bytes_ to be static

Updates to changelog and contibutors files

dfb9a3a

Update name ordering in contributors file

reword changelog a bit

8464072

megies force-pushed the reftek_soh branch from 98de7d9 to 8464072 Compare July 15, 2020 08:38

megies and others added 5 commits July 15, 2020 10:41

reftek: make parts a bit easier to read

d798789

Co-authored-by: dsentinel <lloyd@passcal.nmt.edu>

reftek: simplify code as proposed by @dsentinel

6feba2b

reftek: refactoring to be nice to our future selves

0a1cd7e

add some comments, easier to read code

reftek: refactoring, use slice instead of arange

01fa2cb

reftek: should fix reading on big endian systems, untested though..

7cb87a6

dsentinel approved these changes Jul 15, 2020

View reviewed changes

Update obspy/io/reftek/core.py

2776601

Co-authored-by: dsentinel <lloyd@passcal.nmt.edu>

megies merged commit 18013bf into obspy:master Jul 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reftek mass position channels #2678

Reftek mass position channels #2678

MaevaP commented Jul 8, 2020 •

edited

megies commented Jul 10, 2020 •

edited

MaevaP commented Jul 10, 2020

dsentinel commented Jul 10, 2020 •

edited

MaevaP commented Jul 10, 2020

megies commented Jul 11, 2020

megies commented Jul 11, 2020

MaevaP commented Jul 13, 2020

megies commented Jul 14, 2020

megies commented Jul 15, 2020

MaevaP commented Jul 15, 2020

dsentinel left a comment

dsentinel Jul 15, 2020

dsentinel Jul 15, 2020

megies commented Jul 16, 2020

megies commented Jul 16, 2020

MaevaP commented Jul 16, 2020

megies commented Jul 17, 2020

Reftek mass position channels #2678

Reftek mass position channels #2678

Conversation

MaevaP commented Jul 8, 2020 • edited

What does this PR do?

Why was it initiated? Any relevant Issues?

PR Checklist

megies commented Jul 10, 2020 • edited

MaevaP commented Jul 10, 2020

dsentinel commented Jul 10, 2020 • edited

MaevaP commented Jul 10, 2020

megies commented Jul 11, 2020

megies commented Jul 11, 2020

MaevaP commented Jul 13, 2020

megies commented Jul 14, 2020

megies commented Jul 15, 2020

MaevaP commented Jul 15, 2020

dsentinel left a comment

Choose a reason for hiding this comment

dsentinel Jul 15, 2020

Choose a reason for hiding this comment

dsentinel Jul 15, 2020

Choose a reason for hiding this comment

megies commented Jul 16, 2020

megies commented Jul 16, 2020

MaevaP commented Jul 16, 2020

megies commented Jul 17, 2020

MaevaP commented Jul 8, 2020 •

edited

megies commented Jul 10, 2020 •

edited

dsentinel commented Jul 10, 2020 •

edited