New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ndk bugfix #2016

Merged
merged 2 commits into from Apr 26, 2018

Conversation

Projects
None yet
3 participants
@jsaul
Contributor

jsaul commented Nov 23, 2017

Line 3 must be treated as fixed-width fields because otherwise parsing fails for a few ndk's where there is no space between the centroid time and its error, like e.g. in event C110397G.

To test this:

wget 'http://www.ldeo.columbia.edu/~gcmt/projects/CMT/catalog/jan76_dec13.ndk'
python -c 'import obspy; obspy.read_events("jan76_dec13.ndk")'

This produced several errors but now works.

@jsaul jsaul added .io.ndk bug labels Nov 23, 2017

@jsaul jsaul requested a review from krischer Nov 23, 2017

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Nov 23, 2017

Member

Can you please rebase on the maintenance_1.1.x branch and force push to get rid of all the extra commits?

Member

krischer commented Nov 23, 2017

Can you please rebase on the maintenance_1.1.x branch and force push to get rid of all the extra commits?

Fixed wrong assumption of space separated fields in line 3
Line 3 must be treated as fixed width fields because otherwise parsing fails for a few ndk's where there is no space between the centroid time and its error, like e.g. in event C110397G.

@megies megies added this to the 1.1.1 milestone Nov 23, 2017

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Nov 27, 2017

Member

Rebased + force pushed.

Member

krischer commented Nov 27, 2017

Rebased + force pushed.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Nov 27, 2017

Member

Can you fix the formatting issues and add a test case?

I remember encountering this before but somehow came to the conclusion that there was some problem with just hard coding the offset but I cannot recall why. Maybe I was mistaken.

Member

krischer commented Nov 27, 2017

Can you fix the formatting issues and add a test case?

I remember encountering this before but somehow came to the conclusion that there was some problem with just hard coding the offset but I cannot recall why. Maybe I was mistaken.

@krischer

This comment has been minimized.

Show comment
Hide comment
@krischer

krischer Nov 27, 2017

Member

So the offending events look like this:

PDE  1996/11/20 19:42:56.1  10.30  127.43  33.0 5.6 5.3 PHILIPPINE ISLANDS REGIO
B112096E         B: 35   58  45 S:  0    0   0 M:  0    0   0 CMT: 1 BOXHD: 22.0
CENTROID:      1.060.0  10.47 0.03  127.42 0.03  15.0  0.0 BDY  O-00000000000000
24 -4.983 0.133 -0.092 0.153  5.075 0.173 -0.433 0.443  0.715 0.469 -2.898 0.141
V10   6.430  4 246  -1.390  2 156  -5.050 86  45   5.740 338 41  -88 154 49  -92

The third lines merges the first and second values (being the time and the time error). This is problematic if the time error is larger than 9.9 which occurs in only 2 events in the GCMT catalogue. But in this case they could be interpreted as 1.0 and 60.0 or 1.06 and 0.0. I personally would favor the first interpretation which this PR implements but its not really defined and IMHO just a faulty file, especially as these are the only non-whitespace separated values in ndk files.

I'm fine with this PR once it has tests but would be happy to hear other opinions as well.

Member

krischer commented Nov 27, 2017

So the offending events look like this:

PDE  1996/11/20 19:42:56.1  10.30  127.43  33.0 5.6 5.3 PHILIPPINE ISLANDS REGIO
B112096E         B: 35   58  45 S:  0    0   0 M:  0    0   0 CMT: 1 BOXHD: 22.0
CENTROID:      1.060.0  10.47 0.03  127.42 0.03  15.0  0.0 BDY  O-00000000000000
24 -4.983 0.133 -0.092 0.153  5.075 0.173 -0.433 0.443  0.715 0.469 -2.898 0.141
V10   6.430  4 246  -1.390  2 156  -5.050 86  45   5.740 338 41  -88 154 49  -92

The third lines merges the first and second values (being the time and the time error). This is problematic if the time error is larger than 9.9 which occurs in only 2 events in the GCMT catalogue. But in this case they could be interpreted as 1.0 and 60.0 or 1.06 and 0.0. I personally would favor the first interpretation which this PR implements but its not really defined and IMHO just a faulty file, especially as these are the only non-whitespace separated values in ndk files.

I'm fine with this PR once it has tests but would be happy to hear other opinions as well.

@jsaul

This comment has been minimized.

Show comment
Hide comment
@jsaul

jsaul Nov 28, 2017

Contributor

If there are still any doubts left then please try

awk 'NR%5==3' jan76_dec13.ndk
awk 'NR%5==3 && substr($0,19,1)!=" "'  jan76_dec13.ndk

It is clear that in the example the error is 60.0. Such a large value is almost certainly an error in the original file. But do you want the parser to fail here? In the other example (C110397G) the error is 10.0 and while certainly very rare it is not a totally unrealistic value.

With the PR you are on the safe side.

Contributor

jsaul commented Nov 28, 2017

If there are still any doubts left then please try

awk 'NR%5==3' jan76_dec13.ndk
awk 'NR%5==3 && substr($0,19,1)!=" "'  jan76_dec13.ndk

It is clear that in the example the error is 60.0. Such a large value is almost certainly an error in the original file. But do you want the parser to fail here? In the other example (C110397G) the error is 10.0 and while certainly very rare it is not a totally unrealistic value.

With the PR you are on the safe side.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Nov 29, 2017

Member

Would be good to ask the GCMT guys about this?

Member

megies commented Nov 29, 2017

Would be good to ask the GCMT guys about this?

@jsaul

This comment has been minimized.

Show comment
Hide comment
@jsaul

jsaul Nov 30, 2017

Contributor

It's not a question about semantics. As I wrote above, the 60 s uncertainty is most probably a mistake and it would be good to correct it upstream. Not so sure about the 10 s in the other example, though. In any case the examples show that uncertainty values >= 10 s do occur. They must not cause the parser to fail. Just like there are many time strings in the NDK files with the seconds field being 60.0, which is rejected by the UTCDateTime constructor but which the ObsPy NDK parser fortunately does accept via a workaround.

Contributor

jsaul commented Nov 30, 2017

It's not a question about semantics. As I wrote above, the 60 s uncertainty is most probably a mistake and it would be good to correct it upstream. Not so sure about the 10 s in the other example, though. In any case the examples show that uncertainty values >= 10 s do occur. They must not cause the parser to fail. Just like there are many time strings in the NDK files with the seconds field being 60.0, which is rejected by the UTCDateTime constructor but which the ObsPy NDK parser fortunately does accept via a workaround.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Dec 4, 2017

Member

Ideally a regression test should be added with one or two of these offending examples, otherwise fine for me.

Member

megies commented Dec 4, 2017

Ideally a regression test should be added with one or two of these offending examples, otherwise fine for me.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Dec 4, 2017

Member

I think this should fix PEP8.

Member

megies commented Dec 4, 2017

I think this should fix PEP8.

@megies

This comment has been minimized.

Show comment
Hide comment
@megies

megies Apr 26, 2018

Member

Alright.. regression test still lacking but I'm gonna merge this now, otherwise it'll stay pending forever..

Member

megies commented Apr 26, 2018

Alright.. regression test still lacking but I'm gonna merge this now, otherwise it'll stay pending forever..

@megies megies merged commit 98429d2 into maintenance_1.1.x Apr 26, 2018

3 of 4 checks passed

docker-testbot docker testbot results not available yet
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@megies megies deleted the ndk-bugfix branch Apr 26, 2018

@krischer krischer referenced this pull request May 2, 2018

Closed

Fixing maintenance branch #2130

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment