Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mseed: Avoid invalid warning when guessing endianness of timestamps / UTCDateTime: proper exceptions on invalid 'julday' #1988

Merged
merged 7 commits into from May 9, 2018
4 changes: 4 additions & 0 deletions CHANGELOG.txt
Expand Up @@ -2,6 +2,8 @@
- General:
* Tests pass with numpy 1.14 (see #2044).
- obspy.core:
* UTCDateTime now raises a meaningful exceptions when passing
invalid or out-of-bounds 'julday' during initialization (see #1988)
* Fix pickling of traces with a sampling rate of 0 (see #1990)
* read_inventory() used with non-existing file path (e.g. typo in filename)
now shows a proper "No such file or directory" error message (see #2062)
Expand All @@ -28,6 +30,8 @@
(see #1981, #2057).
* Fix util.get_start_and_end_time returning sample rate = 0 when
sample rate = 1 (see #2069)
* Avoid showing invalid warnings when guessing endian during parsing
timestamps (see #1988)
- obspy.io.nordic
* Bug-fix for amplitudes without magnitude_hint (see #2021)
* Bug-fix for wavefiles with full path stripping (see #2021)
Expand Down
2 changes: 1 addition & 1 deletion obspy/core/tests/test_utcdatetime.py
Expand Up @@ -463,7 +463,7 @@ def test_invalid_dates(self):
self.assertRaises(ValueError, UTCDateTime, 2010, 9, 31)
self.assertRaises(ValueError, UTCDateTime, '2010-09-31')
# invalid julday
self.assertRaises(TypeError, UTCDateTime, year=2010, julday=999)
self.assertRaises(ValueError, UTCDateTime, year=2010, julday=999)
# testing some strange patterns
self.assertRaises(TypeError, UTCDateTime, "ABC")
self.assertRaises(TypeError, UTCDateTime, "12X3T")
Expand Down
9 changes: 9 additions & 0 deletions obspy/core/utcdatetime.py
Expand Up @@ -339,6 +339,15 @@ def __init__(self, *args, **kwargs):
return
# check for ordinal/julian date kwargs
if 'julday' in kwargs:
try:
int(kwargs['julday'])
except (ValueError, TypeError):
msg = "Failed to convert 'julday' to int: {!s}".format(
kwargs['julday'])
raise TypeError(msg)
if not (1 <= int(kwargs['julday']) <= 366):
msg = "'julday' out of bounds: {!s}".format(kwargs['julday'])
raise ValueError(msg)
if 'year' in kwargs:
# year given as kwargs
year = kwargs['year']
Expand Down
4 changes: 4 additions & 0 deletions obspy/io/mseed/__init__.py
Expand Up @@ -160,6 +160,10 @@ class InternalMSEEDError(ObsPyMSEEDError):
pass


class InternalMSEEDParseTimeError(InternalMSEEDError):
pass


class InternalMSEEDWarning(UserWarning):
pass

Expand Down
22 changes: 16 additions & 6 deletions obspy/io/mseed/util.py
Expand Up @@ -20,6 +20,7 @@
from obspy import UTCDateTime
from obspy.core.compatibility import from_buffer
from obspy.core.util.decorator import ObsPyDeprecationWarning
from . import InternalMSEEDParseTimeError
from .headers import (ENCODINGS, ENDIAN, FIXED_HEADER_ACTIVITY_FLAGS,
FIXED_HEADER_DATA_QUAL_FLAGS,
FIXED_HEADER_IO_CLOCK_FLAGS, HPTMODULUS,
Expand Down Expand Up @@ -641,6 +642,10 @@ def fmt(s):
return native_str('%sHHBBBxHHhhBBBxlxxH' % s)

def _parse_time(values):
if not (1 <= values[1] <= 366):
msg = 'julday out of bounds (wrong endian?): {!s}'.format(
values[1])
raise InternalMSEEDParseTimeError(msg)
# The spec says values[5] (.0001 seconds) must be between 0-9999 but
# we've encountered files which have a value of 10000. We interpret
# this as an additional second. The approach here is general enough
Expand All @@ -653,25 +658,30 @@ def _parse_time(values):
"the maximum strictly allowed value is 9999. It will be "
"interpreted as one or more additional seconds." % values[5],
category=UserWarning)
return UTCDateTime(
year=values[0], julday=values[1],
hour=values[2], minute=values[3], second=values[4],
microsecond=msec % 1000000) + offset
try:
t = UTCDateTime(
year=values[0], julday=values[1],
hour=values[2], minute=values[3], second=values[4],
microsecond=msec % 1000000) + offset
except TypeError:
msg = 'Problem decoding time (wrong endian?)'
raise InternalMSEEDParseTimeError(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pass on the message of the exception which would ease debugging and improve understanding of whatever problem is encountered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the message would be very helpful, at least not in the case that seems to be what is normally encountered (when trying the wrong endian):

$ python -c 'from obspy import UTCDateTime; UTCDateTime(year=444444, julday=10)'
Version 7.000kness 20eader in AFM: UnderlinePosition -133
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/megies/git/obspy/obspy/core/utcdatetime.py", line 377, in __init__
    dt = datetime.datetime(*args, **kwargs)
TypeError: Required argument 'month' (pos 2) not found

I'm not sure it would help to see that message to be honest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case this PR is not making things worse. Before it there was a bare try/except, now at least I'm only catching an exception that is explicitly raised inside the subroutine that parses the timestamp.

return t

if endian is None:
try:
endian = ">"
values = unpack(fmt(endian), data)
starttime = _parse_time(values)
except Exception:
except InternalMSEEDParseTimeError:
endian = "<"
values = unpack(fmt(endian), data)
starttime = _parse_time(values)
else:
values = unpack(fmt(endian), data)
try:
starttime = _parse_time(values)
except Exception:
except InternalMSEEDParseTimeError:
msg = ("Invalid starttime found. The passed byte order is likely "
"wrong.")
raise ValueError(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again pass on exception message.

Expand Down
10 changes: 2 additions & 8 deletions obspy/io/seisan/tests/test_core.py
Expand Up @@ -8,7 +8,6 @@

import os
import unittest
import warnings

import numpy as np

Expand Down Expand Up @@ -195,13 +194,8 @@ def test_read_obspy(self):
# 1 - little endian, 32 bit, version 7
st1 = read(os.path.join(self.path,
'2011-09-06-1311-36S.A1032_001BH_Z'))
# raises "UserWarning: Record contains a fractional seconds" - ignore
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', UserWarning)
st2 = read(os.path.join(self.path,
'2011-09-06-1311-36S.A1032_001BH_Z.mseed'))
self.assertEqual(len(w), 1)
self.assertEqual(w[0].category, UserWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this no longer necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this file triggers the invalid warning that I got rid of in mseed reading (i.e. potentially trialing the wrong endian in the first attempt).

The only thing that changes in here is that the warning-catching was removed, nothing else.

st2 = read(os.path.join(self.path,
'2011-09-06-1311-36S.A1032_001BH_Z.mseed'))
self.assertEqual(len(st1), len(st2))
self.assertTrue(np.allclose(st1[0].data, st2[0].data))

Expand Down