Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
MSEED: Segfault reading truncated file #1728
While trying to work around a problem when reasding truncated files (in SDS client while reading files that are currently being appended to by a different program), I came across a segfault when reading truncated MiniSEED files:
import copy from io import BytesIO from obspy import read from obspy.core.util import get_example_file file_ = get_example_file('BW.BGLD.__.EHE.D.2008.001.first_10_records') with open(file_, 'rb') as fh: data = fh.read() # for i in range(1, 1000): for i in : print(i) bio = BytesIO(copy.deepcopy(data[:-i])) read(bio, format='MSEED')
Seems to be crashing in our code, not libmseed:
0x00007fffdda816aa in readMSEEDBuffer (mseed=0x18ffc10 "763445D BGLD EHEBW", <incomplete sequence \330>, buflen=4863, selections=0x0, unpack_data=1 '\001', reclen=-1, verbose=0 '\000', details=0 '\000', header_byteorder=-1, allocData=0x7ffff7fae048, diag_print=0x7ffff7fae080, log_print=0x7ffff7fae0b8) at obspy/io/mseed/src/obspy-readbuffer.c:472 472 if ((unpack_data != 0) && (msr->fsdh->data_offset >= 48) &&
This branch contains a fix: https://github.com/obspy/obspy/tree/mseed-fix-segfault-truncated-file
Not sure why I cannot convert this issue to a PR right now but I'll try again later tonight or tomorrow. Or maybe somebody else can try?
Some other types of record corruption where already caught by libmseed and correctly bubble up to the Python warnings. I'm not entirely sure why this one does not but maybe its just because its truncated fairly late in the file?
In any case: now works as expected and it raises a nice warning (but still reads all previous records).
Thanks for the fix @krischer, checking again, there's still some truncation scenarios that end in segfaults though..
Can you maybe have a look at these two byte offset:
These seem to be different issues.. the latter one I've seen in real live reading mseed files that currently also get appended to in other threads (checking data latency).
import copy from io import BytesIO from obspy import read from obspy.core.util import get_example_file file_ = get_example_file('BW.BGLD.__.EHE.D.2008.001.first_10_records') with open(file_, 'rb') as fh: data = fh.read() for i in range(1, 10000): # this seems to be a different issue than the already covered one: if i == 256: continue # these seem to be the same issue as with 256, as there just offset by 512 # bytes.. if i % 512 == 256: continue # this is finally the issue I was looking after: :-) if i == 5066: continue print(i) bio = BytesIO(copy.deepcopy(data[:-i])) read(bio, format='MSEED')