segy: struct.error: 'h' format requires -32768 <= number <= 32767 #1393

josephjamesfarrugia · 2016-05-04T16:33:18Z

obspy version 1.0.1,Python version 2.7.6, OSX.
obspy was installed via pip and python via their website

Essentially, I'd like to use obspy to merge all the traces in a stream object, and then write a .sgy file from the merged stream. I've been able to do the aforementioned by writing the merged stream to a .txt file, but I've been unsuccessful in writing the .sgy file.

Below is a copy of my code:

# !/usr/bin/env python

# IMPORT RELEVANT SCRIPT PACKAGES
import os  # Operating system dependent functionality
from obspy.io.segy.core import _read_segy
import sys

filename = ['200', '201']  # File name

for i in range(0, len(filename), 1):
    original_segy = os.path.join('/Users/josephfarrugia/Dropbox/Masters_Work/Ontario Site Response Field Campaign/' + filename[i] + '.sgy')
    st = _read_segy(original_segy)

    merged_st = {}
    for x in range(1, 13, 1):  # First trace in the stream starts at x = 0
        print('Writing Channel %d to .txt File' % (x + 12))  # Indicating what channel is being merged/written to .txt
        merged_st[x] = st[((x - 1) + 12):len(st):24].merge(method=1, fill_value=None, interpolation_samples=2)  # See:
        # https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.merge.html for more information
        # Start at x = 0 (representing the 13th channel) and through to x = 13-1 = 12 (24th channel)
        merged_st[x].write('%s_channel_%d.txt' % (filename[i], x), format='TSPAIR')  # Write the new .txt file

        print('Writing Channel %d to .sgy File' % (x + 12))  # Indicating what channel is being merged/written to .sgy
        merged_st[x].write('%s_channel_%d.sgy' % (filename[i], x), format="SEGY")  # Write the new .sgy file

I receive the following error:

Traceback (most recent call last):
  File "/Users/josephfarrugia/Dropbox/Masters_Work/Python/segyconcat.py", line 32, in <module>
    merged_st[x].write('%s_channel_%d.sgy' % (filename[i], x), format="SEGY")  # Write the new .sgy file
  File "/usr/local/lib/python2.7/site-packages/obspy/core/stream.py", line 1444, in write
    write_format(self, filename, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/obspy/io/segy/core.py", line 435, in _write_segy
    segy_file.write(filename, data_encoding=data_encoding, endian=byteorder)
  File "/usr/local/lib/python2.7/site-packages/obspy/io/segy/segy.py", line 232, in write
    self._write(file, data_encoding=data_encoding, endian=endian)
  File "/usr/local/lib/python2.7/site-packages/obspy/io/segy/segy.py", line 272, in _write
    self.binary_file_header.write(file, endian=endian)
  File "/usr/local/lib/python2.7/site-packages/obspy/io/segy/segy.py", line 415, in write
    file.write(pack(format, getattr(self, name)))
struct.error: 'h' format requires -32768 <= number <= 32767

I'm fairly new to Python, but have experience with Matlab. From my research, I don't think this is a Python error. Hoping someone knows how to rectify the problem!

You can download the corresponding files to run with my script here to replicate the error: https://www.dropbox.com/s/7m5wbb1hkmmhok8/200.sgy?dl=0
https://www.dropbox.com/s/sdttwiurxhjwgcg/201.sgy?dl=0

The text was updated successfully, but these errors were encountered:

claudiodsf · 2016-05-04T17:06:27Z

Hi, here's a more pythonic way of writing your code, which does not resolve the issue, but it's a bit clearer for others, I hope.

A few notes:

You should use the generic ObsPy function read() instead of _read_segy()
It looks like that there's no need to put your merged stream into a dictionary (merged_st = {}).
You need to copy the original stream before manipulating it: merged_st = st[((x - 1) + 12)::24].copy()

# !/usr/bin/env python
from obspy import read

filename = ['200', '201']

for fname in filename:
    st = read(fname + '.sgy')

    for x in range(1, 13, 1):
        print('Writing Channel %d to .txt File' % (x + 12))
        merged_st = st[((x - 1) + 12)::24].copy()
        merged_st.merge(method=1, fill_value=None, interpolation_samples=2)
        merged_st.write('%s_channel_%d.txt' % (fname, x), format='TSPAIR')

        print('Writing Channel %d to .sgy File' % (x + 12))
        merged_st.write('%s_channel_%d.sgy' % (fname, x), format="SEGY")

As I said before, this code fails with the same error. I guess it's related to some invalid value in your stream...

josephjamesfarrugia · 2016-05-04T17:11:36Z

Thanks!

Joseph Farrugia
M.Sc. Candidate, Geophysics
Engineering Seismology
Department of Earth Sciences (BGS 1033)
Western University

On May 4, 2016, at 1:06 PM, Claudio Satriano notifications@github.com wrote:

!/usr/bin/env python

from obspy import read

filename = ['200', '201']

for fname in filename:
st = read(fname + '.sgy')
for x in range(1, 13, 1):
    print('Writing Channel %d to .txt File' % (x + 12))
    merged_st = st[((x - 1) + 12)::24].copy()
    merged_st.merge(method=1, fill_value=None, interpolation_samples=2)
    merged_st.write('%s_channel_%d.txt' % (fname, x), format='TSPAIR')

    print('Writing Channel %d to .sgy File' % (x + 12))
    merged_st.write('%s_channel_%d.sgy' % (fname, x), format="SEGY")

bsmithyman · 2016-05-04T22:44:51Z

I suspect it's the sample rate setting; for SEG-Y it is stored as a signed short int in microseconds, so it has to be between 1 us and 32.767 ms. This is commonly a problem in going to/from data that don't fit the assumptions in SEG-Y. You can divide the dt header by 1000, in which anything in Hz will be kHz in your SEG-Y workflow.

josephjamesfarrugia · 2016-05-05T00:21:19Z

I'll give that a shot! Thanks Brendan!

Joseph Farrugia
M.Sc. Candidate
Geophysics and Seismology
Department of Earth Sciences
Western University

On May 4, 2016, at 6:44 PM, Brendan Smithyman notifications@github.com wrote:

I suspect it's the sample rate setting; for SEG-Y it is stored as a signed short int in microseconds, so it has to be between 1 us and 32.767 ms. This is commonly a problem in going to/from data that don't fit the assumptions in SEG-Y. You can divide the dt header by 1000, in which anything in Hz will be kHz in your SEG-Y workflow.

—
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

josephjamesfarrugia · 2016-05-05T03:54:21Z

Hey Brendan,

I did as you suggested, and set the dt header to dt/1000 (2000/1000). I still get the same error!

# !/usr/bin/env python
from obspy import read

filename = ['201']

for fname in filename:
    st = read(fname + '.sgy')

    for x in range(1, 13, 1):
        print('Writing Channel %d to .txt File' % (x + 12))
        merged_st = st[((x - 1) + 12)::24].copy()
        merged_st.merge(method=1, fill_value=None, interpolation_samples=2)
        merged_st[0].stats.segy.trace_header['sample_interval_in_ms_for_this_trace'] = 2
        merged_st.write('%s_channel_%d_copy.txt' % (fname, x), format='TSPAIR')

        print('Writing Channel %d to .sgy File' % (x + 12))
        merged_st.write('%s_channel_%d_copy.sgy' % (fname, x), format="SEGY")

Here's all the header information from the traces:

AttribDict({'receiver_group_elevation': 0, 'ensemble_number': 0, 'unassigned': '\x00\x00\x00\x00\x00\x00\x00\x00', 'sweep_length_in_ms': 0, 'data_use': 0, 'original_field_record_number': 202, 'year_data_recorded': 16, 'datum_elevation_at_receiver_group': 0, 'day_of_year': 119, 'hour_of_day': 12, 'sample_interval_in_ms_for_this_trace': 2000, 'number_of_samples_in_this_trace': 16000, 'taper_type': 0, 'x_coordinate_of_ensemble_position_of_this_trace': 0, 'gap_size': 0, 'geophone_group_number_of_trace_number_one': 0, 'low_cut_slope': 0, 'coordinate_units': 1, 'group_coordinate_y': 0, 'group_coordinate_x': 23, 'source_measurement_exponent': 0, 'instrument_early_or_initial_gain': 0, 'for_3d_poststack_data_this_field_is_for_in_line_number': 0, 'source_energy_direction_exponent': 0, 'distance_from_center_of_the_source_point_to_the_center_of_the_receiver_group': 0, 'trace_weighting_factor': 0, 'mute_time_end_time_in_ms': 0, 'trace_number_within_the_ensemble': 0, 'number_of_horizontally_stacked_traces_yielding_this_trace': 1, 'geophone_group_number_of_last_trace': 0, 'over_travel_associated_with_taper': 0, 'number_of_vertically_summed_traces_yielding_this_trace': 1, 'unpacked_header': None, 'scalar_to_be_applied_to_times': 0, 'scalar_to_be_applied_to_all_elevations_and_depths': 0, 'energy_source_point_number': 0, 'water_depth_at_source': 0, 'for_3d_poststack_data_this_field_is_for_cross_line_number': 0, 'notch_filter_frequency': 0, 'source_static_correction_in_ms': 0, 'instrument_gain_constant': 0, 'sweep_trace_taper_length_at_start_in_ms': 0, 'high_cut_frequency': 0, 'lag_time_B': 0, 'lag_time_A': 0, 'high_cut_slope': 0, 'minute_of_hour': 42, 'uphole_time_at_source_in_ms': 0, 'scalar_to_be_applied_to_all_coordinates': 1, 'shotpoint_number': 0, 'device_trace_identifier': 0, 'subweathering_velocity': 0, 'source_depth_below_surface': 0, 'trace_sequence_number_within_line': 0, 'sweep_trace_taper_length_at_end_in_ms': 0, 'delay_recording_time': -32000, 'weathering_velocity': 0, 'source_coordinate_x': 16, 'source_coordinate_y': 0, 'source_type_orientation': 0, 'mute_time_start_time_in_ms': 0, 'sweep_frequency_at_end': 0, 'total_static_applied_in_ms': 0, 'time_basis_code': 1, 'group_static_correction_in_ms': 0, 'sweep_type': 0, 'surface_elevation_at_source': 0, 'alias_filter_frequency': 0, 'low_cut_frequency': 0, 'endian': u'>', 'trace_identification_code': 1, 'source_measurement_mantissa': 0, 'scalar_to_be_applied_to_the_shotpoint_number': 0, 'source_measurement_unit': 0, 'source_energy_direction_mantissa': 0, 'second_of_minute': 48, 'trace_sequence_number_within_segy_file': 0, 'transduction_constant_exponent': 0, 'alias_filter_slope': 0, 'sweep_frequency_at_start': 0, 'uphole_time_at_group_in_ms': 0, 'gain_type_of_field_instruments': 0, 'trace_value_measurement_unit': 0, 'trace_number_within_the_original_field_record': 24, 'transduction_units': 0, 'y_coordinate_of_ensemble_position_of_this_trace': 0, 'notch_filter_slope': 0, 'geophone_group_number_of_roll_switch_position_one': 0, 'correlated': 0, 'datum_elevation_at_source': 0, 'water_depth_at_group': 0, 'transduction_constant_mantissa': 0})

I think I changed the right attribute.

megies · 2016-05-05T09:54:37Z

@josephjamesfarrugia, download links to files are not working for me.

Also, my feeling tells me this is the same problem as in #1385.

josephjamesfarrugia · 2016-05-05T14:30:23Z

@megies I've updated the links so they should work now!

josephjamesfarrugia · 2016-05-05T15:22:07Z

@megies @bsmithyman
Following what I could understand from the example in #1385 I attempted to trim the stream object before writing to SEGY. Again, I'm able to write the TXT file, but the write to SEGY crashes with the same error:

struct.error: 'h' format requires -32768 <= number <= 32767

Copy of the current code is below:
segyconcat.txt

josephjamesfarrugia · 2016-05-05T16:37:35Z

Update: I changed the last line of my code to:

merged_st.write('%s_channel_%d_copy.sgy' % (fname, x), format="SEGY", data_encoding=1, byteorder=sys.byteorder)

And now I receive the following error (same as #1385):

struct.error: short format requires SHRT_MIN <= number <= SHRT_MAX

josephjamesfarrugia · 2016-05-05T17:13:19Z

Also, the length of each trace is long...approximately 45 minutes. Might that be producing the error?

Update: I was ABLE to write a 30 second trace to SEGY. I'm thinking there must be a file size limit.

josephjamesfarrugia · 2016-05-05T17:47:18Z

Update: It's the number of samples.

1 Trace(s) in Stream:
Seq. No. in line:    0 | 2016-04-28T11:30:57.000000Z - 2016-04-28T11:31:57.998000Z | 500.0 Hz, 30500 samples
Writing Channel 13 to .txt File
Writing Channel 13 to .sgy File

If I set the record length to just a minute (roughly two traces), the number of samples (30500) is less than the max number of samples allowed (original error -- struct.error: 'h' format requires -32768 <= number <= 32767).

I gradually increased the record length until the number of samples exceeded 32767, and the write to SEGY failed as expected.

So I think it's a matter of resampling the stream object. Therefore, I'll close this issue. However, I get a new error when trying to write the resampled stream object.

Thanks everyone for your time and input.

krischer · 2016-05-06T09:11:58Z

Reopening as this definitely requires a better error message.

Also: We currently write the number of samples as a signed short (thus the range from -32768 to 32767). We could store it as an unsigned short which would double the effective range of allowed number of samples. The SEG-Y manual does not appear to specify whether to use signed or unsigned integers (it only says to use two's complement integers) so we might risk compatibility with other SEG-Y tools. Any SEG-Y experts around here that know the best way to handle this?

bsmithyman · 2016-05-06T11:03:22Z

One of the clearer quick references I use is here, which is useful for the old formats. My mostly-compatible SEG-Y rev. 0 / rev. 1 library uses unsigned ints for dt and ns, now that I look at it. I based that off of the Seismic Unix headers and convention, specifically SU/src/su/include/[tapebhdr.h,tapehdr.h]. So, I think that would probably be my go-to open source reference; also, SU is BSD License, so it's a safe place to look w/o worrying about license violations. This is contrary to what I remembered off of the top of my head, so I guess my comments above should say 65535 us (though, of course, turns out it was ns that was the issue in this case.

krischer · 2016-05-06T11:42:55Z

Thanks for the hint to look at the SU source code! I did not actually run any tests but from looking at the code I don't think SU can deal with unsigned values for the following reasons:

The SU/src/su/include/[tapesegy.h,tapebhdr.h] files define everything as unsigned, but when SU/src/su/lib/hdrpkge.c actually reads and writes the header its values are cast to and from signed (pointer) values.
Type definitions in Seismix Unix are for some reason available in lots of places but I think the authoritative location is SU/src/su/include/hdr.h which is used by the gethval() function which only results in signed header values.

I guess we should just check if Seismic Unix can deal with unsigned values. Signed and unsigned values are identical if one does not leave the positive range of the signed version.

Cheers!

LKueperkoch · 2016-05-10T12:27:52Z

Everything looks fine in the code, but two things might be missing (that's what I additionally did to the data): trim all traces to get equal start and end times, fill gaps with zeros.

krischer · 2017-03-27T08:31:44Z

In a recent discussion on the mailing list (http://lists.swapbytes.de/archives/obspy-users/2017-March/002358.html) it was pointed out that two's complement numbers are by definition signed numbers and that the SEG-Y spec states that all integers are two's complement integers.

ObsPy thus does the correct (and maximally compatible) thing. Users who want unsigned values will have to monkey-patch ObsPy to get that behaviour.

We still need a better error message though (also for #1396).

ThomasLecocq · 2017-03-27T08:36:23Z

"struct.error: short format requires SHRT_MIN <= number <= SHRT_MAX"

"This error might occur because the traces within your stream are longer than SHRT_MAX, try to slice the stream in traces smaller than SHRT_MAX before saving to SEGY"

megies · 2019-02-20T12:26:21Z

was this fixed by #2196, same as #2194? Can we close this ticket?

megies · 2019-09-12T12:57:41Z

This has been worked on and improved in recent PRs. Closing

megies added the .io.segy label May 5, 2016

josephjamesfarrugia mentioned this issue May 5, 2016

Strange error when writing merged streams to SEGY #1385

Closed

josephjamesfarrugia closed this as completed May 5, 2016

josephjamesfarrugia mentioned this issue May 5, 2016

Stream to SEGY: dtype of the data and the chosen data_encoding do not match #1396

Closed

krischer reopened this May 6, 2016

megies added this to the 1.1.1 milestone Mar 27, 2017

This was referenced Feb 13, 2018

Miniseed to SEG-Y numpy/numpy#10580

Closed

miniSEED to SEG-Y #2073

Closed

megies changed the title ~~struct.error: 'h' format requires -32768 <= number <= 32767~~ segy: struct.error: 'h' format requires -32768 <= number <= 32767 Feb 13, 2018

megies modified the milestones: 1.1.1, 1.2.1 Apr 19, 2018

ghost mentioned this issue Jul 4, 2018

problem writing segy file: maximum number of traces (too large number in header field) #2194

Closed

megies modified the milestones: 1.2.1, 1.2.0 Feb 20, 2019

megies added this to Waiting for Review in Release 1.2.0 Feb 20, 2019

megies mentioned this issue Mar 15, 2019

Segy nice error message when trying to write traces with too many samples #2358

Merged

9 tasks

megies moved this from Waiting for Review to Waiting on CI in Release 1.2.0 Mar 15, 2019

megies moved this from Waiting on CI to In Progress in Release 1.2.0 Mar 15, 2019

megies self-assigned this Mar 15, 2019

megies closed this as completed Sep 12, 2019

megies moved this from In Progress to Done in Release 1.2.0 Sep 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segy: struct.error: 'h' format requires -32768 <= number <= 32767 #1393

segy: struct.error: 'h' format requires -32768 <= number <= 32767 #1393

josephjamesfarrugia commented May 4, 2016 •

edited

claudiodsf commented May 4, 2016

josephjamesfarrugia commented May 4, 2016

!/usr/bin/env python

bsmithyman commented May 4, 2016

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016 •

edited by claudiodsf

megies commented May 5, 2016 •

edited

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016 •

edited by claudiodsf

josephjamesfarrugia commented May 5, 2016 •

edited

josephjamesfarrugia commented May 5, 2016

krischer commented May 6, 2016

bsmithyman commented May 6, 2016

krischer commented May 6, 2016

LKueperkoch commented May 10, 2016

krischer commented Mar 27, 2017

ThomasLecocq commented Mar 27, 2017 •

edited

megies commented Feb 20, 2019

megies commented Sep 12, 2019

segy: struct.error: 'h' format requires -32768 <= number <= 32767 #1393

segy: struct.error: 'h' format requires -32768 <= number <= 32767 #1393

Comments

josephjamesfarrugia commented May 4, 2016 • edited

claudiodsf commented May 4, 2016

josephjamesfarrugia commented May 4, 2016

!/usr/bin/env python

bsmithyman commented May 4, 2016

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016 • edited by claudiodsf

megies commented May 5, 2016 • edited

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016

josephjamesfarrugia commented May 5, 2016 • edited by claudiodsf

josephjamesfarrugia commented May 5, 2016 • edited

josephjamesfarrugia commented May 5, 2016

krischer commented May 6, 2016

bsmithyman commented May 6, 2016

krischer commented May 6, 2016

LKueperkoch commented May 10, 2016

krischer commented Mar 27, 2017

ThomasLecocq commented Mar 27, 2017 • edited

megies commented Feb 20, 2019

megies commented Sep 12, 2019

josephjamesfarrugia commented May 4, 2016 •

edited

josephjamesfarrugia commented May 5, 2016 •

edited by claudiodsf

megies commented May 5, 2016 •

edited

josephjamesfarrugia commented May 5, 2016 •

edited by claudiodsf

josephjamesfarrugia commented May 5, 2016 •

edited

ThomasLecocq commented Mar 27, 2017 •

edited