zgy_writer #117

ksurf1 · 2023-05-22T23:57:55Z

Hi, great job on the segysak package !

It might not be the best venue to write this since it is not a real issue with the existing code.

I would like to harvest the ZGY compression options and write 16bit ZGY files.
Having worked in the past with ZGY files on the development side, there are variable like bias and factor to set in the metadata in order to be able to write data as int16 or int8. (float32 = int16*factor+bias). Also there is a "brick" size which is by default [64,64,64] to compress the data: if all the data values are identical in one brick, only one value is dumped to disk not 64^3. This compresses the file size a lot when portion of the data is 0.0 for instance.

Right now your code writes data as float32 uncompressed if I am not mistaken.

Can you give me some pointers to modify your code to achieve the best compression possible ?

Much obliged.

Would this work in your code ?

def zgy_writer(seisnc_dataset, filename, datatype = "float", datarange=(-2**15, 2**15-1), dimension=None):
    """Write a seisnc dataset to ZGY file format.

    Args:
        seisnc_dataset (xr.Dataset): This should be a seisnc dataset.
        filename (pathlib.Path/str): The filename to write to.
        datatype  (str): "float", "int16" or "int8"
        datarange (float,float): (-1500.0, 1500.0)
        dimension (str, optional): The dimension to write out, if
            None uses available dimenions twt first.
    """
    assert seisnc_dataset.seis.is_3d()

    seisnc_dataset.seis.calc_corner_points()
    if seisnc_dataset.corner_points_xy:
        corners = tuple(seisnc_dataset.corner_points_xy[i] for i in [0, 3, 1, 2])
    else:
        corners=[(0, 0), (0, 0), (0, 0), (0, 0)]

    dimension = _check_dimension(seisnc_dataset, dimension)
    if dimension == "twt":
        zunitdim = UnitDimension(2001)
    elif dimension == "depth":
        zunitdim = UnitDimension(2002)
    else:
        zunitdim = UnitDimension(2000)
    try:
        coord_scalar_mult = seisnc_dataset.attrs["coord_scalar_mult"]
    except KeyError:
        coord_scalar_mult = 1.0

    # dimensions
    ni, nj, nk = (
        seisnc_dataset.dims[CoordKeyField.iline],
        seisnc_dataset.dims[CoordKeyField.xline],
        seisnc_dataset.dims[dimension],
    )

    # vertical
    z0 = int(seisnc_dataset[dimension].values[0])
    dz = int(seisnc_dataset.sample_rate)

    # annotation
    il0 = seisnc_dataset[CoordKeyField.iline].values[0]
    xl0 = seisnc_dataset[CoordKeyField.xline].values[0]
    dil = int(np.median(np.diff(seisnc_dataset[CoordKeyField.iline].values)))
    dxl = int(np.median(np.diff(seisnc_dataset[CoordKeyField.xline].values)))

    dtype_float = SampleDataType[datatype]
    
    if datatype == "float":
        datarange = (-float('inf'),float('inf'))
    

    with ZgyWriter(
        str(filename),
        size=(ni, nj, nk),
        datatype=dtype_float,
        datarange=datarange,
        zunitdim=zunitdim,
        zstart=z0,
        zinc=dz,
        annotstart=(il0, xl0),
        annotinc=(dil, dxl),
        corners=corners,
    ) as writer:

        order = (CoordKeyField.iline, CoordKeyField.xline, dimension)

        writer.write(
            (0, 0, 0),
            seisnc_dataset[VariableKeyField.data]
            .transpose(*order)
            .values.astype(np.float32),
        )

V

The text was updated successfully, but these errors were encountered:

ksurf1 · 2023-05-23T23:46:49Z

this code works. I would like to be able to set a compressor and lodcompressor

Does anyone have a syntax for using the compressor arguments to Writer ?

trhallam · 2023-05-25T08:32:15Z

Thanks for your feedback.

Ideally this code needs to be migrated to use pyzgy on the backend and as a natural xarray backend but that is low priority with focus being SEGY.

You can see the openzgy code in the pyzgy library which is a fork of the code from the OSDU implementation. It also uses the Python API reference implementation which does not natively support compression. Additional instructions must be followed on their website to install the ZFP compressor.

Otherwise:

blocks are [64,64,64] by default.
references to output type are given here

To change the output datatype, in the segysak code you could add additional checks for the datatype to select the correct format to output to ZgyWriter.

ksurf1 · 2023-05-25T11:47:59Z

Hi Tony, Thanks for the info. I did modify your code for the datatype and datarange and it works. I was wondering how to go even further but it looks compression only is available for float datatype. Anyway I shared my modif on GitHub. Note: I am using octave and multi threading to read segy files and netcdf as storage. Writing goes mulch faster precasting the headers as uint32 swapping bytes correctly first. Thanks again. Good work.

On Thu, May 25, 2023 at 03:32 Tony Hallam ***@***.***> wrote: Thanks for your feedback. Ideally this code needs to be migrated to use pyzgy on the backend and as a natural xarray backend <https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html> but that is low priority with focus being SEGY. You can see the openzgy code in the pyzgy library which is a fork of the code from the OSDU implementation <https://community.opengroup.org/osdu/platform/domain-data-mgmt-services/seismic/open-zgy/-/tree/master/python/openzgy>. It also uses the Python API reference implementation which does not natively support compression. Additional instructions must be followed on their website to install the ZFP compressor. Otherwise: - blocks are [64,64,64] by default. - references to output type are given here <https://github.com/equinor/pyzgy/blob/36c8f4d640dd03510ca47569b8b36eaca0104334/openzgy/api.py#L138> To change the output datatype, in the segysak code you could add additional checks for the datatype to select the correct format to output to ZgyWriter. — Reply to this email directly, view it on GitHub <#117 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZG2VEB4DWZBSG7EYR2ZR33XH4KJTANCNFSM6AAAAAAYLDB3OY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- Vincent Favreau.

da-wad · 2023-05-28T06:35:00Z

@ksurf1 depending on your aims here, you may find converting your SEG-Y files to either the fully opensource seismic-zfp or Bluware's nearly-open compressed VDS format helps you further with disk space. Either of these could be read through segysak (writing would take effort) since they have segyio-like reading syntax available (pyvds for VDS). However, if you have more than half the volume as a constant value you'd win more from ZGY/VDS than seismic-zfp as the latter favours regularity in the data over identifying constant-value bricks in order to make offsets computable rather than requiring a disk location lookup for bricks.

ksurf1 · 2023-05-28T11:31:20Z

Hi Tony, Openvds is a possibility but I don’t get requests yet. I am aware of the tools to create such files. Segy is the go to for archiving anyway. Zgy is for petrel users. Thanks for your time. How do I follow future developments ?

On Sun, May 28, 2023 at 01:35 David Wade ***@***.***> wrote: @ksurf1 <https://github.com/ksurf1> depending on your aims here, you may find converting your SEG-Y files to either the fully opensource seismic-zfp or Bluware's nearly-open compressed VDS format helps you further with disk space. Either of these could be read through segysak (writing would take *effort*) since they have segyio-like reading syntax available (pyvds for VDS). However, if you have more than half the volume as a constant value you'd win more from ZGY/VDS than seismic-zfp as the latter favours regularity in the data over identifying constant-value bricks in order to make offsets computable rather than requiring a disk location lookup for bricks. — Reply to this email directly, view it on GitHub <#117 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZG2VEHYE7C2EQIMQROFKRTXILWZ7ANCNFSM6AAAAAAYLDB3OY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Vincent Favreau.

ksurf1 · 2023-05-30T13:44:15Z

Hi Tony, I have a segy with a group scalar of -100 in bytes 71. This means the XYs must be divided by 100 to get a float from uint32. When I check the corners, they are all nan. I tried the defaults for cdpx cdpy, same result. I know my segy is correct and the byte positions are correct. Solution ? Python code from segysak.segy import segy_loader, well_known_byte_locs, segy_writer

fn = "path2segy.segy" ds = segy_loader(fn, iline=189, xline=193, cdpx=73, cdpy=77, offset=None)

ds.seis.calc_corner_points()

corners = tuple(ds.corner_points_xy[i] for i in [0, 3, 1, 2]) print(corners)

((nan, nan), (nan, nan), (nan, nan), (nan, nan))

…

On Sun, May 28, 2023 at 1:35 AM David Wade ***@***.***> wrote: @ksurf1 <https://github.com/ksurf1> depending on your aims here, you may find converting your SEG-Y files to either the fully opensource seismic-zfp or Bluware's nearly-open compressed VDS format helps you further with disk space. Either of these could be read through segysak (writing would take *effort*) since they have segyio-like reading syntax available (pyvds for VDS). However, if you have more than half the volume as a constant value you'd win more from ZGY/VDS than seismic-zfp as the latter favours regularity in the data over identifying constant-value bricks in order to make offsets computable rather than requiring a disk location lookup for bricks. — Reply to this email directly, view it on GitHub <#117 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZG2VEHYE7C2EQIMQROFKRTXILWZ7ANCNFSM6AAAAAAYLDB3OY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- ---------------------------------------------------------- Vincent FAVREAU *"No one can make you feel inferior without your permission."~Eleanor Roosevelt*

ksurf1 · 2023-05-30T13:58:36Z

The ds variable seems ok in terms of il xl dimensions but the coordinates ? Type: DatasetString form: <xarray.Dataset> Dimensions: (iline: 1234, xline: 1875, twt: 1001) Coordinates: * iline (il <...> None percentiles: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] coord_scalar: -100.0

…

On Tue, May 30, 2023 at 8:44 AM vincent favreau ***@***.***> wrote: Hi Tony, I have a segy with a group scalar of -100 in bytes 71. This means the XYs must be divided by 100 to get a float from uint32. When I check the corners, they are all nan. I tried the defaults for cdpx cdpy, same result. I know my segy is correct and the byte positions are correct. Solution ? Python code from segysak.segy import segy_loader, well_known_byte_locs, segy_writer > fn = "path2segy.segy" > ds = segy_loader(fn, iline=189, xline=193, cdpx=73, cdpy=77, offset=None) ds.seis.calc_corner_points() > corners = tuple(ds.corner_points_xy[i] for i in [0, 3, 1, 2]) > print(corners) ((nan, nan), (nan, nan), (nan, nan), (nan, nan)) On Sun, May 28, 2023 at 1:35 AM David Wade ***@***.***> wrote: > @ksurf1 <https://github.com/ksurf1> depending on your aims here, you may > find converting your SEG-Y files to either the fully opensource seismic-zfp > or Bluware's nearly-open compressed VDS format helps you further with disk > space. Either of these could be read through segysak (writing would take > *effort*) since they have segyio-like reading syntax available (pyvds > for VDS). However, if you have more than half the volume as a constant > value you'd win more from ZGY/VDS than seismic-zfp as the latter favours > regularity in the data over identifying constant-value bricks in order to > make offsets computable rather than requiring a disk location lookup for > bricks. > > — > Reply to this email directly, view it on GitHub > <#117 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AZG2VEHYE7C2EQIMQROFKRTXILWZ7ANCNFSM6AAAAAAYLDB3OY> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > -- ---------------------------------------------------------- Vincent FAVREAU *"No one can make you feel inferior without your permission."~Eleanor Roosevelt*

-- ---------------------------------------------------------- Vincent FAVREAU *"No one can make you feel inferior without your permission."~Eleanor Roosevelt*

trhallam · 2023-05-30T16:03:27Z

I've moved your second issue to #119 please put separate questions in separate issues.

trhallam mentioned this issue May 30, 2023

Get corners not working for segy_loader #119

Closed

trhallam closed this as completed May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zgy_writer #117

zgy_writer #117

ksurf1 commented May 22, 2023 •

edited

ksurf1 commented May 23, 2023

trhallam commented May 25, 2023

ksurf1 commented May 25, 2023 via email

da-wad commented May 28, 2023

ksurf1 commented May 28, 2023 via email

ksurf1 commented May 30, 2023 via email

ksurf1 commented May 30, 2023 via email

trhallam commented May 30, 2023

zgy_writer #117

zgy_writer #117

Comments

ksurf1 commented May 22, 2023 • edited

ksurf1 commented May 23, 2023

trhallam commented May 25, 2023

ksurf1 commented May 25, 2023 via email

da-wad commented May 28, 2023

ksurf1 commented May 28, 2023 via email

ksurf1 commented May 30, 2023 via email

ksurf1 commented May 30, 2023 via email

trhallam commented May 30, 2023

ksurf1 commented May 22, 2023 •

edited