Skip to content

Cloud native, scalable storage engine for various types of energy data.

License

Notifications You must be signed in to change notification settings

TGSAI/mdio-python

Repository files navigation

MDIO

PyPI Conda Python Version Status License

Tests Codecov Read the documentation at https://mdio-python.readthedocs.io/

pre-commit Black

PyPI Downloads Conda Downloads

"MDIO" is a library to work with large multidimensional energy datasets. The primary motivation behind MDIO is to represent multidimensional time series data in a format that makes it easier to use in resource assessment, machine learning, and data processing workflows.

See the documentation for more information.

This is not an official TGS product.

Features

Shared Features

  • Abstractions for common energy data types (see below).
  • Cloud native chunked storage based on Zarr and fsspec.
  • Lossy and lossless data compression using Blosc and ZFP.
  • Distributed reads and writes using Dask.
  • Powerful command-line-interface (CLI) based on Click

Domain Specific Features

  • Oil & Gas Data
    • Import and export 2D - 5D seismic data types stored in SEG-Y.
    • Import seismic interpretation, horizon, data. FUTURE
    • Optimized chunking logic for various seismic types. FUTURE
  • Wind Resource Assessment
    • Numerical weather prediction models with arbitrary metadata. FUTURE
    • Optimized chunking logic for time-series analysis and mapping. FUTURE
    • Xarray interface. FUTURE

The features marked as FUTURE will be open-sourced at a later date.

Installing MDIO

Simplest way to install MDIO via pip from PyPI:

$ pip install multidimio

or install MDIO via conda from conda-forge:

$ conda install -c conda-forge multidimio

Extras must be installed separately on Conda environments.

For details, please see the installation instructions in the documentation.

Using MDIO

Please see the Command-line Usage for details.

For Python API please see the API Reference for details.

Requirements

Minimal

Chunked storage and parallelization: zarr, dask, numba, and psutil.
SEG-Y Parsing: segyio
CLI and Progress Bars: click, click-params, and tqdm.

Optional

Distributed computing [distributed]: distributed and bokeh.
Cloud Object Store I/O [cloud]: s3fs, gcsfs, and adlfs.
Lossy Compression [lossy]: zfpy

Contributing to MDIO

Contributions are very welcome. To learn more, see the Contributor Guide.

Licensing

Distributed under the terms of the Apache 2.0 license, MDIO is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was established at TGS. Current maintainer is Altay Sansal with the support of many more great colleagues.

This project template is based on @cjolowicz's Hypermodern Python Cookiecutter template.