Skip to content

feat: support ptotobuf in dsis client#39

Merged
Qif-Equinor merged 24 commits intomainfrom
feat/qif/protobuf
Dec 10, 2025
Merged

feat: support ptotobuf in dsis client#39
Qif-Equinor merged 24 commits intomainfrom
feat/qif/protobuf

Conversation

@Qif-Equinor
Copy link
Collaborator

  1. add protobuf
  2. support stream

qfu3 and others added 12 commits November 28, 2025 19:56
- Add dsis-schemas>=0.0.2 to project dependencies
- Required for QueryBuilder casting functionality and model validation
- Package provides Pydantic models for DSIS data structures (1160+ models)
- Add get_bulk_data() method to DSISClient for fetching binary protobuf data
- Add _request_binary() internal method to BaseClient for binary requests
- Update README with comprehensive protobuf documentation:
  - Installation instructions for protobuf support
  - Supported bulk data types (Horizon, LogCurve, Seismic)
  - Examples showing two approaches: include data in query vs fetch separately
  - API reference for get_bulk_data() method
- Add example_protobuf.py demonstrating complete workflow:
  - Horizon data decoding and NumPy conversion
  - Log curve data decoding and analysis
  - Seismic data decoding and trace extraction
- Update execute_query() documentation to reflect generator pattern

The DSIS API serves data in two formats:
- Metadata: Via OData (JSON)
- Bulk Data: Via Protocol Buffers (binary)

The new get_bulk_data() method provides efficient access to binary data
via the OData endpoint: /{schema}('{native_uid}')/data/$value

This is more efficient than including the 'data' field in OData queries,
especially for large datasets like seismic volumes.
- Change endpoint from /{Schema}('{native_uid}')/data/$value to /{Schema}('{native_uid}')/data
- Change Accept header from application/octet-stream to application/json
- Update documentation to reflect correct endpoint format and Accept header
- Add notes about API behavior in README

The DSIS API returns binary protobuf data with Accept: application/json header,
not application/octet-stream, and the endpoint does not include the /$value suffix.
…data

- Add get_entity_data() method to DSISClient that extracts native_uid from entity
- Support passing query object to reuse district_id and field context
- Update README with new convenience method documentation
- Update log curve example to use the simpler API

This provides a more intuitive API:
  binary_data = client.get_entity_data(log_curve, schema='LogCurve', query=query)

Instead of:
  binary_data = client.get_bulk_data(
      schema='LogCurve',
      native_uid=log_curve['native_uid'],
      district_id=district_id,
      field=field
  )
- Remove HAS_DSIS_SCHEMAS checks since dsis-schemas>=0.0.2 is now required
- Simplify schema_helper.py by removing try/except for dsis_model_sdk import
- Simplify base_client.py schema validation logic
- Add dsis-schemas[protobuf] to ensure protobuf version compatibility
- Add dev dependencies (pytest, pytest-cov, pytest-mock) to pyproject.toml
- All tests pass (9/9)

Since dsis-schemas is now a required dependency in pyproject.toml, we don't
need defensive imports anymore. Including protobuf extras ensures version
compatibility between the generated protobuf code and runtime.
- Remove try/except blocks for dsis_model_sdk imports
- Update all examples to use get_entity_data() instead of get_bulk_data()
- Simplify code since dsis-schemas[protobuf] is now required
- All three examples (Horizon, LogCurve, Seismic) now use the new API
- Add requests>=2.28.0 to dependencies
- While requests is a transitive dependency of msal, it's best practice
  to explicitly declare direct dependencies that are imported in the code
- Modified _request_binary() to return None on 404 (entity has no bulk data)
- Updated get_bulk_data() and get_entity_data() return types to Optional[bytes]
- Updated docstrings with proper behavior documentation
- Added None checks in example_protobuf.py and sample.py
- This provides a more Pythonic API - users can check 'if binary_data:' instead
  of catching exceptions for missing data
- Add _request_binary_stream() method for chunked binary data streaming
- Add get_bulk_data_stream() for streaming binary protobuf data in chunks
- Add get_entity_data_stream() convenience method for entity-based streaming
- Update all bulk data methods to accept schema as class or string (type-safe)
- Add example_streaming.py demonstrating memory-efficient streaming patterns
- Update examples to use schema classes instead of strings

Benefits:
- Memory efficient: stream large datasets without loading everything into memory
- Type-safe: use schema classes with IDE autocomplete and type checking
- Progress tracking: monitor download progress in real-time
- Early termination: stop streaming if data exceeds size limits
- Flexible: supports direct-to-file streaming and conditional processing
- Changed default chunk_size from 8192 bytes (8KB) to 10MB (10*1024*1024)
- Updated _request_binary_stream() in BaseClient
- Updated get_bulk_data_stream() in DSISClient
- Updated get_entity_data_stream() in DSISClient
- Updated all examples in example_streaming.py to use 10MB chunks
- 10MB is the recommended chunk size by DSIS for optimal performance
@Qif-Equinor Qif-Equinor changed the title Feat/qif/protobuf feat: support ptotobuf in dsis client Dec 8, 2025
- Consolidate get_bulk_data() and get_bulk_data_stream() to accept entity objects or strings
- Add query parameter to auto-extract district_id and field
- Deprecate get_entity_data() methods (warnings added, removal in v1.0.0)
- Add comprehensive binary data guide
- Update README and API docs with simplified examples
- Update pyproject.toml to require dsis-schemas>=0.0.4 and protobuf>=6.33.0
- Fix example_protobuf.py: move imports to top, update to use get_bulk_data()
- Fix example_streaming.py: move imports to top, update to use get_bulk_data_stream()
- Remove f-strings without placeholders
- Update all examples to use consolidated API (native_uid parameter accepts entities)
@jucc jucc linked an issue Dec 8, 2025 that may be closed by this pull request
Qif-Equinor and others added 9 commits December 8, 2025 12:09
dsis-schemas 0.0.5 fixes the protobuf dependency issue:
- Now requires protobuf>=5.28.3 (was protobuf<6.0.0 in 0.0.4)
- Correctly installs protobuf 6.33.2 which matches gencode version
- All tests passing with correct protobuf runtime/gencode versions
…ort, and query execution for DSIS API

- Added BulkDataMixin for fetching and streaming binary protobuf data.
- Introduced HTTPTransportMixin for making authenticated HTTP requests.
- Created PaginationMixin for handling OData nextLink pagination.
- Developed QueryExecutionMixin for executing QueryBuilder queries and casting results.
- Each mixin requires specific configurations and methods to be implemented in subclasses.
…lk_data.py

- Add _extract_native_uid() helper to extract native_uid from string/dict/object
- Add _build_bulk_data_endpoint() helper to build endpoint paths
- Simplify get_bulk_data() from 40 to 16 lines (60% reduction)
- Simplify get_bulk_data_stream() from 39 to 16 lines (59% reduction)
- Improve error messages with context (keys, type names)
- Eliminate ~33 lines of duplicated code
- All tests passing (9/9)
- Remove get_entity_data() and get_entity_data_stream() methods
- Remove unused warnings import
- Update README.md to remove deprecated method documentation
- Update working-with-binary-data.md migration guide
- All functionality now available via get_bulk_data() and get_bulk_data_stream()
- All tests passing (9/9)
@Qif-Equinor Qif-Equinor merged commit 1be337a into main Dec 10, 2025
6 checks passed
@Qif-Equinor Qif-Equinor deleted the feat/qif/protobuf branch December 10, 2025 08:00
TordAreStromsnes pushed a commit that referenced this pull request Dec 10, 2025
🤖 I have created a release *beep* *boop*
---


##
[0.5.0](v0.4.1...v0.5.0)
(2025-12-10)


### Features

* support ptotobuf in dsis client
([#39](#39))
([1be337a](1be337a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for streaming surface grids and horizons

2 participants