feat: support ptotobuf in dsis client by Qif-Equinor · Pull Request #39 · equinor/dsis-python-client

Qif-Equinor · 2025-12-02T12:33:44Z

add protobuf
support stream

- Add dsis-schemas>=0.0.2 to project dependencies - Required for QueryBuilder casting functionality and model validation - Package provides Pydantic models for DSIS data structures (1160+ models)

- Add get_bulk_data() method to DSISClient for fetching binary protobuf data - Add _request_binary() internal method to BaseClient for binary requests - Update README with comprehensive protobuf documentation: - Installation instructions for protobuf support - Supported bulk data types (Horizon, LogCurve, Seismic) - Examples showing two approaches: include data in query vs fetch separately - API reference for get_bulk_data() method - Add example_protobuf.py demonstrating complete workflow: - Horizon data decoding and NumPy conversion - Log curve data decoding and analysis - Seismic data decoding and trace extraction - Update execute_query() documentation to reflect generator pattern The DSIS API serves data in two formats: - Metadata: Via OData (JSON) - Bulk Data: Via Protocol Buffers (binary) The new get_bulk_data() method provides efficient access to binary data via the OData endpoint: /{schema}('{native_uid}')/data/$value This is more efficient than including the 'data' field in OData queries, especially for large datasets like seismic volumes.

- Change endpoint from /{Schema}('{native_uid}')/data/$value to /{Schema}('{native_uid}')/data - Change Accept header from application/octet-stream to application/json - Update documentation to reflect correct endpoint format and Accept header - Add notes about API behavior in README The DSIS API returns binary protobuf data with Accept: application/json header, not application/octet-stream, and the endpoint does not include the /$value suffix.

…data - Add get_entity_data() method to DSISClient that extracts native_uid from entity - Support passing query object to reuse district_id and field context - Update README with new convenience method documentation - Update log curve example to use the simpler API This provides a more intuitive API: binary_data = client.get_entity_data(log_curve, schema='LogCurve', query=query) Instead of: binary_data = client.get_bulk_data( schema='LogCurve', native_uid=log_curve['native_uid'], district_id=district_id, field=field )

- Remove HAS_DSIS_SCHEMAS checks since dsis-schemas>=0.0.2 is now required - Simplify schema_helper.py by removing try/except for dsis_model_sdk import - Simplify base_client.py schema validation logic - Add dsis-schemas[protobuf] to ensure protobuf version compatibility - Add dev dependencies (pytest, pytest-cov, pytest-mock) to pyproject.toml - All tests pass (9/9) Since dsis-schemas is now a required dependency in pyproject.toml, we don't need defensive imports anymore. Including protobuf extras ensures version compatibility between the generated protobuf code and runtime.

- Remove try/except blocks for dsis_model_sdk imports - Update all examples to use get_entity_data() instead of get_bulk_data() - Simplify code since dsis-schemas[protobuf] is now required - All three examples (Horizon, LogCurve, Seismic) now use the new API

- Add requests>=2.28.0 to dependencies - While requests is a transitive dependency of msal, it's best practice to explicitly declare direct dependencies that are imported in the code

- Modified _request_binary() to return None on 404 (entity has no bulk data) - Updated get_bulk_data() and get_entity_data() return types to Optional[bytes] - Updated docstrings with proper behavior documentation - Added None checks in example_protobuf.py and sample.py - This provides a more Pythonic API - users can check 'if binary_data:' instead of catching exceptions for missing data

- Add _request_binary_stream() method for chunked binary data streaming - Add get_bulk_data_stream() for streaming binary protobuf data in chunks - Add get_entity_data_stream() convenience method for entity-based streaming - Update all bulk data methods to accept schema as class or string (type-safe) - Add example_streaming.py demonstrating memory-efficient streaming patterns - Update examples to use schema classes instead of strings Benefits: - Memory efficient: stream large datasets without loading everything into memory - Type-safe: use schema classes with IDE autocomplete and type checking - Progress tracking: monitor download progress in real-time - Early termination: stop streaming if data exceeds size limits - Flexible: supports direct-to-file streaming and conditional processing

- Changed default chunk_size from 8192 bytes (8KB) to 10MB (10*1024*1024) - Updated _request_binary_stream() in BaseClient - Updated get_bulk_data_stream() in DSISClient - Updated get_entity_data_stream() in DSISClient - Updated all examples in example_streaming.py to use 10MB chunks - 10MB is the recommended chunk size by DSIS for optimal performance

- Consolidate get_bulk_data() and get_bulk_data_stream() to accept entity objects or strings - Add query parameter to auto-extract district_id and field - Deprecate get_entity_data() methods (warnings added, removal in v1.0.0) - Add comprehensive binary data guide - Update README and API docs with simplified examples

- Update pyproject.toml to require dsis-schemas>=0.0.4 and protobuf>=6.33.0 - Fix example_protobuf.py: move imports to top, update to use get_bulk_data() - Fix example_streaming.py: move imports to top, update to use get_bulk_data_stream() - Remove f-strings without placeholders - Update all examples to use consolidated API (native_uid parameter accepts entities)

dsis-schemas 0.0.5 fixes the protobuf dependency issue: - Now requires protobuf>=5.28.3 (was protobuf<6.0.0 in 0.0.4) - Correctly installs protobuf 6.33.2 which matches gencode version - All tests passing with correct protobuf runtime/gencode versions

…ort, and query execution for DSIS API - Added BulkDataMixin for fetching and streaming binary protobuf data. - Introduced HTTPTransportMixin for making authenticated HTTP requests. - Created PaginationMixin for handling OData nextLink pagination. - Developed QueryExecutionMixin for executing QueryBuilder queries and casting results. - Each mixin requires specific configurations and methods to be implemented in subclasses.

…lk_data.py - Add _extract_native_uid() helper to extract native_uid from string/dict/object - Add _build_bulk_data_endpoint() helper to build endpoint paths - Simplify get_bulk_data() from 40 to 16 lines (60% reduction) - Simplify get_bulk_data_stream() from 39 to 16 lines (59% reduction) - Improve error messages with context (keys, type names) - Eliminate ~33 lines of duplicated code - All tests passing (9/9)

- Remove get_entity_data() and get_entity_data_stream() methods - Remove unused warnings import - Update README.md to remove deprecated method documentation - Update working-with-binary-data.md migration guide - All functionality now available via get_bulk_data() and get_bulk_data_stream() - All tests passing (9/9)

🤖 I have created a release *beep* *boop* --- ## [0.5.0](v0.4.1...v0.5.0) (2025-12-10) ### Features * support ptotobuf in dsis client ([#39](#39)) ([1be337a](1be337a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

qfu3 and others added 12 commits November 28, 2025 19:56

build: Add dsis-schemas as a dependency

d2f07a3

- Add dsis-schemas>=0.0.2 to project dependencies - Required for QueryBuilder casting functionality and model validation - Package provides Pydantic models for DSIS data structures (1160+ models)

feat: Add requests as explicit dependency

0932fb1

- Add requests>=2.28.0 to dependencies - While requests is a transitive dependency of msal, it's best practice to explicitly declare direct dependencies that are imported in the code

upgrade dsis-schema

eccd7fc

update example

ce712b1

Qif-Equinor changed the title ~~Feat/qif/protobuf~~ feat: support ptotobuf in dsis client Dec 8, 2025

Qif-Equinor added 3 commits December 8, 2025 11:12

fix: remove unused Iterator import

da77dce

jucc linked an issue Dec 8, 2025 that may be closed by this pull request

Support for streaming surface grids and horizons #42

Closed

Qif-Equinor and others added 9 commits December 8, 2025 12:09

fix: Remove escaped quotes and duplicate import in example files

ce6b397

style: Apply ruff formatting

4488f21

build: Add ruff to dev dependencies

c1ae7cc

refactor: Change debug logs to info level for improved logging clarity

5ccdc70

style: apply ruff formatting to _bulk_data.py

482032e

TordAreStromsnes approved these changes Dec 10, 2025

View reviewed changes

Qif-Equinor merged commit 1be337a into main Dec 10, 2025
6 checks passed

Qif-Equinor deleted the feat/qif/protobuf branch December 10, 2025 08:00

github-actions bot mentioned this pull request Dec 10, 2025

chore(main): release 0.5.0 #43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support ptotobuf in dsis client#39

feat: support ptotobuf in dsis client#39
Qif-Equinor merged 24 commits intomainfrom
feat/qif/protobuf

Qif-Equinor commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Qif-Equinor commented Dec 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants