feat: support ptotobuf in dsis client#39
Merged
Qif-Equinor merged 24 commits intomainfrom Dec 10, 2025
Merged
Conversation
Collaborator
Qif-Equinor
commented
Dec 2, 2025
- add protobuf
- support stream
- Add dsis-schemas>=0.0.2 to project dependencies - Required for QueryBuilder casting functionality and model validation - Package provides Pydantic models for DSIS data structures (1160+ models)
- Add get_bulk_data() method to DSISClient for fetching binary protobuf data
- Add _request_binary() internal method to BaseClient for binary requests
- Update README with comprehensive protobuf documentation:
- Installation instructions for protobuf support
- Supported bulk data types (Horizon, LogCurve, Seismic)
- Examples showing two approaches: include data in query vs fetch separately
- API reference for get_bulk_data() method
- Add example_protobuf.py demonstrating complete workflow:
- Horizon data decoding and NumPy conversion
- Log curve data decoding and analysis
- Seismic data decoding and trace extraction
- Update execute_query() documentation to reflect generator pattern
The DSIS API serves data in two formats:
- Metadata: Via OData (JSON)
- Bulk Data: Via Protocol Buffers (binary)
The new get_bulk_data() method provides efficient access to binary data
via the OData endpoint: /{schema}('{native_uid}')/data/$value
This is more efficient than including the 'data' field in OData queries,
especially for large datasets like seismic volumes.
- Change endpoint from /{Schema}('{native_uid}')/data/$value to /{Schema}('{native_uid}')/data
- Change Accept header from application/octet-stream to application/json
- Update documentation to reflect correct endpoint format and Accept header
- Add notes about API behavior in README
The DSIS API returns binary protobuf data with Accept: application/json header,
not application/octet-stream, and the endpoint does not include the /$value suffix.
…data
- Add get_entity_data() method to DSISClient that extracts native_uid from entity
- Support passing query object to reuse district_id and field context
- Update README with new convenience method documentation
- Update log curve example to use the simpler API
This provides a more intuitive API:
binary_data = client.get_entity_data(log_curve, schema='LogCurve', query=query)
Instead of:
binary_data = client.get_bulk_data(
schema='LogCurve',
native_uid=log_curve['native_uid'],
district_id=district_id,
field=field
)
- Remove HAS_DSIS_SCHEMAS checks since dsis-schemas>=0.0.2 is now required - Simplify schema_helper.py by removing try/except for dsis_model_sdk import - Simplify base_client.py schema validation logic - Add dsis-schemas[protobuf] to ensure protobuf version compatibility - Add dev dependencies (pytest, pytest-cov, pytest-mock) to pyproject.toml - All tests pass (9/9) Since dsis-schemas is now a required dependency in pyproject.toml, we don't need defensive imports anymore. Including protobuf extras ensures version compatibility between the generated protobuf code and runtime.
- Remove try/except blocks for dsis_model_sdk imports - Update all examples to use get_entity_data() instead of get_bulk_data() - Simplify code since dsis-schemas[protobuf] is now required - All three examples (Horizon, LogCurve, Seismic) now use the new API
- Add requests>=2.28.0 to dependencies - While requests is a transitive dependency of msal, it's best practice to explicitly declare direct dependencies that are imported in the code
- Modified _request_binary() to return None on 404 (entity has no bulk data) - Updated get_bulk_data() and get_entity_data() return types to Optional[bytes] - Updated docstrings with proper behavior documentation - Added None checks in example_protobuf.py and sample.py - This provides a more Pythonic API - users can check 'if binary_data:' instead of catching exceptions for missing data
- Add _request_binary_stream() method for chunked binary data streaming - Add get_bulk_data_stream() for streaming binary protobuf data in chunks - Add get_entity_data_stream() convenience method for entity-based streaming - Update all bulk data methods to accept schema as class or string (type-safe) - Add example_streaming.py demonstrating memory-efficient streaming patterns - Update examples to use schema classes instead of strings Benefits: - Memory efficient: stream large datasets without loading everything into memory - Type-safe: use schema classes with IDE autocomplete and type checking - Progress tracking: monitor download progress in real-time - Early termination: stop streaming if data exceeds size limits - Flexible: supports direct-to-file streaming and conditional processing
- Changed default chunk_size from 8192 bytes (8KB) to 10MB (10*1024*1024) - Updated _request_binary_stream() in BaseClient - Updated get_bulk_data_stream() in DSISClient - Updated get_entity_data_stream() in DSISClient - Updated all examples in example_streaming.py to use 10MB chunks - 10MB is the recommended chunk size by DSIS for optimal performance
- Consolidate get_bulk_data() and get_bulk_data_stream() to accept entity objects or strings - Add query parameter to auto-extract district_id and field - Deprecate get_entity_data() methods (warnings added, removal in v1.0.0) - Add comprehensive binary data guide - Update README and API docs with simplified examples
- Update pyproject.toml to require dsis-schemas>=0.0.4 and protobuf>=6.33.0 - Fix example_protobuf.py: move imports to top, update to use get_bulk_data() - Fix example_streaming.py: move imports to top, update to use get_bulk_data_stream() - Remove f-strings without placeholders - Update all examples to use consolidated API (native_uid parameter accepts entities)
dsis-schemas 0.0.5 fixes the protobuf dependency issue: - Now requires protobuf>=5.28.3 (was protobuf<6.0.0 in 0.0.4) - Correctly installs protobuf 6.33.2 which matches gencode version - All tests passing with correct protobuf runtime/gencode versions
…ort, and query execution for DSIS API - Added BulkDataMixin for fetching and streaming binary protobuf data. - Introduced HTTPTransportMixin for making authenticated HTTP requests. - Created PaginationMixin for handling OData nextLink pagination. - Developed QueryExecutionMixin for executing QueryBuilder queries and casting results. - Each mixin requires specific configurations and methods to be implemented in subclasses.
…lk_data.py - Add _extract_native_uid() helper to extract native_uid from string/dict/object - Add _build_bulk_data_endpoint() helper to build endpoint paths - Simplify get_bulk_data() from 40 to 16 lines (60% reduction) - Simplify get_bulk_data_stream() from 39 to 16 lines (59% reduction) - Improve error messages with context (keys, type names) - Eliminate ~33 lines of duplicated code - All tests passing (9/9)
- Remove get_entity_data() and get_entity_data_stream() methods - Remove unused warnings import - Update README.md to remove deprecated method documentation - Update working-with-binary-data.md migration guide - All functionality now available via get_bulk_data() and get_bulk_data_stream() - All tests passing (9/9)
TordAreStromsnes
approved these changes
Dec 10, 2025
TordAreStromsnes
pushed a commit
that referenced
this pull request
Dec 10, 2025
🤖 I have created a release *beep* *boop* --- ## [0.5.0](v0.4.1...v0.5.0) (2025-12-10) ### Features * support ptotobuf in dsis client ([#39](#39)) ([1be337a](1be337a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.