# Fetch Horizon 3D bulk data from DSIS

This notebook demonstrates how to fetch and decode **HorizonData3D** binary (protobuf) data from DSIS using the `dsis-client` library.

The following steps are covered:

1. Authenticate to DSIS using an `.env` file with the required configuration and credentials.
2. Construct and execute a query requesting horizon metadata.
3. Fetch binary bulk data for a specific horizon using `get_bulk_data()` and `get_bulk_data_stream()`.
4. Decode the protobuf response and convert it to a NumPy array for analysis.

For more information about the required content of the `.env` file, please contact the SDD-SID team, or the DSIS team in Equinor.

### Authenticate and connect to DSIS

In [77]:
from dsis_client import DSISClient, DSISConfig, QueryBuilder, Environment
from dsis_model_sdk.protobuf import decode_horizon_data
from dsis_model_sdk.utils.protobuf_decoders import horizon_samples_to_dict

from dotenv import load_dotenv
import os

In [78]:
MODEL_NAME = "OpenWorksCommonModel"

In [None]:
DISTRICT = "BG4FROST"
PROJECT = "VOLVE_PUBLIC"

In [80]:
load_dotenv(".env_dsis")

True

In [81]:
config = DSISConfig(
        environment=Environment.DEV,
        tenant_id=os.getenv("tenant_id"),
        client_id=os.getenv("client_id"),
        client_secret=os.getenv("client_secret"),
        access_app_id=os.getenv("resource_id"),
        dsis_username=os.getenv("dsis_function_key"),
        dsis_password=os.getenv("dsis_password"),
        subscription_key_dsauth=os.getenv("subscription_key_dsauth"),
        subscription_key_dsdata=os.getenv("subscription_key_dsdata"),
        dsis_site=os.getenv("dsis_site"),
    )

In [82]:
dsis_client = DSISClient(config)
if dsis_client.test_connection():
    print("✓ Connected to DSIS API")

✓ Connected to DSIS API


Specify the OW database (district) and project.

In [83]:
def build_district_id(database: str, *, model_name: str) -> str:
    """Build DSIS district_id from database name.

    DSIS uses different district-id conventions for different models.

    Examples:
    - OpenWorksCommonModel: OpenWorksCommonModel_OW_<DB>-OW_<DB>
    - OpenWorks native models (e.g., OW5000): OpenWorks_OW_<DB>_SingleSource-OW_<DB>
    """
    if model_name == "OpenWorksCommonModel":
        return f"OpenWorksCommonModel_OW_{database}-OW_{database}"
    return f"OpenWorks_OW_{database}_SingleSource-OW_{database}"

In [84]:
query = QueryBuilder(
    model_name=MODEL_NAME,
    district_id=build_district_id(DISTRICT, model_name=MODEL_NAME),
    project=PROJECT,
).schema("HorizonData3D")

In [85]:
horizons = list(dsis_client.execute_query(query, max_pages=1))
print(f"Found {len(horizons)} horizons")

Found 1000 horizons


In [86]:
native_uids = [h["native_uid"] for h in horizons] 

### Fetch bulk (binary) data

The `dsis-client` provides two methods for downloading binary protobuf data:

- **`get_bulk_data()`** – loads everything at once (best for < 100 MB)
- **`get_bulk_data_stream()`** – streams in chunks (best for > 100 MB)

Use `query.entity(native_uid)` to target a specific entity's binary data, then pass the query to the bulk-data method.

For `HorizonData3D` the default `data_field="data"` is used, so we don’t need to override it.

In [87]:
horizon_data: dict[str, list[dict]] = {}
for uid in native_uids[:2]: #remove [:2]    
    # Target the entity's binary data field via query.entity()
    bulk_query = query.entity(uid)

    # Fetch all bulk data at once
    binary_data = dsis_client.get_bulk_data(bulk_query)

    if binary_data:
        print(f"Downloaded {len(binary_data):,} bytes")
        decoded = decode_horizon_data(binary_data, skip_length_prefix=True)
        horizon_data[uid] = horizon_samples_to_dict(decoded)
    else:
        print("No bulk data available for this horizon")

Downloaded 3,344,259 bytes
Downloaded 48,567,594 bytes


In [88]:
def stream_horizon(uid: str, chunk_size: int = 10 * 1024 * 1024) -> bytes | None:
    """Stream bulk data for a single horizon and return the reassembled bytes."""
    bulk_query = query.entity(uid, data_field="$value")
    chunks: list[bytes] = []

    for chunk in dsis_client.get_bulk_data_stream(bulk_query, chunk_size=chunk_size):
        chunks.append(chunk)

    if not chunks:
        print(f"✗ {uid}: no data")
        return None

    binary = b"".join(chunks)
    print(f"✓ {uid}: {len(binary):,} bytes ({len(chunks)} chunk(s))")
    return binary


# ── Stream all horizons ───────────────────────────────────────────────────
streamed_decoded: dict[str, tuple[np.ndarray | None, list[dict] | None]] = {}

for uid in native_uids[:2]:  # remove [:2] to stream all horizons
    binary_data = stream_horizon(uid)
    if binary_data is not None:
        decoded = decode_horizon_data(binary_data, skip_length_prefix=True)
        streamed_decoded[uid] = horizon_samples_to_dict(decoded)



✓ 1000: 3,344,259 bytes (409 chunk(s))
✓ 84749: 48,567,594 bytes (5930 chunk(s))
