In [1]:
%pip install -U odp-sdk

Collecting odp-sdk
  Using cached odp_sdk-0.4.3-py3-none-any.whl (20 kB)
  Using cached odp_sdk-0.4.2-py3-none-any.whl (20 kB)
Note: you may need to restart the kernel to use updated packages.


# SDK - Raw Roundtrip

In this example we will do the following:
 1. Create a raw dataset
 2. Create and upload a file to the dataset
 3. Download the file
 4. Delete the file
 5. Delete the dataset

In [2]:
from odp_sdk.client import OdpClient # The SDK
from odp_sdk.dto import ResourceDto # Resource Data Transfer Object
from odp_sdk.dto.file_dto import FileMetadataDto # File Metadataa Data Transfer Object

## Initiate the client
This is where we set up the client for our enviroment.
When we initiate a client within workspaces - it automagically authenticates requests to the plaform.
Using the SDK on your own computer you will need to authieticate, either with env variables or with our interactive login.

In [3]:
client = OdpClient()

## Create a resource data trasfer object
This object it what's being sent back and forth to the api to reference a certain resource.

In [4]:
my_dataset = ResourceDto(
    **{
        "kind": "catalog.hubocean.io/dataset",
        "version": "v1alpha3",
        "metadata": {
            "name": "seahorses", # Add your name to the dataset
        },
        "spec": {
            "storage_controller": "registry.hubocean.io/storageController/storage-raw-cdffs", # Raw storage controller
            "storage_class": "registry.hubocean.io/storageClass/raw",                         # Raw storage
            "maintainer": {"contact": "Just Me <raw_client_example@hubocean.earth>"},         # <-- strict syntax here
        },
    }
)


## Create the dataset
Managing resources like datasets and collections happens in the catalog part of the platform. 
Which is why we are using the catalog client part of the sdk.

In [5]:
# The dataset is created in the catalog.
my_dataset = client.catalog.create(my_dataset)

## Response

When creating a dataset the platform adds some extra data to the Resource dto. This is the same type of object we sent to create the dataset, but there are some additional fields set. 
Like the UUID, which is now the unique identifier of the dataset. 

In [6]:
print(my_dataset)

kind='catalog.hubocean.io/dataset' version='v1alpha3' metadata=MetadataDto(name='seahorses', display_name=None, description=None, uuid=UUID('40fdd05d-8ee4-400f-8a00-d82abebc1339'), labels={}, owner=UUID('2883db88-4205-488a-aea6-6273ccbaad87')) status=ResourceStatusDto(num_updates=0, created_time=datetime.datetime(2024, 1, 8, 12, 49, 19, 620282), created_by=UUID('2883db88-4205-488a-aea6-6273ccbaad87'), updated_time=datetime.datetime(2024, 1, 8, 12, 49, 19, 620282), updated_by=UUID('2883db88-4205-488a-aea6-6273ccbaad87'), deleted_time=None, deleted_by=None) spec={'storage_class': 'registry.hubocean.io/storageClass/raw', 'storage_controller': 'registry.hubocean.io/storageController/storage-raw-cdffs', 'data_collection': None, 'maintainer': {'contact': 'Just Me <raw_client_example@hubocean.earth>', 'organisation': None}, 'citation': None, 'documentation': [], 'attributes': [], 'facets': None, 'tags': []}


## Create and upload a file

Creating and uploading a file are two separate actions, but are joined together in the create_file fuction of the SDK for ease of use. 

Now that we are handling raw data, we are using the raw client of the SDK. 
We are also creating a data transfer object for the file, which is what it's name is, the type ... etc.
The create_file function also allows for passing the file data as well. Here the raw data is passed to the contents parameter. But you can stream any file you like. But make sure you set a fitting mime_type for the file. We are uploading text so we're using "text/plain". But there are other mimetypes for zip, netcdf, geojson and so forth.

In [7]:
# Creating and uploading a file.
file_dto = client.raw.create_file(    # Use the create_file function in the raw client.
    resource_dto=my_dataset,    # We want to put the file in the dataset we already created.
    file_metadata_dto=FileMetadataDto(**{"name": "test.txt", "mime_type": "text/plain"}), # Initial metadata for the file
    contents=b"Hello, World!", # Actual file data
)

## File creation response

The create_file function returns an uptdated file_dto, like the one we passed as file_metadata, let's check it out.

In [8]:
print(file_dto)

external_id=None name='test.txt' source=None mime_type='text/plain' metadata={'hubocean.io/app': 'odcat', 'hubocean.io/dataset': '40fdd05d-8ee4-400f-8a00-d82abebc1339'} directory=None asset_ids=None data_set_id=None labels=None geo_location=None source_created_time=None source_modified_time=None security_categories=None id=None uploaded=None uploaded_time=None created_time='2024-01-08T12:49:20.296000' last_updated_time=None


## List files

Listing files is straight forward. We use the list function in the raw client, and pass the dataset object that references the dataset we want the files to be returned from. We have only made one file, so the output should be the same as the file metadata we received in the previous step.

In [9]:
for file in client.raw.list(my_dataset):
    print(file)

external_id=None name='test.txt' source=None mime_type='text/plain' metadata={'hubocean.io/app': 'odcat', 'hubocean.io/dataset': '40fdd05d-8ee4-400f-8a00-d82abebc1339'} directory=None asset_ids=None data_set_id=None labels=None geo_location=None source_created_time=None source_modified_time=None security_categories=None id=None uploaded=None uploaded_time=None created_time='2024-01-08T12:49:20.296000' last_updated_time=None


## Download file

Let's download the file again to see that it worked.
Using the download_file function in the raw client, and let's pass the reference objects for the dataset and file.
If we add "test.txt" to the function we define that we want the file to be downloaded and made into a file for us at the specified path and with the specified name. Or else the function will return a stream. 

In [10]:
client.raw.download_file(my_dataset, file_dto, "test.txt")

## File downloaded

The file should now be downloaded to the current directory and contain "Hello, World!"

## Clean up

To run theese examples again, we might have to delete the resources we create, so here are the functions to delete file, and delete dataset.

In [11]:
client.raw.delete_file(my_dataset, file_dto)

In [12]:
client.catalog.delete(my_dataset)