# SpectrumX Data System | SDK Demo

## Links

+ [PyPI](https://pypi.org/project/spectrumx/)
+ [GitHub Repo](https://github.com/spectrumx/sds-code/blob/master/sdk/)
+ [SDS Key Generation](https://sds.crc.nd.edu/users/generate-api-key/)


## Basic Usage


### Client initialization

In [21]:
import spectrumx

spectrumx.__version__


'0.1.10'

In [2]:
%%bash
if ! grep -q '^SDS_SECRET_TOKEN=' .env 2>/dev/null; then
    echo "SDS_SECRET_TOKEN=" >> .env
fi

echo "Now edit the .env file to set your SDS_SECRET_TOKEN from:"
echo "https://sds.crc.nd.edu/users/generate-api-key/."


Now edit the .env file to set your SDS_SECRET_TOKEN from:
https://sds.crc.nd.edu/users/generate-api-key/.


In [3]:
from spectrumx import Client
from pathlib import Path

SDS_HOST = "sds.crc.nd.edu"

sds = Client(
    host=SDS_HOST,
    env_file=Path(".env"),  # default
)


This version of the SDK is in early development. Expect breaking changes in the future.


### File uploads


In [4]:
# when in dry-run (default), no changes are made to the SDS or the local filesystem
# to enable the changes, set dry_run to False, as in:
sds.dry_run = False


Dry-run DISABLED: modifications are now possible.


In [5]:
# authenticate using either the token from
# the .env file or in the config passed in

# if you see errors here, check your .env file has the correct token, then
# recreate the SDS client by running cell that creates the `sds` object again

sds.authenticate()


In [6]:
# this local_dir has some sample data we can upload to SDS
local_dir: Path = Path("./data-samples/drf/westford-vpol")

!tree "{local_dir}"


[01;34mdata-samples/drf/westford-vpol[0m
└── [01;34mcap-2024-06-27T14-00-00[0m
    ├── [01;34m2024-06-27T14-00-00[0m
    │   ├── [00;90mindex.html.tmp[0m
    │   ├── rf@1719499740.000.h5
    │   ├── rf@1719499740.125.h5
    │   ├── rf@1719499740.250.h5
    │   ├── rf@1719499740.375.h5
    │   ├── rf@1719499740.500.h5
    │   ├── rf@1719499740.625.h5
    │   ├── rf@1719499740.750.h5
    │   ├── rf@1719499740.875.h5
    │   ├── rf@1719499741.000.h5
    │   ├── rf@1719499741.125.h5
    │   ├── rf@1719499741.250.h5
    │   ├── rf@1719499741.375.h5
    │   ├── rf@1719499741.500.h5
    │   ├── rf@1719499741.625.h5
    │   ├── rf@1719499741.750.h5
    │   └── rf@1719499741.875.h5
    ├── drf_properties.h5
    ├── [00;90mindex.html.tmp[0m
    └── [01;34mmetadata[0m
        ├── [01;34m2024-06-27T14-00-00[0m
        │   ├── [00;90mindex.html.tmp[0m
        │   └── metadata@1719499588.h5
        ├── dmd_properties.h5
        └── [00;90mindex.html.tmp[0m

5 directories, 23 files


#### Upload capture files to SDS

In [7]:
# this is the path under your user directory in SDS, where those files will be uploaded
sds_reference_path = "/westford-vpol"

# we can optionally increase the logging to see what's happening
#   - note this might cause progress bars to break visually.
# spectrumx.enable_logging()

# make sure dry-run is disabled
sds.dry_run = False


Dry-run DISABLED: modifications are now possible.


In [8]:
# upload the local files
upload_results = sds.upload(
    local_path=local_dir,  # may be a single file or a directory
    sds_path=sds_reference_path,  # files will be created under this virtual directory
    verbose=True,  # shows a progress bar (default)
)


Uploading: 23.0files [00:03, 6.35files/s]


#### Checking upload results

The upload operation will NOT raise an exception if some files fail to upload,
so it's important to check its results for any errors returned.

This gives you more control over how you handle errors, such as retrying failed uploads or logging them for later review.


In [9]:
from spectrumx.models.files import File
from spectrumx.errors import Result

# successful results, in this context, wrap a `File` object
# let's filter the successful results by:
#   1. Evaluating their 'truthiness' (truth-y <=> success); and
#   2. "Unwrapping" them with `res()` to get the `File` object:
successful_files: list[File] = [res() for res in upload_results if res]

# failed results wrap an exception.
# NOTE: unwrapping failed results (`res()`) will raise their exception, so we just filter them:
failed_results: list[Result] = [res for res in upload_results if not res]

# optionally take action for failed uploads
if failed_results:
    print(f"{len(failed_results)} files failed to upload:")
    for res in failed_results:
        print(f"\t- {res.error_info}")

print(f"{len(successful_files)} files uploaded successfully:")

for file_uploaded in successful_files:
    print(f"\t- {file_uploaded.name:>30} of {file_uploaded.size} B")


23 files uploaded successfully:
	-              drf_properties.h5 of 2232 B
	-                 index.html.tmp of 624 B
	-                 index.html.tmp of 670877 B
	-           rf@1719499740.000.h5 of 1257608 B
	-           rf@1719499740.125.h5 of 1257608 B
	-           rf@1719499740.250.h5 of 1257608 B
	-           rf@1719499740.375.h5 of 1257608 B
	-           rf@1719499740.500.h5 of 1257608 B
	-           rf@1719499740.625.h5 of 1257608 B
	-           rf@1719499740.750.h5 of 1257608 B
	-           rf@1719499740.875.h5 of 1257608 B
	-           rf@1719499741.000.h5 of 1257608 B
	-           rf@1719499741.125.h5 of 1257608 B
	-           rf@1719499741.250.h5 of 1257608 B
	-           rf@1719499741.375.h5 of 1257608 B
	-           rf@1719499741.500.h5 of 1257608 B
	-           rf@1719499741.625.h5 of 1257608 B
	-           rf@1719499741.750.h5 of 1257608 B
	-           rf@1719499741.875.h5 of 1257608 B
	-              dmd_properties.h5 of 4056 B
	-                 index.html.tmp of 52

#### Where are the uploaded files?

In [10]:
for file_uploaded in successful_files:
    print(f"\t- {file_uploaded.path!s:>30}")


	- /westford-vpol/cap-2024-06-27T14-00-00/drf_properties.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/index.html.tmp
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/index.html.tmp
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.000.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.125.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.250.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.375.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.500.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.625.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.750.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499740.875.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00/rf@1719499741.000.h5
	- /westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-

Notice they're all under the `/westford-vpol/` directory: our `sds_reference_path` of earlier.

### File Downloads

Maybe you'd like to download the files to a different machine. Here we'll go through
this process:


In [11]:
# download the files from an SDS directory
local_downloads = Path("sds-downloads")
download_results = sds.download(
    from_sds_path=sds_reference_path,  # files will be downloaded from this virtual dir
    to_local_path=local_downloads,  # download to this location (it may be created)
    overwrite=True,  # do not overwrite local existing files (default)
    verbose=True,  # shows a progress bar (default)
)

!tree "sds-downloads"


Downloading: 'index.html.tmp': 100%|[32m██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████[0m| 23.0/23.0 [00:16<00:00, 1.37files/s][0m

[01;34msds-downloads[0m
└── [01;34mfiles[0m
    └── [01;34msds@parza.dev[0m
        └── [01;34mwestford-vpol[0m
            └── [01;34mcap-2024-06-27T14-00-00[0m
                ├── [01;34m2024-06-27T14-00-00[0m
                │   ├── [00;90mindex.html.tmp[0m
                │   ├── rf@1719499740.000.h5
                │   ├── rf@1719499740.125.h5
                │   ├── rf@1719499740.250.h5
                │   ├── rf@1719499740.375.h5
                │   ├── rf@1719499740.500.h5
                │   ├── rf@1719499740.625.h5
                │   ├── rf@1719499740.750.h5
                │   ├── rf@1719499740.875.h5
                │   ├── rf@1719499741.000.h5
                │   ├── rf@1719499741.125.h5
                │   ├── rf@1719499741.250.h5
                │   ├── rf@1719499741.375.h5
                │   ├── rf@1719499741.500.h5
                │   ├── rf@1719499741.625.h5
                │   ├── rf@1719499741.750.h5
                │   └── rf@1719499741.875.h5
     




In [12]:
from spectrumx.models.files import File

# Once again, we have a list of results that may either wrap
#   a `File` object (success) or an exception (failure)

# let's filter the successful results and unwrap them:
successful_files: list[File] = [res() for res in download_results if res]

# then just filter the failed results:
failed_results: list[Result] = [res for res in download_results if not res]

# optionally take action for failed uploads
if failed_results:
    print(f"{len(failed_results)} files failed to download:")
    for res in failed_results:
        print(f"\t- {res.error_info}")

print(f"{len(successful_files)} files downloaded successfully:")

for file_downloaded in successful_files:
    print(
        f"\t- {file_downloaded.uuid}: {file_downloaded.name} of {file_downloaded.size} B"
    )


23 files downloaded successfully:
	- 7ade2030-4f21-44e5-8f6d-7053c4d519d4: index.html.tmp of 670877 B
	- 2aa82596-9b02-4cbd-933b-a9597990bdcd: rf@1719499740.000.h5 of 1257608 B
	- ac2bbd82-786b-4173-a918-34869c897284: rf@1719499740.125.h5 of 1257608 B
	- e2fb3490-f15d-4ccd-8511-90316e3425c1: rf@1719499740.250.h5 of 1257608 B
	- e38ecb28-2855-42f7-b2ab-28ced3691914: rf@1719499740.375.h5 of 1257608 B
	- 967ffd61-097b-4491-8dbf-6d915152fd32: rf@1719499740.500.h5 of 1257608 B
	- e427952c-a402-4d83-bd39-979a720eea65: rf@1719499740.625.h5 of 1257608 B
	- f03b3477-3b1e-40f1-bc5c-d2e0e381dfa2: rf@1719499740.750.h5 of 1257608 B
	- 0a574fbd-a857-4511-b411-e54af1782698: rf@1719499740.875.h5 of 1257608 B
	- 51ccbc22-c8fa-4bd7-9e1c-e53c76bfece5: rf@1719499741.000.h5 of 1257608 B
	- cc104469-3b7c-4eea-b5bc-4c4f0da838c5: rf@1719499741.125.h5 of 1257608 B
	- 9696974f-21b8-4ee4-826e-2673dd6eb372: rf@1719499741.250.h5 of 1257608 B
	- ec391eab-ceac-4887-971f-910436098b08: rf@1719499741.375.h5 of 1257608 

Notice that file paths (file directory + names) may change and not be unique, but UUIDs
will persist. Use this UUID to reference a specific file version in SDS.


### File Listing

This is a very useful feature of the SDK to visualize the files in a directory, recursively.


In [13]:
# gets a list of SDS files in a directory, without downloading them
from spectrumx.ops.pagination import Paginator

files_generator: Paginator[File] = sds.list_files(sds_path=sds_reference_path)


In [14]:
print("Iterating over the file generator:")
for file_index, file_entry in enumerate(files_generator):
    print(
        f"\t'dir={file_entry.directory}' | name='{file_entry.name}' | created_at={file_entry.created_at}"
    )
    # do_something_with_the_file(file_entry)
    if file_index > 5:
        print("\t>> Stopping early <<")
        break


Iterating over the file generator:
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='index.html.tmp' | created_at=2025-06-12 08:23:04
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='rf@1719499740.000.h5' | created_at=2025-06-12 08:23:04
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='rf@1719499740.125.h5' | created_at=2025-06-12 08:23:05
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='rf@1719499740.250.h5' | created_at=2025-06-12 08:23:05
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='rf@1719499740.375.h5' | created_at=2025-06-12 08:23:06
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00' | name='rf@1719499740.500.h5' | created_at=2025-06-12 08:23:07
	'dir=/files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T1

Note that the returned list of files is a (lazy) **generator**, and it is consumed after
the first iteration.

This has some benefits:
  1. Avoids loading all file metadata in **memory** at once;
  2. Reduces the **time** for the first page of files; and
  3. Avoids making more server **requests** than necessary, while spacing them out.


In [15]:
# If you need to iterate over the files multiple times, you
# can get the next N files from the generator into a list:
num_files: int = 3
up_to_three_files: list[File] = []
for _ in range(num_files):
    try:
        up_to_three_files.append(next(files_generator))
    except StopIteration:
        break  # no more to list; we have less than `num_files` files

print("The next files:")
for file_entry in up_to_three_files:
    print(f"\tProcessing {file_entry.name} of size {file_entry.size} B...")
print()


The next files:
	Processing rf@1719499740.750.h5 of size 1257608 B...
	Processing rf@1719499740.875.h5 of size 1257608 B...
	Processing rf@1719499741.000.h5 of size 1257608 B...



## Operating on RF Captures

Captures are groups of radio-frequency files that make sense together. For example, a
group of Digital RF files that belong to the same channel may be part of a capture.


In [16]:
capture_dir = Path("./data-samples/drf/westford-vpol/").resolve()
channel_name = "cap-2024-06-27T14-00-00"

!tree "{capture_dir}"


[01;34m/home/lucas/demos/crc/spectrumx/spx-events/demos/data_system/data-samples/drf/westford-vpol[0m
└── [01;34mcap-2024-06-27T14-00-00[0m
    ├── [01;34m2024-06-27T14-00-00[0m
    │   ├── [00;90mindex.html.tmp[0m
    │   ├── rf@1719499740.000.h5
    │   ├── rf@1719499740.125.h5
    │   ├── rf@1719499740.250.h5
    │   ├── rf@1719499740.375.h5
    │   ├── rf@1719499740.500.h5
    │   ├── rf@1719499740.625.h5
    │   ├── rf@1719499740.750.h5
    │   ├── rf@1719499740.875.h5
    │   ├── rf@1719499741.000.h5
    │   ├── rf@1719499741.125.h5
    │   ├── rf@1719499741.250.h5
    │   ├── rf@1719499741.375.h5
    │   ├── rf@1719499741.500.h5
    │   ├── rf@1719499741.625.h5
    │   ├── rf@1719499741.750.h5
    │   └── rf@1719499741.875.h5
    ├── drf_properties.h5
    ├── [00;90mindex.html.tmp[0m
    └── [01;34mmetadata[0m
        ├── [01;34m2024-06-27T14-00-00[0m
        │   ├── [00;90mindex.html.tmp[0m
        │   └── metadata@1719499588.h5
        ├── dmd_properties.h5
   

Upload these files to SDS


In [17]:
# (if not already done)
# _ = sds.upload(
#     local_path=capture_dir,
#     sds_path=sds_reference_path,
#     verbose=True,
# )


### Creating a Capture


In [None]:
from spectrumx.models.captures import Capture
from spectrumx.models.captures import CaptureType
from pathlib import PurePosixPath

drf_capture: Capture = sds.captures.create(
    capture_type=CaptureType.DigitalRF,
    channel=channel_name,
    top_level_dir=PurePosixPath(sds_reference_path),
)


In [None]:
from typing import Any

print(f"Digital-RF capture created with ID {drf_capture.uuid}")
print(drf_capture)


def pretty_print_dict(data: dict[str, Any], indent: int = 0) -> None:
    """Recursively pretty-prints a dict with indentation."""
    for key, value in data.items():
        prefix = "    " * indent
        if isinstance(value, dict):
            print(f"{prefix}{key}:")
            pretty_print_dict(value, indent=indent + 1)
        else:
            print(f"{prefix}{key!s:<37}: {value}")
    print()


pretty_print_dict(data=drf_capture.capture_props, indent=1)


Digital-RF capture created with ID 21ef8605-28d9-4266-beab-2429d327d965
Capture(uuid=21ef8605-28d9-4266-beab-2429d327d965, type=drf, files=19, created_at=2025-06-12 08:50:56.038266-04:00)
    samples_per_second                   : 2500000
    start_bound                          : 1719499740
    end_bound                            : 1719499741
    is_complex                           : True
    is_continuous                        : True
    epoch                                : 1970-01-01T00:00:00Z
    digital_rf_time_description          : All times in this format are in number of samples since the epoch in the epoch attribute.  The first sample time will be sample_rate * UTC time at first sample.  Attribute init_utc_timestamp records this init UTC time so that a conversion to any other time is possible given the number of leapseconds difference at init_utc_timestamp.  Leapseconds that occur during data recording are included in the data.
    digital_rf_version                   : 

### Listing user's Captures


In [19]:
cap_list: list[Capture] = sds.captures.listing(capture_type=CaptureType.DigitalRF)
if not cap_list:
    print("No captures found.")
for capture in cap_list:
    print(f"Capture {capture.uuid}: {capture.channel} ({capture.capture_props})")


Capture 21ef8605-28d9-4266-beab-2429d327d965: cap-2024-06-27T14-00-00 ({'samples_per_second': 2500000, 'start_bound': 1719499740, 'end_bound': 1719499741, 'is_complex': True, 'is_continuous': True, 'epoch': '1970-01-01T00:00:00Z', 'digital_rf_time_description': 'All times in this format are in number of samples since the epoch in the epoch attribute.  The first sample time will be sample_rate * UTC time at first sample.  Attribute init_utc_timestamp records this init UTC time so that a conversion to any other time is possible given the number of leapseconds difference at init_utc_timestamp.  Leapseconds that occur during data recording are included in the data.', 'digital_rf_version': '2.6.0', 'uuid_str': '15f2dcff1f2a43278a0f5175a356df3a', 'center_frequencies': [1024000000.4842604], 'custom_attrs': {'num_subchannels': 1, 'index': 4298748970000000, 'processing/channelizer_filter_taps': [], 'processing/decimation': 1, 'processing/interpolation': 1, 'processing/resampling_filter_taps': [

"Reading" a specific capture will give us the files associated with it:


In [20]:
from uuid import UUID

if cap_list:
    selected_capture: Capture = cap_list[0]
    assert isinstance(selected_capture.uuid, UUID), (
        "Capture UUID should be a UUID object"
    )
    selected_capture = sds.captures.read(capture_uuid=selected_capture.uuid)
    for cap_file in selected_capture.files:
        # print file UUID and name using an f-string for clarity
        print(f"\t{cap_file.uuid}: {cap_file.name} in {cap_file.directory}")
else:
    print("No captures available to read.")


	f8aea719-7b8e-4adb-a122-30b8ad71ed5b: metadata@1719499588.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/metadata/2024-06-27T14-00-00
	23d257c5-e784-4264-8951-c02cf6bc3237: drf_properties.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00
	2aa82596-9b02-4cbd-933b-a9597990bdcd: rf@1719499740.000.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00
	ac2bbd82-786b-4173-a918-34869c897284: rf@1719499740.125.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00
	e2fb3490-f15d-4ccd-8511-90316e3425c1: rf@1719499740.250.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00
	e38ecb28-2855-42f7-b2ab-28ced3691914: rf@1719499740.375.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00
	967ffd61-097b-4491-8dbf-6d915152fd32: rf@1719499740.500.h5 in /files/sds@parza.dev/westford-vpol/cap-2024-06-27T14-00-00/2024-06-27T14-00-00
	e427952c-a402-4d8