# Roman Supplementary & Telemetry Data Retrieval

------------------

## Learning Goals

By the end of this tutorial, you will:

- Understand how to programmatically access (count, list, or download) Roman Supplementary and Telemetry data from the Calibration and Supplemental Search Interface (CaSSI), by ingestion date, filetype, or file name.


## Table of Contents

* [Introduction](#Introduction)
* [Imports](#Imports)
* [Helper Scripts](#Helper-Scripts)
* [Querying CaSSI](#Querying-CaSSI)
    * [Querying with start & end times][#Querying-with-start-&-end-times]
    * [Querying with filetype(s)][#Querying-with-filetype(s)]
    * [Querying by filename(s)][#Querying-by-filename(s)]
* [Downloading Data](#Downloading-Data)
    * [FIX MEEEEE Define the attributes for the mnemonics of interest](#Define-Mnemonic-Parameters) 
    * [Construct the filenames to contain the mnemonic timeseries](#Construct-File-Names)
    * [Call the web service to fetch the data and return files containing the timeseries](#Call-the-Webservice)
    * [Prepare the data for analysis](#Prepare-the-Data-for-Analysis)
* [Example companion script usage](#Example-companion-script-usage)
* [Additional resources](#Additional-Resources)


## Introduction

This tutorial demonstrates how to retrieve Roman Supplementary & Telemetry data and use it in the context of a Python session.

The [Roman Supplementary & Telemetry Data](https://outerspace.stsci.edu/spaces/RAPD/pages/308839208/Obtain+Supplemental+and+Telemetry+Data) tutorial in the Roman Pre-Launch documentation (restricted access pre-launch) describes how to access these supplementary & telemetry data available from MAST through the Calibration and Supplemental Search Interface (CaSSI). These data may be searched by ingestion date, file type, or filename.

The supplementary and telemetry data available from CaSSI covers a variety of data products (including Level 0 products for WFI/CGI, housekeeping ancillary telemetry data, observatory predicted/observed ephemeris, and message logs). Because of the variety of data products, this tutorial focuses on demonstrating how to query CaSSI to obtain file counts or lists and how to download data from CaSSI.   This notebook assumes that if you are searching CaSSI, you know (generally) what you are looking for; no demonstration of how to work with the retrieved data is included.
 

Note that this folder includes a companion script; after completing the tutorial, this offers a compact, customizable way to query, count, or download supplementary data. Examples using the various options in this script are given at the end of this notebook.

<div class="alert alert-info" style="color:black; border-color:teal;">
Please note that pre-launch, <b>the MAST Roman CaSSI Search API requires authorization to search and download Roman data products.</b> Before we get started, please ensure that:
    
- ***you are authorized to search and download Roman engineering data from MAST.*** If you are not authorized but you think you should be, email the helpdesk at archive@stsci.edu
- ***you have a [MAST token](https://auth.mast.stsci.edu/token) set to the environment variable*** **`MAST_API_TOKEN`**
</div>

    
<div class="alert alert-warning" style="color:black; background-color:#ffc5c5; border-color:red;">
<b>Note:</b> At this time, Roman data are not accessible from the cloud. Downloads will come from MAST servers, and <b>may be large</b>. Download with caution.
</div>


## Imports

This notebook uses the following packages to retreive data: 

* `os` to get MAST API token envirnoment variable
* `requests` to complete the web requests for search & download
* `datetime` for manipulating datetime strings
* `pathlib` to create a directory for the downloaded files
* `numpy` and `pandas` for convenient data manipulation

In [None]:
import os
import argparse
import requests
from pathlib import Path
from datetime import datetime
import numpy as np
from pandas import DataFrame

## Helper Scripts

Below are some functions to connect the CaSSI web service and query/retreive data files. These will be used later in this tutorial. 

In [None]:
cassi_url_search = "https://mast.stsci.edu/cassi/api/v0.1/roman/search/Eng"
cassi_url_download = "https://mast.stsci.edu/cassi/api/v0.1/download/roman/eng"

def query_cassi(start_date, end_date, limit, filetype, filename, token):
    """
    Query Roman CASSI API for supplemental and telemetry data and return the response.
    
    Parameters
    ----------
    start_date : str
        Ingestion start date in YYYY-MM-DD[THH:MM:SS] format (24hr)
        
    end_date : str or None
        Optional ingestion end date in YYYY-MM-DD[THH:MM:SS] format (24hr)
        Note that omitting a time will default to YYYY-MM-DDT00:00:00.

    limit : int
        Limit for number of results

    filetype : str or None
        Optional filetype filter constraints (quoted, with commas separating multiple values).
        Defaults to None (no filetype filtering).

    filename : str or None
        Optional filename (e.g., "CGI_00011850") filter constraint 
        (multiple values can be included, but the list must be comma separated and quoted).
        Defaults to None (no filename filtering).
        
    token: str
        MAST API token
        
    Returns
    -------
    response : requests.Response
       Response object from the CASSI API reques
    """

    headers = {
        "Content-Type": "application/json",
        "Authorization": f"token {token}"
    }
    if end_date:
        date_range = f">={start_date},<={end_date}"
    else:
        date_range = f">={start_date}"
    payload = {
        "conditions": [
            {"source": "Eng"},
            {"dataGroup": "Eng"},
            {"ingestCompletionDate": date_range}
        ],
        "limit": limit,
        "select_cols": [
            "fileType", "archiveFileName", "startTime", "endTime", "ingestCompletionDate"
        ]
    }

    if filetype is not None:
        payload["conditions"].append({"fileType": filetype})
        
    if filename is not None:
        payload["conditions"].append({"archiveFileName": filename})

    response = requests.post(cassi_url_search, headers=headers, json=payload)
    response.raise_for_status()

    # Get data results from the request response
    data = response.json()
    results = DataFrame(data.get("results", []))

    # Presently there are duplicates in the responses; drop these:
    results.drop_duplicates(
        subset=[
            'checksum', 'fileType', 
            'ingestCompletionDate', 'archiveFileName',
            'search_key'
        ],
        inplace=True, ignore_index=True
    )

    return results

    
def download_cassi_files(results, folder, token):
    """
    Download files to directory
    
    Parameters
    ----------
    results : DataFrame
        Metadata query results
    folder: str
        Directory (relative to cwd) in which to write output files
    token: str
        MAST API token
        
    Returns
    -------
    int
       Success status for each retrieval
    """
    
    Path(folder).mkdir(exist_ok=True)
    
    headers = {
        "Content-Type": "application/json",
        "Authorization": f'token {token}'
    }

    status = 0

    for ind in range(len(results)):
        row = results.iloc[ind]
        url = row['distribution'].get('url', {}).get('mast', None)
        if url is None:
            print("***Missing URL***")
            status = 1
        else:
            
            payload = {"product_url": url}

            try:
                response = requests.post(cassi_url_download, headers=headers, json=payload)
                response.raise_for_status()

                with open(f"{folder}/{row['archiveFileName']}", "wb") as f:
                    f.write(response.content)

            except requests.exceptions.HTTPError:
                print("  ***Error downloading file***")
                status = 1
         
    return status  

## Querying CaSSI

To begin, we will query CaSSI to find files matching specific search criteria.

Available search criteria include the ingestion start and end date, the file type, and the file name.

### Querying with start & end times

Using the above helper function `query_cassi()`, we will start by searching for all supplementary & telemetry data ingested during a specified time frame.

The start and end ingestion times are in UTC, and must be specified in `YYYY-MM-DD[THH:DD:SS]` format, with **T** as a literal cheracter.

Here, we will select start and end times of 00:00:00 on 2025 Oct 4 to 23:59:59 on 2025 Oct 4.

This query is constructed and executed as follows:

*(Note here we explicitly provide the time in both, as omitting the time signature defaults to `00:00:00`.)*

In [None]:
start_time = "2025-10-04T00:00:00"
end_time =   "2025-10-04T23:59:59"

mast_token = os.getenv("MAST_API_TOKEN")  # Fetch MAST token

results = query_cassi(
    start_time,
    end_time,
    50000,       # limit to number of results
    None,        # filetype
    None,        # filename
    mast_token
)

In [None]:
results

This query returns metadata for 39 files, with a variety of file types.


### Querying with filetype(s)

Now, let's run the same query as above, but now restricting the filetypes to only include Science WFI/CGI Level 0 files.

File types must be specified exactly, with multiple types separated by commas and the entire list wrapped in quotes.  *(It is convenient to use a first search to obtain the exact, corrently-formatted file type strings, as above.)*

In [None]:
start_time = "2025-10-04T00:00:00"
end_time =   "2025-10-04T23:59:59"
filetype = "Science CGI Level 0,Science WFI Level 0"

mast_token = os.getenv("MAST_API_TOKEN")  # Fetch MAST token

results = query_cassi(
    start_time,
    end_time,
    50000,       # limit to number of results
    filetype,    # filetype
    None,        # filename
    mast_token
)

In [None]:
results

With this added restriction, CaSSI returns only 10 file matches.


### Querying by filename(s)

Finally, we will query by file name to obtain only the metadata on specific files.

In [None]:
start_time = "2025-10-04T00:00:00"
end_time =   None
filetype =   None
filename =   "CGI_00011850,CGI_00011851"

mast_token = os.getenv("MAST_API_TOKEN")  # Fetch MAST token

results = query_cassi(
    start_time,
    end_time,
    50000,       # limit to number of results
    filetype,    # filetype
    filename,        # filename
    mast_token
)

In [None]:
results

Note: Wildcard, exclusion, exact matches, and multiple selections are also possible (as described in the [MAST documentation](outerspace.stsci.edu/spaces/MASTDfOCS/pages/217350619/Core+Search+Parameters#CoreSearchParameters-param_strings)).

## Downloading Data

To download Roman supplementary & telemetry data, we will use the second helper funciton, `download_cassi_files()`.

This function takes the following parameters as input:
- `results`: the metadata query results as returned by the `query_cassi()` helper function.
- `folder`: the folder into which the file(s) should be downloaded.
- `token`: the MAST API token


Here we will download the files from the last metadata query, restricting to just the filenames `CGI_00011850`, `CGI_00011851`.

*(Note these files total ~297 MB.)* 

In [None]:
status = download_cassi_files(results, "cassi-data", mast_token)

If all file(s) download successfully, status will be 0 (with 1 denoting errors in one or more downloads).

In [None]:
print(status)

In [None]:
ls -lhtr cassi-data

The download folder shows both files downloaded successfully.

## Example companion script usage

The companion script provides the functionality demonstrated above, as well as some additional features to aid in examining the metadata queries directly from the command line.

These usage patterns, with example output in certain cases, are detailed below (with code rendering, which will not be executed as part of this notebook).

**List all command line options**
```console
$ python cassi_supptelem_query_download_script.py --help
```

*(output omitted)*

**Perform a first query and list filetypes**

```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 --list-filetypes

fileTypes: ['Calibrated Engineering Data', 'FSW Event Message Logs', 'Guide Window Level 0', 'Housekeeping Ancillary Telemetry Data', 'Observatory Predicted Ephemeris', 'Science CGI Level 0', 'Science WFI Level 0']
```

**Search for files by start/end ingestion date and list metadata**
```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 -e 2025-10-04
```

*(output omitted)*

**Search for files by start/end ingestion date & file type and list metadata**
```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 -e 2025-10-04 --filetype "Science CGI Level 0,Science WFI Level 0"

   archiveFileName             fileType         ingestCompletionDate
0     CGI_00011850  Science CGI Level 0  2025-10-04T00:01:27.8970000
1     CGI_00011851  Science CGI Level 0  2025-10-04T00:01:30.9830000
2  WFISCI_00016592  Science WFI Level 0  2025-10-04T00:03:35.2380000
3  WFISCI_00016593  Science WFI Level 0  2025-10-04T00:03:53.7030000
4     CGI_00011852  Science CGI Level 0  2025-10-04T00:31:31.2340000
5  WFISCI_00016596  Science WFI Level 0  2025-10-04T00:33:00.1050000
6  WFISCI_00016594  Science WFI Level 0  2025-10-04T00:33:00.4000000
7  WFISCI_00016595  Science WFI Level 0  2025-10-04T00:33:42.3820000
8     CGI_00011853  Science CGI Level 0  2025-10-04T01:31:49.1250000
9     CGI_00011854  Science CGI Level 0  2025-10-04T03:01:30.7890000
```

**Search for files by start/end ingestion date & file name and list metadata**
```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 -e 2025-10-04 --filename "CGI_00011850,CGI_00011851"

  archiveFileName             fileType         ingestCompletionDate
0    CGI_00011850  Science CGI Level 0  2025-10-04T00:01:27.8970000
1    CGI_00011851  Science CGI Level 0  2025-10-04T00:01:30.9830000
```


**Count files selected by start/end ingestion date and file type**
```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 -e 2025-10-04 --filetype "Science CGI Level 0,Science WFI Level 0" --count

Total files:               10
Science CGI Level 0:        5
Science WFI Level 0:        5
```


**Download files selected by start/end ingestion date & file name**
```console
$ python cassi_supptelem_query_download_script.py -s 2025-10-04 -e 2025-10-04 --filename "CGI_00011850,CGI_00011851" --download --folder "cassi-data"
```

## Additional Resources
* The [Roman CaSSI interface](https://mast.stsci.edu/cassi/#/roman), with restricted access pre-launch.
* The restricted-access [[Roman Supplementary & Telemetry Data](https://outerspace.stsci.edu/spaces/RAPD/pages/308839208/Obtain+Supplemental+and+Telemetry+Data) tutorial in the Roman Pre-Launch documentation (on innerspace).  This information will be made public at a later date.

## About this Notebook

**Author(s):** Sedona Price and Zach Claytor, loosely adapted from the [JWST EDB retrieval notebook](https://spacetelescope.github.io/mast_notebooks/notebooks/JWST/Engineering_Database_Retreival/EDB_Retrieval.html) by MAST staff (chiefly Dick Shaw, Peter Forshay, and Bernie Shiao, with additional editing by Thomas Dutkiewicz). <br>
**Keyword(s):** Roman

***
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/style-guides/master/guides/images/stsci-logo.png" alt="Space Telescope Logo" width="200px"/> 

[Return to top of page](#Roman-Supplementary-&-Telemetry-Data-Retrieval)