# Roman Engineering Data Retrieval

------------------

## Learning Goals

By the end of this tutorial, you will:

- Understand how to programmatically access Roman Engineering Database (EDB) products (by mnemonic and date).


## Table of Contents

* [Introduction](#Introduction)
* [Imports](#Imports)
* [Helper Script](#Helper-Script)
* [Downloading Data](#Downloading-Data)
    * [Define the attributes for the mnemonics of interest](#Define-Mnemonic-Parameters) 
    * [Construct the filenames to contain the mnemonic timeseries](#Construct-File-Names)
    * [Call the web service to fetch the data and return files containing the timeseries](#Call-the-Webservice)
    * [Prepare the data for analysis](#Prepare-the-Data-for-Analysis)
* [Visualize the data](#Visualize-the-Data-Tuple)
    * [Split the data into mini-series at time boundaries](#Identify-Subseries-in-the-Data)
    * [Plot the timeseries](#Plot-the-Segmented-Timeseries)
* [Additional resources](#Additional-Resources)


## Introduction

This tutorial demonstrates how to retrieve Roman engineeering data and use it in the context of a Python session.

The [Roman Engineering Data](https://outerspace.stsci.edu/spaces/RAPD/pages/301172598/Search+for+Calibrated+Engineering+Data) tutorial in the Roman Pre-Launch documentation (restricted access pre-launch) describes how to access engineering telemetry points stored in the Roman Engineering Database in the form of timeseries. These data may be searched by means of an identifier, or **mnenomic**.

Some quantities of interest require more than one mnemonic (a *tuple*) for meaningful analysis. This tutorial illustrates how to retrieve a tuple of mnenomics and visualize the result. In the following example, timeseries will be retrieved for mnemonics`SCF_AC_SDR_QBJ_`**n**, with **n** values of 1 through 4, which are the spacecraft orientation quaternion paremters.

For more details on constructing a mnemonic, see the [Roman Engineering Data](https://outerspace.stsci.edu/spaces/RAPD/pages/301172598/Search+for+Calibrated+Engineering+Data) page (and also the [JWST Engineering Data](https://outerspace.stsci.edu/display/MASTDOCS/Engineering+Data) page, for general information on engineering database contents and mnemonic conventions, as similar layouts and conventions are used for both JWST and Roman).

Note that this folder includes a companion script; after completing the tutorial, this offers a compact, customizable way to download the data.

<div class="alert alert-info" style="color:black; border-color:teal;">
Please note that pre-launch, <b>the MAST Roman EDP Search API requires authorization to search and download Roman data products.</b> Before we get started, please ensure that:
    
- ***you are authorized to search and download Roman engineering data from MAST.*** If you are not authorized but you think you should be, email the helpdesk at archive@stsci.edu
- ***you have a [MAST token](https://auth.mast.stsci.edu/token) set to the environment variable*** **`MAST_API_TOKEN`**
</div>

    
<div class="alert alert-warning" style="color:black; background-color:#ffc5c5; border-color:red;">
<b>Note:</b> At this time, Roman data are not accessible from the cloud. Downloads will come from MAST servers and <b>may be large</b>. Download with caution.
</div>


## Imports

This notebook uses the following packages to retreive data: 

* `os` to get MAST API token envirnoment variable, and for handling file separators, i.e. "/" on Unix-like machines and "\\" on Windows
* `urllib` to complete the web request
* `datetime` for manipulating datetime strings
* `pathlib` to create a directory for the downloaded files
* `numpy` and `pandas` for convenient data manipulation

Additional packages are used for data visualization:

* `bokeh` (`output_notebook`, `plotting`, `ColorBar`, `FixedTicker`, `Span`, `palette.Spectral10`, `linear_cmap`, `gridplot`) for plotting
* `astropy.time.Time` to obtain MJD from ISO time formatted strings

In [None]:
import os
import urllib.error
import urllib.request
from datetime import datetime
from pathlib import Path
import numpy as np
import pandas as pd

from bokeh.io import output_notebook
import bokeh.plotting as bp
from bokeh.models import ColorBar, FixedTicker, Span
from bokeh.palettes import Spectral10 as cm
from bokeh.transform import linear_cmap
from bokeh.layouts import gridplot

from astropy.time import Time  # To get datetimes in MJD

## Helper Script

Below is a function to connect to the EDB web service and retrieve the data files. It will be used later in this tutorial. 

In [None]:
def get_mnemonic_datetimes_from_filename(fname):
    splt = fname.split(".")[0].split('_')
    mnemonic = '_'.join(splt[:-2])
    s_time = datetime.fromisoformat(splt[-2]).isoformat()
    e_time = datetime.fromisoformat(splt[-1]).isoformat()

    return mnemonic, s_time, e_time


def download_edb_datafiles(filenames, folder):
    '''
    Download filenames to directory
    
    Parameters
    ----------
    filenames : iterable
        List of string-valued file names to contain the desired mnemonic timeseries
    folder: str
        Directory (relative to cwd) in which to write output files
        
    Returns
    -------
    int
       Success status for each mnemonic retrieval
    '''
            
    Path(folder).mkdir(exist_ok=True)
    
    mast_token = os.getenv("MAST_API_TOKEN")
    headers = {
        "Authorization": f'token {mast_token}'
    }

    urlStr = 'https://mast.stsci.edu/edp/api/v0.1/mnemonics/spa/roman/data?mnemonic={}&s_time={}&e_time={}&result_format=csv' 
    status = 0

    for fname in filenames:
        print(
            f"Downloading File: mast:romanedb/{fname}\n",
            f" To: {folder}/{fname}",
        )
        
        mnemonic, s_time, e_time = get_mnemonic_datetimes_from_filename(fname)
        url = urlStr.format(mnemonic, s_time, e_time)
        req = urllib.request.Request(url, headers=headers)
        
        try:
            # Open the URL with the request object and save to file
            with urllib.request.urlopen(req) as response:
                data = response.read().decode('utf-8')
                with open(f"{folder}/{fname}", "w", encoding='utf-8') as f:
                    f.write(data)
        except urllib.error.URLError:
            print("  ***Error downloading file***")
            status = 1
    
    return status  

## Downloading Data
To download data, you'll need to format the request correctly. That requires defining mnemonics, naming files to match, and then calling the webservice to begin the download.

### Define Mnemonic Parameters

Next, define the parameters of each mnemonic of interest. The parameters are:
* The mnenomic name
* Start time
* End time

The start and end times are in UTC and have a "compact" ISO-8601 formatting: `yyyymmddThhmmss`, where the **T** is a literal character. The definitions can be stored in multiple ways: here they will be stored in a Python dictionary, which could be stored in an external `.yaml` file. In the companion script they are stored in an external `.csv` file.

Since the mnemonics of interest are a tuple, the start/end times are the same: from 00:00:00 on 2027 March 14 to 23:59:59 on 2027 March 16. Define these times first, followed by the full parameter dictionary.

In [None]:
times = { 
         't_start': '20270314T000000',
         't_end':   '20270316T235959'
        }
mnemonics = {
            'SCF_AC_SDR_QBJ_1': times,
            'SCF_AC_SDR_QBJ_2': times,
            'SCF_AC_SDR_QBJ_3': times,
            'SCF_AC_SDR_QBJ_4': times
           }
for m, v in mnemonics.items():
    print(m, v)

### Construct File Names

The key to fetching data from the web service is to construct file names to contain the data for each mnemonic. The web service will parse the file names to determine how to query the engineering database and retrieve the timeseries of interest.

The file names have the form: 

    `<mnemonic_name>_<t_start>_<t_end>.csv`
    
Use a dictionary comprehension to construct a list of file names; these will be passed to the webservice calling function.

In [None]:
fnames = ['_'.join([m, v['t_start'], v['t_end']]) + '.csv' for m, v in mnemonics.items()]
print(fnames)

### Call the Webservice

Set the (optional) output folder name prior to the webservice call. 

In [None]:
# Sub-directory where the data files will be written:
subdir = 'edb-data'

Now call the EDB web service. The files containing data will be written to your local storage, in the specified subdirectory. 

<div class="alert alert-block alert-info">

<span style="color:black">
    The webservice may take a long time (or timeout), depending upon the quantity of data in the timeseries within the chosen date range.
    
</span>
</div>

In [None]:
status = download_edb_datafiles(fnames, folder=subdir)

### Prepare the Data for Analysis

Create a list of Pandas dataframes from the mnemonics data that were just written to disk. 

In [None]:
df = [pd.read_csv(subdir+os.path.sep+f) for f in fnames]

Make sure the sizes of the dataframes are equal, and take a look at the first dataframe.

In [None]:
print('Dataframes have the same size? {}'.format(len(df[0]) == len(df[1])))
df[0]

## Visualize the Data Tuples

Create a series of joint plots (of each quaterion parameter against the others) for analysis. This is easy to do by plotting the Pandas dataframes. It is more interesting to add color to indicate changes in the spacecraft quaternion parameters.

In [None]:
# The following method is needed for bokeh display in a Notebook.
# Note that it does not activate the display. This happens in the 'Plot Timeseries' section.
output_notebook()

### Identify Subseries in the Data

Engineering data may contain periods of sampling between observations where the returned values do not change. The following function attempts to break up the timeseries by looking for these stretches of unchanging values.

In [None]:
def find_breaks(data_series, max_flats=5):
    """
    Parameters
    ----------
    data_series : list of pandas.DataFrame
        List of timeseries data.
    max_flats : int, default=5
        After this many data points with unchanging values, timeseries data will be broken up.
        
    Returns
    -------
    list of pandas.DataFrame
        Each DataFrame contains a continuous set of changing EDB timeseries data with grouped tuples of values from the inputs.
    """

    vals_list = []
    dates_list = []
    for ds in data_series:
        # Get the MJD and position values out of the DataFrames.
        # ObsTime needs to be converted to a string and then coverted to MJD
        vals_list.append(ds['EUValue'].values)
        dates_list.append(Time(np.array(ds['ObsTime'].values, dtype=str), format='isot').mjd)
    
    # Combine the individual series into a single DataFrame.
    combo_frame = pd.DataFrame(data=dates_list[0], columns=['MJD'])
    combo_frame['timestamp'] = np.array(data_series[0]['ObsTime'].values, dtype=str)
    for i, val_list in enumerate(vals_list):
        combo_frame[f'value_{i+1}'] = val_list

    # If only 1 timepoint, skip sampling breaks checking.
    if len(vals_list[0]) == 1:
        return [combo_frame]
    
    # Scan the timeseries data to look for flat periods of no reading change.
    results = []
    m = 0
    flat = 0
    recording = True
    
    for n in range(1, len(vals_list[0])):

        # Make sure timestamps match: 
        dates_match = []
        for j in range(len(dates_list)):
            if j == 0:
                date_ref = dates_list[j][n]
            dates_match.append(dates_list[j][n] == date_ref)
        if np.all(dates_match):
            # Calculate the distance from the current positions to the following.
            val_diffs = []
            for j in range(len(dates_list)):
                val_diffs.append(np.abs(vals_list[j][n-1] - vals_list[j][n]))

            # Multiple points with no change will stop recording and store the current series.
            if np.all(np.array(val_diffs) == 0):
                flat += 1
                if not recording:
                    continue
                elif flat >= max_flats:
                    size = (n-max_flats) - m
                    if size > 1:
                        results.append(combo_frame[m:n-(max_flats)])
                    recording = False
                    
            # Start recording if changes detected.
            elif np.any(val_diffs > 0) and not recording:
                flat = 0
                m = n
                recording = True
    
    # Capture the final series if still recording.
    if recording and (n - m) > 1:
        results.append(combo_frame[m:])
    
    print("returning {} timeseries".format(len(results)))
    
    return results

Report the start/end times of each identified subseries. Since there are many of them, we print only the last result as a sample.

In [None]:
split_series = find_breaks(df, max_flats=5)
for ss in split_series:
    v = ss['timestamp'].values

# Inserting this print statement into the for loop will print all timeseries
print("    {0} - {1}".format(v[0], v[-1]))

### Plot the Segmented Timeseries

The following function plots a single subseries of the quaternion tuple data and applies a color gradiant based on the associated time stamps. 

In [None]:
def make_single_bokeh_panel(data, keyx, keyy, mapper=None):
    
    # Create a bokeh.plotting figure object.
    n = bp.figure(height=400, width=400, match_aspect=True)
    
    # Add lines to make 0 axis a bit more obvious.
    lw = 1.3
    vline = Span(location=0, dimension='height', line_color='black', line_width=lw)
    hline = Span(location=0, dimension='width', line_color='black', line_width=lw)
    n.renderers.extend([vline, hline])
    
    # Add a circle plot of parameters with the color map applied.
    radius = np.max([0.0025, (data[keyx].max() - data[keyy].min()) / 100])  # Standardize the radius of points
    n.circle(source=data, x=keyx, y=keyy, fill_alpha=0.6, fill_color=mapper, line_color=None, radius=radius)

    return n


def plot_quaternion_color(data):
    """
    Plot quaternion-vs-quaternion timeseries data with color mapping based on the timing.
    
    Parameters
    ----------
    data : pandas.DataFrame
        A combined quaternion timeseries data set.
    """
    
    mjd = data['MJD']
    n_ticks = 10

    # Set up a linear color map based on the MJD data.
    mapper = linear_cmap(field_name='MJD', palette=cm, low=min(mjd), high=max(mjd))
    
    # Create the bokeh.plotting figures:
    fig_grid = []
    for i in range(4):
        fig_list = []
        for j in range(4):
            if i > j:
                n = make_single_bokeh_panel(data, f"value_{j+1}", f"value_{i+1}", mapper=mapper)
                
                # Add some labels to our axes
                n.xaxis.axis_label = f"SCF_AC_SDR_QBJ_{j+1}"
                n.yaxis.axis_label = f"SCF_AC_SDR_QBJ_{i+1}"
                fig_list.append(n)
            else:
                fig_list.append(None)
        fig_grid.append(fig_list)

    # Link ranges:
    for i in range(4):
        for j in range(4):
            if (fig_grid[i][j] is not None):
                if j > 0:
                    fig_grid[i][j].y_range = fig_grid[i][0].y_range
                if i < 3:
                    fig_grid[i][j].x_range = fig_grid[3][j].x_range
        
    p = gridplot(fig_grid, toolbar_location='left')
    
    # Translate legend values from MJD to time stamps.
    indices = list(range(0, len(mjd), np.max([int(len(mjd)/n_ticks), 1])))
    tick_dict = {mjd.values[x]: data['timestamp'].values[x] for x in indices}
    ticks = FixedTicker(ticks=list(tick_dict.keys()))
    
    # Add a color bar legend for the MJD data.
    color_bar = ColorBar(color_mapper=mapper['transform'], 
                         width=12,
                         ticker=ticks,
                         major_label_overrides=tick_dict,
                         location=(0, 0), 
                         label_standoff=45,
                         )
    fig_grid[1][0].add_layout(color_bar, 'right')
    
    # Display the figure.
    bp.show(p)

In the following command you can update the index to change which split timeseries you are plotting. Once the plot renders, use the plot control tools in the upper right to pan, zoom, and save the plot. 

In [None]:
plot_quaternion_color(split_series[0])

In [None]:
len(split_series)

# Additional Resources
* The [Roman Engineering Database Portal](https://mast.stsci.edu/edp/#/roman), with restricted access pre-launch.
* The restricted-access [Roman Engineering Data](https://outerspace.stsci.edu/spaces/RAPD/pages/301172598/Search+for+Calibrated+Engineering+Data) tutorial in the Roman Pre-Launch documentation (on innerspace).  This information will be made public at a later date.

## About this Notebook
This notebook was developed by MAST staff, chiefly Sedona Price and Zach Claytor, based on the equivalent JWST EDB notebook.

**Author(s):** Sedona Price and Zach Claytor, adapted from the [JWST EDB retrieval notebook](https://github.com/zclaytor/mast_notebooks/blob/main/notebooks/JWST/Engineering_Database_Retreival/EDB_Retrieval.ipynb) by MAST staff (chiefly Dick Shaw, Peter Forshay, and Bernie Shiao, with additional editing by Thomas Dutkiewicz). <br>
**Keyword(s):** Tutorial, Roman <br>
**First published:** Feb 2026 <br>
**Last updated:** Feb 2026 

***
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/style-guides/master/guides/images/stsci-logo.png" alt="Space Telescope Logo" width="200px"/> 

[Return to top of page](#Roman-Engineering-Data-Retrieval)