# Uploading Clinical Scores and Similar Data to the Rune Platform

Various types of data are ingested into the Rune platform in different ways. Neural signals from deep brain stimulation devices are automatically uploaded via our Box.com integration. Wearable data and patient reported outcomes are automatically uploaded via our iPhone/Apple Watch integration. However, research moves quickly, and not all emerging types of data are immediately compatible with our automated ingestion tools.

For many such data types, our event ingestion API provides the flexibility to upload and ingest data into the platform. While it is not appropriate for dense streams of data (such as local field potentials or accelerometry), it can accommodate data that occur as individual events, including clinical assessments, symptom metrics, behavioral task data, etc.

This notebook serves as a tutorial and template for upload and ingestion of such data, using the [Movement Disorders Society Unified Parkinson's Disease Rating Scale (MDS-UPDRS)](https://www.movementdisorders.org/MDS/MDS-Rating-Scales/MDS-Unified-Parkinsons-Disease-Rating-Scale-MDS-UPDRS.htm) as a common example. UPDRS scores can be placed in a CSV file in the format of [the provided example](https://rune-labs.quip.com/ooCbAOWCeWBT/UPDRSuploadformat), and uploaded using this notebook.

This template is intended to serve as a basis for extensions that handle a variety of data types, depending on the specific needs of research groups and therapy developers. We are happy to help with your specific application.

**Note:** Because no method is available for deletion of individual segments of data from the Rune platform, it is vital to check and confirm the format of the data before upload/ingestion. Mistakes could cause a patient's data record to become polluted with incorrect events. Functions for format checking of data are included for this purpose, and a separate "clinical score" device should be created for each patient. It is also recommended to create a test patient on the Rune platform, and upload to that patient first to confirm your code/data format.

**Contents:**
* [Set up](#Set-up)
* [1. Configure API client](#1.-Configure-API-client)
* [2. Define functions to create Events and Spans](#2.-Define-functions-to-create-Events-and-Spans)
* [3. Read a clinical assessment from a CSV file](#3.-Read-a-clinical-assessment-from-a-CSV-file)
* [4. Check the formatting of the data](#4.-Check-the-formatting-of-the-data)
* [5. Ingest the items (assessment scores) as events](#5.-Ingest-the-items-(assessment-scores)-as-events)
* [6. Pull the uploaded events from the Rune platform](#6.-Pull-the-uploaded-events-from-the-Rune-platform)

---

# Set up

## Access credentials

Pulling data from the Rune API requires a local configuration (.yaml) file that includes your Rune user access tokens (as discussed in previous example notebooks). Uploading data to the platform currently requires a second local configuration file, which includes client credentials for the specific patient for which you will upload data.

### Your Rune user access tokens

If you have not previously done so, you will need to set up your access tokens for read access to all patients within your organization. See our [API doc](https://docs.runelabs.io/stream/#section/Overview/Authentication) for instructions on how to set this up.

Next, set up a .yaml file with your token ID and secret. This is text file that will store your credentials. See our [`runeq` quickstart](https://runeq.readthedocs.io/en/latest/pages/quickstart.html#configuration) for how to set this up.

Once this is complete, you will have the following in `~/.rune/config.yaml`:

```yaml
access_token_id: 1234567890abcdef
access_token_secret: 1234567890abcdef
```

### Client keys for the specific patient

Next, you'll need a **client key pair** for the specific patient. Creation of these credentials requires admin access, and can be accomplished by the following steps: 

1. Open the settings menu for the patient.
    * From the patient list, click the gear icon to the right of the "browse" and "analyze" buttons for the patient.
    * **OR**
    * While viewing the patient's data, click the gear icon next to the patient code name at the upper left of the screen.
2. Click "Clients" on the left.
3. Click "Create Client" at the top.
4. Name the client something descriptive (e.g., Gavin's Macbook).
5. Click "Create."
6. Copy and paste the given "Client Key ID" and "Client Key Secret" into a .yaml file with the below format.
    * When finished, click "Done." These credentials will never be shown again. You can create new credentials at any time using the "New Key +" button.

Save these keys in a separate .yaml file (e.g., `~/.rune/config-patient01.yaml`:

```yaml
auth_method: client_keys
client_key_id: 1234567890abcdef
client_access_key: 1234567890abcdef
stream_url: https://stream.runelabs.io
```

## Create a clinical assessment "device" for the patient

Data on the Rune platform is associated with devices. LFP data from a Medtronic Percept device is associated with a Percept "device" on the platform. Symptom data from an Apple Watch is similarly associated with an Apple Watch "device." To upload clinical assessment data, create a *clinical assessment* "device" for that patient on in the research portal. This requires admin level access. Create the device by following these steps:

1. Open the settings menu for the patient.
    * From the patient list, click the gear icon to the right of the "browse" and "analyze" buttons for the patient.
    * **OR**
    * While viewing the patient's data, click the gear icon next to the patient code name at the upper left of the screen.
2. Click "Devices" on the left.
3. Click "Register Device" at the top.
4. Select "Clinical Assessment" as the device type.
5. Name the device as desired in the "Alias" field (e.g., "UPDRS scores").
6. Click "Create."

The `device_id` of this device will used in the API parameters below.

## Import necessary packages/modules

Most of the packages/modules used here are included in the Python/Anaconda standard libraries. Others can be installed  using pip by opening a terminal (Apple/Linux) or Anaconda command prompt (Windows), and entering `pip install packagename`.

Additional packages/modules:
* `tzlocal.get_localzone`: Get local time zone in a usable format (i.e., not three-character designation).
* `requests`: Handle HTTP requests to send data to the Rune platform.
* `ipyfilechooser.FileChooser`: Browse for files on the local computer.

In [1]:
import pandas as pd
import datetime as dt
import requests
from pprint import pprint
from tzlocal import get_localzone
from ipyfilechooser import FileChooser
from runeq import Config, stream

---

# 1. Configure API client

Use the configuration files created above (in the set up instructions) to create an API client, and prepare accessors for the Event (and Span if desired) endpoint for the desired patient.

In [2]:
# Create an API client with the user access token config file.
#  (If the file is in the default location, ~/.rune/config.yaml,
#   it is not necessary to pass the path to Config().)

cfg = Config()
client = stream.V1Client(cfg)

data_url = 'https://data.runelabs.io'

In [3]:
# Prepare parameters for the Event/Span endpoints for the desired patient,
#  using the client config file.

patient_cfg = Config('~/.rune/config_patient01.yaml')

patient_id = 'c118dbfff9644fbb83e5fe1982d4534c'
device_id = 'M4S95rKL'

In [4]:
# Prepare accessors for the Event and Span API endpoints.

event = client.Event(patient_id=patient_id, device_id=device_id)
span = client.Span(patient_id=patient_id, device_id=device_id)

---

# 2. Define functions to create Events and Spans

"Events" and "spans" are slightly different structures in the Rune platform.

* Event: Events that occur at a single point in time
* Span: Events that have duration, including "fuzzy" start and end times

In this example, we will only be creating "events," as we are representing assessments that are considered to have been recorded at a single point in time. To represent a task or assessment that has duration information, we would use "spans."

Functions are defined here for creating both events and spans on the Rune platform, but only the `create_event()` function will be used in this example.

In [5]:
def create_event(
    event_namespace,
    event_type,
    event_enum,
    event_time,
    event_description='',
    add_payload=None,
    device_id=device_id,
    cfg=patient_cfg,
    dry_run=False
):
    """
    Create an "event"

    The event_description field is optional, and will be an empty string in the payload if not provided.
    The add_payload field is optional, and the payload will include only the body below if not provided.
    If dry_run is True, nothing will be posted to the Rune platform. This is for testing purposes.

    """
    
    body = {
        'classification': f'{event_namespace}.{event_type}.{event_enum}',
        'created_time': dt.datetime.utcnow().timestamp(),
        'device_id': device_id,
        'device_version': 'jupyter-notebook',
        'payload': {
            'category': event_type.upper(),  # Upper case for consistency
            'name': event_enum,
            'description': event_description
        },
        'time': event_time
    }

    if add_payload is not None:
        body['payload'].update(add_payload)

    if not dry_run:
        r = requests.post(
            f'{data_url}/rest/v1/event',
            json=body,
            headers=cfg.auth_headers,
        )
    else:
        pprint(body)
        r = None

    return r

In [6]:
def create_span(
    event_namespace,
    event_type,
    event_enum,
    start_time,
    end_time,
    event_description='',
    add_payload=None,
    device_id=device_id,
    cfg=patient_cfg,
    dry_run=False
):
    """
    Create a "span"

    The event_time parameter from the create_event() function is replaced here by start_time and end_time.
    
    The event_description field is optional, and will be an empty string in the payload if not provided.
    The add_payload field is optional, and the payload will include only the body below if not provided.
    If dry_run is True, nothing will be posted to the Rune platform. This is for testing purposes.

    """
    
    body = {
        'classification': f'{event_namespace}.{event_type}.{event_enum}',
        'created_time': dt.datetime.utcnow().timestamp(),
        'device_id': device_id,
        'device_version': 'jupyter-notebook',
        'payload': {
            'category': event_type.upper(),  # Upper case for consistency
            'name': event_enum,
            'description': event_description
        },
        'start_time': start_time,
        'end_time': end_time,
    }
    
    if add_payload is not None:
        body['payload'].update(add_payload)

    if not dry_run:
        r = requests.post(
            f'{data_url}/rest/v1/span',
            json=body,
            headers=cfg.auth_headers,
        )
    else:
        pprint(body)
        r = None
        
    return r

---

# 3. Read a clinical assessment from a CSV file

The API is flexible, allowing ingestion of events in a variety of formats. However, to ease analysis of uploaded data and collaboration across institutions, consistency is ideal. This example focuses on UPDRS scores, and an example CSV file is provided with a suggested format [here](https://rune-labs.quip.com/ooCbAOWCeWBT/UPDRSuploadformat). In this file, each row corresponds to an item of the assessment. The code in this example is designed to handle this format, and it is recommended that UPDRS data files conform to it, including the item designations (e.g., `1_a`, `2_3`, `3_4a`). Section 4 below demonstrates how to define these item designations and test data for consistency.

This code will also handle partial assessments, utilizing the same file format, but removing rows that correspond to missing items.

This example can be modified as desired, and is intended to serve as a basis for other types of data (e.g., other clinical assessments, symptom metrics, behavioral task data). The code may require some modification to handle new formats.

The steps below will:

* Open a file chooser to select the desired file to upload
* Read the file into a dataframe
* Localize timestamps to the desired time zone and convert to UTC

## Select the desired file

In [7]:
fc = FileChooser()
display(fc)

FileChooser(path='/Users/gavin/Documents/GitHub/jupyter-notebook-templates', filename='', title='', show_hidde…

In [8]:
file_path = fc.selected

In [9]:
print(file_path)

/Users/gavin/Documents/data/patient01/UPDRS_test.csv


The file chooser above is used for convenience, but it is also possible to define a file path manually/programatically if desired. For example:

`file_path = '/Users/name/Documents/assessments/patient_001/patient001_UPDRS_2021-11-24.csv'`

## Read the file into a dataframe

In [10]:
data = pd.read_csv(file_path, index_col=['Item'], parse_dates=[['Date','Time']])

In [11]:
data

Unnamed: 0_level_0,Date_Time,Assessment,Description,Value
Item,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1_A,2021-07-28 10:52:00,UPDRS,Source of Information,patient
1_1,2021-07-28 10:52:00,UPDRS,Cognitive impairment,4
1_2,2021-07-28 10:52:00,UPDRS,Hallucinations and psychosis,4
1_3,2021-07-28 10:52:00,UPDRS,Depressed mood,4
1_4,2021-07-28 10:52:00,UPDRS,Anxious mood,4
...,...,...,...,...
4_2,2021-07-28 10:52:00,UPDRS,Functional impact of dyskinesias,4
4_3,2021-07-28 10:52:00,UPDRS,Time spent in the OFF state,4
4_4,2021-07-28 10:52:00,UPDRS,Functional impact of fluctuations,4
4_5,2021-07-28 10:52:00,UPDRS,Complexity of motor fluctuations,4


## Localize to the desired time zone and convert to UTC

All timestamps on the Rune platform are stored in UTC, which can be converted to any desired time zone.

This method of localizing and converting time zones with pandas will automatically handle daylight savings time.

See all possible time zones by importing the pytz package and using `pytz.all_timezones`. A few examples:

* 'UTC'
* 'GMT'
* 'US/Eastern'
* 'US/Central'
* 'US/Mountain'
* 'US/Pacific'
* 'Japan'
* 'Asia/Shanghai'
* 'Europe/Zurich'

The desired time zone (in which the assessment was performed or data was recorded) can be set in a variety of ways:

* Manually enter the designation: `timezone = 'US/Eastern'`
* If the data is being uploaded in the same time zone, use the `get_localzone()` function of the `tzlocal` package as demonstrated below.
* If data from multiple time zones is being uploaded, it might be helpful to add a time zone column to the CSV file format, and include the appropriate time zone designation. These could then be read from the data frame and utilized in the localization steps below.

In [12]:
# Get the local time zone.

timezone = get_localzone().key
timezone

'America/Denver'

In [13]:
# Localize to the desired time zone, and convert to UTC.

data.Date_Time = data.Date_Time.dt.tz_localize(timezone)
data.insert(1,'Date_Time_UTC',data.Date_Time.dt.tz_convert('UTC'))

data

Unnamed: 0_level_0,Date_Time,Date_Time_UTC,Assessment,Description,Value
Item,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1_A,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Source of Information,patient
1_1,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Cognitive impairment,4
1_2,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Hallucinations and psychosis,4
1_3,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Depressed mood,4
1_4,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Anxious mood,4
...,...,...,...,...,...
4_2,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Functional impact of dyskinesias,4
4_3,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Time spent in the OFF state,4
4_4,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Functional impact of fluctuations,4
4_5,2021-07-28 10:52:00-06:00,2021-07-28 16:52:00+00:00,UPDRS,Complexity of motor fluctuations,4


Note that the format of the `Date_Time` column now includes an offset value from UTC (which is `+00:00` after the timestamps have been converted to UTC).

---

# 4. Check the formatting of the data

**Because no method is available for deletion of individual segments of data from the Rune platform, it is vital to check and confirm the format of the data before upload/ingestion. Mistakes could cause a patient's data record to become polluted with incorrect events.**

Formatting errors or typos in the CSV file could cause a variety of issues in the ingested data. Check the extracted dataframe and enforce these conditions:

* The desired columns are present, with the correct labels.
    * In this UPDRS example, these include Item, Date_Time_UTC, Assessment, Description, and Value.
* The item labels are correct.
    * In this UPDRS example, "correct" means they are included in the desired item set.
        * Not all items are required, as partial assessments are allowable.
* No dots are used in the category strings (`event_namespace`, `event_type`, `event_enum`).
* The category strings are all lowercase.
    * Namespace is hard coded, and is not included in the data file.
    * `event_type` is forced to lowercase in the code below.
    * `event_enum` is compared to a list of acceptable items.
* Any other strings have the desired case.
* The timestamps are in a valid format, and match the expected date/time.
    * Pandas should parse most date formats, but a human check is recommended.

## Define a function to perform automated format checking

The function below checks the dataframe for several of the conditions in the above list. It can be modified to include additional conditions, or to handle different assessments and/or file formats.

Note that it requests a human check of the first date/time in the file, to confirm that is as expected. Pandas parses most date formats successfully, but a quick human check is advised to catch any unexpected issues with parsing, time zone correction, or data entry errors.

This function returns a boolean variable called `data_pass`, which is subsequently used to prevent ingestion before all detected issues are corrected.

In [14]:
def data_format_check(data, expected_columns, expected_items):
    """
    Check formatting of a dataframe before ingestion

    Inputs:
        data - Pandas dataframe of data read from file
        expected_columns - List of expected column labels
        expected_items - List of acceptable items
    """
    
    if data.index.name != 'Item':
        raise ValueError('The index label is not "Item".')
        
    if (data.keys() != expected_columns).any():
        raise ValueError('The column labels do not match the expected labels.')
        
    if not all(item in expected_items for item in data.index):
        raise ValueError('Items in the file that are not found in the list of acceptable items:')
        for item in data.index:
            if item not in expected_items:
                print(f'{item}\n')
    
    if any('.' in item for item in data.index):
        raise ValueError('At least one "." was found in the item labels.')
        
    if any('.' in assmt for assmt in data.Assessment):
        raise ValueError('At least one "." was found in the Assessment column.')
        
    print(f'Manually confirm that this date/time is expected: {data.iloc[0].Date_Time}\n')

## Create lists of expected columns/items and check the data

The expected column item labels can be hard coded here (as shown below), or pulled from a master file.

(Note the use of set {} rather than list [], which is optimized for lookup operations such as that used in the `data_format_check()` function.)

In [15]:
columns = ['Date_Time', 'Date_Time_UTC', 'Assessment', 'Description', 'Value']
items = {'1_A', '1_1', '1_2', '1_3', '1_4', '1_5', '1_6', '1_6a', '1_7', '1_8', '1_9', 
         '1_10', '1_11', '1_12', '1_13', '2_1', '2_2', '2_3', '2_4', '2_5', '2_6', '2_7',
         '2_8', '2_9', '2_10', '2_11', '2_12', '2_13', '3a', '3b', '3c', '3_c1', '3_1',
         '3_2', '3_3a', '3_3b', '3_3c', '3_3d', '3_3e', '3_4a', '3_4b', '3_5a', '3_5b',
         '3_6a', '3_6b', '3_7a', '3_7b', '3_8a', '3_8b', '3_9', '3_10', '3_11', '3_12',
         '3_13', '3_14', '3_15a', '3_15b', '3_16a', '3_16b', '3_17a', '3_17b', '3_17c',
         '3_17d', '3_17e', '3_18', '3_dyskA', '3_dyskB', '3_hy', '4_1', '4_2', '4_3',
         '4_4', '4_5', '4_6'
        }

data_format_check(data, columns, items)

# If a ValueError is raised in data_format_check(), it won't reach the code below.
print('Data from file passes all automated checks.')
data_pass = True

Manually confirm that this date/time is expected: 2021-07-28 10:52:00-06:00

Data from file passes all automated checks.


Once the data passes all automated checks, and the date/time is confirmed to be correct, the data can be uploaded/ingested into the Rune platform.

---

# 5. Ingest the items (assessment scores) as events

In the UPDRS example, each score will have its own event. Additionally, a comprehensive "all" event will be created that includes all of the other scores in its payload. This will ease future analysis of the data by allowing access to entire assessments or individual items as desired.

## Upload comprehensive "all" event with all items in payload

This event is created to represent the overall assessment, including all available items (i.e., scores). The scores are packed into a text payload, so they aren't as easy to access, but they are all in a single event.

A user could then query for all assessments of this type (i.e., `clinical.updrs.all`), and the API would return all of them, including partial assessments.

In [16]:
# Convert the relevant columns of the dataframe to a dictionary.

payload_fields = data[['Description','Value']].to_dict('index')

Set the `dry_run` parameter of the `create_event` (or `create_span`) function to `True` to test and validate your code, so that you avoid ingesting improperly formatted events to the platform. During a dry run, the payload will be printed to the screen for validation.

When your code is validated, change the `dry_run` parameter to `False`.

In [17]:
# If the data passed automated format checks above:
#
# Post the event to the Rune platform, using the dictionary for the text payload.
# Uses the 'Assessment' and 'Date-Time' values from the first row, assuming that 
#  all values in this CSV file were from the same assessment.

if data_pass:
    response = create_event(
        'clinical',
        data.iloc[0].Assessment.lower(),
        'all',
        data.iloc[0].Date_Time_UTC.timestamp(),
        event_description='All scores',
        add_payload=payload_fields,
        dry_run=True
    )

In [18]:
# Check the API response.

print(response)

<Response [200]>


When posting data to the API, the response will indicate success or reasons for failure with a code. Possible responses include:

* `200`: Success
* `400`: Invalid Request. The validation error will be indicated as a message in the response body. This will include cases where the device ID does not exist, is disabled, or does not belong to the patient.
* `401`: Missing or invalid authentication
* `413`: Payload too large
* `500`: Server error. The message body will indicate whether to retry.

## Upload individual events for each separate item

Each individual item will be ingested as a separate event. These enable querying of individual items from the platform.

A user could then query specifically for subsets or individual items (e.g., `clinical.updrs.3_11`).

Again, set the `dry_run` parameter of the `create_event` or `create_span` function to `True` to test and validate you code, so that you avoid ingesting improperly formatted events to the platform. During a dry run, the payload will be printed to the screen for validation.

When your code is validated, change the `dry_run` parameter to `False`.

In [19]:
# Define a function to handle each row of the dataframe as a separate event.

def event_from_row(index, row):
    """
    Ingest an event shaped from the data in one row of a dataframe.

    """
    event_enum = index
    event_type = row.Assessment.lower()
    timestamp = row.Date_Time_UTC.timestamp()
    event_description = row.Description
    value = {'value': row.Value}

    response = create_event(
        'clinical',
        event_type,
        event_enum,
        timestamp,
        event_description=event_description,
        add_payload=value,
        dry_run=True
    )

    return response

Iterating over the rows of a pandas dataframe with the `iterrows()` function is not very efficient, and typically should be avoided. However, the difference in processing time here is negligible.

In [20]:
# If the data passed automated format checks above:
#
# Iterate over the rows of the dataframe, creating an event for each.
            
if data_pass:
    for index, row in data.iterrows():
        response = event_from_row(index, row)

        if response.ok:
            print(f'SUCCESS: {index}')
        else:
            print(f'FAILURE: {index}')
            # If any request fails, raise an exception to break out of the loop.
            response.raise_for_status()

print('Done.')

SUCCESS: 1_A
SUCCESS: 1_1
SUCCESS: 1_2
SUCCESS: 1_3
SUCCESS: 1_4
SUCCESS: 1_5
SUCCESS: 1_6
SUCCESS: 1_6a
SUCCESS: 1_7
SUCCESS: 1_8
SUCCESS: 1_9
SUCCESS: 1_10
SUCCESS: 1_11
SUCCESS: 1_12
SUCCESS: 1_13
SUCCESS: 2_1
SUCCESS: 2_2
SUCCESS: 2_3
SUCCESS: 2_4
SUCCESS: 2_5
SUCCESS: 2_6
SUCCESS: 2_7
SUCCESS: 2_8
SUCCESS: 2_9
SUCCESS: 2_10
SUCCESS: 2_11
SUCCESS: 2_12
SUCCESS: 2_13
SUCCESS: 3a
SUCCESS: 3b
SUCCESS: 3c
SUCCESS: 3_c1
SUCCESS: 3_1
SUCCESS: 3_2
SUCCESS: 3_3a
SUCCESS: 3_3b
SUCCESS: 3_3c
SUCCESS: 3_3d
SUCCESS: 3_3e
SUCCESS: 3_4a
SUCCESS: 3_4b
SUCCESS: 3_5a
SUCCESS: 3_5b
SUCCESS: 3_6a
SUCCESS: 3_6b
SUCCESS: 3_7a
SUCCESS: 3_7b
SUCCESS: 3_8a
SUCCESS: 3_8b
SUCCESS: 3_9
SUCCESS: 3_10
SUCCESS: 3_11
SUCCESS: 3_12
SUCCESS: 3_13
SUCCESS: 3_14
SUCCESS: 3_15a
SUCCESS: 3_15b
SUCCESS: 3_16a
SUCCESS: 3_16b
SUCCESS: 3_17a
SUCCESS: 3_17b
SUCCESS: 3_17c
SUCCESS: 3_17d
SUCCESS: 3_17e
SUCCESS: 3_18
SUCCESS: 3_dyskA
SUCCESS: 3_dyskB
SUCCESS: 3_hy
SUCCESS: 4_1
SUCCESS: 4_2
SUCCESS: 4_3
SUCCESS: 4_4
SUCCESS: 

The API responses are explained in the previous section.

---

# 6. Pull the uploaded events from the Rune platform

Once the data has been uploaded to the Rune platform, it will be available via the Rune API (along with all other data streams) for analysis. Upload success can be confirmed by pulling the events from the API.

**NOTE:** After upload, there will be a delay (often ~15 minutes) before the events are available to be queried.

Exploration of Events and Spans is covered in one of our tutorial notebooks: https://github.com/rune-labs/opensource/blob/master/jupyter-notebook-templates/07_explore_patient_events.ipynb

In [21]:
# Define function for pulling events.

def get_events(client, params):
    """Makes API calls for events, outputs dataframe"""

    accessor = client.Event(**params)

    df = pd.DataFrame()
    for page in accessor.iter_json_data():
        df_page = pd.DataFrame(page['event'])
        df = df.append(df_page, ignore_index=True)

    return df

In [22]:
# Set parameters for pulling data from the API.
#  Starting 1 second before the first timestamp in the file.
#  Ending 1 second after the last timestamp in the file.

params = {
    'patient_id': patient_id,
    'device_id': device_id,
    'start_time': data.iloc[0].Date_Time_UTC.timestamp()-1,
    'end_time': data.iloc[-1].Date_Time_UTC.timestamp()+2
}

In [23]:
# Retrieve data.

events = get_events(client, params)

In [24]:
# Examine the dataframe of pulled events.

events

Unnamed: 0,time,created_time,device_id,id,event_namespace,event_type,event_enum,payload,display_name
0,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-0009bb509ed10a95c560b1e9afc6d710...,clinical,updrs,3_3e,"{'category': 'UPDRS', 'description': 'Rigidity...",clinical.updrs.3_3e
1,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-012198512f883f6df37e759db90097f1...,clinical,updrs,3_8b,"{'category': 'UPDRS', 'description': 'Leg agil...",clinical.updrs.3_8b
2,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-06dd403662735e13acdc7ce8621a7b2a...,clinical,updrs,3_2,"{'category': 'UPDRS', 'description': 'Facial e...",clinical.updrs.3_2
3,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-08b5574f9a551e79edfdb2368635bb98...,clinical,updrs,2_4,"{'category': 'UPDRS', 'description': 'Eating t...",clinical.updrs.2_4
4,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-0a8033f51ca609890af214ad4c820fad...,clinical,updrs,3_6a,"{'category': 'UPDRS', 'description': 'Pronatio...",clinical.updrs.3_6a
...,...,...,...,...,...,...,...,...,...
70,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-ee989d41c6212698732596fcb7ad3397...,clinical,updrs,2_13,"{'category': 'UPDRS', 'description': 'Freezing...",clinical.updrs.2_13
71,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-f337a09795b35c44555e05ab9e412643...,clinical,updrs,2_12,"{'category': 'UPDRS', 'description': 'Walking ...",clinical.updrs.2_12
72,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-f356d4bb844b72735d9817a9b1e8fa94...,clinical,updrs,3_7a,"{'category': 'UPDRS', 'description': 'Toe tapp...",clinical.updrs.3_7a
73,1627491120,1.641546e+09,M4S95rKL,event-2d4534c-fb4f38b195c5b342f1e299c441724227...,clinical,updrs,3_c1,"{'category': 'UPDRS', 'description': 'If yes, ...",clinical.updrs.3_c1


The returned data looks correct. There were 74 items in the CSV file, and there are 75 rows in the data frame (including the "all" item event). The `display_name` column shows the combined event namespace, type, and enum for each item. Each also includes the expected text payload.

## See which items exist on the platform

To confirm that a complete assessment was uploaded, or explore which items exist in a partial assessment, extract all events of the correct type (here UPDRS) from the data frame and print their enums.

In [25]:
# Get the event type from the file (should be 'updrs' in this example).
event_type = data.iloc[0].Assessment.lower()

# Extract all events of the correct type and list their enums.
enums = events.loc[events['event_type'] == event_type].event_enum.unique()
enums.sort()
print(f'* {event_type} items:')
for enum in enums:
    print(f'    * {enum}')

* updrs items:
    * 1_1
    * 1_10
    * 1_11
    * 1_12
    * 1_13
    * 1_2
    * 1_3
    * 1_4
    * 1_5
    * 1_6
    * 1_6a
    * 1_7
    * 1_8
    * 1_9
    * 1_A
    * 2_1
    * 2_10
    * 2_11
    * 2_12
    * 2_13
    * 2_2
    * 2_3
    * 2_4
    * 2_5
    * 2_6
    * 2_7
    * 2_8
    * 2_9
    * 3_1
    * 3_10
    * 3_11
    * 3_12
    * 3_13
    * 3_14
    * 3_15a
    * 3_15b
    * 3_16a
    * 3_16b
    * 3_17a
    * 3_17b
    * 3_17c
    * 3_17d
    * 3_17e
    * 3_18
    * 3_2
    * 3_3a
    * 3_3b
    * 3_3c
    * 3_3d
    * 3_3e
    * 3_4a
    * 3_4b
    * 3_5a
    * 3_5b
    * 3_6a
    * 3_6b
    * 3_7a
    * 3_7b
    * 3_8a
    * 3_8b
    * 3_9
    * 3_c1
    * 3_dyskA
    * 3_dyskB
    * 3_hy
    * 3a
    * 3b
    * 3c
    * 4_1
    * 4_2
    * 4_3
    * 4_4
    * 4_5
    * 4_6
    * all


## Pull a specific item from the API

A specific item can be accessed in two ways:
1. Pull all events of the desired type (here UPDRS) from the API, and extract the desired item from the full data frame.
2. Pull just the desired item from the API using the `event` parameter.

For example, to access just item 3-11:

In [26]:
# Copy the previously use parameters.
params_3_11 = params.copy()

# Add the desired "event" parameter to the new set of parameters.
params_3_11['event'] = 'clinical.updrs.3_11'

# Pull from the API with the specific parameter set.
updrs_3_11_events = get_events(client, params_3_11)

In [27]:
updrs_3_11_events

Unnamed: 0,time,created_time,device_id,id,event_namespace,event_type,event_enum,payload,display_name
0,1627491120,1641546000.0,M4S95rKL,event-2d4534c-eca860bd7c17ccfa048596bb77810ed5...,clinical,updrs,3_11,"{'category': 'UPDRS', 'description': 'Freezing...",clinical.updrs.3_11


In [28]:
updrs_3_11_events.iloc[0].payload

{'category': 'UPDRS',
 'description': 'Freezing of gait',
 'name': '3_11',
 'value': '4'}

## Examine the payload of an event

Any information can be placed in the payload of an event. In this example, we have included the type of assessment, item designation, item description, and recorded value in the payload of each individual item event. We have also placed all of the items in the payload of an "all" event.

Pull the "all" event from the API and examine its payload:

In [29]:
# Now pull the comprehensive "all" event.

# Copy the previously use parameters.
params_all = params.copy()

# Add the desired "event" parameter to the new set of parameters.
params_all['event'] = 'clinical.updrs.all'

# Pull from the API with the specific parameter set.
updrs_all_events = get_events(client, params_all)

In [30]:
updrs_all_events.iloc[0].payload

{'1_1': {'Description': 'Cognitive impairment', 'Value': '4'},
 '1_10': {'Description': 'Urinary problems', 'Value': '4'},
 '1_11': {'Description': 'Constipation problems', 'Value': '4'},
 '1_12': {'Description': 'Lightheadedness on standing', 'Value': '4'},
 '1_13': {'Description': 'Fatigue', 'Value': '4'},
 '1_2': {'Description': 'Hallucinations and psychosis', 'Value': '4'},
 '1_3': {'Description': 'Depressed mood', 'Value': '4'},
 '1_4': {'Description': 'Anxious mood', 'Value': '4'},
 '1_5': {'Description': 'Apathy', 'Value': '4'},
 '1_6': {'Description': 'Features of DDS', 'Value': '4'},
 '1_6a': {'Description': 'Who is filling out questionnaire',
  'Value': 'patient'},
 '1_7': {'Description': 'Sleep problems', 'Value': '4'},
 '1_8': {'Description': 'Daytime sleepiness', 'Value': '4'},
 '1_9': {'Description': 'Pain and other sensations', 'Value': '4'},
 '1_A': {'Description': 'Source of Information', 'Value': 'patient'},
 '2_1': {'Description': 'Speech', 'Value': '4'},
 '2_10': {'

If desired, we can extract the information for a particular item from this payload:

In [31]:
# Using item 4-6 for example:
item = '4_6'
item_info = updrs_all_events.iloc[0].payload[item]

print(f"Item {item}:")
print(f"    Description: {item_info['Description']}")
print(f"    Value: {item_info['Value']}")

Item 4_6:
    Description: Painful OFF-state dystonia
    Value: 4


---

# Summary

This notebook has utilized the UPDRS assessment as an example to demonstrate how to upload clinical scores and other such data to the Rune platform. It provides a recommended format as a CSV file. The example code and file can be modified to handle other data and formats. We are happy to help with your specific application.