## Understanding IOBP2 Dataset
This notebook tries provides details on the structure of the IOBP2 dataset and makes suggestions how to process the data.

## The IOBP2 study

**Title**: The Insulin-Only Bionic Pancreas Pivotal Trial: Testing the iLet in Adults and Children with Type 1 Diabetes


**Description**: This multi-center randomized control trial (RCT) will compare efficacy and safety endpoints using the insulin-only configuration of the iLet Bionic Pancreas (BP) System versus a control group using CGM during a 13-week study period.
    
**Devices**: iLet and Dexcom G6 system

**Study Population**: People with T1D ages 6+

# Data
The study data folder is named **IOBP2 RCT Public Dataset**

From the DataGlossary.rtf file, the following relevant files were identified which are stored in the **Data Tables** subfolder.

* **IOBP2DeviceiLet.txt**: All events logged on pump including CGM and insulin delivery 
* **PtRoster.txt**: Patient Roster

These are csv files ("|" separator) and host many columns related to the Tandem pump events and the Dexcom cgm. The glossary provides information about each column. Each file contains a limited amount of columns compared to the FLAIR data. Below are **all** of the columns contained in each file

## IOBP2DeviceiLet
* **PtID**: Patient ID
* **DeviceDtTm**: Local date and time on the device
* **CGMVal**: CGM glucose value
* **BGTarget**: Current target glucose level in mg/dl
* **InsDelivPrev**: Delivered insulin dose (U) of the prior executed step

## Questions


In [2]:
import os, sys, time, random
import pandas as pd
from datetime import datetime, timedelta
import numpy as np
from matplotlib import pyplot as plt

In [3]:
#get the file path
current_dir = os.getcwd(); 
original_data_path = os.path.join(current_dir, '..', 'data/raw')
cleaned_data_path = os.path.join(current_dir,  '..', 'data/cleaned')
path = os.path.join(original_data_path, 'IOBP2 RCT Public Dataset', 'Data Tables', 'IOBP2DeviceiLet.txt')

In [11]:
df_all_events = pd.read_csv(path, sep="|", low_memory=False,
                           usecols=['PtID', 'DeviceDtTm', 'CGMVal', 'BGTarget', 'InsDelivPrev', 'BasalDelivPrev',
                                    'BolusDelivPrev'])

## Check for DateTimes without Time part

In [12]:
print('Datetimes without time: ', len(df_all_events[df_all_events['DeviceDtTm'].str.len() <= 10]))

Datetimes without time:  147


## inspecting the event counts

In [15]:
df_all_events.head()

Unnamed: 0,PtID,DeviceDtTm,CGMVal,BGTarget,InsDelivPrev,BasalDelivPrev,BolusDelivPrev
0,183,8/14/2020 12:01:23 AM,91.0,120,0.0,0.0,0.0
1,183,8/14/2020 12:06:23 AM,102.0,120,0.0,0.0,0.0
2,183,8/14/2020 12:11:23 AM,105.0,120,0.0,0.0,0.0
3,183,8/14/2020 12:16:23 AM,103.0,120,0.0,0.0,0.0
4,183,8/14/2020 12:21:23 AM,98.0,120,0.0,0.0,0.0
