# 3. Metrics of glycemic control

In [1]:
# Import modules
from src.diametrics import transform, metrics, preprocessing

In [2]:
# Upload and transform data
dexcom_data = transform.transform_directory(directory='tests/test_data/dexcom/', device='dexcom')
libre_data = transform.transform_directory(directory='tests/test_data/libre/', device='libre')
libre1 = transform.open_file('tests/test_data/libre/libre_amer_01.csv')
libre1_transformed = transform.convert_libre(libre1)
dxcm2 = transform.open_file('tests/test_data/dexcom/dexcom_eur_02.xlsx')
dxcm2_transformed = transform.convert_dexcom(dxcm2)

In [3]:
# Replace the lo/hi cutoff values
dexcom_data = preprocessing.replace_cutoffs(dexcom_data)
libre_data = preprocessing.replace_cutoffs(libre_data)
libre1_transformed = preprocessing.replace_cutoffs(libre1_transformed, lo_cutoff=2.1, hi_cutoff=27.8)
dxcm2_transformed = preprocessing.replace_cutoffs(dxcm2_transformed)

## 3.1. Individual metrics

### 3.1.1. Data sufficiency

The data sufficiency is calculated using the first and last reading of the CGM trace you uploaded, unless the start and end times are edited in the ‘data overview’ section, in which case those datetimes will be used. The number of expected readings during this period are calculated based on the interval. The data sufficiency is then calculated as the 100*non-null readings/expected readings.
                        
Example:
* 48hrs and 17 mins of CGM data would be 2897 mins
* For a  FreeStyle Libre we divide this by the interval of 15 mins to give the expected readings (193.1333…)
* The number of non-null readings are counted for the period (186)
* The data sufficiency is 100* 186/193.333 rounded to 2 decimal places (96.83%) 


Docstring:

In [4]:
help(metrics.data_sufficiency)

Help on function data_sufficiency in module src.diametrics.metrics:

data_sufficiency(df, start_time=None, end_time=None)
    Calculate the data sufficiency percentage based on the provided DataFrame, gap size, and time range.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings and a 'time' column with timestamps.
        gap_size (int): The size of the gap in minutes to check for data sufficiency.
        start_time (datetime.datetime, optional): The start time of the time range. If not provided, it will be determined from the DataFrame. Default is None.
        end_time (datetime.datetime, optional): The end time of the time range. If not provided, it will be determined from the DataFrame. Default is None.
    
    Returns:
        dict: A dictionary containing the start and end datetimes of the time range and the data sufficiency percentage.
    
    Raises:
        ValueError: If the gap size is not 5 or 15.
    
    Note:
   

Calling the function

In [5]:
metrics.data_sufficiency(dxcm2_transformed)

{'Start DateTime': '2023-03-08 00:01:00',
 'End DateTime': '2023-03-21 15:31:00',
 'Data Sufficiency (%)': 97.4}

In [6]:
metrics.data_sufficiency(libre_data)

Unnamed: 0,ID,Start DateTime,End DateTime,Data Sufficiency (%)
0,libre_amer_01,2021-03-20 17:38:00,2021-04-03 16:08:00,100.0
1,libre_amer_02,2021-03-23 01:11:00,2021-04-06 12:56:00,48.8


### 3.1.2. Average glucose
The average glucose is calculated as the mean glucose reading of all the readings using a Pandas function.

In [7]:
help(metrics.average_glc)

Help on function average_glc in module src.diametrics.metrics:

average_glc(df)
    Calculate the average glucose reading from the 'glc' column in the DataFrame.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        float: The average glucose reading.
    
    Note:
        - The function uses the 'mean' method from pandas.DataFrame to calculate the average.
        - It returns the average glucose reading as a float value.



In [8]:
metrics.average_glc(libre1_transformed)

{'Average glucose (mg/dL)': 126.06721433905899}

In [9]:
metrics.average_glc(dexcom_data)

Unnamed: 0,ID,Average glucose (mmol/L)
0,dexcom_eur_01,9.151752
1,dexcom_eur_02,7.866179
2,dexcom_eur_03,8.212228


### 3.1.3. Glycemic variability          
Standard deviation (SD) is the standard deviation of all the glucose readings (obviously), once again calculated with a Pandas function.

Coefficient of variation (CV) is 100 * SD / avg. glucose.

In [10]:
help(metrics.glycemic_variability)

Help on function glycemic_variability in module src.diametrics.metrics:

glycemic_variability(df)
    Calculate the glycemic variability metrics for glucose readings in the DataFrame.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        dict: A dictionary containing the calculated glycemic variability metrics.
    
    Note:
        - The function uses the 'average_glc' function to calculate the average glucose reading.
        - It then calculates the standard deviation (SD) of glucose readings using the 'std' method from pandas.Series.
        - The coefficient of variation (CV) is calculated as (SD * 100 / average glucose).
        - The calculated SD and CV values are returned as a dictionary with corresponding labels.



In [11]:
metrics.glycemic_variability(dxcm2_transformed)

{'SD (mmol/L)': 3.0850394980568345, 'CV (%)': 39.21903424368059}

In [12]:
metrics.glycemic_variability(libre_data)

Unnamed: 0,ID,SD (mg/dL),CV (%)
0,libre_amer_01,36.505584,28.957239
1,libre_amer_02,25.72951,20.255761


### 3.1.4. Time in range
 The percentage time in range is calculated for the 5 ranges specified in the international consensus by default, with the ability to add more in the ‘analysis options’ section of the dashboard.
                                
The ranges are:
* Time in normal range (3.9-10.0 mmol/L)
* Time in level 1 hypoglycemia (3.0-3.9 mmol/L)
* Time in level 2 hypoglycemia (<3.0 mmol/L)
* Time in level 1 hyperglycemia (10.0-13.9 mmol/L)
* Time in level 2 hyperglycemia (>13.9 mmol/L)

The time in range is calculated as a percentage of the readings in the specified range over the total number of readings * 100.


In [13]:
help(metrics.time_in_range)

Help on function time_in_range in module src.diametrics.metrics:

time_in_range(df)
    Helper function for time in range calculation with normal thresholds. Calculates the percentage of readings within
    each threshold by dividing the number of readings within range by total length of the series.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        dict: A dictionary containing the percentages of readings within each threshold range.
    
    Note:
        - The function calculates the percentage of readings within different threshold ranges for time in range (TIR) analysis.
        - TIR normal represents the percentage of readings within the range [3.9, 10].
        - TIR normal 1 represents the percentage of readings within the range [3.9, 7.8].
        - TIR normal 2 represents the percentage of readings within the range [7.8, 10].
        - TIR level 1 hypoglycemia represents the percentage of rea

In [14]:
metrics.time_in_range(libre1_transformed)

{'TIR normal (%)': 90.44,
 'TIR normal 1 (%)': 67.66,
 'TIR normal 2 (%)': 22.78,
 'TIR level 1 hypoglycemia (%)': 0.3,
 'TIR level 2 hypoglycemia (%)': 0.0,
 'TIR level 1 hyperglycemia (%)': 9.26,
 'TIR level 2 hyperglycemia (%)': 0.0}

In [15]:
metrics.time_in_range(dexcom_data)

Unnamed: 0,ID,TIR normal (%),TIR normal 1 (%),TIR normal 2 (%),TIR level 1 hypoglycemia (%),TIR level 2 hypoglycemia (%),TIR level 1 hyperglycemia (%),TIR level 2 hyperglycemia (%)
0,dexcom_eur_01,64.83,37.47,27.35,0.71,0.03,26.54,7.9
1,dexcom_eur_02,67.01,46.49,20.53,5.9,1.7,22.67,2.72
2,dexcom_eur_03,67.85,44.77,23.08,4.11,1.49,21.09,5.46


### 3.1.5. Area under the curve

Area under the curve is calculated with SciKit Learn’s function that uses the trapezoidal rule (drawing a straight line between two points and calculating the area underneath).

To get the data in the right format, the datetime needs to be rewritten as number of hours from the first reading (which is 0). E.g., for 15 minute Libre readings, the 1st reading would be 0, the next would be 0.25 etc.

The trapezoidal rule is then used to give the average hourly AUC in either mmol h/L or mg h/dL.


In [16]:
help(metrics.auc)

Help on function auc in module src.diametrics.metrics:

auc(df)
    Calculate the area under the curve (AUC) for glucose readings in the DataFrame.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings and a 'time' column with timestamps.
    
    Returns:
        dict: A dictionary containing the hourly average AUC, daily AUC breakdown, and hourly AUC breakdown.
    
    Note:
        - The function calculates the AUC by breaking down the DataFrame into hourly and daily intervals.
        - It uses the 'calculate_auc' function to calculate the AUC for each group.
        - The hourly AUC breakdown is a DataFrame with columns 'date', 'hour', and 'auc'.
        - The daily AUC breakdown is a Series with dates as the index and average AUC values as the values.
        - The hourly average AUC is the mean of the AUC values in the hourly breakdown.



In [17]:
metrics.auc(dxcm2_transformed)

{'AUC (mmol h/L)': 7.104192692307693}

In [18]:
metrics.auc(dexcom_data)

Unnamed: 0,ID,AUC (mmol h/L)
0,dexcom_eur_01,8.291685
1,dexcom_eur_02,7.104193
2,dexcom_eur_03,7.369449


In [19]:
#metrics.auc(libre_data)

### 3.1.6. eA1c
eA1c is calculated as (avg. glucose + 2.59) / 1.59 for mmol/L and (avg. glucose + 46.7) / 28.7 for mg/dL.

In [20]:
help(metrics.ea1c)

Help on function ea1c in module src.diametrics.metrics:

ea1c(df)
    Calculate estimated average HbA1c (eA1c) based on the average glucose readings in the DataFrame.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        float: The estimated average HbA1c (eA1c) value.
    
    Note:
        - The function calculates the average glucose reading from the 'glc' column in the DataFrame.
        - It determines the units of the glucose readings using the 'detect_units' function from the 'preprocessing' module.
        - If the units are 'mmol/l', the eA1c is calculated using the formula: (average glucose + 2.59) / 1.59.
        - If the units are not 'mmol/l', the eA1c is calculated using the formula: (average glucose + 46.7) / 28.7.
        - The calculated eA1c value is returned as a float.



In [21]:
metrics.ea1c(libre1_transformed)

{'eA1c (%)': 6.019763565820871}

In [22]:
metrics.ea1c(dexcom_data)

Unnamed: 0,ID,eA1c (%)
0,dexcom_eur_01,7.38475
1,dexcom_eur_02,6.576213
2,dexcom_eur_03,6.793854


### 3.1.7. Hypoglycemic and hyperglycemic episodes
Hypo- and hyper-glycemic events are defined as 15 minutes or more below and above the relevant thresholds respectively.
                        
Since this metric is dependent on the progression of time, it is affected more greatly by missing data and discrepancies in recording interval. It also doesn’t help that the international consensus is incredibly vague about this metric.

The way the hypoglycemic events were calculated is as follows, the same will be the case with hyperglycemic events, but just working above the thresholds rather than below.
* Identify all times the glucose dips below 3.9mmol/L for at least 15 mins (2 readings for FreeStyle Libre, 4 for Dexcom and Medtronic)
* If there is another dip below the threshold within the next 15 mins of the glucose coming back up then it is part of the same event
* If 15 consecutive minutes of readings go below the level 2 hypoglycemia threshold (3.0mmol/L) then it is considered to be a level 2 event. If not, it’s just a level 1 event.
* Only when the glucose rises above the threshold for 15 mins is the event over
* The start time and end time of each episode are used to calculate the total time spent in hypoglycemia and the average length of event

Things get a bit more complicated if there’s missing data. These are the assumptions that have been made in order to calculate the events and may result in very high time spent in events and average glucose readings.
* If the glucose readings drop out during an event, if the first glucose reading after the drop out is also in hypoglycemia, this is considered to be part of the same episode. This is where you could end up with very long episodes if there’s a lot of missing data and it cuts out and cuts in in hypoglycemia. 
* However, if the first glucose reading is not below the hypoglycemia threshold, the episode is considered to have ended with the last glucose reading
* If there are, for example, 10 mins of readings in hypoglycemia and then the readings cut out for 15 mins, then come back in with 10 mins of hypoglycemia, this is not considered an episode because there aren’t 15 mins of consistent readings                        
                    

In [23]:
help(metrics.glycemic_episodes)

Help on function glycemic_episodes in module src.diametrics.metrics:

glycemic_episodes(df, hypo_lv1_thresh=None, hypo_lv2_thresh=None, hyper_lv1_thresh=None, hyper_lv2_thresh=None, mins=15, long_mins=120)
    Calculate the statistics of glycemic episodes (hypoglycemic and hyperglycemic events) based on glucose readings.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings and a 'time' column with timestamps.
        hypo_lv1_thresh (float, optional): Level 1 hypoglycemic threshold. If not provided, it will be determined based on the units detected. Default is None.
        hypo_lv2_thresh (float, optional): Level 2 hypoglycemic threshold. If not provided, it will be determined based on the units detected. Default is None.
        hyper_lv1_thresh (float, optional): Level 1 hyperglycemic threshold. If not provided, it will be determined based on the units detected. Default is None.
        hyper_lv2_thresh (float, optional): Level 

In [24]:
metrics.glycemic_episodes(libre1_transformed)

{'Total number hypoglycemic events': 1,
 'Number LV1 hypoglycemic events': 1,
 'Number LV2 hypoglycemic events': 0,
 'Number prolonged hypoglycemic events': 0,
 'Avg. length of hypoglycemic events': '0 days 01:00:00',
 'Total time spent in hypoglycemic events': '0 days 01:00:00',
 'Total number hyperglycemic events': 16,
 'Number LV1 hyperglycemic events': 16,
 'Number LV2 hyperglycemic events': 0,
 'Number prolonged hyperglycemic events': 0,
 'Avg. length of hyperglycemic events': '0 days 01:56:15',
 'Total time spent in hyperglycemic events': '1 days 07:00:00'}

In [25]:
metrics.glycemic_episodes(dexcom_data)

Unnamed: 0,ID,Total number hypoglycemic events,Number LV1 hypoglycemic events,Number LV2 hypoglycemic events,Number prolonged hypoglycemic events,Avg. length of hypoglycemic events,Total time spent in hypoglycemic events,Total number hyperglycemic events,Number LV1 hyperglycemic events,Number LV2 hyperglycemic events,Number prolonged hyperglycemic events,Avg. length of hyperglycemic events,Total time spent in hyperglycemic events
0,dexcom_eur_01,4,4,0,0,0 days 00:31:16,0 days 02:05:02,33,20,13,5,0 days 03:30:37,4 days 19:50:13
1,dexcom_eur_02,26,17,9,0,0 days 01:00:58,1 days 02:25:06,25,19,6,1,0 days 03:20:48,3 days 11:40:04
2,dexcom_eur_03,16,9,7,0,0 days 01:02:11,0 days 16:35:03,40,28,12,3,0 days 02:11:45,3 days 15:49:59


### 3.1.8. LBGI and HBGI

Blood glucose (BG) readings are transformed first, using one of the following two formulas:

    mmol/L: Transformed BG = 1.794*{log(BG)^1.026 - 1.861}

    mg/dL: Transformed BG = 1.509*{log(BG)^1.084 - 5.381}

This makes the transformed BG symmetric around zero, ranging from 10^1/2 to 10^1/2 . Then, a risk value is assigned to each BG reading as follows.

For the low blood glucose index (LBGI) that is as follows:

    If Transformed BG is < 0, Risk(BG) = 10*(Transformed BG)^2, otherwise Risk(BG) = 0
Then take the mean of these values.


For high bloood glucose index (HBGI):

    If Transformed BG is > 0, Risk(BG) = 10* (Transformed BG)^2, otherwise Risk(BG) = 0
HBGI is the mean of all these readings.

The paper can be found [here](https://diabetesjournals.org/care/article/21/11/1870/23103/Assessment-of-risk-for-severe-hypoglycemia-among) 


In [26]:
help(metrics.bgi)

Help on function bgi in module src.diametrics.metrics:

bgi(df)
    Calculate the Blood Glucose Index (BGI) metrics for a DataFrame of glucose readings.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        dict: A dictionary containing the Low Blood Glucose Index (LBGI) and High Blood Glucose Index (HBGI) values.
    
    Note:
        - The function calculates the LBGI and HBGI based on the glucose readings and detects the units of measurement.
        - The LBGI and HBGI are average values calculated from individual readings using the 'lbgi' and 'hbgi' functions.



In [27]:
metrics.bgi(dxcm2_transformed)

{'LBGI': 1.7291700433451427, 'HBGI': 4.962552708817989}

In [28]:
metrics.bgi(libre_data)

Unnamed: 0,ID,LBGI,HBGI
0,libre_amer_01,0.675633,2.203471
1,libre_amer_02,0.138239,1.498385


### 3.1.9. Mean amplitude of glycemic excursions (MAGE)
The mean amplitude of glycemic excursion (MAGE) is calculated using Scipy’s signal class. The peaks and troughs with a prominence of greater than the standard deviation are selected.

The difference between the peaks and troughs are then calculated separately for both the positive glucose and negative glucose differences. The mean is then calculated between the positive MAGE and the positive of the negative MAGE to calculate the final MAGE mean.


In [29]:
help(metrics.mage)

Help on function mage in module src.diametrics.metrics:

mage(df)
    Calculate the mean amplitude of glycemic excursions (MAGE) using scipy's signal class.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings and a 'time' column with timestamps.
    
    Returns:
        dict: A dictionary containing the MAGE value.
    
    Note:
        - The function uses scipy's signal.find_peaks function to find peaks and troughs in the glucose readings.
        - It then calculates the positive and negative MAGE and returns their mean.



In [30]:
metrics.mage(dxcm2_transformed)

{'MAGE (mmol/L)': 7.927027027027027}

In [31]:
metrics.mage(libre_data)

Unnamed: 0,ID,MAGE (mg/dL)
0,libre_amer_01,93.276316
1,libre_amer_02,41.27085


### 3.1.10. Calculate percentiles

This function will provide all of the percentiles used in the AGP, the 10th, 25th, 50th, 75th and 90th percentiles.

In [32]:
help(metrics.percentiles)

Help on function percentiles in module src.diametrics.metrics:

percentiles(df)
    Calculate various percentiles of glucose readings in the DataFrame.
    
    Args:
        df (pandas.DataFrame): The DataFrame containing a 'glc' column with glucose readings.
    
    Returns:
        dict: A dictionary containing the calculated percentiles of glucose readings.
    
    Note:
        - The function uses the numpy function np.percentile to calculate the specified percentiles.
        - The percentiles calculated are: 0th, 10th, 25th, 50th (median), 75th, 90th, and 100th.
        - The values are returned as a dictionary with keys representing the percentile labels and values representing the corresponding percentile values.



In [33]:
metrics.percentiles(dxcm2_transformed)

{'Min. glucose': 2.2,
 '10th percentile': 4.2,
 '25th percentile': 5.4,
 '50th percentile': 7.3,
 '75th percentile': 10.2,
 '90th percentile': 12.0,
 'Max. glucose': 22.1}

In [34]:
metrics.percentiles(dexcom_data)

Unnamed: 0,ID,Min. glucose,10th percentile,25th percentile,50th percentile,75th percentile,90th percentile,Max. glucose
0,dexcom_eur_01,2.4,5.7,6.9,8.5,11.0,13.2,22.1
1,dexcom_eur_02,2.2,4.2,5.4,7.3,10.2,12.0,22.1
2,dexcom_eur_03,2.2,4.3,5.8,7.7,10.3,12.71,19.9


## 3.1. All standard metrics

In [35]:
help(metrics.all_standard_metrics)

Help on function all_standard_metrics in module src.diametrics.metrics:

all_standard_metrics(df, return_df=True, lv1_hypo=None, lv2_hypo=None, lv1_hyper=None, lv2_hyper=None, additional_tirs=None, event_mins=15, event_long_mins=120)
    Calculate standard metrics of glycemic control for glucose data.
    
    Args:
        df (DataFrame): Input DataFrame containing glucose data.
        return_df (bool, optional): Flag indicating whether to return the results as a DataFrame. Defaults to True.
        lv1_hypo (float, optional): Level 1 hypoglycemia threshold. Defaults to None.
        lv2_hypo (float, optional): Level 2 hypoglycemia threshold. Defaults to None.
        lv1_hyper (float, optional): Level 1 hyperglycemia threshold. Defaults to None.
        lv2_hyper (float, optional): Level 2 hyperglycemia threshold. Defaults to None.
        additional_tirs (list, optional): Additional time in range thresholds. Defaults to None.
        event_mins (int, optional): Duration in minutes 

In [36]:
metrics.all_standard_metrics(libre1_transformed)

{'Start DateTime': '2021-03-20 17:38:00',
 'End DateTime': '2021-04-03 16:08:00',
 'Data Sufficiency (%)': 100,
 'Days': '13 days 22:30:00',
 'Average glucose (mg/dL)': 126.06721433905899,
 'eA1c (%)': 6.019763565820871,
 'SD (mg/dL)': 36.505583973110895,
 'CV (%)': 28.95723853699882,
 'AUC (mg h/dL)': 94.38283582089552,
 'LBGI': 0.6756332098838412,
 'HBGI': 2.203470544756186,
 'MAGE (mg/dL)': 93.27631578947368,
 'TIR normal (%)': 90.44,
 'TIR normal 1 (%)': 67.66,
 'TIR normal 2 (%)': 22.78,
 'TIR level 1 hypoglycemia (%)': 0.3,
 'TIR level 2 hypoglycemia (%)': 0.0,
 'TIR level 1 hyperglycemia (%)': 9.26,
 'TIR level 2 hyperglycemia (%)': 0.0,
 'Total number hypoglycemic events': 1,
 'Number LV1 hypoglycemic events': 1,
 'Number LV2 hypoglycemic events': 0,
 'Number prolonged hypoglycemic events': 0,
 'Avg. length of hypoglycemic events': '0 days 01:00:00',
 'Total time spent in hypoglycemic events': '0 days 01:00:00',
 'Total number hyperglycemic events': 16,
 'Number LV1 hyperglycem

In [37]:
metrics.all_standard_metrics(dexcom_data)

Unnamed: 0,ID,Start DateTime,End DateTime,Data Sufficiency (%),Days,Average glucose (mmol/L),eA1c (%),SD (mmol/L),CV (%),AUC (mmol h/L),...,Number LV2 hypoglycemic events,Number prolonged hypoglycemic events,Avg. length of hypoglycemic events,Total time spent in hypoglycemic events,Total number hyperglycemic events,Number LV1 hyperglycemic events,Number LV2 hyperglycemic events,Number prolonged hyperglycemic events,Avg. length of hyperglycemic events,Total time spent in hyperglycemic events
0,dexcom_eur_01,2023-03-08 00:04:00,2023-03-21 15:29:00,97.3,13 days 15:25:00,9.151752,7.38475,3.121628,34.109626,8.291685,...,0,0,0 days 00:31:16,0 days 02:05:02,33,20,13,5,0 days 03:30:37,4 days 19:50:13
1,dexcom_eur_02,2023-03-08 00:01:00,2023-03-21 15:31:00,97.4,13 days 15:30:00,7.866179,6.576213,3.085039,39.219034,7.104193,...,9,0,0 days 01:00:58,1 days 02:25:06,25,19,6,1,0 days 03:20:48,3 days 11:40:04
2,dexcom_eur_03,2023-03-08 00:01:00,2023-03-21 15:40:00,95.9,13 days 15:39:00,8.212228,6.793854,3.256385,39.652884,7.369449,...,7,0,0 days 01:02:11,0 days 16:35:03,40,28,12,3,0 days 02:11:45,3 days 15:49:59
