Skip to content

USER GUIDE

RunsData edited this page Dec 14, 2020 · 5 revisions

USER GUIDE for the Python Package wearablecompute.

wearablecompute is a package for Feature Engineering of Wearable Sensors.

Installation

wearablevar requires the pandas, numpy, and datetime packages.

Recommended: Install via pip:

$ pip install wearablecompute

Install via git:

$ pip install git+git://github.com/brinnaebent/wearablecompute.git
$ git clone

Metrics computed:

We have developed 50+ data- and domain-driven engineered features that are currently in use in digital biomarker discovery projects. Metrics currently available in wearablecompute are listed below:

  • Mean Heart Rate Variability
  • Median Heart Rate Variability
  • Maximum Heart Rate Variability
  • Minimum Heart Rate Variability
  • SDNN (HRV)
  • RMSSD (HRV)
  • NNx (HRV)
  • pNNx (HRV)
  • HRV Frequency Domain Metrics:
  • PowerVLF
  • PowerLF
  • PowerHF
  • PowerTotal
  • LF/HF
  • PeakVLF
  • PeakLF
  • PeakHF
  • FractionLF
  • FractionHF
  • EDA Peaks
  • Activity Bouts
  • Interday Summary:
  • Interday Mean
  • Interday Median
  • Interday Maximum
  • Interday Minimum
  • Interday Quartile 1
  • Interday Quartile 3
  • Interday Standard Deviation
  • Interday Coefficient of Variation
  • Intraday Standard Deviation (mean, median, standard deviation)
  • Intraday Coefficient of Variation (mean, median, standard deviation)
  • Intraday Mean (mean, median, standard deviation)
  • Daily Mean
  • Intraday Summary:
  • Intraday Mean
  • Intraday Median
  • Intraday Minimum
  • Intraday Maximum
  • Intraday Quartile 1
  • Intraday Quartile 3
  • TIR (Time in Range of default 1 SD)
  • TOR (Time outside Range of default 1 SD)
  • POR (Percent outside Range of default 1 SD)
  • MASE (Mean Amplitude of Sensor Excursions, default 1 SD)
  • Hours from Midnight (circadian rhythm feature)
  • Minutes from Midnight (ciracadian rhythm feature)

Functions:

The functions listed below calculate the metrics listed above. Please see description, arguments, and returns for each function. If you have questions not answered below, please create an Issue and describe your question.

e4import()
Brings in an empatica compiled file **this is not raw empatica data**
         Args:
             filepath (String): path to file
             sensortype (Sting): Options: 'EDA', 'HR', 'ACC', 'TEMP', 'BVP'
             Starttime (String): (optional, default arg = 'NaN') format '%Y-%m-%d %H:%M:%S.%f', if you want to only look at data after a specific time
             Endtime (String): (optional, default arg = 'NaN') format '%Y-%m-%d %H:%M:%S.%f', if you want to only look at data before a specific time
             window (String): default '5min'; this is the window your data will be resampled on.
         Returns:
             (pd.DataFrame): dataframe of data with Time, Mean, Std columns
HRV()
Computes Heart Rate Variability metrics
        Args:
            time (pandas.DataFrame column or pandas series): time column
            IBI (pandas.DataFrame column or pandas series): column with inter beat intervals
            ibimultiplier (IntegerType): defualt = 1000; transforms IBI to milliseconds. If data is already in ms, set as 1
        Returns:
            maxHRV (FloatType): maximum HRV
            minHRV (FloatType): minimum HRV
            meanHRV (FloatType): mean HRV
            medianHRV(FloatType): median HRV
SDNN
Computes Heart Rate Variability metric SDNN
        Args:
            time (pandas.DataFrame column or pandas series): time column
            IBI (pandas.DataFrame column or pandas series): column with inter beat intervals
            ibimultiplier (IntegerType): defualt = 1000; transforms IBI to milliseconds. If data is already in ms, set as 1
        Returns:
            SDNN (FloatType): standard deviation of NN intervals 
RMSSD
Computes Heart Rate Variability metric RMSSD
        Args:
            time (pandas.DataFrame column or pandas series): time column
            IBI (pandas.DataFrame column or pandas series): column with inter beat intervals
            ibimultiplier (IntegerType): defualt = 1000; transforms IBI to milliseconds. If data is already in ms, set as 1
        Returns:
            RMSSD (FloatType): root mean square of successive differences
NNx
Computes Heart Rate Variability metrics NNx and pNNx
        Args:
            time (pandas.DataFrame column or pandas series): time column
            IBI (pandas.DataFrame column or pandas series): column with inter beat intervals
            ibimultiplier (IntegerType): defualt = 1000; transforms IBI to milliseconds. If data is already in ms, set as 1
            x (IntegerType): default = 50; set the number of times successive heartbeat intervals exceed 'x' ms
        Returns:
            NNx (FloatType): the number of times successive heartbeat intervals exceed x ms
            pNNx (FloatType): the proportion of NNx divided by the total number of NN (R-R) intervals. 
FrequencyHRV()
Computes Heart Rate Variability frequency domain metrics
        Args:
            IBI (pandas.DataFrame column or pandas series): column with inter beat intervals
            ibimultiplier (IntegerType): defualt = 1000; transforms IBI to milliseconds. If data is already in ms, set as 1
            fs (IntegerType): Optional sampling frequency for frequency interpolation (default=1)
        Returns:
            (dictionary): dictionary of frequency domain HRV metrics with keys:
                PowerVLF (FloatType): Power of the Very Low Frequency (VLF): 0-0.04Hz band
                PowerLF (FloatType): Power of the Low Frequency (LF): 0.04-0.15Hz band
                PowerHF (FloatType): Power of the High Frequency (HF): 0.15-0.4Hz band
                PowerTotal (FloatType):Total power over all frequency bands
                LF/HF (FloatType): Ratio of low and high power
                Peak VLF (FloatType): Peak of the Very Low Frequency (VLF): 0-0.04Hz band
                Peak LF (FloatType): Peak of the Low Frequency (LF): 0.04-0.15Hz band
                Peak HF (FloatType): Peak of the High Frequency (HF): 0.15-0.4Hz band
                FractionLF (FloatType): Fraction that is low frequency
                FractionHF (FloatType): Fraction that is high frequency
PeaksEDA()
Calculates peaks in the EDA signal
        Args:
            eda (pandas.DataFrame column or pandas series): eda column
            time (pandas.DataFrame column or pandas series): time column
        Returns:
            countpeaks (IntegerType): the number of peaks total 
            peakdf (pandas.DataFrame): a pandas dataframe with time and peaks to easily integrate with your data workflow
exercisepts()
Calculates activity bouts using accelerometry and heart rate
        Args:
            acc (pandas.DataFrame column or pandas series): accelerometry column
            hr (pandas.DataFrame column or pandas series): heart rate column
            time (pandas.DataFrame column or pandas series): time column
        Returns:
            countbouts (IntegerType): the number of acitvity bouts total
            returndf (pandas.DataFrame): a pandas dataframe with time and activity bouts (designated as a '1') to easily integrate with your data workflow
interdaycv()
Computes the interday coefficient of variation on pandas dataframe Sensor column
        Args:
            column (pandas.DataFrame column or pandas series): column that you want to calculate over
        Returns:
            cvx (IntegerType): interday coefficient of variation 
interdaysd()
Computes the interday standard deviation of pandas dataframe Sensor column
        Args:
            column (pandas.DataFrame column or pandas series): column that you want to calculate over
        Returns:
            interdaysd (IntegerType): interday standard deviation 
intradaycv()
Computes the intradaycv, returns the mean, median, and sd of intraday cv Sensor column in pandas dataframe
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            intradaycv_mean (IntegerType): Mean, Median, and SD of intraday coefficient of variation 
            intradaycv_median (IntegerType): Median of intraday coefficient of variation 
            intradaycv_sd (IntegerType): SD of intraday coefficient of variation 
        Requires:
            interdaycv() function
intradaysd()
Computes the intradaysd, returns the mean, median, and sd of intraday sd Sensor column in pandas dataframe
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            intradaysd_mean (IntegerType): Mean, Median, and SD of intraday standard deviation 
            intradaysd_median (IntegerType): Median of intraday standard deviation 
            intradaysd_sd (IntegerType): SD of intraday standard deviation 
intradaymean()
Computes the intradaymean, returns the mean, median, and sd of the intraday mean of the Sensor data
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            intradaymean_mean (IntegerType): Mean, Median, and SD of intraday standard deviation 
            intradaymean_median (IntegerType): Median of intraday standard deviation 
            intradaymean_sd (IntegerType): SD of intraday standard deviation 
dailymean()
Computes the mean of each day
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            pandas.DataFrame with days and means as columns
dailysummary()
Computes the summary of each day (mean, median, std, max, min, Q1G, Q3G)
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            pandas.DataFrame with days and summary metrics as columns
interdaysummary()
Computes interday mean, median, minimum and maximum, and first and third quartile over a column
        Args:
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             dataframe (True/False): default=True; whether you want a pandas DataFrame as an output or each of the summary metrics as IntegerTypes
        Returns:
            pandas.DataFrame with columns: Mean, Median, Standard Deviation, Minimum, Maximum, Quartile 1, Quartile 3
            
            or
            
            interdaymean (FloatType): mean 
            interdaymedian (FloatType): median 
            interdaysd (FloatType) : standard deviation
            interdaymin (FloatType): minimum 
            interdaymax (FloatType): maximum 
            interdayQ1 (FloatType): first quartile 
            interdayQ3 (FloatType): third quartile 
TIR()
Computes time in the range of (default=1 sd from the mean)column in pandas dataframe
        Args:
             df (pandas.DataFrame):
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             sd (IntegerType): standard deviation from mean for range calculation (default = 1 SD)
             sr (IntegerType): sampling rate of sensor
        Returns:
            TIR (IntegerType): Time in Range set by sd, *Note time is relative to your SR
TOR()
Computes time outside the range of (default=1 sd from the mean) column in pandas dataframe
        Args:
             df (pandas.DataFrame):
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             sd (IntegerType): standard deviation from mean for range calculation (default = 1 SD)
             sr (IntegerType): sampling rate of sensor
        Returns:
            TOR (IntegerType): Time outside of range set by sd, *Note time is relative to your SR
POR()
Computes percent time outside the range of (default=1 sd from the mean) column in pandas dataframe
        Args:
             df (pandas.DataFrame):
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             sd (IntegerType): standard deviation from mean for range calculation (default = 1 SD)
             sr (IntegerType): 
        Returns:
            POR (IntegerType): percent of time spent outside range set by sd
MASE()
Computes the mean amplitude of sensor excursions (default = 1 sd from the mean)
        Args:
             df (pandas.DataFrame):
             column (pandas.DataFrame column or pandas series): column that you want to calculate over
             sd (IntegerType): standard deviation from mean to set as a sensor excursion (default = 1 SD)
        Returns:
           MASE (IntegerType): Mean Amplitude of sensor excursions
crhythm()
Computes 'minutes from midnight' and 'hours from midnight'- these features will allow you to account for circaidan rhythm effects
        Args:
             time (pandas.DataFrame): time column
             timeformat (String): default = '%Y-%m-%d %H:%M:%S.%f'; format of timestamp in time column
        Returns:
            hourfrommid (ListType): Hours from midnight, the same length as your time column
            minfrommid (ListType): Minutes from midnight, the same length as your time column