This compliance notebook shows how to calculate the band wear compliance. 

Compliance for band wear is inferred from band data for heart rate because fitness bands provide heart rate when worn. We estimate the overall band wear compliance as the percentage of patients adherent to band-wear ≥ **a minimum threshold time** of a **specified time range** for ≥ **minimum overall band wear percentage** of the activity-tracking period. 

For each subject we also output the percentage of daily band wear during the activity tracking period. This allows to further filter out the band data for the compliant dates.

**<h3>How to Use</h3>**
**Using with default data**

To use this notebook with default sample data, jump to configure compliance and initialize the variables as per your requirements and then run the complete notebook.

**Using with data present in a CSV file on a remote GitHub repository**

To use this notebook with your own time series heart-rate data stored on a remote GitHub repository, follow the below steps:
1. Go to the repository where the csv is saved and then open the file in the Github interface.
2. Click on the raw button to go to open the raw content.
3. Copy the link in the address bar and replace it with the current url in the Data fetch section.
4. Jump to Configure Compliance section and change the variables as per your requirements.
5. Run the complete notebook.

**<h3>Code Formatter</h3>**

In [49]:
%load_ext blackcellmagic

The blackcellmagic extension is already loaded. To reload it, use:
  %reload_ext blackcellmagic


**<h3>Necessary Imports**

In [50]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import requests
from io import StringIO

**<h3>Data fetch</h3>**

In case of a different input csv file stored on a remote GitHub repository change the url in the section below as mention in the **How to use** section

In [51]:
url = "https://raw.githubusercontent.com/USC-InfoLab/w4h-datasets/main/sampleHeartRateData.csv"
response = requests.get(url)
data = StringIO(response.text)
df = pd.read_csv(data)

**<h3>Compliance Analyzer Class</h3>**

The Compliance Analyzer class contains two functions:
1. **calculate_total_time_span:** This function calculates the total number of hours spanned by the daily_band_wear_hours. The parameter daily_band_wear_hours can be in multiple parts (e.g. 8:00 to 12:00 and 16:00 to 22:00 hours). To find the total hours this function is used.


2. **calculate_compliance:** This function uses the heart-rate data to calculate the number of hours for which the subject wore the band in the daily_band_wear_hours. To achieve the same, this function counts the number of per-minute heart-rate measurements to count the number of minutes for which the heart-rate monitor was active and the subject wore the band.<br>
The function returns the compliance report per user that includes the daily compliance percentage, compliance data and the overall compliance percentage indicating whether the subject was compliant or not.

In [52]:
class ComplianceAnalyzer:
    def __init__(
        self,
        daily_band_wear_percentage_min=80,
        overall_band_wear_percentage_min=80,
        daily_band_wear_hours=None,
    ):
        self.daily_band_wear_percentage_min = daily_band_wear_percentage_min
        self.overall_band_wear_percentage_min = overall_band_wear_percentage_min
        self.daily_band_wear_hours = (
            daily_band_wear_hours
            if daily_band_wear_hours is not None
            else [("08:00", "12:00"), ("14:00", "18:00")]
        )

    def calculate_total_time_span(self):
        total_time = timedelta()
        for from_time, to_time in self.daily_band_wear_hours:
            from_time = datetime.strptime(from_time, "%H:%M")
            to_time = datetime.strptime(to_time, "%H:%M")
            total_time += to_time - from_time
        return total_time.total_seconds() / 3600

    def calculate_compliance(self, data):
        data["timestamp"] = pd.to_datetime(data["date"] + " " + data["time"])
        data["date"] = data["timestamp"].dt.date

        user_compliance = {}
        total_time_span = (
            self.calculate_total_time_span() * 60
        )  # Convert hours to minutes
        daily_compliance_minutes = total_time_span

        user_groups = data.groupby("user_id")
        for user_id, user_data in user_groups:
            daily_compliance = []
            compliant_days = []
            compliant_users_data = []

            for date, group in user_data.groupby("date"):
                total_active_minutes = 0
                for from_time, to_time in self.daily_band_wear_hours:
                    from_time = datetime.combine(
                        date, datetime.strptime(from_time, "%H:%M").time()
                    )
                    to_time = datetime.combine(
                        date, datetime.strptime(to_time, "%H:%M").time()
                    )

                    day_data = group[
                        (group["timestamp"] >= from_time)
                        & (group["timestamp"] <= to_time)
                        & (group["heart_rate"] != 0)
                    ]
                    total_active_minutes += len(day_data) if not day_data.empty else 0

                daily_compliance_percentage = (
                    total_active_minutes / daily_compliance_minutes
                ) * 100
                daily_compliance.append(
                    {
                        "date": date,
                        "compliance_hours": total_active_minutes / 60,
                        "compliance_percentage": daily_compliance_percentage,
                    }
                )

                if daily_compliance_percentage >= self.daily_band_wear_percentage_min:
                    compliant_days.append(date)
                    compliant_users_data.append(group[group["date"] == date])

            results_df = pd.DataFrame(daily_compliance)
            total_days = len(results_df)
            compliant_days_count = len(compliant_days)
            overall_compliance_percentage_calc = (
                (compliant_days_count / total_days) * 100 if total_days > 0 else 0
            )

            user_compliance[user_id] = {
                "compliance_report": {
                    "daily_compliance": results_df.to_dict(orient="records"),
                    "overall_compliance_percentage": overall_compliance_percentage_calc,
                    "is_compliant": overall_compliance_percentage_calc
                    >= self.overall_band_wear_percentage_min,
                },
                "compliant_data": pd.concat(compliant_users_data).to_dict(
                    orient="records"
                )
                if compliant_users_data
                else [],
            }

        return user_compliance

**<h3>Configure Compliance</h3>**

The main three parameters required to calculate band wear compliance are:
1. daily_band_wear_percentage_min: The minimum percentage of time a subject needs to wear the band daily.
2. daily_band_wear_hours: The time of the day each subject is required to wear the band (e.g. the subjects may be required to wear the band from 8:00 to 16:00 each day)
3. overall_band_wear_percentage_min: The minimum percentage of time a subject needs to wear the band out of the total activity-tracking period.

In [53]:
compliance_analyzer = ComplianceAnalyzer(
    daily_band_wear_percentage_min=60,
    daily_band_wear_hours=[("08:00", "20:00")],
    overall_band_wear_percentage_min=60,
)

**<h3>Calculate Compliance**

In [54]:
results = compliance_analyzer.calculate_compliance(df)

**<h3>Output Results</h3>**

The results have three components per subject:
1. Compliance Report: The Compliance Report contains compliance hours of each day i.e. number of hours for which the subject wore the band during the band wear hours mentioned and the percentage these hours are of the total time.
2. Complaint Data: The data of the compliant days for each subject.
3. Overall compliance percentage and compliance status: The total percentage of days the subject was compliant and whether the subject meets the overall complaince requirements.

In [55]:
for user_id, user_data in results.items():
    print(f"\nUser ID: {user_id}")

    # Print compliance report as DataFrame
    compliance_report_df = pd.DataFrame(
        user_data["compliance_report"]["daily_compliance"]
    )
    print("\nCompliance Report:")
    print(compliance_report_df)

    # Print compliant data as DataFrame
    compliant_data_df = pd.DataFrame(user_data["compliant_data"])
    print("\nCompliant Data:")
    print(compliant_data_df)

    # Print overall compliance percentage and compliance status
    overall_compliance_percentage = user_data["compliance_report"][
        "overall_compliance_percentage"
    ]
    is_compliant = user_data["compliance_report"]["is_compliant"]
    print(f"\nOverall Compliance Percentage: {overall_compliance_percentage:.2f}%")
    print(f"Is Compliant: {is_compliant}")


User ID: 02f77d2

Compliance Report:
          date  compliance_hours  compliance_percentage
0   2018-07-12          3.883333              32.361111
1   2018-07-13          9.616667              80.138889
2   2018-07-14          4.983333              41.527778
3   2018-07-20          0.033333               0.277778
4   2018-07-23          0.166667               1.388889
..         ...               ...                    ...
62  2018-12-04         11.566667              96.388889
63  2018-12-05         11.416667              95.138889
64  2018-12-06          5.683333              47.361111
65  2018-12-07          7.766667              64.722222
66  2018-12-08          1.583333              13.194444

[67 rows x 3 columns]

Compliant Data:
       user_id        date      time  heart_rate           timestamp
0      02f77d2  2018-07-13  00:00:00         170 2018-07-13 00:00:00
1      02f77d2  2018-07-13  00:01:00         171 2018-07-13 00:01:00
2      02f77d2  2018-07-13  00:02:00       