This project focuses on analyzing a unique dataset that integrates Continuous Glucose Monitoring (CGM) data, insulin administration, carb intake, physical activity, heart rate, and sleep quality/quantity from 25 individuals with Type 1 Diabetes Mellitus (T1DM).
The data was collected using:
- FreeStyle Libre 2 CGMs for glucose monitoring.
- Fitbit Ionic smartwatches for steps, calories burned, heart rate, and sleep tracking.
Each participant was monitored for a minimum of 14 consecutive days.
This rich, multi-dimensional dataset allows researchers and developers to:
- Build glucose prediction models.
Develop hypoglycemia and hyperglycemia detection algorithms.
Explore the impact of sleep, activity, and lifestyle factors on glucose variability.
Use either the raw data for customized pipelines or the preprocessed version for immediate model development.
By combining physiological, behavioral, and clinical variables, this project provides a foundation for advancing data-driven diabetes management, personalized healthcare solutions, and predictive modeling in T1DM research.
Source: https://data.mendeley.com/datasets/3hbcscwz44/1
- Preprocessed/ and raw data/ containing demographics and 25 patients file in csv format.
- Each file contains Continuous Glucose Monitoring (CGM) data,steps, calories burned, heart rate, and sleep tracking of each patients.
The csv file contains entries for :
- Continuous Glucose Monitoring (CGM): Blood glucose level measurements every five minutes
- Insulin: Bolus and basal insulin doses, with details on the type and amount administered
- Carb intake: Recorded carbohydrate intake
- Steps : Steps count taken by Fitbit Ionic smartwatches
- Heart Rate : Heart rate taken by Fitbit Ionic smartwatches for evey 5 mins
The project's code is organized into the following directories and files:
- Diabetes_dataset: Contains HUPA-UCM Diabetes Dataset
Notebooks: Jupyter notebooks used for data exploration and model development
Team2_NumPioneers_Data Cleaning.ipynb: Notebook for data cleaning and
Team2_NumPioneers_Descriptive_Analysis II.ipynb: Notebook for Visualizations and summary statistics.
Team2_NumPioneers_Prescriptive_Analysis III.ipynb: Notebook for Implementation of recommendations and optimization.
Team2_NumPioneers_Predective_Analysis IV.ipynb: Notebook for Model training, validation, and evaluation.
- requirements.txt: Lists all Python dependencies required to run the project.
Setting Clinical Ranges & Checking Outliers.
Patient information (names, emails) was removed and replaced with unique identifiers.
Identify rows in the dataset where values fall outside the defined acceptable ranges.
Capping Outliers with Defined Clinical Ranges
Convert time to proper datetime format
- Objective: To understand the basic characteristics of the T1D patient population and the dynamics of their disease over time.
- Key Activities:
- Exploratory Data Analysis (EDA): Visualizations and statistical summaries of patient data.
- Trend Analysis: Identifying patterns in glucose levels, insulin usage, and activity levels over time.
- Patient Segmentation: Grouping patients based on their demographic information, health metrics, or lifestyle factors.
- Outputs:
- Notebooks detailing the data exploration.
- Visualizations (charts, graphs) showing trends and distributions.
- Summary statistics of key health indicators.
- Objective: To provide actionable, data-driven recommendations for T1D management.
- Key Activities:
- Rule-Based Systems: Creating clear, interpretable rules based on predictive insights (e.g., "If glucose is predicted to drop below X, and the patient has been active, reduce insulin dose by Y units").
- Optimization Algorithms: Developing algorithms to suggest optimal treatment plans, such as adjusting insulin dosage based on predicted glucose response.
- Treatment Pathway Simulation: Using a combination of descriptive and predictive sources to simulate different treatment scenarios and recommend the best course of action.
- Outputs:
- Documentation of the rules and logic for actionable insights.
- A report detailing the effectiveness of recommended strategies against actual outcomes.
- Objective: To build models that can forecast future events and patient outcomes.
- Key Activities:
- Feature Engineering: Creating new variables from the raw data to improve model performance.
- Model Training: Developing and training machine learning models such as:
- Time-Series Models: To predict future blood glucose levels based on historical CGM data.
- Classification Models: To forecast the risk of complications (e.g., hypoglycemia, diabetic ketoacidosis).
- Model Evaluation: Assessing model accuracy using appropriate metrics like F1-score or RMSE.
- Outputs:
- Jupyter notebooks with the predictive modeling workflow.
- Pickled model files for future use.
- Performance metrics and validation results.
- Clone the repository:
- Navigate to the project directory:
- cd Team2_NumPioneers_PythonHackathon_August25
- Install the required dependencies:
- pip install -r requirements.txt
- Explore the Jupyter notebooks:
- jupyter notebook
- Follow the notebooks in numerical order to walk through the complete analytical process.