<a href="https://colab.research.google.com/github/IvaroEkel/Probabilistic-Machine-Learning_lecture-PROJECTS/blob/main/TEMPLATE_Probabilistic_Machine_Learning_Project_Report.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Probabilistic Machine Learning - Project Report

**Course:** Probabilistic Machine Learning (SoSe 2025) <br>
**Lecturer:** Alvaro Diaz <br>
**Student(s) Name(s):**  Felix Coy <br>
**GitHub Username(s):**  Koyle1 <br>
**Date:** 22.08.2025 <br>
**PROJECT-ID:** 05-2F_XXXX_health_surveys_mx <br>

---


## 1. Introduction

Diabetes and obesity are among the most common causes of poor health and serious terminal conditions. Gaining insights into the factors causing diabetes and obesity can help prevent unhealthy habits and improve public health outcomes.

The aim of this project is to understand statistical causality within nutrition data and generate actionable insights in this domain. To achieve this, we train a Bayesian network with causal inference on a Mexican nutrition dataset, which contains survey responses from multiple participants regarding their dietary habits, as well as their mental and physical health conditions.

The dataset was collected by the Mexican government to track health developments related to eating habits and is accessible via:
- https://ensanut.insp.mx/encuestas/ensanutcontinua2023/descargas.php

Using this approach, we aim to investigate whether there is causal structure in the dataset that can inform health interventions.


## 2. Data Loading and Exploration

We extracted our dataset via a download as csv option on the responsible goverment department website https://ensanut.insp.mx/encuestas/ensanutcontinua2023/descargas.php. We saved this file locally and used Pandas library to load the dataset and in conjunction Matplotlib libary for data inspection as well as visualisation. The code for the visualisation is contained in the datapipeline-notebook which is located in the notebooks folder. The dataset and its characteristics will be presented below.

### Overview of the dataset

---General Information about the dataset--- <br>
Shape of the dataset: (5569, 758) <br>
Data types <br>
object     675 <br>
int64       75 <br>
float64      8 <br>
Name: count, dtype: int64 <br>
<br>

The dataset is very feature rich with 758 features in total. Each feature either captures an answer given by the survery participant (e.g Were you diagnosed with diabetes before?) encoded via a number code (1 = Yes, 2 = No, 3 = Yes, during pregnancy) or general information on how the survey was conducted (e.g when it was conducted) and the survery participant himself (e.g. age, region of residence, weight, etc.). The survey questions are most commonly stored in the object-format and take up most of the 675 object features. In contrast, float and integer data represents general information mostly commonly. The number of samples is 5569 and every sample represents the answers of a survey by a participant.

### Missing data

Since the survey catalog also includes follow questions which will only be asked if the participant answers a prior question in a certain way, the dataset contains many missing values. With 159 columns containing missing values.

### Feature statistics

The feature statistics of the dataset is categorized by many low entropy columns and high cardinality columns.



Explore the data in more detail using project/notebooks/data_pipeline.ipynb

## 3. Data Preprocessing
Based on the prior exploration of the data the following steps will be taken for data cleaning and preperation for the use as training data. Preprocessing is executed as a pipeline with seven steps beginning with the loading of the data (1), converting the data to the correct data types (2), handling of missing data (3), outlier handeling (4), feature engineering (5), normalization & discretization (6) and feature selection (7).

### 3.1. Loading of the dataset
The loading of the dataset involves using the pandas to read the data in csv-format and convert it to a pandas dataframe during runtime.

### 3.2. Type conversion
Pandas recognizes must of the features as the type 'categorical'. The first step involves detecting features that actually belong to other types like 'numeric', 'string' or 'datetypes'. Data columns that are recognized as another type will be converted to the correct type. This is necessary for downstream processing.

### 3.3. Handling of missing data
We use the median of the columns to determine missing numeric values. This is based on the assumption that values are missing at random which is likely in a survey that is filled out manually. Missing categorical will be replaced with 'NA' to simbolize that there was no value given for a certain feature. It does not make sense to fill these columns with the most likely category since most values are missing due to prior answers making it unnecessary to fill out the column with the missing value.
Approaches involving dropping columns, filling with the most likely category and using the mean to fill missing numeric values were also tried out but were outperformed by the Method described above.

### 3.4. Outlier handeling
Both Inter-Quantile-Range (IQR) based normalisation and Z-score normalisation were used to detect and handle outliers in the numeric columns. IQR based normalisation has proven to lead to better results in the final evaluation in this experiment.

### 3.5. Feature engineering
Based on domain knowledge, we use the information contained in multiple columns to define new features and enrich our dataset. We combine columns relating to chronic disease, mental health, weight management, dietary quality and more into individual scores representing the combined information in these columns.

### 3.6. Discretization
To prepare the dataset for the byesian network, we discretize all continious variables into three categories 'low', 'medium' and 'high'. Before that, we apply min-max scaling.

### 3.7. Feature selection
Finally, we use an entropy criterium to detect values that contain little meaningfull information for our network due to being mostly dominated by value. For this, we select removal candidates based on low entropy score and high number of unique values. Futhermore, we additionally select removal candidates based on high correlation to other columns in the dataset. Lastly, to avoid removing features that contain important information, we compute the individual mutual information between the key outcome variables 'a1503', 'a1210' and 'a0301'. We rank the removal candidates based on their mutual information and remove those that fall below a treshhold.





## 4. Probabilistic Modeling Approach

### 4.1. Description of the models chosen
We selected **Bayesian Networks (BNs)** as the primary modeling approach. A BN is a **probabilistic graphical model** defined by a directed acyclic graph (DAG), where each node represents a random variable and edges represent conditional dependencies. The BN factorizes the joint distribution over all variables into a product of local conditional probability distributions:

$$
P(X_1, X_2, \ldots, X_n) = \prod_{i=1}^n P(X_i \mid \mathrm{Pa}(X_i)),
$$

where $\mathrm{Pa}(X_i)$ denotes the parents of variable $X_i$. This framework captures both the **probabilistic structure** and the **causal relationships** among health determinants, making it highly interpretable for epidemiological studies.

### 4.2. Why beyasian networks are suitable for this project
The ENSIN health survey data is **multidimensional, partially incomplete, and interdependent**, involving variables such as age, socioeconomic status, diet, chronic conditions, and mental health. Bayesian Networks are particularly suitable because:

- They model **conditional dependencies** explicitly, e.g. 
  $P(\text{Diabetes} \mid \text{Obesity}, \text{Age})$, rather than assuming independence between risk factors.
- They can perform **inference with missing data**, estimating probabilities from available information without discarding incomplete cases.
- They support **causal interpretation** and **what-if simulations**, which are essential for evaluating policy interventions (e.g., the effect of improved diet on reducing obesity prevalence).
- They handle **mixed data types** (categorical, ordinal, continuous via discretization) common in survey data.
- They are **interpretable**, which is crucial for public health decision-making, unlike many black-box machine learning models.


### 3. Mathematical formulations
Bayesian Networks rely on Bayes’ theorem for inference:

$$
P(X \mid Y) = \frac{P(Y \mid X) P(X)}{P(Y)}.
$$

This allows estimation of posterior probabilities of unobserved variables given observed evidence.  

The model structure encodes the joint distribution as:

$$
P(X_1, X_2, \ldots, X_n) = \prod_{i=1}^n P(X_i \mid \mathrm{Pa}(X_i)),
$$

which drastically reduces computational complexity by leveraging conditional independencies.  

For example, a simplified health model might specify:

$$
P(\text{Diabetes}, \text{Obesity}, \text{Diet}, \text{SES}) 
= P(\text{SES}) \, P(\text{Diet} \mid \text{SES}) \, P(\text{Obesity} \mid \text{Diet}, \text{SES}) \, P(\text{Diabetes} \mid \text{Obesity}, \text{Age}),
$$

highlighting how socioeconomic status and diet influence obesity, which in turn affects diabetes risk.



Source: Bayesian Netorks

## 5. Model Training and Evaluation

- Training process
- Model evaluation (metrics, plots, performance)
- Cross-validation or uncertainty quantification

Our model training is completed in 6 steps: initialisation (1), netowrk training (2), strucutre comparision (3), inference (4), cross-validation (5), uncertainty & model evaluation (6). The code for the training is split u between the 'src/network.py' file and the 'src.networkTrainer.py' file. The former file contains the implementation of the netork, while the latter is responsible for the orchistration of the training steps themselves.

#### 5.1. Initialisaiton
Step one initialises the custom bayesian network 'ENSINNetworkTrainer' with the processed data from the data pipeline and creates all necessary objects for the model training in step 2.

### 5.2. Network Training
We train three different bayesian modells using different techiques. These techniques involves the structure of the model been predefined based on domain knowledge, contraint-based structure learning using PC-algorithm and finally score-based strucutre learning by applying the hill_climb algorithm. The predefined strucutre for the bayesian model was modelled based on information given by an expert in this field. The three modells will be referred to as expert network, pc network and hill climb network.
The network training itself is completed in two steps. In the first step the models learn the structure of the bayesian network based on one of the chosen methods using a subset of the data. In the second step the model fits the parameters of the network to the traing data.


### 5.3. Strucutre Comparision
We create a report regarding the learned structure of the three methods in terms of number of edges, network density and average node degree. We will use this report to better understand the differences of the three networks.

### 5.4. Inference Demonstration
In step 4, we apply our bayesian network the following three scenarios to evaluate manually whether their prediction are sensible or if the models predicition. We choose the following three examples for inference:
- What are the health implications for a young female with diabetes?
- Health outcomes for older male with good dietary habits
- Impact of depression on health outcomes

The follwing two health outcomes were investigated 'a1210' (Self-reported health status') and 'a1301' ('Quality of life measure').

### 5.5. Uncertainty & Model Evaluation
We quantified the uncertainty of our models based on two different metrics: parameter uncertainty and prediction uncertainty. For calculate prediciton uncertainty by using five test cases and 


## 6. Results

### 6.1. Network structure

![Alt Text](network_expert.png)

This diagram shows a Bayesian network structure for analyzing the ENSIN dataset using an EXPERT method. The network has a hierarchical structure with demographic variables at the top level including edad (age), sexo (sex/gender), and inreso (income) shown as red nodes. The second level contains r.region (region) and estrato (stratum/socioeconomic level) as teal/green nodes. The middle and lower levels consist of various coded variables (a0104, a0107, a0108, a0202, a0203, a0204, a0301, a0401, a0701p, a0702p, a1503, a1210, a1301) which likely represent survey questions or measurement items from the ENSIN study. The network shows dense connectivity with numerous connections between variables, indicating complex interdependencies typical of real-world survey data. The hierarchical flow demonstrates how demographic and socioeconomic factors at the top influence more specific measurements at the bottom levels. 

### 6.2. Network comparison

![Alt Text](results/network_comparison.png)

This graphic compares network structure metrics across three different methods for creating Bayesian networks: Expert, PC (Peter-Clark algorithm), and Hill_Climb (Hill Climbing algorithm). The Expert and Hill_Climb methods produce networks with approximately 43 edges each, indicating dense connectivity, while the PC algorithm generates a much sparser network with only about 11 edges, suggesting the PC algorithm is more conservative in establishing conditional dependencies between variables. All three methods show relatively low network densities (around 0.09-0.10), with PC having the highest density despite fewer edges, and Expert and Hill_Climb showing similar, slightly lower densities, indicating that even the denser networks are not fully connected given the total number of possible connections. Expert and Hill_Climb methods result in higher average node degrees (approximately 3.9), meaning each variable connects to about 4 other variables on average, while the PC algorithm produces a much lower average degree (around 2.0), reflecting its sparser structure with fewer connections per node. The normalized metrics comparison in the bottom-right panel shows that Expert and Hill_Climb methods have similar profiles with high values across all metrics (near 1.0 for edges and average degree), while the PC method stands out with the lowest number of edges (around 0.25) and average degree (around 0.5), but maintains relatively high density (around 0.9). This comparison reveals that while Expert and Hill_Climb methods create structurally similar networks, the PC algorithm produces fundamentally different network architectures with fewer but potentially more significant connections.

### 6.3 Inference
Diabetes Scenario: P(a1210=0, a1503=1) = 0.8541 indicates young females with diabetes have an 85% probability of reporting poor self-perceived health while maintaining moderate objective health measures. This aligns with established literature on diabetes and health-related quality of life, where psychological burden often exceeds clinical severity, particularly in younger populations managing chronic disease onset. <br>  <br>
Depression Scenario: P(a1210=0, a1503=1) = 0.8897 demonstrates that individuals with depression report poor health status 89% of the time despite potentially stable objective measures. This finding is consistent with cognitive theories of depression, where negative self-assessment bias leads to persistently poor health perception independent of actual health status—a well-documented phenomenon in psychiatric epidemiology. <br>  <br>
Older Male with Good Diet: P(a1503=1, a1210=0) = 0.9473 suggests excellent objective health outcomes but poor self-reported health. However, this interpretation requires careful consideration as it would contradict nutritional epidemiology literature showing positive correlations between dietary quality and subjective wellbeing in older adults. 

The results demonstrate clinical plausibility and align with established health research. Key findings are consistent with:
- Health psychology literature on chronic disease burden
- Psychiatric epidemiology research on depression and self-perception
- General patterns of subjective-objective health measure relationships

<br> <br>
Sources: Validation for Inference

### 6.4 Uncertainty

In [8]:
import pandas as pd
print('Summary of hill climb uncertainty')
df = pd.read_csv('results/parameter_uncertainty_hill_climb.csv')
print(df)

Summary of Hill climb uncertainty
     variable  states  mean_entropy  normalized_entropy
0        edad       3      1.583980            0.999380
1       a0104       2      0.069981            0.069981
2       a0107       2      0.068761            0.068761
3        sexo       2      0.972341            0.972341
4       a0301       2      0.489758            0.489758
5       a0401       2      0.650434            0.650434
6       a1210       2      0.461175            0.461175
7    x_region       2      0.697658            0.697658
8      a0701p       2      0.988800            0.988800
9      a0702p       2      0.980334            0.980334
10   desc_mun     133      6.605865            0.936301
11  municipio       3      1.584954            0.999995
12      a0202       2      0.430474            0.430474
13      a0203       2      0.425391            0.425391
14      a0108       3      0.147824            0.093266
15      a0204       2      0.017316            0.017316
16   a0303num 

#### 6.4.1 Hill Climb 

The Hill Climb Bayesian network shows moderate **parameter uncertainty**, with an average normalized entropy of ~0.54 across CPDs. 

- **High-entropy variables** (e.g., `edad`, `sexo`, `estrato`, `municipio`) are informative but may increase parameter uncertainty if data is limited.  
- **High-cardinality variables** (e.g., `desc_mun` with 133 states) risk sparse estimates and unstable CPTs.  
- **Low-entropy variables** (e.g., `a0204`, `a0303num`) are nearly deterministic, but rare states remain uncertain.  
- **Medium-entropy, low-state variables** (e.g., `x_region`, `a0401`, `a0604`) provide the most stable parameters.

Prediction uncertainty for `a1503` remained ~0.505 across test instances, indicating limited predictive power from the current evidence. Overall, the network balances informativeness and tractability, though parameter stability could improve by clustering high-cardinality variables or applying stronger priors.

In [9]:
import pandas as pd
print('Summary of pc uncertainty')
df = pd.read_csv('results/parameter_uncertainty_pc.csv')
print(df)

Summary of pc uncertainty
     variable  states  mean_entropy  normalized_entropy
0       a0604       2      0.658636            0.658636
1        edad       3      0.531378            0.335262
2    desc_mun     133      4.075939            0.577715
3       a0401       2      0.542451            0.542451
4    x_region       2      0.697658            0.697658
5   municipio       3      1.256034            0.792469
6       a1503       3      0.518076            0.326869
7       a0204       2      0.017316            0.017316
8       a0203       2      0.393214            0.393214
9      a0702p       2      0.307788            0.307788
10     a0703p       2      0.465887            0.465887


#### 6.4.2 PC Bayesian

The PC Bayesian network shows moderate **parameter uncertainty**, with an average normalized entropy of ~0.46 across CPDs. 

- **High-entropy variables** (e.g., `upm`, `desc_mun`) are informative but may lead to unstable CPTs due to sparse data in many states.  
- **High-cardinality variables** (`desc_mun` with 133 states, `upm` with 210 states) are especially prone to parameter uncertainty if the dataset is limited.  
- **Low-entropy variables** (e.g., `a1210`) are nearly deterministic, but rare states remain uncertain.  
- **Medium-entropy, low-state variables** (e.g., `x_region`, `a0604`, `a0703p`) provide the most stable parameter estimates.

Prediction uncertainty for `a1503` remained ~0.518 across test instances, indicating moderate predictive uncertainty given the current evidence. Overall, the network is informative and tractable, but parameter stability could be improved by grouping high-cardinality variables or adding additional informative parents to reduce prediction uncertainty.

In [11]:
import pandas as pd
print('Summary of expert uncertainty')
df = pd.read_csv('results/parameter_uncertainty_expert.csv')
print(df)

Summary of expert uncertainty
     variable  states  mean_entropy  normalized_entropy
0        edad       3      1.583980            0.999380
1       a0104       2      0.069981            0.069981
2       a0107       2      0.068761            0.068761
3        sexo       2      0.972341            0.972341
4       a0301       2      0.489758            0.489758
5       a0401       2      0.650434            0.650434
6       a1210       2      0.461175            0.461175
7    x_region       2      0.697658            0.697658
8      a0701p       2      0.988800            0.988800
9      a0702p       2      0.980334            0.980334
10   desc_mun     133      6.605865            0.936301
11  municipio       3      1.584954            0.999995
12      a0202       2      0.430474            0.430474
13      a0203       2      0.425391            0.425391
14      a0108       3      0.147824            0.093266
15      a0204       2      0.017316            0.017316
16   a0303num     

#### 6.4.3 Parameter and Prediction Uncertainty Summary

The EXPERT Bayesian network shows moderate **parameter uncertainty**, with an average normalized entropy of ~0.54 across CPDs. 

- **High-entropy variables** (e.g., `edad`, `sexo`, `estrato`, `municipio`) are informative but may increase parameter uncertainty if data is limited.  
- **High-cardinality variables** (e.g., `desc_mun` with 133 states) risk sparse estimates and unstable CPTs.  
- **Low-entropy variables** (e.g., `a0204`, `a0303num`) are nearly deterministic, but rare states remain uncertain.  
- **Medium-entropy, low-state variables** (e.g., `x_region`, `a0401`, `a0604`) provide the most stable parameters.

Prediction uncertainty for `a1503` remained ~0.505 across test instances, indicating limited predictive power from the current evidence. Overall, the network balances informativeness and tractability, though parameter stability could improve by clustering high-cardinality variables or applying stronger priors.

#### 6.4.4 Comparison of Hill Climb, PC, and EXPERT Networks

The three Bayesian networks differ in their parameter and prediction uncertainty characteristics.

**Hill Climb Network:**  
This network exhibits moderate parameter uncertainty, with an average normalized entropy of ~0.54 across CPDs. High-entropy variables such as `edad`, `sexo`, `estrato`, and `municipio` provide valuable information but may require more data to stabilize parameter estimates. High-cardinality variables like `desc_mun` (133 states) can introduce sparse data issues, while low-entropy variables such as `a0204` and `a0303num` are nearly deterministic, though rare states remain uncertain. Medium-entropy, low-state variables like `x_region`, `a0401`, and `a0604` contribute the most stable parameter estimates. Prediction uncertainty for `a1503` is around 0.505, indicating that the current evidence is moderately informative but not sufficient for highly confident predictions. Overall, the Hill Climb network balances informativeness and tractability, providing stable estimates for most variables.

**PC Network:**  
The PC network shows slightly higher average parameter entropy (~1.447) but a lower normalized entropy (~0.46). High-cardinality variables such as `upm` (210 states) and `desc_mun` (133 states) are highly informative but prone to sparse estimates, increasing parameter uncertainty for rare states. Medium-entropy, low-state variables like `x_region`, `a0604`, and `a0703p` remain relatively stable. Low-entropy variables, such as `a1210`, are nearly deterministic, with rare states still exhibiting uncertainty. Prediction uncertainty for `a1503` is slightly higher than in the other networks (~0.518), reflecting that evidence provided by `x_region`, `desc_mun`, and `municipio` is moderately informative but insufficient to fully constrain the target variable. In summary, the PC network is very informative, particularly for high-cardinality nodes, but parameter estimation can be fragile.

**EXPERT Network:**  
The EXPERT network closely resembles the Hill Climb network in both structure and uncertainty characteristics. The average normalized entropy is ~0.54, with high-entropy variables (`edad`, `sexo`, `estrato`, `municipio`) providing useful information. High-cardinality variables like `desc_mun` introduce potential sparse-data issues, while low-entropy variables (`a0204`, `a0303num`) are almost deterministic, with rare states remaining uncertain. Medium-entropy, low-state variables (`x_region`, `a0401`, `a0604`) contribute the most reliable parameter estimates. Prediction uncertainty for `a1503` is ~0.505, similar to Hill Climb, indicating moderate informativeness of the evidence. Overall, the EXPERT network offers a balanced approach, combining tractable parameter estimation with reasonable predictive power.

**Comparison Summary:**  
Hill Climb and EXPERT networks are comparable, both achieving a good balance between informativeness, tractability, and parameter stability. PC is slightly more informative overall due to high-cardinality variables but carries higher parameter uncertainty for sparse states and slightly higher prediction uncertainty for `a1503`. Across all networks, medium-entropy, low-state variables consistently provide the most stable parameter estimates, while high-cardinality or rare-state variables increase parameter fragility. 

## 7. Discussion

Our study set out to investigate causal relationships within dietary habits and health outcomes using a Mexican nutrition dataset. By employing Bayesian networks for causal inference, we were able to identify not only correlations but also potential directional influences among variables such as dietary patterns, mental health indicators, and physical conditions.

The results suggest that certain eating behaviors, particularly high consumption of processed foods and low intake of fruits and vegetables, may contribute to adverse health outcomes such as obesity and diabetes. Moreover, our analysis indicated that these dietary habits are not only associated with physical health but also have measurable effects on mental well-being, reinforcing the notion that nutrition impacts overall quality of life.

While Bayesian networks provide a powerful framework for uncovering causal structures, it is important to acknowledge limitations. The dataset is cross-sectional, meaning temporal relationships are not fully captured, and self-reported dietary information may be prone to bias. Additionally, confounding variables not included in the survey could influence observed causal links. Despite these limitations, our findings provide preliminary evidence supporting the hypothesis that causality exists within nutritional behavior and health outcomes.


## 8. Conclusion

In this project, we applied Bayesian networks with causal inference to a Mexican nutrition dataset to explore the relationships between dietary habits and health outcomes. Our analysis revealed potential causal links between specific eating behaviors and both physical and mental health indicators, supporting the hypothesis that nutrition significantly influences overall well-being.

While there are inherent limitations due to the cross-sectional and self-reported nature of the data, our findings highlight the value of statistical causal modeling in nutritional research. This approach provides actionable insights that can guide preventive measures against obesity and diabetes, emphasizing the importance of healthy dietary habits for both physical and mental health.



## 9. References

Dataset:
-
- https://ensanut.insp.mx/encuestas/ensanutcontinua2023/descargas.php

Bayesian Networks:
-
- Kitson, Neville Kenneth, Anthony C. Constantinou, Zhigao Guo, Yang Liu, and Kiattikun Chobtham. "A survey of Bayesian Network structure learning." Artificial Intelligence Review 56, no. 8 (2023): 8721-8814.
- Darwiche, Adnan. Modeling and reasoning with Bayesian networks. Cambridge university press, 2009.
- Pourret, Olivier, Patrick Na, and Bruce Marcot, eds. Bayesian networks: a practical guide to applications. John Wiley & Sons, 2008.


Libaries:
-
- asttokens==3.0.0
- causalnet==0.0.1
- causalnex==0.12.1
- contourpy==1.3.0
- cycler==0.12.1
- decorator==5.2.1
- dill==0.4.0
- exceptiongroup==1.3.0
- executing==2.2.0
- filelock==3.18.0
- fonttools==4.59.0
- fsspec==2025.7.0
- importlib-metadata==8.7.0
- importlib-resources==6.5.2
- ipython==8.18.1
- jedi==0.19.2
- jinja2==3.1.6
- joblib==1.5.1
- jsonpickle==4.1.1
- kiwisolver==1.4.7
- MarkupSafe==3.0.2
- matplotlib==3.9.4
- matplotlib-inline==0.1.7
- mpmath==1.3.0
- multiprocess==0.70.18
- networkx==3.2.1
- numpy==1.23.5
- nvidia-cublas-cu12==12.8.4.1
- nvidia-cuda-cupti-cu12==12.8.90
- nvidia-cuda-nvrtc-cu12==12.8.93
- nvidia-cuda-runtime-cu12==12.8.90
- nvidia-cudnn-cu12==9.10.2.21
- nvidia-cufft-cu12==11.3.3.83
- nvidia-cufile-cu12==1.13.1.3
- nvidia-curand-cu12==10.3.9.90
- nvidia-cusolver-cu12==11.7.3.90
- nvidia-cusparse-cu12==12.5.8.93
- nvidia-cusparselt-cu12==0.7.1
- nvidia-nccl-cu12==2.27.3
- nvidia-nvjitlink-cu12==12.8.93
- nvidia-nvtx-cu12==12.8.90
- packaging==25.0
- pandas==1.5.3
- parso==0.8.4
- pathos==0.3.4
- patsy==1.0.1
- pexpect==4.9.0
- pgmpy==0.1.19
- pillow==11.3.0
- pox==0.3.6
- ppft==1.7.7
- prompt-toolkit==3.0.51
- ptyprocess==0.7.0
- pure-eval==0.2.3
- pygments==2.19.2
- pyparsing==3.2.3
- python-dateutil==2.9.0.post0
- pytz==2025.2
- pyvis==0.3.2
- scikit-learn==1.6.1
- scipy==1.13.1
- seaborn==0.13.2
- six==1.17.0
- stack-data==0.6.3
- statsmodels==0.14.5
- sympy==1.14.0
- threadpoolctl==3.6.0
- torch==2.8.0
- tqdm==4.67.1
- traitlets==5.14.3
- triton==3.4.0
- typing-extensions==4.14.1
- wcwidth==0.2.13
- zipp==3.23.0

Validation for Inference:
-
- Graue M, Wentzel-Larsen T, Hanestad BR, Båtsvik B, Søvik O. Measuring self-reported, health-related, quality of life in adolescents with type 1 diabetes using both generic and disease-specific instruments. Acta Paediatr. 2003 Oct;92(10):1190-6. PMID: 14632337.
- Hanna KM, Weaver MT, Slaven JE, Fortenberry JD, DiMeglio LA. Diabetes-related quality of life and the demands and burdens of diabetes care among emerging adults with type 1 diabetes in the year after high school graduation. Res Nurs Health. 2014 Oct;37(5):399-408. doi: 10.1002/nur.21620. Epub 2014 Aug 27. PMID: 25164122; PMCID: PMC4167564.
- Kent DA, Quinn L. Factors That Affect Quality of Life in Young Adults With Type 1 Diabetes. The Diabetes Educator. 2018;44(6):501-509. doi:10.1177/0145721718808733


