# Import Libraries & Set Up
---

In [1]:
import warnings
import logging
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import bnlearn as bn
from pgmpy.global_vars import logger
from utils import *  # Helper functions for Gaussian cross-validation

warnings.filterwarnings('ignore')
logger.setLevel(logging.ERROR)
logging.getLogger('seaborn').setLevel(logging.ERROR)
logging.getLogger('matplotlib').setLevel(logging.ERROR)
logging.getLogger('pandas').setLevel(logging.ERROR)

palette = ['#800080', '#8A2BE2', '#FF69B4', '#DA70D6', '#9370DB', '#DDA0DD', '#BA55D3']
gradient_palette = sns.light_palette('#620080', as_cmap=True)
plt.rcParams['axes.prop_cycle'] = plt.cycler(color=palette)
sns.set_theme(style="whitegrid", palette=palette)

# Dementia
---

Let's import the processed dataset.

In [2]:
dementia_df = pd.read_csv('data/dementia_data_processed.csv')

## Learning Structure & Parameters with Cross-Validation
---

Now we will explore the process of structure learning and parameter learning for Bayesian Networks (BNs) using the provided dataset. We will evaluate multiple structure learning methods, perform parameter learning, and assess the results using cross-validation.

Our goals are:
1. To understand the structure of the data using various structure learning algorithms.
2. To learn the parameters of the best structure.
3. To evaluate the performance of each method through cross-validation.

The methods we will use:
1. Hill Climbing (HC) with BDeu scoring
2. PC Algorithm

We will then evaluate each method's performance and compare the results.

### Cross-Validation
---


In utils we have defined a function for cross validation, and we can run for each different type of model we want to test to see which one gets the best results.


In [3]:
evaluation = {}

#### BDeu Hill Climbing
---

In [4]:
structure_kwargs_hc = {'methodtype': 'hc'}

In [None]:
evaluation_hc = gaussian_cross_validation(dementia_df, 'Group', structure_kwargs=structure_kwargs_hc)

In [6]:
evaluation['BDeu Hill Climbing'] = evaluation_hc

#### PC Algorithm
---

In [7]:
structure_kwargs_pc = {'methodtype': 'pc'}

In [None]:
evaluation_pc = gaussian_cross_validation(dementia_df, 'Group', structure_kwargs=structure_kwargs_pc)

As we can see, with this dataset the PC Algorithm doesn't seem to learn the connection to 'Status' on any fold, meaning we cannot use this algorithm.

### Evaluation
---

In [None]:
plot_metrics_from_evaluation(evaluation)

In [None]:
display_evaluation_results(evaluation)

In [None]:
plot_confusion_matrices_from_evaluation(evaluation, cmap=gradient_palette)

# Parkinson's Disease
---

In [12]:
parkinsons_df = pd.read_csv('data/parkinsons_data_processed.csv')

## Learning Structure & Parameters with Cross-Validation
---

Now we will explore the process of structure learning and parameter learning for Bayesian Networks (BNs) using the provided dataset. We will evaluate multiple structure learning methods, perform parameter learning, and assess the results using cross-validation.

Our goals are:
1. To understand the structure of the data using various structure learning algorithms.
2. To learn the parameters of the best structure.
3. To evaluate the performance of each method through cross-validation.

The methods we will use:
1. Hill Climbing (HC) with BDeu scoring
2. PC Algorithm

We will then evaluate each method's performance and compare the results.

### Cross-Validation
---

In [13]:
evaluation = {}

#### Hill Climbing
---

In [14]:
structure_kwargs_hc = {'methodtype': 'hc'}

In [None]:
evaluation_hc = gaussian_cross_validation(parkinsons_df, 'Status', structure_kwargs=structure_kwargs_hc)

In [16]:
evaluation['BDeu Hill Climbing'] = evaluation_hc

#### PC Algorithm
---

In [17]:
structure_kwargs_pc = {'methodtype': 'pc'}

In [None]:
evaluation_pc = gaussian_cross_validation(parkinsons_df, 'Status', structure_kwargs=structure_kwargs_pc)

As we can see, with this dataset the PC Algorithm doesn't seem to learn the connection to 'Status' on any fold, meaning we cannot use this algorithm.

## Evaluation
---

In [None]:
plot_metrics_from_evaluation(evaluation)

In [None]:
display_evaluation_results(evaluation)

In [None]:
plot_confusion_matrices_from_evaluation(evaluation, cmap=gradient_palette)