# Notebook 06 - Results And Interpretation

In [None]:
## Model Comparison – Decision Tree vs Random Forest

In [None]:
# Metrics for each model (updated)
models = ['Decision Tree', 'Random Forest']
accuracy = [0.76, 0.82]
recall_ad = [0.65, 0.59]
precision_ad = [0.67, 0.86]

import numpy as np
import matplotlib.pyplot as plt

# Bar width and positions
bar_width = 0.25
index = np.arange(len(models))

# Create grouped bar chart
plt.figure(figsize=(10, 6))
plt.bar(index, accuracy, bar_width, label='Accuracy', color='#4C72B0')
plt.bar(index + bar_width, recall_ad, bar_width, label='Recall (AD Class)', color='#55A868')
plt.bar(index + 2 * bar_width, precision_ad, bar_width, label='Precision (AD Class)', color='#C44E52')

# Add labels and formatting
plt.xlabel('Model')
plt.ylabel('Score')
plt.title('Model Performance Comparison – Decision Tree vs Random Forest')
plt.xticks(index + bar_width, models)
plt.ylim(0, 1)
plt.legend()
plt.grid(axis='y')
plt.tight_layout()
plt.show()

# Save plot
fig = plt.gcf()
save_plot(
    fig,
    filename="model_comparison_random_forest_vs_decision_tree.png",
    caption="Bar chart comparing accuracy, recall, and precision between Decision Tree and Random Forest models.",
    folder_path="../plots"
)

In [None]:
We use a grouped bar chart to compare accuracy, recall, and precision between the models. This makes it easy to see trade-offs between catching more Alzheimer’s cases and avoiding false positives.

In [None]:
### Feature Importance - Random Forest

In [None]:
# Get feature importances from the trained model
importances = rf_model.feature_importances_
features = X.columns

In [None]:
# Create DataFrame for sorting
importance_df = pd.DataFrame({
    'Feature': features,
    'Importance': importances
}).sort_values(by='Importance', ascending=True)


In [None]:
# Plot
plt.figure(figsize=(10, 6))
plt.barh(importance_df['Feature'], importance_df['Importance'], color="#4C72B0")
plt.title("Feature Importance – Random Forest")
plt.xlabel("Relative Importance")
plt.ylabel("Features")
plt.tight_layout()
plt.show()

# Save figure
fig = plt.gcf()
save_plot(
    fig,
    filename="random_forest_feature_importance.png",
    caption="Bar chart showing the most important features used by the Random Forest model to predict Alzheimer's diagnosis.",
    folder_path="../plots"
)

In [None]:
This plot shows which features had the greatest influence on the Random Forest model's predictions.

The top features (such as MMSE, ADL, FunctionalAssessment) were the most important in classifying whether a patient was likely to have Alzheimer’s.

This helps answer our second research question:  
**Which health and lifestyle features are most predictive of an Alzheimer’s diagnosis?**

It also helps build trust in the model by showing which variables matter most.


In [None]:
### Model Performance Comparison – Accuracy vs ROC AUC
In this section, we compare the performance of our three supervised models:
- **Decision Tree**
- **Random Forest**
- **Logistic Regression**

We visualize the results using a bar chart, showing:
- **Accuracy**: Overall percentage of correct predictions.
- **ROC AUC** (Area Under the Curve): How well the model separates the two classes (Alzheimer's / No Alzheimer's).

This helps us identify which model performs best for our prediction task.

In [None]:
dt_accuracy = 0.75
rf_accuracy = 0.80
log_accuracy = 0.82

dt_auc = 0.84
rf_auc = 0.91
log_auc = 0.88

In [None]:
results = pd.DataFrame({
    'Model': ['Decision Tree', 'Random Forest', 'Logistic Regression'],
    'Accuracy': [dt_accuracy, rf_accuracy, log_accuracy],
    'ROC AUC': [dt_auc, rf_auc, log_auc]
})

# Bar chart
results.set_index("Model")[['Accuracy', 'ROC AUC']].plot(kind='bar', figsize=(8, 6))
plt.title("Model Comparison – Accuracy and ROC AUC")
plt.ylabel("Score")
plt.ylim(0.7, 1.0)
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

In [None]:
fig = plt.gcf()
save_plot(
    fig,
    filename="model_comparison_bar_chart.png",
    caption="Comparison of Accuracy and ROC AUC scores for Decision Tree, Random Forest, and Logistic Regression.",
    folder_path="../plots"
)


In [None]:
### What does the model comparison chart show?
We evaluate each model using two metrics:
- **Accuracy**: Percentage of correct predictions on the test set.
- **ROC AUC** (Receiver Operating Characteristic – Area Under Curve): How well the model separates Alzheimer’s vs. non-Alzheimer’s cases based on predicted probability.

### What we observe:
- **Random Forest** has the highest scores for both accuracy and AUC.  
- **Logistic Regression** performs slightly better than Decision Tree.  
- All models achieve good AUC scores above 0.80, which means they have strong ability to distinguish between the two classes.

This helps us choose the best model for prediction:  
**Random Forest** shows the strongest and most balanced performance.


In [None]:
### Model Comparison – Precision, Recall, F1-Score

To compare how well our models identify Alzheimer’s patients (class 1), we extract key metrics from each classification report:

- **Precision** tells us how many of the predicted positives were actually correct.
- **Recall** tells us how many of the actual positives the model was able to detect.
- **F1-score** is the balance between precision and recall.

This allows us to see **not just accuracy**, but how safely the model can help with early diagnosis in a real-world setting.


In [None]:
### Get Classification Reports for Each Model

In [None]:
from sklearn.metrics import classification_report
import pandas as pd

report_dt = classification_report(y_test, dt_pred, output_dict=True)
report_rf = classification_report(y_test, rf_pred, output_dict=True)
report_log = classification_report(y_test, y_pred, output_dict=True)


In [None]:
We generate the classification reports for each model (Decision Tree, Random Forest, and Logistic Regression) and convert them into dictionaries so we can extract specific values.

In [None]:
### Create Comparison DataFrame for Class 1 (Alzheimer's)

In [None]:
# Create comparison table for class 1
comparison_df = pd.DataFrame({
    "Decision Tree": {
        "Precision": report_dt["1"]["precision"],
        "Recall": report_dt["1"]["recall"],
        "F1-score": report_dt["1"]["f1-score"]
    },
    "Random Forest": {
        "Precision": report_rf["1"]["precision"],
        "Recall": report_rf["1"]["recall"],
        "F1-score": report_rf["1"]["f1-score"]
    },
    "Logistic Regression": {
        "Precision": report_log["1"]["precision"],
        "Recall": report_log["1"]["recall"],
        "F1-score": report_log["1"]["f1-score"]
    }
})


In [None]:
We focus on class 1, which represents patients diagnosed with Alzheimer’s disease.
This allows us to evaluate how good each model is at detecting the group that matters most for early intervention.

In [None]:
### Format and Display the Table

In [None]:
comparison_df = comparison_df.T.round(2)
display(comparison_df)

In [None]:
We transpose the table to make it easier to read and round the scores to two decimals for a cleaner output.

In [None]:
### Visualize as Bar Chart

In [None]:
comparison_df.plot(kind="bar", figsize=(8, 6))
plt.title("Precision, Recall and F1-score for Alzheimer's Class (1)")
plt.ylabel("Score")
plt.ylim(0, 1)
plt.grid(axis='y', linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()


In [None]:
This bar chart gives a clear comparison of model performance for detecting Alzheimer's patients.
We compare:

- Precision: How many predicted AD cases were correct

- Recall: How many real AD cases were detected

- F1-score: Balance between precision and recall

In [None]:
# Save the figure
fig = plt.gcf()
save_plot(
    fig,
    filename="class1_comparison_bar_chart.png",
    caption="Comparison of precision, recall and F1-score for class 1 (Alzheimer's) across three models.",
    folder_path="../plots"
)


In [None]:
### Bar Chart Plot - zoomed only for decision tree

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Plot grouped bar chart
plt.figure(figsize=(8, 6))
comparison_df.plot(kind="bar", ylim=(0.5, 1.0), figsize=(8, 6))
plt.title("Precision, Recall, and F1-score for Class 1 – Alzheimer's")
plt.ylabel("Score")
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()


In [None]:
fig = plt.gcf()

from utils.save_tools import save_plot

save_plot(
    fig,
    filename="model_precision_recall_f1_bar_chart.png",
    caption="Comparison of precision, recall, and F1-score for predicting Alzheimer's (class 1) across all trained models.",
    folder_path="../plots"
)


In [None]:
### What Do We See?
The Decision Tree performs the best on all three metrics for Alzheimer’s detection.

Random Forest and Logistic Regression score lower for this class, especially on recall, which means they miss more true Alzheimer’s cases.

This confirms that precision and recall are essential metrics when predicting a serious diagnosis, not just accuracy.