# SHAP UTILIZATION TUTORIAL – Regression Decision Tree Model
## Explainable AI for Climate–Health Forecasting

## 1. Introduction

This tutorial demonstrates how to use SHAP (SHapley Additive exPlanations) to interpret a regression model built for climate–health predictions.
The workflow follows the CHAP-core structure and explains each step of the SHAP process in a clear and practical manner.

### By the end of this tutorial, you will understand:

- What SHAP values represent

- How to compute SHAP values for a trained model

- How to generate global and local explanations

- How to visualize and interpret climate variable contributions

## 2. Import Dependencies

In this step, we import all required Python libraries used for the tutorial.

In [None]:
import pandas as pd
import numpy as np
import joblib
import shap
import matplotlib.pyplot as plt
import os

## 3. Load the Trained Model

We use a DecisionTreeRegressor that was previously trained on climate–health data and saved in the model/ directory.

In [None]:
model_path = "../output/brazil_model.bin"

print("Loading model:", model_path)
model = joblib.load(model_path)

**Why this matters:**

SHAP explains the behavior of an existing model. We don't train here; we only interpret.

### 4. Load the Dataset (CHAP-Compatible)

We load the historic dataset used for training.
Typical CHAP-style fields include:

- rainfall

- mean_temperature

- disease_cases

- time_period

- location

In [None]:
data_path = "../data/historic_brazil.csv"

df = pd.read_csv(data_path)
df = df.dropna()

features = ["rainfall", "mean_temperature"]
target = "disease_cases"

X = df[features]
y = df[target]

X.head()

***Why this matters:***

SHAP needs the same feature matrix the model was trained on.

### 5. Understanding SHAP (Short Explanation)

SHAP values answer:

*“How much did each feature contribute to the model’s prediction for this instance?”*

***Key ideas:***

- SHAP is based on cooperative game theory

- A positive SHAP value means the feature increases the prediction

- A negative SHAP value means the feature decreases the prediction

- The magnitude shows how strong the impact is

- SHAP works well with tree-based models with its corresponding methode

### 6. Initialize the SHAP Explainer

Because our model is a DecisionTreeRegressor, we use TreeExplainer.

In [None]:
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

**What happens here:**

SHAP computes a contribution score for every feature in every row of the dataset.

### 7. Global Explanations

Global explanations help answer:

***“Which features are most important overall?”***

#### 7.1 SHAP Summary Plot

This plot shows:

- Feature importance

- Direction of influence

- Distribution of SHAP values

In [None]:
plt.figure()
shap.summary_plot(shap_values, X, show=False)
plt.tight_layout()
plt.show()

#### 7.2 SHAP Bar Plot (Mean Absolute Contribution)

This shows the average impact magnitude of each feature.

In [None]:
plt.figure()
shap.summary_plot(shap_values, X, plot_type="bar", show=False)
plt.tight_layout()
plt.show()

### 8. Local Explanations

A local explanation interprets one prediction in detail.

Pick an example index:

In [None]:
index = 100
sample = X.iloc[index:index+1]
sample

**Compute SHAP values for this sample**

In [None]:
sample_shap = explainer.shap_values(sample)

#### 8.1 Force Plot

A force plot shows how each feature pushes the prediction up or down relative to the baseline.

In [None]:
shap.initjs()
shap.force_plot(
    explainer.expected_value,
    sample_shap,
    sample
)

### 9. Dependence Plots

Dependence plots show how one feature affects predictions across the dataset.

**Rainfall effect**

In [None]:
shap.dependence_plot("rainfall", shap_values, X)

**Interpretation:**

These plots help understand nonlinear patterns—for example, how vector-borne disease risk increases within specific rainfall and temperature ranges.

### 10. Summary & Learning Outcomes

In this tutorial, we:

- Loaded a trained regression decision tree model

- Imported climate–health data

- Initialized a SHAP explainer

- Computed SHAP values

- Produced global explanation plots

- Generated local explanation plots

- Interpreted how rainfall and temperature drive disease predictions

- Followed a CHAP-style folder structure for outputs

## Explainability Pipeline