<a href="https://colab.research.google.com/github/samiha-mahin/A-Machine-Learning-Models-Repo/blob/main/Explainable_Ai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Model-Based Explainable AI (XAI)**

## 🔍 What is Model-Based Explainable AI?

**Model-Based Explainable AI** refers to techniques where **the model itself is inherently interpretable or designed to be explainable**. These models are either:

1. **Intrinsically interpretable**: like Decision Trees, Linear Regression.
2. **Modified to provide explanations**: for example, using attention mechanisms in neural networks or building inherently explainable deep models.

So, in **Model-Based XAI**, the **focus is on building a model that can explain its decisions as part of its design**, instead of using post-hoc tools like SHAP or LIME.

---

## 💡 Key Characteristics

* Interpretability is built into the model.
* No need for external explanation tools.
* Often involves a tradeoff between accuracy and interpretability.

---

## ✅ Examples of Model-Based XAI

### 1. **Decision Trees (e.g., CART, ID3)**

* The tree structure shows how decisions are made based on features.
* At each node, a rule is applied (e.g., "Age < 30?"), making it easy to follow and understand.

📌 **Use case**: Predicting if a loan should be approved.

```plaintext
If income > 50K:
   └── If credit score > 700 → Approve
   └── Else → Deny
Else:
   └── Deny
```

Here, you can clearly see **why** the loan was approved or denied.

---

### 2. **Generalized Additive Models (GAMs)**

* Predicts outcome as a sum of functions of individual features.
* You can visualize each feature's contribution.

📌 **Use case**: Predicting diabetes risk.

The model might say:

```plaintext
Risk score = f1(age) + f2(BMI) + f3(blood pressure)
```

Each `f` is a smooth curve showing how that feature affects the prediction. Easy to interpret.

---

### 3. **Explainable Boosting Machines (EBMs)**

* An extension of GAMs, based on boosted decision trees.
* High accuracy + interpretability.

📌 **Use case**: Hospital readmission prediction.

You can see:

* Feature: "No. of past visits" → contributes +10 to risk.
* Feature: "Age > 70" → contributes +15 to risk.

Each feature’s contribution is shown clearly.

---

### 4. **Attention-based models (in NLP or Vision)**

* Highlight **which part of the input** the model focused on.

📌 **Use case**: In sentiment analysis

* Sentence: "The movie was surprisingly good"
* Attention weights show "surprisingly good" had the most impact on prediction.

---

### 5. **ProtoPNet (Prototype Networks)**

* Used in image classification.
* Model classifies based on **prototype images** from training set.

📌 **Use case**: Classifying birds

* The model says: "I classify this as a Sparrow because this part of the bird looks like this part in a known Sparrow image."

So it's like saying **"This new sample is similar to this known prototype"**, which helps in human-level understanding.

---

## 🟢 Summary

| Model-Based XAI | Description                            | Example               |
| --------------- | -------------------------------------- | --------------------- |
| Decision Trees  | Rule-based structure                   | Loan approval         |
| GAMs            | Sum of interpretable feature functions | Diabetes prediction   |
| EBMs            | Boosted interpretable model            | Hospital readmission  |
| Attention       | Highlights input regions               | Sentiment/translation |
| ProtoPNet       | Uses prototypes for decision           | Image classification  |

---

## 📌 When to Use Model-Based XAI?

* When you **need trust and clarity** (e.g., healthcare, law, finance).
* When regulatory compliance requires **explainability**.
* When end-users need to understand the **why** behind a prediction.

---



#**Post-Hoc Explainable AI (XAI)**

## 🔍 What is Post-Hoc Explainable AI?

**Post-Hoc XAI** means explaining a model **after it has already been trained**, **without changing the model itself**.

Think of it like this:

> “The model is already made (like a black box). Now let’s try to understand **why it gave a certain output**, using tools and methods.”

This approach is useful when the model is too complex (e.g., deep neural networks) or when we want to keep accuracy while adding interpretability.

---

## 🧠 Categories of Post-Hoc Explanations

| Type                    | Explanation                                                |
| ----------------------- | ---------------------------------------------------------- |
| **Global**              | Explains the overall behavior of the model.                |
| **Local**               | Explains why the model made a specific prediction.         |
| **Feature-based**       | Shows feature importance.                                  |
| **Example-based**       | Uses similar past examples to explain.                     |
| **Visualization-based** | Uses heatmaps or attention maps to show influential parts. |

---

## 🟥 Post-Hoc XAI for Black-Box Models

### ✅ Common Black-Box Models

* Deep Neural Networks (CNNs, LSTMs)
* Random Forests
* Gradient Boosted Trees
* SVMs with nonlinear kernels

### 📌 Tools & Examples

#### 1. **SHAP (SHapley Additive exPlanations)**

* Based on game theory.
* Tells you how much each feature contributed to a specific prediction.

📍 **Example**: In credit scoring

* Model predicts 85% chance of loan default.
* SHAP explains:

  * Age: -12%
  * Low income: +20%
  * High debt: +15%
  * Total = 85%

> "Because of your low income and high debt, the model predicted a high risk."

---

#### 2. **LIME (Local Interpretable Model-agnostic Explanations)**

* Creates a simple interpretable model (like a linear model) **around a single prediction**.
* Helps understand one prediction at a time.

📍 **Example**: Text sentiment

> Sentence: “The movie was **not good**.”

* LIME shows:

  * “not” = -0.6
  * “good” = +0.4
    So, “not” had stronger negative influence.

---

#### 3. **Saliency Maps / Grad-CAM (for images)**

* Used for CNNs.
* Highlights which part of an image the model used to make its prediction.

📍 **Example**: Dog vs. cat classifier

* Grad-CAM highlights the ears and nose of the cat in red — the parts that helped it decide it’s a cat.

---

## 🟨 Post-Hoc XAI for White-Box Models

Even though white-box models are already interpretable, we sometimes use post-hoc tools for:

* Better **visualization**
* Feature ranking
* Comparing **real behavior vs. expected logic**

### ✅ Common White-Box Models

* Decision Trees
* Linear Regression
* Logistic Regression
* Rule-based models

### 📌 Tools & Examples

#### 1. **Feature Importance Plots**

* Ranks features by how much they influence the output.
* Can be based on:

  * Coefficients (in regression)
  * Gini importance (in trees)

📍 **Example**: House price prediction

* Top features:

  * Area: 45%
  * Location: 30%
  * Age of house: 25%

---

#### 2. **Partial Dependence Plots (PDPs)**

* Shows how one feature affects the prediction, keeping others constant.

📍 **Example**: Predicting diabetes risk

* Plot for "BMI" shows that risk increases rapidly after BMI > 28.

---

#### 3. **ICE (Individual Conditional Expectation) plots**

* Like PDP, but for **individual predictions**.
* Helps in fairness analysis.

---

## 🧾 Summary Table

| Model Type    | Post-Hoc XAI Tools                                           | Purpose                                            |
| ------------- | ------------------------------------------------------------ | -------------------------------------------------- |
| **Black-Box** | SHAP, LIME, Grad-CAM, Saliency Maps                          | Explain opaque models like DNNs, Random Forests    |
| **White-Box** | Feature importance, PDP, ICE, SHAP (for extra visualization) | Add depth or visualization to interpretable models |

---

## 📌 When to Use Post-Hoc XAI?

* When the model is **already trained** and **changing it is not practical**.
* When you need to **justify** model decisions to users, clients, or regulators.
* When the model is complex and not understandable as-is (black-box).

---


#**Explainable AI (XAI) by its scope**

## 🎯 XAI by Scope: Overview

There are **three main scopes** or levels in Explainable AI:

| Scope                            | Description                                      | Example Question                                      |
| -------------------------------- | ------------------------------------------------ | ----------------------------------------------------- |
| **Global Explainability**        | Understand the **overall behavior** of the model | "How does this model generally make decisions?"       |
| **Local Explainability**         | Understand a **single prediction** or decision   | "Why did the model predict **this specific result**?" |
| **Example-Based Explainability** | Understand using **reference examples**          | "What similar past cases influenced this prediction?" |

---

## 🔍 1. Global Explainability

### 📌 What it means:

* Gives a **broad view** of how the model behaves overall.
* Explains **which features matter most** and **how** they influence predictions.

### ✅ Tools:

* **Feature Importance (e.g., SHAP summary plots)**
* **Partial Dependence Plots (PDP)**
* **Model summary visualization**

### 🧠 Example:

Predicting loan approval:

* Global explanation shows:

  * Income → most important
  * Age → moderately important
  * Education → less important

You understand **how the model behaves for all inputs.**

---

## 🔍 2. Local Explainability

### 📌 What it means:

* Focuses on **why the model made one specific prediction**.
* Helpful for **trust and accountability** in individual decisions.

### ✅ Tools:

* **LIME**
* **SHAP (for a single instance)**
* **Counterfactual explanations** (what would change the result)

### 🧠 Example:

A customer’s loan was denied.

* Local explanation:

  * Income = +15%
  * Credit score = -40%
  * Job type = -10%
    → Final prediction: Denied

You now know **why this one person got denied**.

---

## 🔍 3. Example-Based Explainability

### 📌 What it means:

* Uses **similar examples** from the training set to explain a new decision.
* Based on **“this is similar to that”** logic.

### ✅ Tools:

* **Case-based reasoning**
* **K-Nearest Neighbors**
* **Prototype/critic networks (like ProtoPNet)**

### 🧠 Example:

* Image classifier says: "This is a tiger."
* It shows 3 images from the training set that were also tigers and looked similar.
* Human says: "Okay, now I see why!"

---

## 🧾 Summary Table

| Scope             | Goal                   | Tool Example              | Best For                               |
| ----------------- | ---------------------- | ------------------------- | -------------------------------------- |
| **Global**        | Understand whole model | SHAP summary, PDP         | Model debugging, feature understanding |
| **Local**         | Explain one prediction | LIME, SHAP (single point) | High-stakes decisions                  |
| **Example-Based** | Use similar examples   | ProtoPNet, KNN            | Interpretability through resemblance   |

---

## 🧠 Final Thought

> XAI by scope helps you **zoom in or out**:

* **Zoom out** = Global view of model behavior
* **Zoom in** = Local view of one decision
* **Compare** = Example-based explanations using known cases




#**SHAP** and **LIME**

## 🌟 What Are SHAP and LIME?

| Tool     | Full Form                                       | Purpose                                                        |
| -------- | ----------------------------------------------- | -------------------------------------------------------------- |
| **SHAP** | SHapley Additive exPlanations                   | Explains model output using feature contribution scores        |
| **LIME** | Local Interpretable Model-agnostic Explanations | Builds a simple model locally to explain a specific prediction |

Both are used to explain **black-box models** (e.g., neural nets, random forests, XGBoost).

---

## 🔍 1. SHAP (SHapley Additive ExPlanations)

### 💡 Idea:

SHAP uses **game theory** to fairly distribute credit to each feature in a prediction.

Imagine each feature (age, income, gender, etc.) is like a player in a game. SHAP asks:

> "How much did each player (feature) contribute to the final score (prediction)?"

### 📌 How It Works:

* Calculates **all possible combinations** of features.
* Measures how the prediction changes **with and without each feature**.
* Averages the impact across all combinations → **SHAP value**.

### 📊 Output:

You get a **feature contribution score** — positive values push the prediction up, negative values push it down.

### 🧠 Example (Loan Default Prediction):

Model prediction: 80% chance of loan default.

| Feature            | SHAP Value | Contribution                               |
| ------------------ | ---------- | ------------------------------------------ |
| Credit score (low) | +20%       | Increased risk                             |
| Income (low)       | +10%       | Increased risk                             |
| Age (young)        | +5%        | Increased risk                             |
| Total SHAP         | +35%       | Base value = 45%, so final = 45 + 35 = 80% |

You now know **exactly why** the model predicted default.

---

## 🔍 2. LIME (Local Interpretable Model-Agnostic Explanations)

### 💡 Idea:

LIME builds a simple, interpretable model (like a **linear model**) around a **specific prediction** to explain what mattered most **locally**.

Think of it like:

> "Let's zoom in to just one point and approximate the model’s behavior near it."

### 📌 How It Works:

1. Pick a data point you want to explain.
2. Slightly change its input values (perturb).
3. Get model predictions on these new points.
4. Fit a simple model (e.g., linear regression) on these to **mimic the black-box** near that point.
5. Use that model to explain feature importance.

### 🧠 Example (Sentiment Analysis):

Text: **“The movie was not good”**

LIME perturbs it:

* “The movie was good”
* “The movie was not bad”
* “The movie was not good at all”

Then it fits a linear model and finds:

| Word   | Weight |
| ------ | ------ |
| “not”  | -0.8   |
| “good” | +0.4   |

Prediction: Negative sentiment because “not” has strong negative weight.

---

## 📌 Summary Table: SHAP vs LIME

| Feature        | **SHAP**                         | **LIME**                 |
| -------------- | -------------------------------- | ------------------------ |
| **Based on**   | Game theory                      | Local surrogate modeling |
| **Scope**      | Local + Global                   | Only Local               |
| **Accuracy**   | Consistent, mathematically fair  | Approximate              |
| **Output**     | Contribution values per feature  | Local feature weights    |
| **Works with** | Any model (tree-based, NN, etc.) | Any model                |
| **Complexity** | Slower, more accurate            | Faster, less precise     |

---

## 🖼️ Visual Example (If You're Imagining a Plot)

For SHAP:

* A **bar plot** showing positive/negative contributions of each feature to a prediction.

For LIME:

* A **local linear line** around the prediction point, showing how nearby inputs affect the output.

---

## 🧠 When to Use Which?

| Use Case                               | Recommendation                                                   |
| -------------------------------------- | ---------------------------------------------------------------- |
| Need both global + local explanation   | **SHAP**                                                         |
| Fast local explanation on one instance | **LIME**                                                         |
| Tree-based models (e.g., XGBoost)      | **SHAP TreeExplainer** is perfect                                |
| Text/image data                        | Both SHAP and LIME work, but **LIME** is often used for **text** |

---



#**PDP (Partial Dependence Plot)**

## 🔍 What is a Partial Dependence Plot (PDP)?

A **Partial Dependence Plot (PDP)** shows how **a single feature** (or two) affects the **predicted output** of a machine learning model, **on average**.

> It tells you:
> "**How does the model's prediction change as one feature changes, while keeping all others constant?**"

---

## 🧠 Why Use PDP?

PDP helps you:

* Understand **global behavior** of the model
* See whether a feature has a **positive or negative** effect
* Detect **non-linear relationships** (e.g., U-shapes, thresholds)
* Build trust in model predictions

---

## 📊 PDP Example (Real-Life: House Price Prediction)

### 📌 Problem:

You’ve trained a model to **predict house prices**.
Features include:

* `Area (sq ft)`
* `Location`
* `No. of bedrooms`
* `Age of the house`

You want to know:

> “How does `Area` affect predicted price, regardless of other features?”

---

## ✅ How PDP Works:

1. Pick a feature, e.g., `Area`.
2. For a range of area values (say 500 to 3000 sq ft):

   * Fix area to that value across **all** data points.
   * Keep other features unchanged.
   * Predict the price for each.
   * Take the **average prediction**.
3. Plot `Area` on the X-axis and **average predicted price** on Y-axis.

---

## 📈 PDP Plot (What You’d See)

| Area (sq ft) | Avg Predicted Price |
| ------------ | ------------------- |
| 500          | \$80,000            |
| 1000         | \$120,000           |
| 1500         | \$180,000           |
| 2000         | \$250,000           |
| 2500         | \$310,000           |
| 3000         | \$350,000           |

### 🔼 You'd get a graph like:

```
Price ↑
      |
350K  |                               *
310K  |                          *
250K  |                   *
180K  |            *
120K  |      *
 80K  | *
      +-------------------------------- Area →
        500   1000  1500  2000  2500  3000
```

### Interpretation:

> As `Area` increases, the predicted price increases — the model has learned a **positive relationship** between area and house price.

---

## 🧠 Another Insightful Example (Binary Classification)

### 📌 Problem: Predicting **Loan Approval**

* Model predicts `1 = approved`, `0 = rejected`

You plot PDP for `credit score`.

* X-axis: credit score from 400 to 850
* Y-axis: average predicted probability of approval

**If the curve is flat between 600–700 and jumps after 700**, it means:

> “Credit score only improves approval chance significantly after it crosses 700.”

---

## ✅ Advantages of PDP

| Benefit                         | Description                                                   |
| ------------------------------- | ------------------------------------------------------------- |
| **Global view**                 | Shows how a feature influences predictions across the dataset |
| **Works with black-box models** | Supports Random Forest, XGBoost, etc.                         |
| **Simple to interpret**         | Visual, continuous effect                                     |

---

## ⚠ Limitations

| Limitation                 | Explanation                                                              |
| -------------------------- | ------------------------------------------------------------------------ |
| **Assumes independence**   | PDP may mislead if features are correlated (e.g., area and no. of rooms) |
| **Averages over all data** | Can hide individual differences                                          |

---

## 🛠 Tools to Create PDP

* `sklearn.inspection.plot_partial_dependence` (Scikit-learn)
* `pdpbox` library in Python
* `SHAP` also includes partial dependence capability

---

## 🎯 Summary

| Term           | Meaning                                                                              |
| -------------- | ------------------------------------------------------------------------------------ |
| **PDP**        | Shows how one feature affects the model output **on average**, keeping others fixed  |
| **Use it for** | Global understanding of model behavior                                               |
| **Example**    | “How does area affect house price?” or “How does credit score affect loan approval?” |

---


