# Full Agent System Demo

This notebook tests the complete end-to-end flow of the Explainable AI Agent:
1.  **Data Understanding**: Analyzes the dataset schema and statistics.
2.  **Router**: Decides which specialist agent to call.
3.  **Global Explainer**: Explains overall model behavior (SHAP).
4.  **Local Explainer**: Explains specific predictions (SHAP/LIME).

In [1]:
import os
import sys
import pandas as pd
import matplotlib.pyplot as plt
import importlib
from IPython.display import Image, display
from dotenv import load_dotenv
from catboost import CatBoostClassifier
from langchain_core.messages import HumanMessage, AIMessage

# Add parent dir
sys.path.append('..')
load_dotenv('../.env')

# Reload to pick up changes
import agent.graph
import agent.nodes.data_understanding
import agent.nodes.global_explainer
import agent.nodes.local_explainer
importlib.reload(agent.graph)
importlib.reload(agent.nodes.data_understanding)
importlib.reload(agent.nodes.global_explainer)
importlib.reload(agent.nodes.local_explainer)

from agent.graph import app

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# 1. Load Data
def load_arff_data(file_path):
    data = []
    columns = []
    with open(file_path, 'r') as f:
        data_started = False
        for line in f:
            line = line.strip()
            if not line: continue
            if line.lower().startswith("@attribute"):
                parts = line.split()
                columns.append(parts[1])
            elif line.lower().startswith("@data"):
                data_started = True
                continue
            elif data_started:
                row = [x.strip().strip("'").strip('"') for x in line.split(',')]
                data.append(row)
    return pd.DataFrame(data, columns=columns)

dataset_path = "../datasets/banking_deposit_subscription/dataset"
df = load_arff_data(dataset_path)

for col in df.columns:
    try:
        df[col] = pd.to_numeric(df[col])
    except ValueError:
       pass 

# 2. Load Model
model = CatBoostClassifier()
model.load_model("../notebooks/models/catboost_model.cbm")

print("Loaded Data and Model.")

Loaded Data and Model.


In [3]:
# Helper to print output
def print_agent_response(result):
    print("\n--- Agent Response ---")
    messages = result['messages']
    last_msg = messages[-1]
    print(f"{last_msg.type.upper()}: {last_msg.content}")

## Turn 1: Data Understanding
System initiates data analysis.

In [4]:
state_t1 = {
    "messages": [HumanMessage(content="Here is the dataset. Please analyze it.")],
    "df": df,
    "model": model,
    # We do NOT invoke target_variable yet; let the agent find it.
}

print("Invoking Data Understanding...")
result_t1 = app.invoke(state_t1)
print_agent_response(result_t1)

Invoking Data Understanding...

--- Agent Response ---
AI: The dataset metadata has been successfully updated. If you have any further questions or need additional analysis, feel free to ask!


In [5]:
for m in result_t1['messages']:
    m.pretty_print()


Here is the dataset. Please analyze it.
Tool Calls:
  get_dataset_samples (call_F0MtntKibEVKtgGZ3z7iU7dV)
 Call ID: call_F0MtntKibEVKtgGZ3z7iU7dV
  Args:
Name: get_dataset_samples

--- Samples ---
   age           job   marital  education default  balance housing loan  contact  day month  duration  campaign  pdays  previous poutcome   y
0   58    management   married   tertiary      no     2143     yes   no  unknown    5   may       261         1     -1         0  unknown  no
1   44    technician    single  secondary      no       29     yes   no  unknown    5   may       151         1     -1         0  unknown  no
2   33  entrepreneur   married  secondary      no        2     yes  yes  unknown    5   may        76         1     -1         0  unknown  no
3   47   blue-collar   married    unknown      no     1506     yes   no  unknown    5   may        92         1     -1         0  unknown  no
4   33       unknown    single    unknown      no        1      no   no  unknown    5   may 

In [6]:
state_t2 = result_t1
state_t2['messages'].append(HumanMessage(content="What are the sensitive features that may lead to bias, discrimination or unfair treatment?"))

print("Invoking Fairness Explainer...")
result_t2 = app.invoke(state_t2)
print_agent_response(result_t2)

Invoking Fairness Explainer...

--- Agent Response ---
AI: Based on the dataset, the following features could be considered sensitive or protected attributes that may lead to bias, discrimination, or unfair treatment:

1. **Age**: Age discrimination could occur if the model unfairly favors or disfavors certain age groups.
2. **Job**: Certain job types might be unfairly favored or disfavored.
3. **Marital Status**: Discrimination based on whether someone is single, married, or divorced.
4. **Education**: Bias could occur if the model favors individuals with certain education levels.
5. **Default**: Whether the client has credit in default could lead to bias against those with financial difficulties.

Would you like to proceed with an analysis on one of these sensitive features? If so, please specify which one you are interested in investigating.


In [7]:
state_t3 = result_t2
state_t3['messages'].append(HumanMessage(content="I would like to know the fairness of the marital status"))

print("Invoking Fairness Explainer...")
result_t3 = app.invoke(state_t3)
print_agent_response(result_t3)

Invoking Fairness Explainer...

--- Agent Response ---
AI: The fairness analysis for the sensitive attribute "marital status" has been completed. Here are the key findings:

### Overall Metrics
- **Accuracy**: 92.83%
- **True Positive Rate (TPR)**: 57.63%
- **False Positive Rate (FPR)**: 2.50%
- **Selection Rate**: 8.95%

### By Group Metrics
- **Accuracy**:
  - Divorced: 93.30%
  - Married: 93.59%
  - Single: 91.03%
- **True Positive Rate (TPR)**:
  - Divorced: 62.86%
  - Married: 54.37%
  - Single: 60.62%
- **False Positive Rate (FPR)**:
  - Divorced: 2.57%
  - Married: 1.99%
  - Single: 3.62%
- **Selection Rate**:
  - Divorced: 9.78%
  - Married: 7.30%
  - Single: 12.14%

### Fairness Metrics
- **Statistical Parity Difference (SPD)**: 0.048
- **Equalized Odds Difference (EOD)**: 0.085
- **Disparate Impact (DI)**: 0.601

### Interpretation
- **Statistical Parity Difference (SPD)**: The difference in selection rates between the groups is 4.84%, which is relatively small, indicating a 

In [12]:
for m in result_t3['messages'][len(result_t2['messages'])-1:]:
    m.pretty_print()


I would like to know the fairness of the marital status
Tool Calls:
  run_ethic_analysis (call_eZOyIoKuVqXs1EieWFzn8xf0)
 Call ID: call_eZOyIoKuVqXs1EieWFzn8xf0
  Args:
    sensitive_attr: marital
Name: run_ethic_analysis

{"overall": {"accuracy": 0.9283139059078542, "true_positive_rate": 0.5762904140669314, "false_positive_rate": 0.025048845248234055, "selection_rate": 0.08953573245449116}, "by_group": {"accuracy": {"divorced": 0.9329748415594392, "married": 0.9358785919012274, "single": 0.9103205629397967}, "true_positive_rate": {"divorced": 0.6286173633440515, "married": 0.5437386569872958, "single": 0.6061715481171548}, "false_positive_rate": {"divorced": 0.025736095965103598, "married": 0.01995175599983646, "single": 0.036219893362750506}, "selection_rate": {"divorced": 0.09775302477434224, "married": 0.0729771441169986, "single": 0.1214229867083659}}, "fairness": {"SPD": 0.048445842591367305, "EOD": 0.08487870635675565, "DI": 0.601015887479982}, "sensitive_attr": "marital"}

The

In [15]:
state_t4 = result_t3
state_t4['messages'].append(HumanMessage(content="Yes, please, I would like to have the fairness visualization!"))

print("Invoking Fairness Explainer...")
result_t4 = app.invoke(state_t4)
print_agent_response(result_t4)

Invoking Fairness Explainer...

--- Agent Response ---
AI: The fairness analysis visualization has been generated. You can view the plot in the file named `fairness_analysis_plot.png`. This visualization will help you better understand the distribution of fairness metrics across different marital status groups. If you have any further questions or need additional insights, feel free to ask!


In [16]:
for m in result_t4['messages'][len(result_t3['messages'])-1:]:
    m.pretty_print()


Yes, please, I would like to have the fairness visualization!
Tool Calls:
  visualize_ethic_analysis (call_4eZ1CXSbKAfsSNDZQpewXoAe)
 Call ID: call_4eZ1CXSbKAfsSNDZQpewXoAe
  Args:
Name: visualize_ethic_analysis

Plot generated and saved to fairness_analysis_plot.png

The fairness analysis visualization has been generated. You can view the plot in the file named `fairness_analysis_plot.png`. This visualization will help you better understand the distribution of fairness metrics across different marital status groups. If you have any further questions or need additional insights, feel free to ask!


## Turn 2: Global Explanation
User asks for global feature importance. Router should send to Global Explainer.

In [None]:
state_t2 = result_t1
state_t2['messages'].append(HumanMessage(content="What are the most important features driving the model?"))

print("Invoking Global Explainer...")
result_t2 = app.invoke(state_t2)
print_agent_response(result_t2)

Invoking Global Explainer...

--- Agent Response ---
AI: The Global SHAP Summary Plot has been generated, which provides insights into the most important features driving the model. You can view the plot [here](sandbox:/artifacts/shap_summary_96e9f291ce6346ef888a657cd87afb30.png).

### Key Insights from the SHAP Summary Plot:
- **Feature Importance**: The plot ranks features based on their importance in the model. The features at the top are the most influential in predicting the target variable.
- **Impact on Prediction**: Each point on the plot represents the impact of a feature on a single prediction. The color indicates the feature value (e.g., red for high values and blue for low values).

By analyzing the plot, you can identify which features have the most significant impact on the model's predictions and understand the overall model behavior. If you have any specific questions about the plot or need further analysis, feel free to ask!


## Turn 3: Local Explanation
User asks about a specific instance. Router should send to Local Explainer.

In [7]:
state_t3 = result_t2
state_t3['messages'].append(HumanMessage(content="Why was User 5 classified this way? Use SHAP."))
state_t3['user_id'] = 5 # In a real app, this would be extracted or set by UI context

print("Invoking Local Explainer...")
result_t3 = app.invoke(state_t3)
print_agent_response(result_t3)

Invoking Local Explainer...

--- Agent Response ---
AI: The SHAP Waterfall plot for User 5 has been generated. You can view the plot [here](sandbox:/artifacts/shap_waterfall_0_7a748eb2060941c0a76f594cbc393e55.png).

### Insights from the SHAP Waterfall Plot:
- **Feature Contributions**: The plot shows how each feature contributes to pushing the prediction from the base value (average model output) to the final prediction for User 5.
- **Positive and Negative Impacts**: Features that increase the prediction are shown in red, while those that decrease it are in blue.
- **Magnitude of Impact**: The length of each bar indicates the magnitude of the feature's impact on the prediction.

By examining this plot, you can understand which features were most influential in the model's decision for User 5. If you have any questions about specific features or need further clarification, feel free to ask!


In [8]:
state_t4 = result_t3
state_t4['messages'].append(HumanMessage(content="Why was User 5 classified this way? Use LIME."))
state_t4['user_id'] = 5 # In a real app, this would be extracted or set by UI context

print("Invoking Local Explainer...")
result_t4 = app.invoke(state_t4)

Invoking Local Explainer...

--- Agent Response ---
HUMAN: Why was User 5 classified this way? Use LIME.


In [10]:
print_agent_response(result_t4)


--- Agent Response ---
AI: The LIME explanation for User 5 has been generated. You can view the explanation [here](sandbox:/artifacts/lime_explanation_faadd1357d364b7ab4c1cfecc7cc7b2c.png).

### Insights from the LIME Explanation:
- **Local Feature Importance**: The LIME plot provides a local approximation of the model's decision for User 5, highlighting the most influential features for this specific prediction.
- **Positive and Negative Contributions**: Features that contribute positively to the prediction are shown in one color, while those that contribute negatively are shown in another.
- **Weight of Features**: The size of each bar represents the weight or importance of the feature in the local decision-making process.

This explanation helps to understand the specific reasons behind the model's classification for User 5. If you have any questions or need further details, feel free to ask!


### Random Questions

In [12]:
state_t5 = result_t4
state_t5['messages'].append(HumanMessage(content="Can you tell me something about the dataset? Like what is the goal of the dataset, what is the target variables and so on?"))
result_t5 = app.invoke(state_t5)

In [13]:
print_agent_response(result_t5)


--- Agent Response ---
AI: Certainly! Here is a summary of the dataset:

1. **Goal of the Dataset**: The dataset is related to a marketing campaign for a bank. The primary goal is to predict whether a client will subscribe to a term deposit based on various attributes.

2. **Target Variable**: The target variable is `y`, which indicates whether the client has subscribed to a term deposit. It is a binary variable with values 'yes' or 'no'.

3. **Problem Type**: This is a classification problem because the target variable is categorical with two classes ('yes' and 'no').

4. **Dataset Format**: The dataset is in a tabular format, consisting of rows and columns with various features.

5. **Features**: The dataset includes features such as age, job, marital status, education, default status, balance, housing loan status, personal loan status, contact type, last contact day and month, duration of the last contact, number of contacts during the campaign, days since last contact from a previ

In [14]:
for m in result_t5['messages']:
    m.pretty_print()


Here is the dataset. Please analyze it.
Tool Calls:
  get_dataset_samples (call_RrFJHsyOfioftiMypAdejsK2)
 Call ID: call_RrFJHsyOfioftiMypAdejsK2
  Args:
Name: get_dataset_samples

--- Samples ---
   age           job   marital  education default  balance housing loan  contact  day month  duration  campaign  pdays  previous poutcome   y
0   58    management   married   tertiary      no     2143     yes   no  unknown    5   may       261         1     -1         0  unknown  no
1   44    technician    single  secondary      no       29     yes   no  unknown    5   may       151         1     -1         0  unknown  no
2   33  entrepreneur   married  secondary      no        2     yes  yes  unknown    5   may        76         1     -1         0  unknown  no
3   47   blue-collar   married    unknown      no     1506     yes   no  unknown    5   may        92         1     -1         0  unknown  no
4   33       unknown    single    unknown      no        1      no   no  unknown    5   may 