# DPGExplainer Saga Benchmarks — Episode 1: Iris

A practitioner-friendly walkthrough of Decision Predicate Graphs (DPG) using the classic Iris dataset. We train a small RandomForest, build a DPG, and interpret three key signals: Local Reaching Centrality (LRC), Betweenness Centrality (BC), and Communities.

## 1. Setup

In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from dpg import DPGExplainer


## 2. Load Iris

In [None]:
iris = load_iris(as_frame=True)
X = iris.data
y = iris.target
X.head()


## 3. Train Model

In [None]:
model = RandomForestClassifier(n_estimators=10, random_state=27)
model.fit(X, y)


## 4. Build DPG

In [None]:
explainer = DPGExplainer(
    model=model,
    feature_names=X.columns,
    target_names=iris.target_names.tolist(),
    config_file="config.yaml",  # optional if present
)

explanation = explainer.explain_global(
    X.values,
    communities=True,
    community_threshold=0.2,
)


## 5. Inspect Node Metrics

In [None]:
explanation.node_metrics.head()


### Local Reaching Centrality (LRC)
High LRC nodes can reach many other nodes downstream. These predicates often act early, framing large portions of the model’s logic.

In [None]:
explanation.node_metrics.sort_values(
    "Local reaching centrality", ascending=False
).head(10)


### Betweenness Centrality (BC)
High BC nodes lie on many shortest paths between other nodes. These predicates are bottlenecks that connect major decision flows.

In [None]:
explanation.node_metrics.sort_values(
    "Betweenness centrality", ascending=False
).head(10)


## 6. Communities (Decision Themes)
Communities group predicates that are tightly connected. For Iris, you often see groups aligned with short-petal rules (often Setosa) and longer-petal rules (often Versicolor and Virginica).

In [None]:
explanation.communities.keys()
explanation.communities.get("Communities", [])[:3]


## 7. Visualize the Story

In [None]:
run_name = "iris_dpg"
explainer.plot(run_name, explanation, save_dir="results", class_flag=True, export_pdf=True)
explainer.plot_communities(run_name, explanation, save_dir="results", class_flag=True, export_pdf=True)


## 8. Practitioner Summary
Use these three points to tell the story:
- LRC: Which predicate most strongly frames the model’s logic?
- BC: Which predicate acts as a bottleneck between key decision paths?
- Communities: Which predicate groups define the themes of each class?