# Session 3: Hands-on Exercises:
# “Is my ML model wrong?” - Using Trustee in Practice

In this session, we will implement a Random Forest classifier to analyze the CIC-IDS dataset and its ability to capture real-world attacks. Then, we will use Trustee to analyze the black-box model and identify an instance of underspecification. More specifically, we will show that a black-box model can easily separate a class of attack (Heartbleed) because of the way the dataset was generated.

# Install dependencies

* Use `pip` to install dependencies.
* Install `graphviz` to render decision trees.




In [None]:
!pip install --upgrade setuptools jedi 2> /dev/null > /dev/null
!pip install matplotlib numpy pandas scikit-learn==1.2.2 pdf2image scipy trustee==1.1.4 2> /dev/null > /dev/null
!apt -qqq install graphviz poppler-utils

---
# Traditional ML Pipeline

* Read CIC-IDS-2017 dataset
  * The CIC-IDS-2017 dataset is hosted in the Github repo from the project.

* To read the data, we are using a helper function from the Trustee package.
  * This method automatically one-hot encodes the categorical variables of the dataset, based on the provided metadata.

In [None]:
import numpy as np
from trustee.utils import dataset
from trustee.utils.const import CIC_IDS_2017_DATASET_META

import warnings
warnings.filterwarnings("ignore")

DF_PATH = "https://github.com/TrusteeML/emperor/raw/main/use_cases/heartbleed_case/res/dataset/CIC-IDS-2017_OverSampled_min.csv.zip"

# if using oversampled df
CIC_IDS_2017_DATASET_META["is_dir"] = False

# Step 1: Parse train-test def
X, y, feature_names, _, _ = dataset.read(
    DF_PATH, metadata=CIC_IDS_2017_DATASET_META, as_df=True, verbose=True
)

SUBSAMPLE_RATIO = 0.1

# select subsample of the dataset
idx = np.random.permutation(X.index)
idx = idx[:int(len(idx) * SUBSAMPLE_RATIO)]

X = X.loc[idx]
y = y.loc[idx]

---
**Split dataset into train and test sets**

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split

X_indexes = np.arange(0, X.shape[0])
X_train, X_test, y_train, y_test = train_test_split(X_indexes, y, train_size=0.7, stratify=y)
X_train = X.iloc[X_train]
X_test = X.iloc[X_test]

Train Random Forest Classifier

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

blackbox = RandomForestClassifier(n_jobs=4)
blackbox.fit(X_train, y_train)

y_pred = blackbox.predict(X_test)

print("Blackbox model classification report with IID:")
print(
    "\n{}".format(
        classification_report(
            y_test,
            y_pred,
            digits=3,
            target_names=CIC_IDS_2017_DATASET_META["classes"],
        )
    )
)


---
# Unlocking the Black Box with Trustee
Run Classification Trustee on trained Random Forest Classifier

<font color="blue">`class trustee.main.ClassificationTrustee(expert, logger=None)`</font>



Bases: ``Trustee``

Implements the Trust-oriented Decision Tree Extraction (Trustee) algorithm to train a student DecisionTreeClassifier based on observations from an Expert classification model.

**PARAMETERS:**

* **expert** *(object)* – The ML blackbox model to analyze. The expert model must have a predict method call implemented for Trustee to work properly, unless explicitly stated otherwise using the predict_method_name argument in the fit() method.

* **logger** *(Logger object , default=None)* – A logger object to log messages to. If none is given, the print() method will be used to log messages.

In [None]:
import graphviz

from sklearn import tree
from sklearn import datasets

# Initialize Trustee

# TODO


#Train a Decision Tree to Imitate the Expert Model.

<font color="blue">`fit(X, y, top_k=10, max_leaf_nodes=None, max_depth=None, ccp_alpha=0.0, train_size=0.7, num_iter=50, num_stability_iter=5, num_samples=2000, samples_size=None, use_features=None, predict_method_name='predict', optimization='fidelity', aggregate=True, verbose=False)`</font>




**PARAMETERS:**

* **X** *({array-like, sparse matrix} of shape (n_samples, n_features))* – The training input samples. Internally, it will be converted to a pandas DataFrame.
* **y** *(array-like of shape (n_samples,) or (n_samples, n_outputs))* – The target values for X (class labels in classification, real numbers in regression). Internally, it will be converted to a pandas Series.
* **top_k** *(int, default=10)* – Number of top-k branches, sorted by number of samples per branch, to keep after finding decision tree with highest fidelity.
* **max_leaf_nodes** *(int, default=None)* – Grow a tree with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes.
* **max_depth** *(int, default=None)* – The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure.
* **ccp_alpha** *(float, default=0.0)* – Complexity parameter used for Minimal Cost-Complexity Pruning. The subtree with the largest cost complexity that is smaller than ccp_alpha will be chosen. By default, no pruning is performed. See Minimal Cost-Complexity Pruning here for details: https://scikit-learn.org/stable/modules/tree.html#minimal-cost-complexity-pruning
* **train_size** *(float or int, default=0.7)* – If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples.
* **num_iter** *(int, default=50)* – Number of iterations to repeat Trustee inner-loop for.
* **num_stability_iter** *(int, default=5)* – Number of stability to repeat Trustee stabilization outer-loop for.
* **num_samples** *(int, default=2000)* – The absolute number of samples to fetch from the training dataset split to train the student decision tree model. If the samples_size argument is provided, this arg is ignored.
* **samples_size** *(float, default=None)* – The fraction of the training dataset to use to train the student decision tree model. If None, the value is automatically set to the num_samples provided value.
* **use_features** *(array-like, default=None)* – Array-like of integers representing the indexes of features from the X training samples. If not None, only the features indicated by the provided indexes will be used to train the student decision tree model.
* **predict_method_name** *(str, default="predict")* – The method interface to use to get predictions from the expert model. If no value is passed, the default predict interface is used.
* **optimization** *({"fidelity", "accuracy"}, default="fidelity")* – The comparison criteria to optimize the decision tree students in Trustee inner-loop. Used for ablation study only.
* **aggregate** *(bool, default=True)* – Boolean indicating whether dataset aggregation should be used in Trustee inner-loop. Used for ablation study only.
* **verbose** *(bool, default=False)* – Boolean indicating whether to log messages.

For this tutorial, specify only the parameters **X, y, num_iter=30, num_stability_iter, samples_size, verbose=True**.

In [None]:
# Fit for classification models
# TODO



---
#Generate an Explanation

<font color="blue">`explain(top_k=10)`</font>

Returns explainable model that best imitates Expert model, based on highest mean agreement and highest fidelity.


**RETURNS:**

* **top_student** – `(dt, pruned_dt, agreement, reward)`
* **dt**: *{DecisionTreeClassifier, DecisionTreeRegressor}* -
Unconstrained fitted student model.
* **pruned_dt**: *{DecisionTreeClassifier, DecisionTreeRegressor}* - Top-k pruned fitted student model.
* **agreement**: *float* - Mean agreement of pruned student model with respect to others.
* **reward**: *float* - Fidelity of student model to the expert model.

**RETURN TYPE:**

* tuple

In [None]:
# Get the best explanation from Trustee
# TODO

print(f"Model explanation training (agreement, fidelity): ({agreement}, {reward})")
print(f"Model Explanation size: {dt.tree_.node_count}")
print(f"Top-k Prunned Model explanation size: {pruned_dt.tree_.node_count}")

---
**Evaluate explanations produced by Trustee**

In [None]:
# Use explanations to make predictions
# TODO

# Evaluate accuracy and fidelity of explanations
print("Model explanation global fidelity report:")
print(classification_report(y_pred, dt_y_pred))
print("Top-k Model explanation global fidelity report:")
print(classification_report(y_pred, pruned_dt_y_pred))

print("Model explanation score report:")
print(classification_report(y_test, dt_y_pred))
print("Top-k Model explanation score report:")
print(classification_report(y_test, pruned_dt_y_pred))

---
**Render Decision Tree explanations**

In [None]:
# Output decision tree to pdf
dot_data = tree.export_graphviz(
    dt,
    class_names=CIC_IDS_2017_DATASET_META["classes"],
    feature_names=feature_names,
    filled=True,
    rounded=True,
    special_characters=True,
)
graph = graphviz.Source(dot_data)
graph.render("dt_explanation")

# Output pruned decision tree to pdf
dot_data = tree.export_graphviz(
    pruned_dt,
    class_names=CIC_IDS_2017_DATASET_META["classes"],
    feature_names=feature_names,
    filled=True,
    rounded=True,
    special_characters=True,
)
graph = graphviz.Source(dot_data)
graph.render("pruned_dt_explanation")

In [None]:
from pdf2image import convert_from_path

images = convert_from_path("pruned_dt_explanation.pdf")
images[0]

---
# Validation

**Read Validation Dataset with Out-of-Distribution Samples**

In [None]:
VALIDATION_DF_PATH = "https://raw.githubusercontent.com/TrusteeML/emperor/main/use_cases/heartbleed_case/res/dataset/validation/heartbleed.csv"

X_validate, y_validate, _, _, _ = dataset.read(VALIDATION_DF_PATH, metadata=CIC_IDS_2017_DATASET_META, as_df=True)


**Use Validation dataset to evaluate trained Random Forest Classifier**


In [None]:
y_val_pred = blackbox.predict(X_validate)

print("Blackbox model classification report with OOD:")
print(
    "\n{}".format(
        classification_report(
            y_validate,
            y_val_pred,
            digits=3,
            target_names=["BENIGN", "Heartbleed"],
        )
    )
)

---
# Trust Report

Run Trust Report on trained Random Forest Classifier

<font color="blue">`class trustee.report.trust.TrustReport(blackbox, X=None, y=None, X_train=None, X_test=None, y_train=None, y_test=None, max_iter=10, num_pruning_iter=10, train_size=0.7, predict_method_name='predict', trustee_num_iter=50, trustee_num_stability_iter=10, trustee_sample_size=0.5, trustee_max_leaf_nodes=None, trustee_max_depth=None, trustee_ccp_alpha=0.0, analyze_branches=False, analyze_stability=False, skip_retrain=False, top_k=10, logger=None, verbose=False, class_names=None, feature_names=None, is_classify=True)`</font>



Bases: `object`

Class to generate Trust Report.

Builds Trust Report for given blackbox model using the Trustee method to extract whitebox explanations as Decision Trees.

**PARAMETERS:**
* **blackbox** *(object)* – The ML blackbox model to analyze. The expert model must have a predict method call implemented for Trustee to work properly, unless explicitly stated otherwise using the predict_method_name.
* **X** *({array-like, sparse matrix} of shape (n_samples, n_features))* – The training input samples. Internally, it will be converted to a pandas DataFrame. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **y** *(array-like of shape (n_samples,) or (n_samples, n_outputs))* – The target values for X (class labels in classification, real numbers in regression). Internally, it will be converted to a pandas Series. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **X_train** *({array-like, sparse matrix} of shape (n_samples, n_features))* – The training input samples. Internally, it will be converted to a pandas DataFrame. Use this argument if a fixed train-test split is to be used. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **X_test** *({array-like, sparse matrix} of shape (n_samples, n_features))* – The training input samples. Internally, it will be converted to a pandas DataFrame. Use this argument if a fixed train-test split is to be used. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **y_train** *(array-like of shape (n_samples,) or (n_samples, n_outputs))* – The target values for X (class labels in classification, real numbers in regression). Internally, it will be converted to a pandas Series. Use this argument if a fixed train-test split is to be used. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **y** – The target values for X (class labels in classification, real numbers in regression). Internally, it will be converted to a pandas Series. Use this argument if a fixed train-test split is to be used. Either (X, y) or (X_train, X_test, y_train, y_test) must be provided.
* **max_iter** *(int, default=10)* – Number of iterations to repeat several analyses in the Trust Report, including feature removal and branch analysis.
* **num_pruning_iter** *(int, default=10)* – Number of iterations to repeat the pruning analysis.
* **train_size** *(float or int, default=0.7)* – If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the train split. If int, represents the absolute number of train samples.
* **predict_method_name** *(str, default="predict")* – The method interface to use to get predictions from the expert model. If no value is passed, the default predict interface is used.
* **trustee_num_iter** *(int, default=50)* – Number of iterations to repeat Trustee inner-loop for.
* **trustee_num_stability_iter** *(int, default=5)* – Number of stability to repeat Trustee stabilization outer-loop for.
* **trustee_samples_size** *(float, default=None)* – The fraction of the training dataset to use to train the student decision tree model. If None, the value is automatically set to the num_samples provided value.
* **trustee_max_leaf_nodes** *(int, default=None)* – Grow a tree with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity. If None then unlimited number of leaf nodes.
* **trustee_max_depth** *(int, default=None)* – The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure.
* **trustee_ccp_alpha** *(float, default=0.0)* – Complexity parameter used for Minimal Cost-Complexity Pruning. The subtree with the largest cost complexity that is smaller than ccp_alpha will be chosen. By default, no pruning is performed. See Minimal Cost-Complexity Pruning here for details: https://scikit-learn.org/stable/modules/tree.html#minimal-cost-complexity-pruning
* **analyze_branches** *(bool, default=False)* – Boolean indicating whether to perform the Trust Report branch analysis of Trustee explanations.
* **analyze_stability** *(bool, default=False)* – Boolean indicating whether to perform the Trust Report stability analysis of Trustee explanations.
* **skip_retrain** *(bool, default=False)* – Boolean indicating whether the Trust Report should attempt to retrain the given blackbox model. Used to evaluate the impact of each feature in training by iteratively removing top features. Works well for scikit-explain model, but can be troublesome for other libraries (especially AutoGluon).
* **top_k** *(int, default=10)* – Number of top-k branches, sorted by number of samples per branch, to keep after finding decision tree with highest fidelity.
* **verbose** *(bool, default=False)* – Boolean indicating whether to log messages.
* **logger** *(Logger object , default=None)* – A logger object to log messages to. If none is given, the print() method will be used to log messages.
* **class_names** *(array-like of str, default=None)* – List of class names to use when plotting decision trees and graphs.
* **feature_names** *(array-like of str, default=None,)* – List of feature names to use when plotting decision trees and graphs.
* **is_classify** *(bool, default=True,)* – Whether given blackbox is a classifier or regressor. The outputted plots change depending on chosen value.

In [None]:
from trustee.report.trust import TrustReport

trust_report = TrustReport(
    blackbox,
    X=X,
    y=y,
    top_k=10,
    max_iter=5,
    trustee_num_iter=30,
    num_pruning_iter=10,
    trustee_num_stability_iter=5,
    trustee_sample_size=0.30,
    analyze_stability=True,
    skip_retrain=False,
    feature_names=feature_names,
    class_names=CIC_IDS_2017_DATASET_META["classes"],
    verbose=True,
)

print(trust_report)

**Saves several graphs for further inspection**

In [None]:
import warnings
warnings.filterwarnings("ignore")

OUTPUT_PATH = "res/output"
trust_report.save(OUTPUT_PATH)

# warning: execution time ~ 30 minutes