<a href="https://colab.research.google.com/github/mille055/AIPI590-XAI/blob/main/Assignments/08_explainable_llm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href='https://ai.meng.duke.edu'> = <img align="left" style="padding-top:10px;" src=https://storage.googleapis.com/aipi_datasets/Duke-AIPI-Logo.png>

# AIPI 590 - XAI | Assignment 08

#Description: Interpretable LLM
This notebook is for exploring explainability of LLMs.


## Chad Miller

[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mille055/AIPI590-XAI/blob/main/Assignments/08_explainable_llm.ipynb)

Acknowledgements: Class Repository code on GAM model, kaggle telco-customer-churn dataset

In [43]:
import os

# Remove Colab default sample_data
!rm -r /content/sample_data

# Clone GitHub files to colab workspace
repo_name = "AIPI590-XAI"
git_path = 'https://github.com/mille055/AIPI590-XAI.git'
!git clone "{git_path}"
# !git clone 'https://github.com/mille055/CT_Protocol.git'

# Install dependencies from requirements.txt file
!pip install -r "{os.path.join(repo_name,'requirements.txt')}"
# !pip install -r "{os.path.join(repo_name, 'requirements2.txt')}"

notebook_dir = 'Assignments'
path_to_notebook = os.path.join(repo_name,notebook_dir)


rm: cannot remove '/content/sample_data': No such file or directory
fatal: destination path 'AIPI590-XAI' already exists and is not an empty directory.
Collecting rulefit@ git+https://github.com/christophM/rulefit.git (from -r AIPI590-XAI/requirements.txt (line 14))
  Cloning https://github.com/christophM/rulefit.git to /tmp/pip-install-ad73onf5/rulefit_a5f61073db924d2eae104e9cf71159ae
  Running command git clone --filter=blob:none --quiet https://github.com/christophM/rulefit.git /tmp/pip-install-ad73onf5/rulefit_a5f61073db924d2eae104e9cf71159ae
  Resolved https://github.com/christophM/rulefit.git to commit 472b8574b4eb9e565caf1e05ed580998fe2c9a8e
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting alepython@ git+https://github.com/MaximeJumelle/ALEPython.git (from -r AIPI590-XAI/requirements.txt (line 15))
  Cloning https://github.com/MaximeJumelle/ALEPython.git to /tmp/pip-install-ad73onf5/alepython_69bb84ce5aec41a08b5134388ca71d92
  Running command git clone --filter=bl

In [66]:
## Standard libraries
import json
import math
import time
import numpy as np
import tabulate
import urllib.request
import zipfile
import pandas as pd
import re


## Imports for plotting
from IPython.display import Image
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import set_matplotlib_formats
#set_matplotlib_formats('svg', 'pdf') # For export
from matplotlib.colors import to_rgb
import matplotlib
matplotlib.rcParams['lines.linewidth'] = 2.0
import seaborn as sns
sns.set()


## Progress bar
from tqdm.notebook import tqdm

## PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as data
import torch.optim as optim

# Torchvision
import torchvision
from torchvision.datasets import CIFAR10
from torchvision import transforms, datasets
import torchvision.transforms.functional as TF
from torchvision.utils import make_grid
from torch.utils.data import DataLoader
from torchvision.datasets import ImageNet

# Other
from huggingface_hub import login
from datasets import load_dataset
from sklearn.metrics import confusion_matrix, classification_report
from openai import OpenAI
from getpass import getpass
from google.colab import userdata

In [67]:
api_key = None

# Try to load the API key from an environment variable
api_key = userdata.get('OPENAI_API_KEY')

# If the API key is not found,
if not api_key:
    print('not in environment variables...')
    api_key = getpass("Enter your OpenAI API key: ")

if api_key:
  print('API key found')



client = OpenAI(api_key=api_key)



API key found


In [71]:

# Load the eigth grade math dataset from Hugging Face Hub
dataset = load_dataset("gsm8k", 'main', split="train")

# making dataframe version
df = dataset.to_pandas()
# Dataset size
print(f'Downloaded dataset length is ', len(dataset))


# limiting to first 200
print('Limiting to the first 200 Q A pairs for this analysis.')
print('Some example question and answers:')
df = df[:200]
df.head()


Downloaded dataset length is  7473
Limiting to the first 200 Q A pairs for this analysis.
Some example question and answers:


Unnamed: 0,question,answer
0,Natalia sold clips to 48 of her friends in Apr...,Natalia sold 48/2 = <<48/2=24>>24 clips in May...
1,Weng earns $12 an hour for babysitting. Yester...,Weng earns 12/60 = $<<12/60=0.2>>0.2 per minut...
2,Betty is saving money for a new wallet which c...,"In the beginning, Betty has only 100 / 2 = $<<..."
3,"Julie is reading a 120-page book. Yesterday, s...",Maila read 12 x 2 = <<12*2=24>>24 pages today....
4,James writes a 3-page letter to 2 different fr...,He writes each friend 3*2=<<3*2=6>>6 pages a w...


In [72]:
# Function to parse the correct answer from the structured answer text
def parse_correct_answer(entry):
    answer_text = entry['answer']

    # Use regex to find the final answer after '####'
    match = re.search(r'####\s*(\d+)', answer_text)
    if match:
        # Convert answer to integer
        return int(match.group(1))
    return None



In [75]:
# Call the OpenAI API to get a response
def get_gpt3_response(prompt):

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant. After solving the problem, put the final numerical answer preceded by \n####"},
            {"role": "user", "content": prompt}
        ],
        max_tokens=150,
        temperature=0.3
    )
    # Extract and return the model's response
    return response.choices[0].message.content.strip()



In [76]:
# Example of function use
question = df.iloc[0]['question']
response = get_gpt3_response(question)
print("Response:", response)

Response: In April, Natalia sold 48 clips.
In May, she sold half as many clips as in April, which is 48 / 2 = 24 clips.

Altogether, Natalia sold 48 + 24 = 72 clips in April and May. 

#### 72


In [77]:
# Function to assess answers and get model responses
def assess_math_questions(df):
    results = []
    for index, row in df.iterrows():
        question = row['question']
        correct_answer = parse_correct_answer(row)
        model_response = get_gpt3_response(question)
        model_numerical_answer = int(re.findall(r'\d+', model_response.strip())[-1]) if re.findall(r'\d+', model_response.strip()) else None

        # Store results in a dictionary for easier analysis later
        results.append({
            'question': question,
            'correct_answer': correct_answer,
            'model_response': model_response,
            'model_numerical_answer': model_numerical_answer,
            'is_correct': correct_answer == model_numerical_answer,
        })

    return pd.DataFrame(results)



In [None]:
# Run assessment and view results
results_df = assess_math_questions(df)
results_df.head()
print(results_df.head())

# saving results as csv
results_df.to_csv('math_results.csv', index=False)

In [None]:
print('Accuracy of the model:', results_df['is_correct'].mean())


In [None]:
# Get device
device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
print('device is ', device)

In [None]:
login()



In [None]:
# Using a Huggingface dataset



ds = load_dataset("AI-MO/NuminaMath-CoT")

In [None]:
# # Access the training split
train_data = ds['train']


In [None]:
# print(train_data[0]['solution'])

In [None]:
df_dict = {}
for split in ds.keys():
  #print('splitting on ', split)
  df_dict[split] = ds[split].to_pandas()
#df_dict['train'].source.value_counts()
train = df_dict['train'].copy()
test = df_dict['test'].copy()

test.head()

In [None]:
# prompt: find entries in the dataframe that need the escapechar

rows_with_escapechar = []
for index, row in test.iterrows():
    if '"' in row['problem'] or '"' in row['solution']:
        rows_with_escapechar.append(index)

print(f"Rows needing escapechar: {rows_with_escapechar}")

# Alternatively, to print the actual rows that need the escape character
#for index in rows_with_escapechar:
#  print(test.loc[index])

In [None]:

# Export train and test DataFrames to CSV files
train.to_csv('math_train.csv', index=False, escapechar='/')
test.to_csv('math_test.csv', index=False, escapechar='/')

# trying another ...

In [None]:

file_path = 'CT_Protocol/data/results_df.csv'

try:
  results_df = pd.read_csv(file_path)
  print("DataFrame created successfully.")
except FileNotFoundError:
  print(f"File not found at: {file_path}")
except Exception as e:
  print(f"An error occurred: {e}")

In [None]:
results_df.head()

In [None]:

# Extract the ground truth labels and predicted labels
ground_truth = results_df['ft_protocol']
predictions = results_df['ft_predicted_protocol']

# Find the common protocols present in both ground_truth and predictions
common_protocols = np.intersect1d(ground_truth.unique(), predictions.unique())

# Filter the ground_truth and predictions to only include the common protocols
filtered_ground_truth = ground_truth[ground_truth.isin(common_protocols)]
filtered_predictions = predictions[ground_truth.isin(common_protocols)]

# Create the confusion matrix for filtered labels
cm = confusion_matrix(filtered_ground_truth, filtered_predictions, labels=common_protocols)

# Plot the confusion matrix with labels for common protocols
plt.figure(figsize=(10, 10))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=common_protocols, yticklabels=common_protocols)
plt.xlabel('Predicted Protocol')
plt.ylabel('Ground Truth Protocol')
plt.title('Confusion Matrix for Fine-tuned Protocol Prediction')

# Save the figure
plt.savefig('/content/confusion_matrix.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
report = classification_report(filtered_ground_truth, filtered_predictions, labels=common_protocols)

print(report)

In [None]:
# Extract the ground truth labels and predicted labels
results_bm_df = results_df[results_df['bm_predicted_protocol'].notna()]

ground_truth = results_bm_df['bm_protocol']
predictions = results_bm_df['bm_predicted_protocol']

# Find the common protocols present in both ground_truth and predictions
common_protocols = np.intersect1d(ground_truth.unique(), predictions.unique())

# Filter the ground_truth and predictions to only include the common protocols
filtered_ground_truth = ground_truth[ground_truth.isin(common_protocols)]
filtered_predictions = predictions[ground_truth.isin(common_protocols)]

# Create the confusion matrix for filtered labels
cm = confusion_matrix(filtered_ground_truth, filtered_predictions, labels=common_protocols)

# Plot the confusion matrix with labels for common protocols
plt.figure(figsize=(10, 10))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=common_protocols, yticklabels=common_protocols)
plt.xlabel('Predicted Protocol')
plt.ylabel('Ground Truth Protocol')
plt.title('Confusion Matrix for Fine-tuned Protocol Prediction')

# Save the figure
plt.savefig('/content/bm_confusion_matrix.png', dpi=300, bbox_inches='tight')
plt.show()

In [None]:
bm_report = classification_report(filtered_ground_truth, filtered_predictions, labels=common_protocols)

print(bm_report)

In [None]:
# filename = 'content/CT_Protocol/data/dataset031524.xlsx'
# _, _, test_df = get_dataframes(filename)
# model_path = "mille055/auto_protocol2"


# tokenizer = AutoTokenizer.from_pretrained(model_path)
# model = AutoModelForCausalLM.from_pretrained(model_path)
# pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, framework="pt")

# ft_model_score = test_model2(test_df, pipe=pipe)
# print(ft_model_score)

In [None]:

# from google.colab import userdata
# from huggingface_hub import HfApi
# from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
# import sys

# sys.path.insert(0, 'content/ct_protocol/scripts')
# from utilities import get_dataframes, test_model2


# filename = 'content/CT_Protocol/data/dataset031524.xlsx'
# _, _, test_df = get_dataframes(filename)
# model_path = "mille055/auto_protocol2"

# # Get the Hugging Face token from Colab secrets
# try:
#   token = userdata.get('huggingface_token')
#   if token is None:
#     raise ValueError("Hugging Face token not found in Colab secrets.")
# except:
#     print("Error getting Hugging Face token from secrets. Please ensure it's set up.")
#     token = input("Enter your Hugging Face token: ")  # Prompt for manual input if secrets are not set up

# api = HfApi(token=token)

# tokenizer = AutoTokenizer.from_pretrained(model_path)
# model = AutoModelForCausalLM.from_pretrained(model_path)
# pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, framework="pt")

# ft_model_score = test_model2(test_df, pipe=pipe)
# ft_model_score

In [None]:

from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM

config = PeftConfig.from_pretrained("mille055/auto_protocol2")
base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
model = PeftModel.from_pretrained(base_model, "mille055/auto_protocol2")

## Exploratory Data Analysis

Dataset:


In [None]:
# Check multicollinearity using Variance Inflation Factor (VIF)
def check_multicollinearity(df):
  X = df.select_dtypes(include=[float, int])  # Select only numeric features

  # Add constant for the intercept term
  X = sm.add_constant(X)

  # Calculate VIF
  vif_data = pd.DataFrame()
  vif_data['Feature'] = X.columns
  vif_data['VIF'] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

  print(vif_data)


def EDA(df):
  df1= df.copy()

  # Display basic information about the dataset
  dataset_info = {
    'Shape': df1.shape,
    'Columns': df1.columns.tolist(),
    'Missing Values': df1.isnull().sum().sum(),
    'Data Types': df1.dtypes.value_counts().to_dict(),
    'Unique Values': df1.nunique().to_dict()
  }

  for key, value in dataset_info.items():
    print(f"{key}: {value}")


  print(df1.head())

  # Summary statistics
  summary_statistics = df1.describe()
  print(summary_statistics)

  # Visualize numerical features
  df1.hist(figsize=(12,10))
  plt.tight_layout()
  plt.show()

  # Correlation matrix for numerical features after dropping any non-numeric columns

  df1_numeric = df1.select_dtypes(include=['float64', 'int64'])
  corr_matrix = df1_numeric.corr()
  plt.figure(figsize=(10, 8))
  sns.heatmap(corr_matrix, annot=True, cmap="coolwarm")
  plt.show()


  # # Pairplot - but takes awhile so only doing the first time
  # print('\nPairplot:\n')
  # sns.pairplot(df1)
  # plt.show()

  # Check colinearity
  print('\nColinearity Check:\n')
  check_multicollinearity(df1)


EDA(df)

## Learned from the EDA:
There are several points to take away from the EDA, including:

1. There are 569 examples with no missing values.

2. Average, median, and std dev valus of the features are given above.

3. Features like mean radius (3817.26), mean perimeter (3792.70), worst radius (815.95), and worst perimeter (405.15) have extremely high VIFs, indicating that they are highly collinear with other features.

## Prepare training and testing datasets

In [None]:
# Preparation of the train/test datasets

X = df.drop(columns=['target'])
y = df['target']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Standardize features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Rulefit

In [None]:
# Train RuleFit model
rulefit_model = RuleFitClassifier(random_state=42)
rulefit_model.fit(X_train_scaled, y_train)

# Evaluate
y_pred = rulefit_model.predict(X_test_scaled)
print(f'Accuracy of RuleFit: {accuracy_score(y_test, y_pred):.2f}')

In [None]:
rule_df = rulefit_model.visualize()
rule_df

Repeating rulefit using a dataset with reduced number of features based on multicolinearity.



In [None]:
## removing some of the colinear features
# List of features to remove based on prior analysis
features_to_remove = [
    'mean perimeter', 'mean area', 'perimeter error', 'area error',
    'mean compactness', 'mean concave points', 'worst compactness', 'worst concave points',
    'mean symmetry', 'symmetry error', 'worst symmetry',
    'mean fractal dimension', 'worst fractal dimension', 'fractal dimension error'
]

# reducing to 16 features

df_reduced = df.drop(columns=features_to_remove)
print(df_reduced.shape)
print(df_reduced.head())

X_reduced = df_reduced.drop(columns=['target'])
y = df_reduced['target']

# splitting the dataset into train/test
X_train_reduced, X_test_reduced, y_train, y_test = train_test_split(X_reduced, y, test_size=0.3, random_state=42)

# Standardize features
scaler = StandardScaler()
X_train_reduced_scaled = scaler.fit_transform(X_train_reduced)
X_test_reduced_scaled = scaler.transform(X_test_reduced)


In [None]:
# Train RuleFit model on the reduced dataset
rulefit_reduced_model = RuleFitClassifier(random_state=42)
rulefit_reduced_model.fit(X_train_reduced_scaled, y_train)

# Evaluate
y_pred = rulefit_reduced_model.predict(X_test_reduced_scaled)
print(f'Accuracy of RuleFit: {accuracy_score(y_test, y_pred):.2f}')

In [None]:
rule_reduced_df = rulefit_reduced_model.visualize()
rule_reduced_df

A few points about rulefit for the reduced breast cancer dataset:

1. The accuracy was high of 0.98 (compared with 0.96 for the original dataset containing all features).
2. For the rules produced by the rulefit model:

Individual Features (Linear Terms)

*   X2 (mean texture) with a coefficient of -0.21: This means that higher values of mean texture are associated with a decrease in the target value (indicating benign characteristics if the target is malignancy).
*   X11 (worst radius) with a coefficient of -0.20: Similarly, a larger worst radius slightly reduces the predicted outcome.

Decision rules with multiple features
* X12 <= 0.30526 and X3 <= 0.20041 and X4 <= 0.53781 has a coefficient of 1.50:
This is the highest positive coefficient, meaning it's a significant predictor of a higher target outcome when all these conditions are true:
worst perimeter ≤ 0.30526
mean smoothness ≤ 0.20041
mean concavity ≤ 0.53781
This rule significantly influences the model's decision, suggesting it’s a combination suggestive of malignancy.


* X10 <= 0.10695 and X12 <= 0.31888 and X14 <= 2.02129 and X4 <= 2.11819 has a positive coefficient of 0.78, indicating it has a strong effect when these conditions hold true:
X10 (concave points error) <= 0.10695
X12 (worst perimeter) <= 0.31888
X14 (worst smoothness) <= 2.02129
X4 (mean concavity) <= 2.11819

* X11 > -0.93104 and X12 > -0.16385 and X15 > -0.35809 has a coefficient of -1.18, which is a high negative coefficient. When these conditions are met:
worst radius is higher than threshold,
worst perimeter is higher than threhsold,
worst concavity is greater than threshold. These findings suggest benign characteristics.






## Visualizations of rulefit

In [None]:

# Given rules and coefficients from your RuleFit model
rules_data = {
    'X3 <= 0.27437 and X4 <= 0.76102': 0.09,
    'X10 <= 0.10695 and X12 <= 0.31888 and X14 <= 2.02129 and X4 <= 2.11819': 0.78,
    'X12 <= 0.30526 and X3 <= 0.20041 and X4 <= 0.53781': 1.50,
    'X1 <= 0.48015 and X12 <= 0.21144': 0.27,
    'X11 > -0.93104 and X12 > -0.16385 and X15 > -0.35809': -1.18,
    'X2': -0.21,
    'X4': -0.00,
    'X7': 0.03,
    'X11': -0.20
}

# Map the feature indices to their actual names
feature_mapping = {
    'X1': 'mean radius',
    'X2': 'mean texture',
    'X3': 'mean smoothness',
    'X4': 'mean concavity',
    'X7': 'smoothness error',
    'X10': 'concave points error',
    'X11': 'worst radius',
    'X12': 'worst texture',
    'X14': 'worst smoothness',
    'X15': 'worst concavity'
}

# rules and feature names to be more interpretable

interpreted_rules = {}
for rule, coef in rules_data.items():
    interpreted_rule = rule
    for old, new in feature_mapping.items():
        interpreted_rule = interpreted_rule.replace(old, new)
    interpreted_rules[interpreted_rule] = coef

# Separate out the linear terms and rules
linear_terms = {k: v for k, v in interpreted_rules.items() if not ('and' in k)}
rules_only = {k: v for k, v in interpreted_rules.items() if 'and' in k}

# Plot linear terms
plt.figure(figsize=(10, 6))
plt.barh(list(linear_terms.keys()), list(linear_terms.values()), color='skyblue')
plt.title('Linear Terms Contribution (Specific to Dataset)')
plt.xlabel('Coefficient Value')
plt.ylabel('Features')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

# Plot rules contribution
plt.figure(figsize=(10, 8))
plt.barh(list(rules_only.keys()), list(rules_only.values()), color='lightcoral')
plt.title('Decision Rules Contribution (Specific to Dataset)')
plt.xlabel('Coefficient Value')
plt.ylabel('Rules')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

In [None]:
rulefit_image_path = os.path.join('AIPI590-XAI', 'Assignments', 'Visualizations', 'rulefit.png')
Image(filename=rulefit_image_path)


## Boosted Rules Classifier (BRC)

The BoostedRulesClassifier is an interpretable machine learning model that combines the power of rule-based models with boosting techniques to create a model that is both accurate and understandable. It is in the family of ensemble models, specifically designed to handle complex datasets while maintaining transparency in its decision-making process.

The model starts by generating a set of simple decision rules from the features of the dataset. These rules are like "if-then" statements, e.g., "if mean radius > 15 and mean texture < 20, then predict malignant."
Each rule is created using decision tree splits. The boosting is an iterative process where the model learns from its mistakes. Initially, the model creates a simple rule and makes predictions, and refines this based on the early predictions.

In [None]:
brc_model = BoostedRulesClassifier(random_state=42)
brc_model.fit(X_train_reduced_scaled, y_train)

# Make predictions
y_pred = brc_model.predict(X_test_reduced_scaled)

# Assess model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy of BoostedRulesClassifier: {accuracy:.2f}')

# Detailed classification report
print(classification_report(y_test, y_pred))



In [None]:
# Extract the rules from the BoostedRulesClassifier
# Access the individual base estimators (decision trees) from the BoostedRulesClassifier
base_estimators = brc_model.estimators_

# Check how many base estimators (rules) are present
print(f"Number of base estimators (trees): {len(base_estimators)}")

# Extract text
for i, estimator in enumerate(base_estimators[:5]):  # Extract rules from the first 5 trees
    print(f"Rules from tree {i + 1}:\n")
    print(export_text(estimator, feature_names=X_reduced.columns.tolist()))
    print("\n" + "=" * 50 + "\n")


# Access feature importances from the BoostedRulesClassifier
feature_importances = brc_model.feature_importances_

# Create a DataFrame for better visualization
feature_importance_df = pd.DataFrame({
    'Feature': X_reduced.columns,
    'Importance': feature_importances
})

# Sort features by importance
feature_importance_df = feature_importance_df.sort_values(by='Importance', ascending=False)

# Plot the feature importances
plt.figure(figsize=(10, 6))
plt.barh(feature_importance_df['Feature'], feature_importance_df['Importance'], color='lightcoral')
plt.title('Feature Importance from BoostedRulesClassifier')
plt.xlabel('Importance')
plt.ylabel('Feature')
plt.gca().invert_yaxis()
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

In [None]:
## visualize schematic of Boosted Rules Classifier

brc_image_path = os.path.join('AIPI590-XAI', 'Assignments', 'Visualizations', 'brc.png')
Image(filename=brc_image_path)

## SLIM (Supersparse Linear Integer Model)
SLIM is an interpretable machine learning model that aims to create a linear model with integer coefficients. The model generates sparse models (many coefficients are zero), meaning it selects only a few important features, and uses integer weights (e.g., -2, 1, 3), making it very easy to understand. The model balances accuracy and interpretability by being both sparse and using simple, integer-based coefficients. While SLIM can be computationally intensive for larger datasets or datasets with many features because it searches for the optimal sparse model, this isn't a significant issue for the Breast Cancer dataset.

As stated in the paper [1], the model comprises a scoring system which, "are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are used to assess the risk of numerous serious medical conditions since they allow physicians to make quick predictions, without extensive training, and without the use of a computer." [1] https://arxiv.org/pdf/1502.04269

In [None]:
# Initialize and fit the SLIM model
slim_model = SLIMClassifier()
slim_model.fit(X_train_reduced_scaled, y_train)

# Make predictions
y_pred = slim_model.predict(X_test_reduced_scaled)

In [None]:
# Evaluate the model performance
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy of SLIM: {accuracy:.2f}')

# Detailed classification report
print(classification_report(y_test, y_pred))

# Access the internal model from slim_model
internal_model = slim_model.model_

# Extract the coefficients
coefficients = internal_model.coef_
print("SLIM Model Coefficients:", coefficients)

In [None]:
# Access the internal model from slim_model
internal_model = slim_model.model_

# Extract the coefficients
coefficients = internal_model.coef_
print("SLIM Model Coefficients:", coefficients)

In [None]:
coef_list = zip(X_reduced.columns, coefficients[0])
coef_df = pd.DataFrame(coef_list, columns=['Feature', 'Coefficient'])
coef_df

In [None]:
rulefit_image_path = os.path.join('AIPI590-XAI', 'Assignments', 'Visualizations', 'slim.png')
Image(filename=rulefit_image_path)

The models have similar accuracy in predicting malignancy (98%). The relative importance of certain features differs (in terms of feature-importance or their coefficients in a linear model) are somewhat different among the three models.

With regards to Interpretability:
* **SLIM Model:**
The SLIM model provides high interpretability because it uses sparse linear equations with integer coefficients. Each feature's contribution is easily understandable, making it clear how predictions are made. The integer coefficients indicate the weight or impact of each feature on the prediction, which is ideal for explaining model behavior to non-experts. The model's sparsity means that only a few features are selected, making it easier to identify the most important factors influencing the prediction but possibly limiting accuracy (although not in the case of this dataset).

* **Boosted Rules Classifier and RuleFit:**
The Boosted Rules Classifier is also interpretable but combines multiple rules in an ensemble format, which can make it slightly harder to interpret compared to SLIM’s linear equation. It can possibly capture more complex relaitonships.

  RuleFit provides linear terms and decision rules, offering a good balance between accuracy and interpretability, but it is more complex than SLIM’s integer-based linear model. RuleFit captures feature importance through both linear terms and rules but may include more features, making it less sparse compared to SLIM.