# Introduction

In this lab session we will provide a quick introduction to the libraries we often use in the labs. We will install the required modules and perform a quick verificaiton that the installed packages work correctly.

At the end of the lab, we ask of you to provide an instance in your line of work where you would like to apply the techniques you will learn in this Explainable AI course. It could be a specific use case where you would like to explain how a model performs a specific task, it could be a general overview where you think explainability is especially important in your domain/workplace, what your current view of explainable AI is, etc. This is an open ended question where you are free to write whatever you please.

From this last part of the lab, there is an option to perform an optional project at the end of the course. The project is as well open ended and not mandatory for a grade in the course. Our intention with this project option is to provide a platform for you to discuss your idea during the interactive sessions and try to provide an explainable AI model. You are not required to provide any models or datasets, if there are privacy or NDA concerns, it is only for your own benefit.

If you decide to try and make a project, try and explain the problem you are trying to solve and how explainability might help with that.

# Installation

We have the cell below to run if you would like to install the packages via the notebook. If you want more control of where the packages are stored, we have a pip *requirements.txt* file next to this notebook on the courses blackboard page. **We recommend you install via the requirements.txt file** if you are using your own machine and have Python 3.11 installed.

If you are using Colab, then running the cell below should install everything necessary for the labs. We will reuse this cell containing installations in the future labs as well. You might have to restart the session after installation by pressing **Runtime** -> **Restart Session** to make everything compatible.

In [None]:
%pip install -U scikit-learn==1.4.1.post1 shap==0.45.0 lime==0.2.0.1 tensorflow==2.16.1 tf-keras==2.16.0 graphviz==0.20.2 dtreeviz==2.2.2 eli5==0.13.0 xgboost==2.0.3 pandas==2.2.1 seaborn==0.13.2 tf-keras-vis==0.8.6 numpy==1.26.4

## SciKit Learn

In some of the labs we will use the well known scikit-learn module.
It contains many algorithms, models, datasets, evaluation methods, and many other things, from the machine learning domain.

We will try and stick to the basic functionalities from the library in the course labs. However, we feel it to be of importance to go through the basic syntax that all models follow.

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

Creating a logistic regression model to train on the iris dataset.

In [None]:
lor = LogisticRegression()

Load the iris dataset and split into test and train subsets.

In [None]:
iris = load_iris()
X_iris_train, X_iris_test, y_iris_train, y_iris_test = train_test_split(iris['data'], iris['target'], random_state=42, test_size=0.4)

Training a model on a dataset calls for the fit function.

In [None]:
lor.fit(X_iris_train, y_iris_train)

Predicting using the model is performed using predict

In [None]:
y_pred = lor.predict(X_iris_test)

For a simple accuracy measure of the classification, we can use the accuracy score function available in the **sklearn.metrics** module

In [None]:
accuracy_score(y_iris_test, y_pred)

As previously stated, all models follow the same fit and predict scheme which can be seen for this linear regression model on a diabetes dataset.

In [None]:
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_diabetes
from sklearn.metrics import mean_absolute_error

lir = LinearRegression()
diabetes = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(diabetes['data'], diabetes['target'], random_state=42, test_size=0.4)

lir.fit(X_train, y_train)
y_pred = lir.predict(X_test)
mean_absolute_error(y_test, y_pred)

## SHAP
A package to perform local model-agnostic techniques, to explain individual predictions of a model. A quick example of the linear regressor on the diabetes dataset.

In [None]:
import shap
mask = shap.maskers.Independent(X_train)
lr_explainer = shap.LinearExplainer(lir, mask, feature_names=diabetes.feature_names)
lr_shap_values = lr_explainer.shap_values(X_test)
lr_shaps = lr_explainer(X_test)

shap.decision_plot(lr_explainer.expected_value, lr_shap_values[[6,8,10]], diabetes.feature_names)

## LIME
A package to perform local model-agnostic techniques. A quick example using the logistic regressor for the iris dataset.

In [None]:
import lime

lime_explainer = lime.lime_tabular.LimeTabularExplainer(X_iris_train, feature_names = iris.feature_names, kernel_width=3, verbose=False)
exp = lime_explainer.explain_instance(X_iris_test[28], lor.predict_proba, num_features=5)
exp.show_in_notebook()

## Pandas
Common package used in data science and machine learning.

In [None]:
import pandas as pd

df = pd.DataFrame([[1,2,3],[6,5,4]], columns=['col1', 'col2','col3'])
df

## xgboost
A library that implements the use of ensemble trees.

In [None]:
import xgboost
xgb = xgboost.XGBClassifier(n_estimators=500, max_depth=4)
xgb.fit(X_iris_train, y_iris_train)
print(accuracy_score(xgb.predict(X_iris_test), y_iris_test))

## dtreeviz
A visualization tool for displaying decision trees.

In [None]:
import dtreeviz
viz = dtreeviz.model(xgb, X_iris_train, y_iris_train,
                     target_name='class', tree_index=1,
                     feature_names=iris.feature_names,
                     class_names=iris.target_names)
viz.view(fontname='DejaVu Sans')

## Tensorflow and Keras

For the last lab, we will venture into neural networks and we have prepared our lab using Keras and Tensorflow. If you have experiences with this library, or any of the other libraries, it should be fairly straight forward to follow the lab. We will provide pre-defined models and you do not have to implement much related to the networks themselves.

If you do not have much experience with the framework, or neural networks in general, it might not be necessary for you to fully understand all of the code.
Most of the code we provide and should require little hands on work from you.

However, we do still recommend you to go through some basic examples available at the [Keras homepage](https://keras.io/examples/).


In [None]:
import tensorflow
from tensorflow import keras

# Overview of the remaining labs

Lab 2 - In this lab, we will focus on interpretable models. Traditional statistical models, such as linear regression, are usually referred to as intrisically interpretable models as they are easily understood what they have learned and how the perform their actions.

Lab 3 - Here, we will venture through global model-agnostic methods.
We will identify the feature importance of datasets using feature importance permutation and go through thte use of global surrogate models.

Lab 4 - Local model-agnostic methods is the topic for this lab. SHAP and LIME will be used to explain individual predictions to determine the effect and importance of each feature in the model.

Lab 5 - Neural networks is the final topic of the lab sessions.

# Your view of XAI, project idea, applications of XAI, etc.

Provide an instance in your line of work where you would like to apply the techniques you will learn in this Explainable AI course. It could be a specific use case where you would like to explain how a model performs a specific task, it could be a general overview where you think explainability is especially important in your domain/workplace, what your current view of explainable AI is, etc. This is an open ended question where you are free to write whatever you please.

Optionally, you could also propose a project to conduct during the course. This project is as well open ended and not mandatory for a grade in the course. Our intention with this project option is to provide a platform for you to discuss your idea during the interactive sessions and try to provide an explainable AI model. You are not required to provide any models or datasets, if there are privacy or NDA concerns, it is only for your own benefit. If you decide to try and make a project, try and explain the problem you are trying to solve and how explainability might help with that.