# The Sparse Tensor Classifier Explainability

In this tutorial you'll learn how to explain the predictions by `SparseTensorClassifier`. The algorithm provides native support for both global and local explainability, thus offering a tool for explorative data analysis and interpreting the individual predictions. 


## Colab

This tutorial and the rest in [this sequence](https://github.com/SparseTensorClassifier/tutorial) can be done in Google colab. If you'd like to open this notebook in colab, click [here](https://colab.research.google.com/github/SparseTensorClassifier/tutorial/blob/main/Quickstart_Explainability.ipynb).

![](https://colab.research.google.com/assets/colab-badge.svg)

## Setup

Uncomment and run the following cell to install the packages. Then, import the modules.

In [1]:
# !pip install stc pandas

In [2]:
import pandas as pd
from stc import SparseTensorClassifier

## Read the dataset
The dataset consists of 101 animals from a zoo. There are 16 variables with various traits to describe the animals. The 7 Class Types are: Mammal, Bird, Reptile, Fish, Amphibian, Bug and Invertebrate. Let's read and shuffle the data.

In [3]:
zoo = pd.read_csv('https://raw.githubusercontent.com/SparseTensorClassifier/tutorial/main/data/zoo/zoo.csv')
zoo = zoo.sample(frac=1, random_state=42)
zoo

Unnamed: 0,animal_name,hair,feathers,eggs,milk,airborne,aquatic,predator,toothed,backbone,breathes,venomous,fins,legs,tail,domestic,catsize,class_type
84,squirrel,1,0,0,1,0,0,0,1,1,1,0,0,2,1,0,0,Mammal
55,oryx,1,0,0,1,0,0,0,1,1,1,0,0,4,1,0,1,Mammal
66,porpoise,0,0,0,1,0,1,1,1,1,1,0,1,0,1,0,1,Mammal
67,puma,1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1,Mammal
45,lion,1,0,0,1,0,0,1,1,1,1,0,0,4,1,0,1,Mammal
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
60,pike,0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,1,Fish
71,rhea,0,1,1,0,0,0,1,0,1,1,0,0,2,1,0,1,Bird
14,crab,0,0,1,0,0,1,1,0,0,0,0,0,4,0,0,0,Invertebrate
92,tuna,0,0,1,0,0,1,1,1,1,0,0,1,0,1,0,1,Fish


## Single Dimension
Let's instruct STC to predict `class_type` based on all the other attributes in the dataset, except `animal_name`. In this example, we are going to collapse all the features in one dimension and explain the model accordingly. This is controlled by the argument `collapse=True`, which is `True` by default upon initialization.

In [4]:
STC = SparseTensorClassifier(targets=['class_type'], features=zoo.columns[1:-1])

### Global Explaination

Fit the training data

In [5]:
STC.fit(zoo[0:70])



Let's now understand the global model structure. This is achieved by the function `explain()`. The function returns a table with the scores associated to each feature, with respect to each target class label.

In [6]:
e = STC.explain()

For example, which are the most relevant features to predict `Mammal`? $\rightarrow$ Give milk, have hair, do not lay eggs.

In [7]:
e.loc["Mammal"][0:3]

Unnamed: 0_level_0,features,score
class_type,Unnamed: 1_level_1,Unnamed: 2_level_1
Mammal,milk: 1,0.25
Mammal,hair: 1,0.16489
Mammal,eggs: 0,0.143899


### Local Explaination

Let's now turn our attention to local explainability and explain the predictions for each predicted animal. The function `predict` returns a tuple of labels, probabilities, and explainability scores.

In [8]:
test = zoo[70:].reset_index()
labels, probability, explainability = STC.predict(test)



Let's consider item 24: a [sole (fish)](https://en.wikipedia.org/wiki/Sole_(fish))

In [9]:
test.loc[24][['animal_name', 'class_type']]

animal_name    sole
class_type     Fish
Name: 24, dtype: object

Which is the predicted class? $\rightarrow$ `Fish`

In [10]:
labels.loc[24]

class_type    Fish
Name: 24, dtype: object

Which is the probability of the predicted class? $\rightarrow$ The probability of `Fish` is `0.51`, pretty high with respect to the other classes.

In [11]:
probability.loc[24]

class_type
Amphibian       0.069069
Bird            0.043749
Bug             0.013218
Fish            0.513370
Invertebrate    0.097584
Mammal          0.093420
Reptile         0.169588
Name: 24, dtype: float64

Why is the model predicting `Fish`? Let's take the top 3 features $\rightarrow$ The animal has `fins`, does not `breathe`, and has no `legs`. As simple as that!

In [12]:
explainability.loc[(24,'Fish')][0:3]

Unnamed: 0_level_0,Unnamed: 1_level_0,features,score
item,class_type,Unnamed: 2_level_1,Unnamed: 3_level_1
24,Fish,fins: 1,0.055465
24,Fish,breathes: 0,0.029967
24,Fish,legs: 0,0.025859


## Multiple Dimensions
Convert to JSON for a more flexible data format and illustrate the explainability for multi-dimensional features. In this example, we are going to define 2 dimensions: `d1` containing the information on the number of legs and if the animal is aquatic, and `d2` containing all the other attributes except `animal_name`. 

In [13]:
items = []
for i, (_, row) in enumerate(zoo.iterrows()):
    item = {}
    item['d1'] = [f+"="+str(row[f]) for f in ['legs','aquatic']]
    item['d2'] = [f+"="+str(row[f]) for f in zoo.columns[1:] if f not in ['legs','aquatic','class_type']]
    item['class_type'] = [row['class_type']]
    item['animal_name'] = [row['animal_name']]
    items.append(item)

items[0]

{'d1': ['legs=2', 'aquatic=0'],
 'd2': ['hair=1',
  'feathers=0',
  'eggs=0',
  'milk=1',
  'airborne=0',
  'predator=0',
  'toothed=1',
  'backbone=1',
  'breathes=1',
  'venomous=0',
  'fins=0',
  'tail=1',
  'domestic=0',
  'catsize=0'],
 'class_type': ['Mammal'],
 'animal_name': ['squirrel']}

Let's instruct STC to predict `class_type` based on `d1` and `d2`. In this example, we are going to keep `d1` and `d2` in separate dimensions of the Sparse Tensor. This is obtained by setting `collapse=False` upon initialization.

In [14]:
STC = SparseTensorClassifier(targets=['class_type'], features=['d1', 'd2'], collapse=False)

### Global Explaination
Fit the training data

In [15]:
STC.fit(items[0:70])



Let's now understand the global model structure. This is achieved by the function `explain()`. The function returns a table with the scores associated to each feature (note: each feature is now represented in a 2 dimensional space), with respect to each target class label.

In [16]:
e = STC.explain()

For example, which are the most relevant features to predict `Mammal`? 
1. It is not `aquatic` and gives `milk`
2. It has 4 `legs` and gives `milk`
3. It has 4 `legs` and does not lay `eggs`

In [17]:
e.loc["Mammal"][0:3]

Unnamed: 0_level_0,d1,d2,score
class_type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Mammal,aquatic=0,milk=1,0.183166
Mammal,legs=4,milk=1,0.170941
Mammal,legs=4,eggs=0,0.170941


### Local Explaination
Let's now turn our attention to local explainability and explain the predictions for each predicted animal. The function `predict` returns a tuple of labels, probabilities, and explainability scores. As we are dealing with multi-dimensional features, make sure to define some policy to avoid missing predictions.

In [18]:
test = items[70:]
labels, probability, explainability = STC.predict(test, policy=[['d1','d2'], ['d2'], []])



Let's consider item 24: a [sole (fish)](https://en.wikipedia.org/wiki/Sole_(fish))

In [19]:
print(test[24]['animal_name'], test[24]['class_type'])

['sole'] ['Fish']


Which is the predicted class? $\rightarrow$ `Fish`

In [20]:
labels.loc[24]

class_type    Fish
Name: 24, dtype: object

Which is the probability of the predicted class? $\rightarrow$ The probability of `Fish` is `0.67`, pretty high with respect to the other classes.

In [21]:
probability.loc[24]

class_type
Amphibian       0.043673
Bird            0.007436
Bug             0.000000
Fish            0.671736
Invertebrate    0.101347
Mammal          0.018671
Reptile         0.157138
Name: 24, dtype: float64

Why is the model predicting `Fish`? Let's take the top 3 features: 
1. The animal is `aquatic` and has `fins`
2. The animal has no `legs` and has `fins`
3. The animal has no `legs` and is not `venomous`

In [22]:
explainability.loc[(24,'Fish')][0:3]

Unnamed: 0_level_0,Unnamed: 1_level_0,d1,d2,score
item,class_type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
24,Fish,aquatic=1,fins=1,0.031694
24,Fish,legs=0,fins=1,0.031694
24,Fish,legs=0,venomous=0,0.023061


# Congratulations! 

Congratulations on completing this tutorial notebook! If you enjoyed working through the tutorial, and want to continue working with Sparse Tensor Classifier, we encourage you to finish the rest of the tutorials in [this series](https://github.com/SparseTensorClassifier/tutorial). Don't forget to [star the repository](https://github.com/SparseTensorClassifier/stc)! 

![GitHub Repo stars](https://img.shields.io/github/stars/SparseTensorClassifier/stc?style=social)

<div>
    Thanks by <a href="https://sparsetensorclassifier.org">https://sparsetensorclassifier.org</a>  
    <span style="float:right">
        Questions? Open an <a href="https://github.com/SparseTensorClassifier/tutorial/issues">issue</a>
    </span> 
</div>