# Iris classification

This notebook will walk through the process of using the ITEA for classification (``ITEA_classifier``) and interpreting the final expression with the ``itea.inspection`` tools.

In [1]:
import numpy  as np
import pandas as pd

# automatically differentiable implementation of numpy
import jax.numpy as jnp

from sklearn import datasets
from sklearn.metrics import classification_report

from sklearn.model_selection import train_test_split
from IPython.display         import display, Math, Latex

import matplotlib.pyplot as plt

from itea.classification import ITEA_classifier
from itea.inspection     import *

import warnings; warnings.filterwarnings('ignore')

We will use the Iris data set in this example.

This data set contains 3 different classes of Iris flowers and has 4 features: sepal width, sepal length, petal width, and petal length.

One example of each flower is illustrated in the figure below.

![](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Machine+Learning+R/iris-machinelearning.png)

## Creating and fitting an ``ITEA_classifier``

In [3]:
iris_data = datasets.load_iris()
X, y      = iris_data['data'], iris_data['target']
labels    = iris_data['feature_names']
targets   = iris_data['target_names']

# changing numbers to the class names
y_targets = [targets[yi] for yi in y]

X_train, X_test, y_train, y_test = train_test_split(
    X, y_targets, test_size=0.33, random_state=42)

# Creating transformation functions for ITEA using jax.numpy
# (so we don't need to analytically calculate its derivatives)
tfuncs = {
    'id'       : lambda x: x,
    'sqrt.abs' : lambda x: jnp.sqrt(jnp.abs(x)), 
    'log'      : jnp.log,
    'exp'      : jnp.exp
}

clf = ITEA_classifier(
    gens            = 50,
    popsize         = 50,
    max_terms       = 2,
    expolim         = (-1, 1),
    verbose         = 5,
    tfuncs          = tfuncs,
    labels          = labels,
    simplify_method = 'simplify_by_var',
    fit_kw          = {'max_iter':25},
    random_state    = 42,
).fit(X_train, y_train)

gen 	 min_fitness 	 mean_fitness 	 max_fitness 	 remaining (s)
0 	 0.56 	 0.8482 	 0.97 	 0min17seg
5 	 0.74 	 0.9617999999999999 	 0.98 	 0min11seg
10 	 0.79 	 0.9693999999999999 	 0.99 	 0min11seg
15 	 0.95 	 0.9822 	 0.99 	 0min9seg
20 	 0.89 	 0.9822 	 0.99 	 0min9seg
25 	 0.86 	 0.9824 	 0.99 	 0min8seg
30 	 0.93 	 0.9818000000000001 	 0.99 	 0min6seg
35 	 0.96 	 0.9826000000000001 	 0.99 	 0min4seg
40 	 0.94 	 0.9816000000000001 	 0.99 	 0min3seg
45 	 0.66 	 0.9738 	 0.99 	 0min1seg


Since the ``ITEA_classifier`` and the best solution found by the ITEA ``ITExpr_classifier`` are both scikit-like models, we can use some base methods like ``get_params``:

In [7]:
clf.bestsol_.get_params()

AttributeError: 'ITExpr_classifier' object has no attribute 'alpha'

## Inspecting the results from ``ITEA_classifier`` and ``ITExpr_classifier``

Now that we have fitted the ITEA, our ``clf`` contains the ``bestsol_`` attribute, which is a fitted instance of ``ITExpr_classifier`` ready to be used.

In [None]:
final_itexpr = clf.bestsol_

final_itexpr.to_str(term_separator='\n')

In [None]:
print(classification_report(
    y_test,
    final_itexpr.predict(X_test),
    target_names=targets
))

We can use the ``ITExpr_inspector`` to obtain metrics regarding the IT terms in the expression

In [None]:
display(pd.DataFrame(
    ITExpr_inspector(
        itexpr=final_itexpr, tfuncs=tfuncs
    ).fit(X_train, y_train).terms_analysis()
))

By using the ``ITExpr_texifier``, we can create formatted LaTeX strings of the final expression and its derivatives.

In [None]:
# The final expression
display(Latex(
    '$ ITExpr = ' + ITExpr_texifier.to_latex(
        final_itexpr,
        term_wrapper=lambda i, term: r'\underbrace{' + term + r'}_{\text{term ' + str(i) + '}}'
    ) + '$'
))

In [None]:
# List containing the partial derivatives
derivatives_latex = ITExpr_texifier.derivatives_to_latex(
    final_itexpr,
    term_wrapper=lambda i, term: r'\underbrace{' + term + r'}_{\text{term ' + str(i) + r' partial derivative}}'
)

# displaying one of its derivatives
display(Latex(
    r'$ \frac{\partial}{\partial ' + labels[0] + '} ITExpr = ' + derivatives_latex[0] + '$'
))

## Explaining the ``IT_classifier`` expression using Partial Effects

Now let's create an instance of ``ITExpr_explainer``.

We can calculate feature importances with Partial Effects (PE) or approximate the Shapley values using PE.

In [None]:
explainer = ITExpr_explainer(
    itexpr=final_itexpr,
    tfuncs=tfuncs
).fit(X_train, y_train)

explainer.plot_feature_importances(
    X = X_train,
    importance_method = 'pe', # change to 'shapley'
    grouping_threshold = 0.0,
    target = None,
    barh_kw = {'edgecolor' : 'k'},
    show = True
)

We can explain a single instance as well:

In [None]:
explainer.plot_feature_importances(
    X = X_test[0, :].reshape(1, -1),
    importance_method = 'pe', # change to 'shapley'
    grouping_threshold = 0.0,
    target = None,
    barh_kw = {'edgecolor' : 'k'},
    show = True
)

Instead of looking into the average Partial Effects, we can plot the Partial Effects for each variable when its co-variables are fixed at the means.

In [None]:
fig, axs = plt.subplots(1, 4, figsize=(16, 4))

explainer.plot_partial_effects_at_means(
    X          = X_test, # Obtaining explanations for test data 
    ax         = axs,
    features   = final_itexpr.labels,
    target     = None,
    num_points = 100,
    share_y    = False,
    show_err   = False,
    show       = False,
)

plt.tight_layout()
plt.show()

We can share the y axis and show errors hatchs:

In [None]:
fig, axs = plt.subplots(1, 4, figsize=(16, 4))

explainer.plot_partial_effects_at_means(
    X          = X_test, # Obtaining explanations for test data 
    ax         = axs,
    features   = final_itexpr.labels,
    target     = None,
    num_points = 100,
    share_y    = True,
    show_err   = True,
    show       = False,
)

plt.tight_layout()
plt.show()