## <font color='darkblue'>Preface</font>
([source article](https://analyticsindiamag.com/omnixai-a-library-for-explainable-ai/)) <b><font size='3ptx'>Explainable AI refers to strategies and procedures that explains the ML solutions.</font></b>

<b>Machine Learning models are frequently seen as black boxes that are impossible to decipher. Because the learner is trained to respond to “yes” and “no” type questions without explaining how the answer was obtained.</b> An explanation of how an answer was achieved is critical in many applications for assuring confidence and openness. Explainable AI refers to strategies and procedures in the use of artificial intelligence technology (AI) that allow human specialists to understand the solution’s findings. This article will focus on explaining the machine learner using OmniXAI. Following are the topics to be covered.

### <font color='darkgreen'>Table of contents</font>
1. <font size='3ptx'><b><a href='#sect1'>What is the objective of explainable AI (XAI)?</a></b></font>
2. <font size='3ptx'><b><a href='#sect2'>Classification of explainable AI</a></b></font>
3. <font size='3ptx'><b><a href='#sect3'>Explaining the machine learning model with OmniXAI</a></b></font>

“Explainability” is a need and expectation that increases the transparency of the intrinsic AI model’s “decision.” Let’s take a closer look at explainable AI objectives.

<a id='sect1'></a>
## <font color='darkblue'>What is the objective of explainable AI (XAI)?</font>
<b><font size='3ptx'>The primary goal of XAI is to answer “wh” (why, when, what, how, and so on) questions about an acquired response. XAI can deliver reliability, transparency, confidence, information and fairness.</font></b>

<a id='sect1_1'></a>
### <font color='darkgreen'>Transparency and Information </font>
<b>By presenting a rationale that a layperson can understand, XAI can improve transparency and fairness.</b>

The minimum need for a transparent AI model is that it be expressive enough to be intelligible by humans. Transparency is essential for evaluating the performance and rationale of the XAI model. <b>Transparency can ensure that any erroneous training to model generates weaknesses in prediction, resulting in a large loss in person to the end-user</b>. False training may be used to alter the generalisation of any AI/ML model, resulting in unethical gains to any party unless it is made clear.

<a id='sect1_2'></a>
### <font color='darkgreen'>Reliability and confidence </font>
<b>One of the most significant aspects that cause humans to rely on any particular technology is trust.</b>

A logical and scientific rationale for every forecast or conclusion leads people to prefer AI/ML systems’ predictions or conclusions

<a id='sect1_3'></a>
### <font color='darkgreen'>Fairness</font>
Because of the bias and variance trade-off in AI/ML models, XAI promotes [fairness and assists](https://analyticsindiamag.com/top-8-initiatives-by-large-tech-firms-to-ensure-fairness-in-ai/) in mitigating bias ([bias-variance trade off](https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff)) of prediction during justification or interpretation.

<a id='sect2'></a>
## <font color='darkblue'>Classification of explainable AI</font>
<font size='3ptx'><b>Explainable AI (XAI) techniques are classified into two major categories of transparent and post-hoc methods. The post-hoc method is further divided based on the data type.</b></font>

<a id='sect2_1'></a>
### <font color='darkgreen'>Post-hoc Methods </font>
<b>Post-hoc approaches are effective for interpreting model complexity when there is a nonlinear connection or increased data complexity. </b> In this scenario, the post-hoc technique is a handy tool for explaining what the model has learnt when the data and features do not follow a clear connection.

The statistical and visualisation-based display of feature summaries underpins result-oriented interpretability techniques. Statistical presentation denotes statistics for each characteristic, with the relevance of each feature measured based on its weight in prediction.

<b>A post-hoc XAI approach takes a trained and/or tested AI model as input and produces intelligible representations of the model’s inner workings and decision logic in the form of feature significance scores, rule sets, heat maps, or plain language</b>. Many post hoc approaches attempt to reveal correlations between feature values and prediction model outputs, regardless of the model’s internals. This assists users in <b>identifying the most relevant characteristics in an ML work, quantifying the value of features, replicating black-box model choices, and identifying biases in the model or data.</b>

Local Interpretable [Model-agnostic Explanations](https://analyticsindiamag.com/why-you-should-try-model-agnostic-solutions-when-the-data-is-vague/), for example, <b>extract feature importance scores by perturbing real samples, observing the change in the ML model’s output given the perturbed instances, and building a local simple model that approximates the original model’s behaviour in the neighbourhood of the original samples</b>. Model agnostic and model-specific posthoc techniques are the two types of posthoc procedures. Explainability limitations about the learning method and internal structure of a particular deep learning model are supported by model-specific strategies. To understand the learning mechanism and give explanations, model agnostic approaches use pairwise analysis of model inputs and predictions.

It has been noted that global techniques can explain all data sets, but local approaches are confined to certain types of data sets. Model-agnostic tools, on the other hand, may be utilised with any AI/ML model. In this case, paired examination of input and results is critical for interpretability. Model-specific strategies such as feature relevance, condition-based explanations, rule-based learning, and saliency map were covered in the following sections.

<a id='sect2_2'></a>
### <font color='darkgreen'>Transparent Methods</font>
<b>Transparent methods such as logistic regression, support vector machine, Bayesian classifier, and K closest neighbour offer rationale with feature weights that are local to the user.</b> This category includes models that meet three properties: algorithmic transparency, decomposability, and simulatability.
* **Simulatability** refers to the ability to simulate a model that must be executed by a human. The complexity of the model is significant for human-enabled simulation. A sparse matrix model, for example, is easier to comprehend than a dense matrix model because a sparse matrix model is easier to rationalise and perceive by people.
* **Decomposability** refers to the explainability of all aspects of the model, from data input to hyper parameters and intrinsic computations. These features establish a model’s behaviour and performance limits. Complex input characteristics are difficult to comprehend. As a result of these limits, such models do not fall within the category of transparent models.
* **Algorithmic transparency** specifies the interpretability of an algorithm from its input of supplied data to its final judgement or categorization. The decision-making process should be transparent to users. The linear model, for example, is considered transparent since the error plot is simple to understand and interpret. The user may understand how the model reacts in different situations by using visualisation.

<br/>

The transparent model is realised with the following explainable AI techniques:
* **Linear/Logistic Regression** (LR) is a transparent model for predicting dependent variables that obey the binary variable characteristic. This strategy is based on the assumption of a flexible fit between predictors and predicted variables. The model demands the users to be familiar with regression techniques and their working mechanism to comprehend [logistic regression](https://analyticsindiamag.com/is-logistic-regression-the-cobol-of-machine-learning/)
* **Decision Trees** are a transparent technique that meets transparency requirements in a big context. It is a decision-making tool with a hierarchical structure. Smaller size decision trees are simple to simulate. The number of layers in a tree increases its algorithmic transparency but decreases its stimulability. The assembly of trained decision trees is effective to overcome weak generalisation qualities due to their poor generalisation capabilities. The decision tree tool is now less transparent as a result of this change.
* **K-Nearest Neighbours** (KNN) is a vote-based method that predicts the class of test samples by voting on the classes of the test samples’ nearest neighbours. KNN voting is based on the distance and similarity of instances. The transparency of KNN is determined by the features, parameter N, and distance function used to quantify similarity. A larger value of K has an effect on the model’s simulation by the user. The complicated distance function limits the model’s decomposability and the transparency of algorithmic execution. 
* **A rule-based learning model** specifies a rule that will be used to train the model. The rule can be defined in the simple conditional if-else form or first order predictive logic. The format of the rules is determined by the type of knowledge base. This sort of model benefits from two rules. First, because the rules are written in language words, a user may easily grasp them. Second, it is more capable of dealing with uncertainty than the traditional rule-based paradigm. The amount of rules in the model enhances efficiency without sacrificing the model’s interpretability and transparency. 
* **Bayesian models** are probabilistic models that incorporate the concept of conditional dependencies among a collection of dependent and independent variables. The Bayesian model is simple enough for end users who understand conditional probability. Bayesian models are sufficiently adequate for all three decomposable, algorithmic transparency, and human simulation qualities. The transparency and simulation of the Bayesian model may be affected by complex variable dependency.

<a id='sect3'></a>
## <font color='darkblue'>Explaining the machine learning model with OmniXAI</font>
<b><font size='3ptx'>[OmniXAI](https://github.com/salesforce/OmniXAI) is an open-source explainable AI package that provides omni-way explainability for a wide range of machine learning models.</font></b>

OmniXAI can assess feature correlations and data imbalance concerns in data analysis and exploration, assisting developers in swiftly removing duplicate features and identifying potential bias issues. <b>OmniXAI can find essential features in feature engineering by studying connections between features and targets, assisting users in understanding data aspects, and doing feature preprocessing</b>. OmniXAI provides multiple explanations, such as feature-attribution explanation, counterfactual explanation, and gradient-based explanation, in model training and assessment to completely examine the behaviour of a model created for tabular, vision, NLP, or time-series tasks.

This article will focus on the data analysis, feature selection and explaining the regression model with OmniXAI. <b>For this article the data used is related to music, the top 2000 songs listed by Spotify and the problem is to predict the popularity of songs.</b>

Let’s start by installing the [**OmniX AI**](https://github.com/salesforce/OmniXAI).

In [2]:
#!pip install omnixai

Then it is time to import necessary libraries

In [27]:
import pandas as pd
import numpy as np
from omnixai.data.tabular import Tabular
from omnixai.explainers.data import DataAnalyzer
from omnixai.preprocessing.base import Identity
from omnixai.preprocessing.encode import LabelEncoder
from omnixai.preprocessing.tabular import TabularTransform
from omnixai.explainers.tabular import TabularExplainer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error,r2_score

from sklearn.model_selection import train_test_split

### <font color='darkgreen'>Load in data</font>
<b>The developers of omnixai recommend using Tabular to describe a tabular dataset that may be generated from a pandas dataframe or a NumPy array. For downloading the data used here, check [Kaggle - spotify-top-hits-eda](https://www.kaggle.com/code/shreydan/spotify-top-hits-eda/data)</b>

To construct a Tabular instance from a pandas dataframe, the dataframe, category feature names, and target/label column names must be specified. The “omnixai.preprocessing” package contains various helpful preprocessing routines for Tabular data.

In [8]:
raw_data_df = pd.read_csv('../../datas/kaggle_spotify-top-hits-eda/songs_normalize.csv')
refined_data_df = raw_data_df.drop(['artist','song'],axis=1)
refined_data_df['explicit'] = refined_data_df['explicit'].astype(str)
 
print(refined_data_df.shape)
refined_data_df.sample(n=3)

(2000, 16)


Unnamed: 0,duration_ms,explicit,year,popularity,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre
247,269000,False,2002,80,0.618,0.938,9,-3.442,1,0.0456,0.0179,0.0,0.167,0.875,91.455,rock
1160,220706,False,2011,74,0.578,0.926,6,-3.689,0,0.0548,0.00472,0.0127,0.14,0.877,149.976,pop
216,213973,False,2002,66,0.607,0.923,1,-6.777,1,0.0948,0.0193,1e-06,0.0924,0.868,184.819,set()


In [9]:
tabular_data = Tabular(
    refined_data_df,
    feature_columns=data_utils.columns,
    categorical_columns=[ 'genre','explicit'],
    target_column='popularity'
)

For data analysis, build an explanation called <b><font color='blue'>DataAnalyzer</font></b>. In <b><font color='blue'>DataAnalyzer</font></b>, the parameter explainers give the names of the analyzers we wish to use, for example, “`correlation`” for feature correlation analysis. In the library, data analysis is classified as a “`global explanation`.” Explain global is invoked with the extra parameters for the specified analyzers to create explanations. 

In [32]:
explainer = DataAnalyzer(
    explainers=["correlation", "mutual", "chi2"],
    data=tabular_data,
    mode="regression"
)
explanations = explainer.explain_global()

The Omnix AI uses [**plotly**](https://plotly.com/python/) as the plotter so all the graphs are interactive. Here we are plotting the correlation plot and some plots related to feature importance.

In [33]:
explanations.keys()

odict_keys(['correlation', 'mutual', 'chi2'])

In [34]:
explainer.explain()['correlation'].ipython_plot()

In [35]:
explainer.explain()['mutual'].ipython_plot()

In [36]:
explainer.explain()['chi2'].ipython_plot()

Next, let's build a regression model

In [37]:
transformer = TabularTransform(
    target_transform=Identity()
).fit(tabular_data)

<b><font color='blue'>TabularTransform</font></b> is a transform that is specifically built for tabular data. It transforms categorical features to one-hot encoding by default and retains continuous-valued features. <b><font color='blue'>TabularTransform</font></b>’s `transform` method will convert a Tabular instance into a NumPy array. If the Tabular instance contains a target column, the target will be the final column of the modified NumPy array.

For this article using the [Gradient Boosting Regressor model by sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html):

In [38]:
X = transformer.transform(tabular_data)

In [39]:
print(X.shape)

(2000, 75)


In [40]:
X_train, X_test, y_train, y_test =train_test_split(X[:, :-1], X[:, -1], train_size=0.80)
print('Training data shape: {}'.format(x_train.shape))
print('Test data shape:     {}'.format(x_test.shape))

Training data shape: (1600, 74)
Test data shape:     (400, 74)


### <font color='darkgreen'>Train Model</font>

In [31]:
gb_r = GradientBoostingRegressor()
gb_r.fit(X_train, y_train)
pred=gb_r.predict(X_test)
print("RMSE = ",np.round(np.sqrt(mean_squared_error(y_test,pred)),3))
print("R2_score= ",r2_score(y_test,pred))

RMSE =  20.121
R2_score=  -0.004449769820205951


### <font color='darkgreen'>Explain Model</font>
Explaining the outcomes of the models by initialising TabularExplainer. There are the following needs to be defined while initialising.
* **explainers**: The names of the explainers that will be used. This article makes use of [**lime**](https://github.com/marcotcr/lime), [**shap**](https://shap.readthedocs.io/en/latest/index.html), and [**PDP**](https://scikit-learn.org/stable/modules/partial_dependence.html).
* **data**: The information used to start explainers. The training dataset is used to train the machine learning model.
* **model**: The machine learning model to explain, in this case, a gradient boosting regressor.
* **preprocess**: The preprocessing function transforms the Tabular instance into model inputs.
* **mode**: The article’s task type is “regression”.

In [41]:
preprocess = lambda z: transformer.transform(z)

explainers = TabularExplainer(
    explainers=["lime", "shap", "pdp"],
    mode="regression",
    data=tabular_data,
    model=gb_r,
    preprocess=preprocess,
    params={
        "lime": {"kernel_width": 3},
        "shap": {"nsamples": 100}
    }
)

Once the explainer is initialized, run test instances by using these codes:

In [42]:
test_instances = transformer.invert(x_test[0:5])
local_explanations = explainers.explain(X=test_instances)
global_explanations = explainers.explain_global()

  0%|          | 0/5 [00:00<?, ?it/s]


The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLarsIC())

If you wish to pass a sample_weight parameter, you need to pass it as a fit parameter to each step of the pipeline as follows:

kwargs = {s[0] + '__sample_weight': sample_weight for s in model.steps}
model.fit(X, y, **kwargs)

Set parameter alpha to: original_alpha * np.sqrt(n_samples). 


The default of 'normalize' will be set to False in version 1.2 and deprecated in version 1.4.
If you wish to scale the data, use Pipeline with a StandardScaler in a preprocessing stage. To reproduce the previous behavior:

from sklearn.pipeline import make_pipeline

model = make_pipeline(StandardScaler(with_mean=False), LassoLarsIC())

If you wish to pass a sample_wei

Plot the results for visualising the explainability:

In [43]:
index=0
print("LIME results:")
local_explanations["lime"].ipython_plot(index)

LIME results:


In [44]:
print("SHAP results:")
local_explanations["shap"].ipython_plot(index)

SHAP results:


In [45]:
print("PDP results:")
global_explanations["pdp"].ipython_plot(
    features=['duration_ms', 'explicit', 'year', 'danceability',
       'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness',
       'instrumentalness', 'liveness', 'valence', 'tempo', 'genre'])

PDP results:


As observed in the LIME test five features (<font color='brown'>instrumentals, duration, energy, acoustics, and genre</font>) are important and have a positive impact on explaining the result of the learner. Similarly in the Shap test, four features (<font color='brown'>duration, loudness, acoustics, genre, and key</font>) have more impact on the explainability.

## <font color='darkblue'>Conclusion</font>
The foundation for explainable AI is transparent ML models, which are only partially interpretable by themselves, and post-hoc explainability approaches, which make the model more interpretable. <b>With this article, we have understood the objective and classification of Explainable AI and implemented explainable AI with OmniXAI.</b>

### <font color='darkgreen'>References</font>
* [Link to the above code](https://colab.research.google.com/drive/1yuvGsISw9gO7GtcYqbdFEYFZPhRFDWl0?usp=sharing)
* [Read more about explainable AI](https://analyticsindiamag.com/category/developers_corner/?s=explainable+ai)