<a id="top"></a>
# AntakIA tutorial
***
### Using AntakIA - the example of the Student Performance dataset

AntakIA helps you understand and explain your _black-box_ machine-learning models, by identifying the most relevant way of segregating your dataset and the best surrogate models to apply on these freshly created regions. In this notebook, we will show you how to use the automatic dyadic-clustering algorithm of AntakIA.

> For more complete tutorials, please refer to the [AntakIA __with GUI__ tutorial](antakia_CH_gui.ipynb) or the [AntakIA __without GUI__ tutorial](antakia_CH_no_gui.ipynb).
> 
> For more information about AntakIA, please refer to the [AntakIA documentation](https://ai-vidence.github.io/antakia/) or go to [AI-vidence's website](https://ai-vidence.com/).

## Context :

__Let's pretend that you are a student willing to adapt your life habits for scoring better performance in school.__ We have a dataset of more than  10000 students characteristics (e.g. Number of sleeping hours, extrascholar activities, etc.). We also have the curent performance of each student. We will first train a machine-learning model (in our case, a simple GradientBoostingRegressor) that will predict the performance of a student based on its characteristics. 

__The main issue is the following :__ we want to explain to you how to organize given the known data. We can't just show you the machine-learning model, because it is a _black-box_ model. We need to find a way to explain the performance of a student based on its characteristics. This is where AntakIA comes in handy !

In this tutorial we'll start from scratch ( with no precomputed explanatory values, so that AntakIA performs it by itself. The default explanatory values are the SHAP values

We start by importing the necessary libraries and the dataset, you can find it in the `data` folder.

In [46]:
import pandas as pd 

df = pd.read_csv('../data/Student_Performance.csv')
df['extra'] = df['Extracurricular Activities'].map({'Yes':1,'No':0})
df = df.drop('Extracurricular Activities', axis =1)

X = df.drop(columns = 'Performance Index')
Y = df['Performance Index']
X.columns = [i.replace(' ','_') for i in X.columns]

In [47]:
df.sample(5)

Unnamed: 0,Hours Studied,Previous Scores,Sleep Hours,Sample Question Papers Practiced,Performance Index,extra
8296,2,71,9,3,47.0,0
840,2,51,6,9,27.0,0
7032,5,43,5,6,29.0,0
3162,8,75,8,9,72.0,1
5822,5,92,8,0,79.0,1


#### **Let's now train the model**

In [48]:
from sklearn.ensemble import GradientBoostingRegressor
model = GradientBoostingRegressor(random_state = 9)
model.fit(X, Y)
print('model fitted')

model fitted


We can then import __antakia__ !
We will define all the antakia objects necessary to use the user interface. To understand it better, see [this notebook](antakia_CH_gui.ipynb), another example with more details, or [this one](antakia_utils.ipynb) to understand the multiple objects of the package.

In [49]:
import antakia

In [50]:
dataset = antakia.Dataset(X, model = model, y=Y)
atk = antakia.AntakIA(dataset)

Launching the GUI!
We can now launch the GUI! We will use it to manually define regions, explore our data and model, or run the automatic dyadic-clustering algorithm.

Before using the GUI, we recommend you to do the following things:
- Take a look at its documentation [here](https://ai-vidence.github.io/antakia/documentation/gui/). `GUI` is a specific AntakIA object!
- Read the [User guide](https://ai-vidence.github.io/antakia/usage/) section of the documentation.
- Watch the video tutorials on [AI-vidence's website](https://ai-vidence.com/) or on [YouTube](https://www.youtube.com/@AI-vidence).

In [53]:
atk.startGUI()

Layout(children=[Image(value=b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\t2\x00\x00\n\xd5\x08\x06\x00\x00\x0…

VBox(children=(AppBar(children=[Html(children=['AntakIA'], class_='ml-3', layout=None, tag='h2'), Spacer(layou…

## List if usefull links

- [AntakIA documentation](https://ai-vidence.github.io/antakia/) - The official documentation of AntakIA
- [AntakIA GitHub repository](https://github.com/AI-vidence/antakia/tree/main) - The GitHub repository of AntakIA. Do not forget to __star__ it if you like it!
- [AntakIA video tutorials](https://www.youtube.com/@AI-vidence) - The YouTube channel of AI-vidence, with video tutorials on AntakIA!
- [AI-vidence's website](https://ai-vidence.com/) - The website of AI-vidence, the company behind AntakIA

[Top of Page](#top)
<img style="float: right;" src="https://raw.githubusercontent.com/AI-vidence/antakia/main/docs/img/Logo-AI-vidence.png" alt="AI-vidence" width="200px"/> 

 ***