## __Statistical and Linguistic Insights for Model Explanation - SLIME__ 
### __Model explainability__
<font size=3>

In [1]:
from slime_nlp.dataset import ImportData
from slime_nlp.model import CustomModel
from slime_nlp.slime import ExplainModel

<font size=3>
    
For model explanability, we use $\mathtt{ExplainModel}$ object for
* $\mathtt{explain()}$ the attribution scores by text;
* $\mathtt{model\_prediction()}$ given the fine-tuned $\mathtt{CustomModel}$'s weights;
* $\mathtt{visualize()}$ the text's tokens with [Integrated Gradient](https://arxiv.org/abs/1703.01365)'s attribution highlights;
* $\mathtt{attribution\_by\_token()}$ outputs a dataframe with the explanability scores by tokens.

Check the $\mathtt{ExplainModel}$'s doc below.

In [2]:
exp = ExplainModel(model_name="../weights/best_model.pt", n_steps=10)  

print(exp.__doc__)


    # ExplainModel: model explanability tools for data processing and visualization.
    
    Input: (model_name=None, device='cpu', n_steps=50, pretrained_name="google-bert/bert-base-cased")
    -----
    - model_name (str): string with the path and model's name.
    - device (str): select CPU or GPU device for output tensors.
    - n_steps (int): number of steps for Integrated Gradient approximation.
    - pretained_name (str): pretrained model name from huggingface.co repository.
    
    
    Methods:
    -------
    - explain: (text)
      -- text (str): text as string format.
    
      Returns a dictionary with 
      > input_ids (Tensor[int]): sequence of special tokens IDs.
      > token_list (List[str]): of tokens.
      > attributions (Tensor[float]): Integrated Gradient's attribution score by token.
      > delta (Tensor[float]): Integrated Gradient's error metric.
    
    - model_prediction: (input_ids)
      -- input_ids (Tensor): sequence of special tokens IDs.
    
  

In [3]:
# Import dataset:
id = ImportData(path_name="../dataset/adress_all.csv", group_by=["id", "text", "group"], verbose=False)

text = id.train['text'][0]
text

"well the little girl is saying to be uiet to her brother . and her brother's in the cookie jar and he's falling off the chair . the mother's oblivious to all . she's washing her dishes and the water's coming outof her sink . i don't know what she's thinking of . <filler> let's see what we have . anything outside . i don't know . okay . nothing out there . these are okay . the dishes are okay . let's see . how are their shoes . their socks are alright . i don't know"

#### __1. $\mathtt{ExplainModel().explain()}$:__
<font size=3>

Computing the attributions from text.

In [4]:
exp_results = exp.explain(text)
exp_results.keys()

dict_keys(['input_ids', 'token_list', 'attributions', 'delta'])

#### __2. $\mathtt{ExplainModel().model\_prediction()}$:__
<font size=3>

Predicting condition from CustomModel.

In [5]:
pred = exp.model_prediction(exp_results['input_ids'])
pred.keys()

dict_keys(['prob', 'class'])

#### __3. $\mathtt{ExplainModel().visualize()}$:__
<font size=3>

Visualizing the attribution highlights by text's tokens.

In [6]:
data = id.train.iloc[0:3]
exp.visualize(data)

True Label,Predicted Label,Predicted probability,Attribution Score
control,control,0.00,-3.06


True Label,Predicted Label,Predicted probability,Attribution Score
condition,condition,0.70,-1.61


True Label,Predicted Label,Predicted probability,Attribution Score
control,control,0.00,-5.41


#### __4. $\mathtt{ExplainModel().attribution\_by\_token()}$:__
<font size=3>

Create dataframe with attribution scores by text's tokens.

In [7]:
df = exp.attribution_by_token(data, path_name="../dataset/explain_test.csv", return_results=True)
df.head()

Processing: 100.0%

Unnamed: 0_level_0,condition,group,pred_label,score,attributions,token
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
S172,control,0,0,-3.061927,0.0,[CLS]
S172,control,0,0,-3.061927,0.147523,well
S172,control,0,0,-3.061927,-0.009417,the
S172,control,0,0,-3.061927,-0.01798,little
S172,control,0,0,-3.061927,0.036564,girl
