# Generate Model Interpreter Report with Breast Cancer dataset using Contextual AI

This notebook demonstrates how to generate explanations report using complier implemented in the Contextual AI library.


## Motivation
Once the PoC is done (and you know where your data comes from, what it looks like, and what it can predict) comes the ideal next step is to put your model into production and make it useful for the rest of the business.

Does it sound familiar? do you also need to answer the questions below, before promoting your model into production:
1. _How you sure that your model is ready for production?_
2. _How you able to explain the model performance? in business context that non-technical management can understand?_
3. _How you able to compare between newly trained models and existing models is done manually every iteration?_

In Contextual AI project, our simply vision is to:
1. __Speed up data validation__
2. __Simplify model engineering__
3. __Build trust__  
  
For more details, please refer to our [whitepaper](https://sap.sharepoint.com/sites/100454/ML_Apps/Shared%20Documents/Reusable%20Components/Explainability/XAI_Whitepaper.pdf?csf=1&e=phIUNN&cid=771297d7-d488-441a-8a65-dab0305c3f04)

## Steps
1. Create a model to Predict Breast Cancer, using the data provide in [sklearn dataset](https://scikit-learn.org/stable/datasets/index.html)
2. Evaluate the model performance with Contextual AI report

---

### 1. Performance Model Training

In [1]:
import warnings

from pprint import pprint
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
warnings.filterwarnings("ignore")

#### 1.1 Load the dataset and prepare training and test sets

In [2]:
# Set seed for reproducibility
np.random.seed(123456)

# Load the dataset and prepare training and test sets
raw_data = datasets.load_breast_cancer()
X, y = raw_data['data'], raw_data['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
feature_names = raw_data['feature_names']
target_names_list = list(raw_data['target_names'])

X_train.shape

(455, 30)

#### 1.2 ML train a RandomForestClassifier Model

In [3]:
# Instantiate a classifier, train, and evaluate on test set
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
clf.score(X_test, y_test)
clf_fn = clf.predict_proba

---

### 2. Involve Contextual AI complier

In [4]:
import os
import json
import sys
sys.path.append('../../../')
from xai.compiler.base import Configuration, Controller

#### 2.1 Specify config file

In [5]:
json_config = 'basic-model-interpreter.json'

#### 2.2 Load and Check config file (before rendering)

In [6]:
with open(json_config) as file:
    config = json.load(file)
config
pprint(config)

{'content_table': True,
 'contents': [{'desc': 'This section provides the Interpretation of model',
               'sections': [{'component': {'_comment': 'refer to document '
                                                       'section xxxx',
                                           'attr': {'feature_names': 'var:feature_names',
                                                    'method': 'default',
                                                    'train_data': 'var:X_train',
                                                    'trained_model': 'var:clf'},
                                           'class': 'FeatureImportanceRanking'},
                             'desc': 'This section provides the analysis on '
                                     'feature',
                             'title': 'Feature Importance Analysis'},
                            {'component': {'_comment': 'refer to document '
                                                       'section xxxx',
    

#### 2.3  Initial compiler controller with config - withe locals()

In [7]:
controller = Controller(config=Configuration(config, locals()))
pprint(controller.config)

{'content_table': True,
 'contents': [{'desc': 'This section provides the Interpretation of model',
               'sections': [{'component': {'_comment': 'refer to document '
                                                       'section xxxx',
                                           'attr': {'feature_names': array(['mean radius', 'mean texture', 'mean perimeter', 'mean area',
       'mean smoothness', 'mean compactness', 'mean concavity',
       'mean concave points', 'mean symmetry', 'mean fractal dimension',
       'radius error', 'texture error', 'perimeter error', 'area error',
       'smoothness error', 'compactness error', 'concavity error',
       'concave points error', 'symmetry error',
       'fractal dimension error', 'worst radius', 'worst texture',
       'worst perimeter', 'worst area', 'worst smoothness',
       'worst compactness', 'worst concavity', 'worst concave points',
       'worst symmetry', 'worst fractal dimension'], dtype='<U23'),
                       

#### 2.4 Render report

In [8]:
controller.render()

### Results

In [9]:
pprint("report generated : %s/breastcancer-model-interpreter-report.pdf" % os.getcwd())

('report generated : '
 '/Users/i062308/Development/Explainable_AI/tutorials/compiler/breastcancer/breastcancer-model-interpreter-report.pdf')
