# Explainer Dashboard Random Forest example
In this notebook, a [Random Forest Regression Model](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html) is trained, and the detailed skill assessments, performance, and results are displayed using the interactive dashboard [explainerdashboard](https://explainerdashboard.readthedocs.io/en/latest/)

- ExplainerDashboard source code: https://github.com/oegedijk/explainerdashboard
- You may install it running: `pip install explainerdashboard`

**>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ricardo M. Domingues [11/23/2020]**

In [1]:
from IPython.core.interactiveshell import InteractiveShell

from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from explainerdashboard import ClassifierExplainer, ExplainerDashboard, RegressionExplainer
from explainerdashboard.explainers import RandomForestExplainer
import xgboost
from explainerdashboard.datasets import titanic_survive, titanic_names
import pandas as pd
import numpy as np

InteractiveShell.ast_node_interactivity = "all"

### Part-1: Importing the **titanic_survive** dataset. 
This dataset is part of the **explainerdashboard** package, and contains specific information about individual passengers of the Titanic such as Fare paid, Class, Age, Ship Deck, and etc.\
Most importantly, it contains information whether the passenger survived `{0 : "Did not survive", 1: "Survived"}`

In [2]:
X_train, y_train, X_test, y_test = titanic_survive()
X_train.head(3)

Unnamed: 0_level_0,Fare,Age,PassengerClass,No_of_siblings_plus_spouses_on_board,No_of_parents_plus_children_on_board,Sex_female,Sex_male,Sex_nan,Deck_A,Deck_B,...,Deck_D,Deck_E,Deck_F,Deck_G,Deck_T,Deck_Unkown,Embarked_Cherbourg,Embarked_Queenstown,Embarked_Southampton,Embarked_Unknown
Passenger,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
"Braund, Mr. Owen Harris",7.25,22.0,3,1,0,0,1,0,0,0,...,0,0,0,0,0,1,0,0,1,0
"Heikkinen, Miss. Laina",7.925,26.0,3,0,0,1,0,0,0,0,...,0,0,0,0,0,1,0,0,1,0
"Allen, Mr. William Henry",8.05,35.0,3,0,0,0,1,0,0,0,...,0,0,0,0,0,1,0,0,1,0


**In this notebook**, instead of focusing on predicting whether the passenger would most likely survive or not, we atempt to predict the **Fare** paid based on other input variables

In [3]:
# Replacing the target variables with **Fare** instead
y_train = pd.Series(data=X_train['Fare'].values)
y_test  = pd.Series(data=X_test['Fare'].values)
X_train = X_train.drop(['Fare'],axis=1)
X_test = X_test.drop(['Fare'],axis=1)

In [4]:
# Uncomment this portion if predicting the Age of passengers instead based on input data
# y_train = pd.Series(data=X_train['Age'].values)
# y_test  = pd.Series(data=X_test['Age'].values)
# X_train = X_train.drop(['Age'],axis=1)
# X_test = X_test.drop(['Age'],axis=1)

# y_train = y_train.replace(-999,value=np.nan)
# y_train = y_train.fillna(y_train.mean())
# y_test = y_test.replace(-999,value=np.nan)
# y_test = y_test.fillna(y_test.mean())

### Part-2: Creating and Traning a **Random Forest** regression model, and training it

In [5]:
model = RandomForestRegressor()
model.fit(X_train, y_train)

RandomForestRegressor(bootstrap=True, ccp_alpha=0.0, criterion='mse',
                      max_depth=None, max_features='auto', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=100, n_jobs=None, oob_score=False,
                      random_state=None, verbose=0, warm_start=False)

### Part-3: Launching the **ExplainerDashboard** 

In [6]:
explainer = RegressionExplainer(model, X_test, y_test, target='Fare')
db = ExplainerDashboard(explainer, title="Predicting Passenger Fare",mode='inline', width=1000, height=1000)
db.run(8053) # running on a specific port 8053

Note: shap=='guess' so guessing for RandomForestRegressor shap='tree'...
Generating self.shap_explainer = shap.TreeExplainer(model)
Changing class type to RandomForestRegressionExplainer...
Building ExplainerDashboard..
Generating ShadowDecTree for each individual decision tree...
Generating layout...
Calculating shap values...
Calculating predictions...
Calculating residuals...
Calculating absolute residuals...
Calculating dependencies...
Calculating importances...
Calculating shap interaction values...
Registering callbacks...


### Final Remarks

**note #1** To open the dashboard on a separate browser tab, use the code below:

`db = ExplainerDashboard(explainer, title="Predicting Passenger Fare")`\
`db.run(8053) # running on a specific port 8053`

The dashboard will then be assessible through the link: http://localhost:8053/

**note #2:** The dashboard may be launched using a single line of code, as below:

`ExplainerDashboard(RegressionExplainer(RandomForestRegressor().fit(X_train, y_train), X_test, y_test)).run()`