# Capturing Deep Learning Models with Scrybe
This notebook builds various deep learnin) models using Keras. This builds on the previous tutorial to show the additional information captured in case of deep learning models. In addition to what we have already seen in the previous tutorial, Scrybe automatically captures: 
* Model architecture
* Layer parameters/configuration
* Training and validation metrics (loss/accuracy) per epoch

In addition, this tutorial shows how to use Scrybe's custom loggers for logging metrics and variable importance. 

We are using data from the House Price Prediction challenge for this tutorial.

## Scrybe Installation

*Skip if Scrybe package is already installed*

The Scrybe Python package is hosted on a private pip server protected by a username and password. As part of the signing up with Scrybe, you should have received a username and password for the package installation. 

In the following cell, replace `username` and `password` with the provided username and password. 

----

> If incorrect username and password is provided, the command would **wait/hang** asking for a username. In such case, kill the execution from **Kernel &rarr; Interrupt**, fix the username/password and rerun.

In [None]:
pip install --extra-index-url http://username:password@15.206.48.113:80/simple/ --trusted-host 15.206.48.113 --upgrade scrybe

## Scrybe Initialization

You need to `import scrybe` at the beginning of your notebook or Python script and initialize it using your access key. You can find the access key on the Scrybe dashboard.

In addition, like the previous example, we will use `scrybe.set_label` to tag this experiment. In this case, we are using the same version string ("v2") with a different experiment identifier ("DeepNets").

> If you are using Scrybe on-premise, change `host_url` to point to your deployment. 

In [1]:
import scrybe
scrybe.init(project_name="Sample Project", user_access_key='aa0e0c5c-3138-45b8-9db5-1fb51b536836', host_url='3.6.105.91:5001')
scrybe.set_label(["v2", "DeepNets"])

## Model Training
You are now fully setup with Scrybe experiment tracking. Beyond this point, Scrybe will automatically: 

* Capture any models which get trained 
* Track model predictions and log metrics computed on them
* Print a URL for each model which can be shared with your team to view/comment upon. 

We will use the same pre-transformed train/test datasets which were used for the traditional algorithms. 

In [2]:
import keras
import numpy as np
import pandas as pd
import shap
import tensorflow as tf

from eli5.permutation_importance import get_score_importances
from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.
The sklearn.metrics.scorer module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.
The sklearn.feature_selection.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.feature_selection. Anything that cannot be imported from sklearn.feature_selection is now part of the private API.


In [3]:
train_set = pd.read_csv('https://raw.githubusercontent.com/scrybe-ml/tutorials/master/data/train_set.csv')
test_set = pd.read_csv('https://raw.githubusercontent.com/scrybe-ml/tutorials/master/data/test_set.csv')

y = train_set['target'].copy()
del train_set['target']
y_test = test_set['target']
del test_set['target']

In [4]:
activation = 'relu'

model = Sequential()
model.add(Dense(256, activation=activation, input_dim=len(train_set.columns)))
model.add(Dense(128, activation=activation))
model.add(Dense(64, activation=activation))
model.add(Dense(32, activation='tanh'))
model.add(Dense(1))

model.compile(loss=keras.losses.mean_squared_error,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['mse', 'mae'])
model.fit(x=train_set, y=y, batch_size=64, epochs=10, validation_data=(train_set, y))

Train on 1112 samples, validate on 1112 samples
Scrybe dashboard URL for model:NeuralNetwork: http://dashboard.scrybe.ml/#/dashboard/projects/61/models/a9cf5d0f-b1bc-48f7-96f2-15b73fc7330b?client_id=true
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.callbacks.History at 0x160339240>

In [5]:
def r2_keras(y_true, y_pred):
    SS_res = np.sum(np.square(y_true - y_pred))
    SS_tot = np.sum(np.square(y_true - np.mean(y_true)))
    return 1 - SS_res / (SS_tot + np.finfo(float).eps)


y_pred = model.predict(test_set)
scores = model.evaluate(test_set, y_test)
r2_score = r2_keras(y_test.values.reshape(len(y_test), 1), y_pred)

print("Evaluation scores: ", scores)
print("R2 Score: ", r2_score)

Evaluation scores:  [0.43402960810729924, 0.43402957916259766, 0.47858715057373047]
R2 Score:  0.5659703774822484


## Scrybe Custom Loggers
Scrybe provides a logger API to log/attach any information to a model which has not been captured automatically. Following sub-sections give two such examples.

### scrybe.log_custom_model_evaluation_metric
As the name suggests, you can use this API to attach any custom metric to a model. In the following cell, we are using this to attach three metrics: `mean_squared_error`, `mean_absolute_error` and `r2_score` to our Keras model. 

See the full documentation here: [log_custom_model_evaluation_metric](https://scrybe-teams.readthedocs.io/en/latest/api.html#scrybe.log_custom_model_evaluation_metric). 

In [6]:
scrybe.log_custom_model_evaluation_metric(model=model, x_test=test_set, y_test=y_test,
                                          param_name="mean_squared_error", param_value=scores[1])
scrybe.log_custom_model_evaluation_metric(model=model, x_test=test_set, y_test=y_test,
                                          param_name="mean_absolute_error", param_value=scores[2])
scrybe.log_custom_model_evaluation_metric(model=model, x_test=test_set, y_test=y_test,
                                          param_name="r2_score",
                                          param_value=r2_score)

### scrybe.log_feature_importances
For certain algorithm types which compute variable importance internally, Scrybe will automatically log these and attach to the model. If you want to override the default variable importance or attach variable importance using a different algorithm, you can use this API to do so. In the following example, we are using [Permutation Importance from ELI5](https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.html#permutation-importance) to compute feature importance for the Keras model and logging that using this API. 


See the full documentation here: [log_custom_model_evaluation_metric](https://scrybe-teams.readthedocs.io/en/latest/api.html#scrybe.log_custom_model_evaluation_metric). 

In [7]:
def score(X, y):
    scores = model.evaluate(X, y)
    return -scores[0]

base_score, score_decreases = get_score_importances(score, test_set.values, y_test.values)
feature_importance_values = np.mean(score_decreases, axis=0)

feature_importances = dict(zip(train_set.columns, feature_importance_values))
scrybe.log_feature_importances(model=model, feature_importances=feature_importances)

















