# Speed Insights Demo Notebook

This notebook covers the basic functionality of how to use SpeedInsights.

At present the package has the following pieces of functionality:
- Generate visualisations of the input data
- Generate metrics for all of the models
- Generate visualisations of the predictions for each model
- Select rows of data for which the predictions are outliers


In [28]:
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split

from speed_insights.speed_insights import SpeedInsights

import pandas as pd

## Load data and fit models

In [2]:
x, y = load_diabetes(return_X_y=True)

x = pd.DataFrame(x)
y = pd.DataFrame(y)

train_x, test_x, train_y, test_y = train_test_split(x, y, train_size=0.8)

In [29]:
model_linear = LinearRegression()
model_tree = DecisionTreeRegressor()

model_linear.fit(train_x, train_y)
model_tree.fit(train_x, train_y)

## Initialize and run SpeedInsights

In [21]:
sins = SpeedInsights(X=test_x, y=test_y, models={'linear': model_linear,
                                                'tree': model_tree})

In [22]:
sins.generate_feature_visualisations('images/notebook_test/')

In [23]:
sins.generate_metrics()

Unnamed: 0,mae,mse,r2,rmse
linear,40.397711,2515.201934,0.489565,7.081793
tree,62.741573,6165.280899,-0.251181,8.861112


In [24]:
sins.generate_prediction_visualisations('images/notebook_test/')

In [27]:
sins.select_outlier_predictions(z_threshold=2)

[332, 321, 135]


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,reason
332,0.030811,-0.044642,0.104809,0.076958,-0.011201,-0.011335,-0.058127,0.034309,0.057108,0.036201,[linear]
321,0.096197,-0.044642,0.051996,0.079265,0.054845,0.036577,-0.076536,0.141322,0.098648,0.061054,[linear]
135,-0.005515,-0.044642,0.056307,-0.036656,-0.048351,-0.042963,-0.072854,0.037999,0.050782,0.056912,[tree]
