Here, I demonstrate how to use predeval with a model producing continuous outputs. This example uses the [diabetes dataset from sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html#sklearn.datasets.load_diabetes). I start by loading this dataset and using the first 300 samples to train a linear regression model.

In [1]:
from sklearn.linear_model import LinearRegression
from sklearn import datasets
import numpy as np
from numpy.random import seed

# load data
diabetes = datasets.load_diabetes()
all_x = diabetes.data
all_y = diabetes.target

# shuffle data
indices = np.arange(all_x.shape[0])
seed(1234)
np.random.shuffle(indices)

# create training set
X = all_x[indices[:300], :]
Y = all_y[indices[:300]]

# train model
linreg = LinearRegression()
linreg.fit(X, Y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

Now that we have a trained model, we can use the model's output to create a ContinuousEvaluator object. We start by importing the ContinuousEvaluator class and instantiating an object, <code>ce</code>, with the model's output. Predeval uses this data to form expectations about how future data will look.

In [2]:
from predeval import ContinuousEvaluator

# give model output from training data to ContinuousEvaluator object
model_output = linreg.predict(X)
ce = ContinuousEvaluator(model_output)

In a real setting, we would give the model completely new data and pass its new predictions into `ce`. Instead, we will pass the remaining data to the model. We will then ask predeval to compare these new outputs to the old outputs.

In [3]:
# give validation data to ContinuousEvaluator object
new_model_output = linreg.predict(all_x[300:, :])
ce.check_data(new_model_output)

Passed min check; min observed=52.1812
Passed max check; max observed=288.6014
Passed mean check; mean observed=159.6063 (Expected 152.7400 +- 104.0139)
Passed std check; std observed=53.4388 (Expected 52.0070 +- 26.0035)
Passed ks check; test statistic=0.0937, p=0.3497


[('min', True), ('max', True), ('mean', True), ('std', True), ('ks', True)]

Not surprisingly (given that we don't expect the first 300 samples to differ from the remaining samples), predeval did not find a difference between the model outputs. 

Let's say that one of your features goes bad. Specifically, let's say the first feature becomes entirely populated by 100s. We didn't catch this mistake, and are feeding bad data to the model.

In [4]:
# screw up validation data and give screwed up data to ContinuousEvaluator
all_x[300:, 0] = 100
new_model_output_bad = linreg.predict(all_x[300:, :])
ce.check_data(new_model_output_bad)

Failed min check; min observed=-2796.7910
Passed max check; max observed=-2557.8884
Failed mean check; mean observed=-2687.6767 (Expected 152.7400 +- 104.0139)
Passed std check; std observed=53.8577 (Expected 52.0070 +- 26.0035)
Failed ks check; test statistic=1.0000, p=0.0000


[('min', False), ('max', True), ('mean', False), ('std', True), ('ks', False)]

In this case, predeval detects a change in the model's output. This is a overly-dramatic example, but hopefully predeval can help you find more subtle changes.