## Scikit-Learn Linear Regression

The following example will:

* Set the input and output schemas.
* Upload a SKLearn Linear Regression model to Wallaroo.
* Deploy a pipeline with the uploaded SKLearn model as a pipeline step.
* Perform a test inference.
* Undeploy the pipeline.

In [1]:
import json
import os
import pickle

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework

import pyarrow as pa
import numpy as np
import pandas as pd

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LogisticRegression

wl = wallaroo.Client(auth_type="sso", interactive=True)

In [2]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

In [3]:
workspace = get_workspace("sklearn-linear-regression-jch")
wl.set_current_workspace(workspace)

{'name': 'sklearn-linear-regression-jch', 'id': 38, 'archived': False, 'created_by': '3cc9e92a-fa3c-4371-a7a7-487884df059e', 'created_at': '2023-06-15T20:29:41.875451+00:00', 'models': [{'name': 'logreg-test', 'versions': 2, 'owner_id': '""', 'last_update_time': datetime.datetime(2023, 6, 15, 20, 38, 36, 627471, tzinfo=tzutc()), 'created_at': datetime.datetime(2023, 6, 15, 20, 34, 36, 55449, tzinfo=tzutc())}], 'pipelines': [{'name': 'sklearn-linear-regression-pipeline', 'create_time': datetime.datetime(2023, 6, 15, 20, 35, 36, 302646, tzinfo=tzutc()), 'definition': '[]'}]}

## Data & Model Creation

In [4]:
input_schema = pa.schema([
    pa.field('age', pa.float64()),
    pa.field('sex', pa.float64()),
    pa.field('bmi', pa.float64()),
    pa.field('bp', pa.float64()),
    pa.field('s1', pa.float64()),
    pa.field('s2', pa.float64()),
    pa.field('s3', pa.float64()),
    pa.field('s4', pa.float64()),
    pa.field('s5', pa.float64()),
    pa.field('s6', pa.float64()),
])


output_schema = pa.schema([
    pa.field('output', pa.float64())
])

## Upload model

In [5]:
model = wl.upload_model('sklearn-linear-regression', 'models/model-auto-conversion_sklearn_linreg_diabetes.pkl', framework=Framework.SKLEARN, input_schema=input_schema, output_schema=output_schema)
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.Pending conversion..Converting.......Ready.


{'name': 'sklearn-linear-regression', 'version': 'aa2e61de-37a0-4cef-80c0-4ebf84db4792', 'file_name': 'model-auto-conversion_sklearn_linreg_diabetes.pkl', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3367', 'last_update_time': datetime.datetime(2023, 6, 15, 21, 43, 54, 104374, tzinfo=tzutc())}

## Configure model and pipeline

In [6]:
deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

In [7]:
pipeline_name = f"sklearn-linear-regression-pipeline"
pipeline = wl.build_pipeline(pipeline_name)
pipeline.add_model_step(model)

pipeline.deploy(deployment_config=deployment_config)
pipeline.status()

Waiting for deployment - this will take up to 90s .......... ok


{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.0.226',
   'name': 'engine-7b49c46d6f-k4lbd',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'sklearn-linear-regression-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'sklearn-linear-regression',
      'version': 'aa2e61de-37a0-4cef-80c0-4ebf84db4792',
      'sha': '6a9085e2d65bf0379934651d2272d3c6c4e020e36030933d85df3a8d15135a45',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.0.225',
   'name': 'engine-lb-584f54c899-p4dg8',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': [{'ip': '10.244.1.50',
   'name': 'engine-sidekick-sklearn-linear-regression-34-f59dfbf54-h9fq9',
   'status': 'Running',
   'reason': None,
   'details': [],
   'statuses': '\n'}]}

## Inference

In [8]:
pipeline.infer_from_file('data/test_linear_regression_data.json')

Unnamed: 0,time,in.age,in.bmi,in.bp,in.s1,in.s2,in.s3,in.s4,in.s5,in.s6,in.sex,out.output,check_failures
0,2023-06-15 21:44:07.809,0.038076,0.061696,0.021872,-0.044223,-0.034821,-0.043401,-0.002592,0.019907,-0.017646,0.05068,134.307976,0
1,2023-06-15 21:44:07.809,-0.001882,-0.051474,-0.026328,-0.008449,-0.019163,0.074412,-0.039493,-0.068332,-0.092204,-0.044642,110.346352,0


In [9]:
pipeline.undeploy()

Waiting for undeployment - this will take up to 45s ...................................... ok


0,1
name,sklearn-linear-regression-pipeline
created,2023-06-15 20:35:36.302646+00:00
last_updated,2023-06-15 21:43:57.437733+00:00
deployed,False
tags,
versions,"c003392e-c8ac-4fcc-a6f2-cc2bf18d5723, 47bf2265-66c2-49bb-86a2-a21616c3bb09, 60ee888a-7cc9-4840-ab75-435df2a0ac28, f0cd44a3-7c42-4859-a940-633f79bc5334, 75c36ec0-929c-43a8-abb7-d8faedaf03c4, 275399ea-dcc0-4639-ab89-24e0ef20b933"
steps,logreg-test
