# Ebonite Tutorial

Ebonite is a framework for managing machine learning models and their lifecycle.
One the main features is building services from ML models. Also Ebonite can reliably persist them to database of your choice.

## Installing requirements

In [46]:
! pip install -U ebonite flask flasgger==0.9.3 scikit-learn

Requirement already up-to-date: ebonite in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (0.5.0)
Requirement already up-to-date: flask in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (1.1.2)
Requirement already up-to-date: flasgger==0.9.3 in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (0.9.3)
Requirement already up-to-date: scikit-learn in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (0.22.2.post1)
Requirement not upgraded as not directly required: pyjackson==0.0.25 in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (from ebonite) (0.0.25)
Requirement not upgraded as not directly required: GitPython==3.0.3 in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (from ebonite) (3.0.3)
Requirement not upgraded as not directly required: Jinja2==2.10.1 in /Users/mike0sv/miniconda3/envs/py36/lib/python3.6/site-packages (from ebonite) (2.10.1)
Requirement not upgraded as not directly requir

## Train a model

This is the part where you train your model like you usually do. It can be any type of model from supported framework ([list of supported frameworks](https://github.com/zyfra/ebonite#supported-libraries-and-repositories)). 
If your framework is not supported, you can use any python function as a model or you can easily [implement](https://ebonite.readthedocs.io/en/latest/usage/04_adding_custom_analyzers.html) suppport for your framework.

In this example we will train simple linear regression model from sklearn library.

In [1]:
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression

X, y = load_diabetes(True)

lr = LinearRegression()
lr.fit(X, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

Now we have `lr` object with trained model

## Create ebonite Model from model object

In [2]:
import ebonite

Let's use `create_model` function to turn our `lr` object into ebonite `Model` object

In [3]:
model = ebonite.create_model(lr, X, model_name='diabetes_model_1')
model

Model(id=None,name=diabetes_model_1)

As you noticed, we also provide sample data when creating `Model`.
This is needed for ebonite to determine input and output data types that this model consumes and produces.
Using this information, ebonite will automatically provide valid interfaces and data serializers.

Now let's see what metadata ebonite got from `lr` object.

In [4]:
from pprint import pprint
from pyjackson import serialize

pprint(serialize(model))

{'author': 'mike0sv',
 'creation_date': '2020-06-17 20:31:29.872267 ',
 'name': 'diabetes_model_1',
 'params': {'python_version': '3.6.8'},
 'requirements': {'requirements': [{'module': 'sklearn',
                                    'type': 'installable',
                                    'version': '0.22.2'},
                                   {'module': 'numpy',
                                    'type': 'installable',
                                    'version': '1.18.2'}]},
 'wrapper_meta': {'type': 'ebonite.ext.sklearn.model.SklearnModelWrapper'}}


We can see that ebonite determined type of model (sklearn model) and it's requirements: sklearn for model and numpy for data.`

## Use ebonite client to push Model to repository

Now we can save `Model` to repository. For this example we will use local repository, which stores artifacts and metadata in `.ebonite` directory on local filesystem.
For production, instead you can use different combinations of repositories for metadata and artifacts, for example PostgeSQL db for metadata and S3 bucket for artifacts.

In [5]:
ebnt = ebonite.Ebonite.local(clear=True)
task = ebnt.get_or_create_task('my_prj', 'diabetes_task')
task.push_model(model)

Model(id=0,name=diabetes_model_1)

We create `Task` object with name `diabetes_task` to store our model, which is stored in `Project` object with name `my_prj`.
Projects and Tasks are needed to add structure to your repository, so you can use one ebonite instance for all different problems you are working on.

In [6]:
model.id

0

Now that we pushed our model, it has an `id` attribute. It can be used to load this model from repository, however model name also can be used for this.

In [7]:
model = ebnt.get_model('diabetes_model_1', task)

In [8]:
model

Model(id=0,name=diabetes_model_1)

In [9]:
pprint(serialize(model))

{'artifact': {'blobs': {'methods.json': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/0/methods.json',
                                         'type': 'local_file'},
                        'model.pkl': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/0/model.pkl',
                                      'type': 'local_file'},
                        'requirements.json': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/0/requirements.json',
                                              'type': 'local_file'}},
              'type': 'blobs'},
 'author': 'mike0sv',
 'creation_date': '2020-06-17 20:31:29.872267 ',
 'id': 0,
 'name': 'diabetes_model_1',
 'params': {'python_version': '3.6.8'},
 'requirements': {'requirements': [{'module': 'sklearn',
                                    'type': 'installable',
     

When we push `Model` to repository, we are not only saving metadata, but also we saving model binary artifacts (i.e. files that contain actual model dump). 
Those artifacts appear in metadata as artifact attribute of our model. Here we can see two local files: `model.pkl`, which is pickled model and `methods.json`, which is metadata about available methods (`predict` and/or `predict_proba` for example)

## Serving model with Flask Server

We also can use `Model` object to create services. For this example we will use nice builtin flask server. But you can implement any type of server you need for your system.


N.B. Running server in jupyter is bad idea, we do this for demo purposes.
Actually `run_model_server` function is mainly for debugging your server or model, 
in production we encourage you to build docker images with your servers


In [10]:
from ebonite.runtime import run_model_server

In [11]:
from ebonite.ext.flask.server import FlaskServer
run_model_server(model, FlaskServer())

2020-06-17 23:31:55,073 [INFO] ebonite_runtime: Starting Ebonite runtime with loader DummyLoader and server FlaskServer ...
2020-06-17 23:31:55,074 [INFO] ebonite_runtime: Running server <ebonite.ext.flask.server.FlaskServer object at 0x1240045c0>
 * Serving Flask app "ebonite.ext.flask.server" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: off


 * Running on http://0.0.0.0:9000/ (Press CTRL+C to quit)
127.0.0.1 - - [17/Jun/2020 23:31:57] "[32mGET / HTTP/1.1[0m" 302 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[32mGET /apidocs HTTP/1.1[0m" 308 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /apidocs/ HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /flasgger_static/swagger-ui-standalone-preset.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /flasgger_static/lib/jquery.min.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /flasgger_static/swagger-ui.css HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /flasgger_static/swagger-ui-bundle.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /flasgger_static/favicon-32x32.png HTTP/1.1[0m" 200 -
127.0.0.1 - - [17/Jun/2020 23:31:57] "[37mGET /apispec_1.json HTTP/1.1[0m" 200 -


Now you can check out cool OpenAPI (ex-swagger) UI [here](http://localhost:9000/apidocs)
You can even send some test requests there.

Before you continue, don't forget to stop server by interruptiong the kernel by doubletapping 'I' key

## Building and running docker container with Model

But the best way to deploy your model is to create a docker image with this model and run it on your infrastructure.
You'll need docker up and running on your machine for this to work

In [12]:
image = ebnt.create_image(model, 'nb_example_diabetes', builder_args={'force_overwrite': True})

2020-06-17 23:33:20,228 [INFO] ebonite: Skipped building image zyfraai/flask:3.6.8: already exists
2020-06-17 23:34:17,003 [INFO] ebonite: Built docker image nb_example_diabetes:latest


In [13]:
image

Image(id=0,name=nb_example_diabetes)

Now we can run our docker image right from code.

In [14]:
instance = ebnt.create_instance(image, 'nb_example_diabetes', port_mapping={9000:80}).run(detach=True)

In [15]:
instance.is_running()

True

Here is the link to the same [OpenAPI UI](http://localhost:80/apidocs)

Data about images and instances is also persisted, so we can load and manage them later.

## Sending requests to service

Builtin flask server also provides an [endpoint](http://localhost:80/interface.json) with it's interface, and we can create a client from it.

In [16]:
from ebonite.ext.flask.client import HTTPClient

In [17]:
client = HTTPClient('localhost', 80)

Now we can send requests to our service using the same data types the unrelying model needs. The client will handle serialization for us.

In [18]:
client.methods['predict']

Method(name='predict', args=[Argument(name='vector', type=<class 'pyjackson.generics.NumpyNdarrayDatasetType[shape=(None, 10),dtype=float64]'>)], out_type=<class 'pyjackson.generics.NumpyNdarrayDatasetType[shape=(None,),dtype=float64]'>)

In [19]:
import numpy as np

client.predict(np.array([[0., 1., 2., 3., 4., 5., 6., .7, .8, .9]]))

array([2532.25644396])

And finally we can stop running instance with our client, that way metadata about it will be deleted.

In [20]:
ebnt.delete_instance(instance)

In [21]:
instance.is_running()

False

## Python function example

Sometimes you need to do some pre- or postprocessing on data. In this case you can create a python function with your logic and use it as a model.
Or you even have a plain python function with some ifs which IS your model.

In [22]:
def is_bad(data):
    preds = lr.predict(data)
    return preds > 150

Lets repeat the same steps for this function.

In [23]:
model2 = ebonite.create_model(is_bad, X, model_name='diabetes_model_2')
task.push_model(model2)

Model(id=1,name=diabetes_model_2)

In [24]:
pprint(serialize(model2))

{'artifact': {'blobs': {'methods.json': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/1/methods.json',
                                         'type': 'local_file'},
                        'model.pkl': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/1/model.pkl',
                                      'type': 'local_file'},
                        'requirements.json': {'path': '/Users/mike0sv/PycharmProjects/zyfra/ebonite/ebonite/examples/notebook_tutorial/.ebonite/artifacts/1/requirements.json',
                                              'type': 'local_file'}},
              'type': 'blobs'},
 'author': 'mike0sv',
 'creation_date': '2020-06-17 20:39:02.825511 ',
 'id': 1,
 'name': 'diabetes_model_2',
 'params': {'python_version': '3.6.8'},
 'requirements': {'requirements': [{'module': 'numpy',
                                    'type': 'installable',
       

Ebonite still got all the requirements right. 
Let's create and run a service. Note that you actually don't need to save model to do it.

In [25]:
image2 = ebnt.create_image(model2, 'nb_example_diabetes2', builder_args={'force_overwrite': True})

2020-06-17 23:39:19,148 [INFO] ebonite: Skipped building image zyfraai/flask:3.6.8: already exists
2020-06-17 23:40:08,196 [INFO] ebonite: Built docker image nb_example_diabetes2:latest


In [26]:
instance2 = ebnt.create_instance(image2, 'nb_example_diabetes2', port_mapping={9000: 81}).run(detach=True)

And do some requests.

In [27]:
client2 = HTTPClient('localhost', 81)

In [28]:
client2.methods['predict']

Method(name='predict', args=[Argument(name='vector', type=<class 'pyjackson.generics.NumpyNdarrayDatasetType[shape=(None, 10),dtype=float64]'>)], out_type=<class 'pyjackson.generics.NumpyNdarrayDatasetType[shape=(None,),dtype=bool]'>)

Note that out_type changed to numpy array of type 'bool'

In [29]:
client2.predict(np.array([[0., 1., 2., 3., 4., 5., 6., .7, .8, .9]]))

array([ True])

In [30]:
ebnt.delete_instance(instance2)

In [31]:
instance2.is_running()

False