# MLOps 

This notebook is intended to help you get started managing your machine learning models and deployments in the H2O AI Cloud using python. 

* **Product Documentation:** https://docs.h2o.ai/mlops/
* **Python Documentation:** https://docs.h2o.ai/mlops/py-client-installing/

## Prerequisites
This tutorial relies on the latest Steam SDK (1.8.11) which can be installed into a python environment by:

1. Click on My AI Engines from the H2O AI Cloud and then `Python client` to download the wheel file
2. Navigate to the location where the python client was downloaded and install the client using `pip install h2osteam-1.8.11-py2.py3-none-any.whl`

We require the `h2o_authn` library for securely connecting to the H2O AI Cloud platform: `pip install h2o_authn`.

We also set the following variables to connect to a specific H2O AI Cloud environment. They can be found by logging into the platform, clicking on your name, and choosing the `CLI & API Access` page. Then, copy values from the `Accessing H2O AI Cloud APIs` section.

In [2]:
CLIENT_ID = "q8s-internal-platform"
TOKEN_ENDPOINT = "https://auth.demo.h2o.ai/auth/realms/q8s-internal/protocol/openid-connect/token"
REFRESH_TOKEN = "https://cloud-internal.h2o.ai/auth/get-platform-token"

H2O_MLOPS_GATEWAY = "https://mlops-api.cloud-internal.h2o.ai"

In [54]:
import requests
import json
from getpass import getpass

import h2o_authn
import h2o_mlops_client

import pandas as pd

## Securely connect to the platform
We first connect to the H2O AI Cloud using our personal access token to create a token provider object. We can then use this object to log into Steam and other APIs.

In [3]:
print(f"Visit {REFRESH_TOKEN} to get your personal access token")
tp = h2o_authn.TokenProvider(
    refresh_token=getpass("Enter your access token: "),
    client_id=CLIENT_ID,
    token_endpoint_url=TOKEN_ENDPOINT
)

Visit https://cloud-internal.h2o.ai/auth/get-platform-token to get your personal access token
Enter your access token: ········


## Connect to MLOps

In [6]:
mlops = h2o_mlops_client.Client(
    gateway_url="https://mlops-api.cloud-internal.h2o.ai",
    token_provider=tp,
)

## Projects

### Create a project

In [12]:
mlops.storage.project.create_project({
    'project': {
        'created_time': None,
         'description': "Demo project for a tutorial",
         'display_name': "my_test_project",
         'id': None,
         'last_modified_time': None,
         'owner_id': None
    }
})

{'project': {'created_time': datetime.datetime(2022, 4, 14, 0, 55, 28, 831332, tzinfo=tzutc()),
             'description': 'Demo project for a tutorial',
             'display_name': 'my_test_project',
             'id': 'a53460b8-ef51-4940-8d6f-2a9ee1559413',
             'last_modified_time': datetime.datetime(2022, 4, 14, 0, 55, 28, 831332, tzinfo=tzutc()),
             'owner_id': '8e1c1f5d-4f7c-4385-96e5-373a6c7f01b8'}}

### List all projects you have access to

In [14]:
my_projects = mlops.storage.project.list_projects(body={
    'filter': None, 
    'paging': None, 
    'sorting': None
}).project

for p in my_projects:
    print(p.id, p.display_name)

4f3e354f-2881-4205-90df-a14733f7b012 ddeloy - test - 3 - link exp
8da2ca70-5f56-41cf-a49a-299ce6656cca Customer Churn
96898320-c062-4c9b-8ffc-167f14daeab3 Demo Diabetes
a53460b8-ef51-4940-8d6f-2a9ee1559413 my_test_project
d3f8a6dd-ae5a-42a4-ac2a-98dcc064e0b5 Credit Card Risk
f4c890fb-a073-499e-9d17-806e7119b89b DAI 1.10.0
fed67454-a829-4cb8-84f1-600171e5115f DAI 1.10.1 Test


### Select a specific project to work with
We have previously created this project using Driverless AI and added models to it.

In [15]:
USE_CASE_PROJECT = mlops.storage.project.get_project(body={
    'project_id': '8da2ca70-5f56-41cf-a49a-299ce6656cca'
})
USE_CASE_PROJECT

{'project': {'created_time': datetime.datetime(2022, 2, 3, 14, 17, 32, 512774, tzinfo=tzutc()),
             'description': 'Is my customer going to leave for a competitor?',
             'display_name': 'Customer Churn',
             'id': '8da2ca70-5f56-41cf-a49a-299ce6656cca',
             'last_modified_time': datetime.datetime(2022, 2, 3, 14, 17, 32, 512774, tzinfo=tzutc()),
             'owner_id': '8e1c1f5d-4f7c-4385-96e5-373a6c7f01b8'}}

## Experiments

### List experiments
Get a list of all experiments that are in our project

In [16]:
my_project_experiments = mlops.storage.experiment.list_experiments({
    'filter': None,
    'paging': None,
    'project_id': USE_CASE_PROJECT.project.id,
    'response_metadata': None,
    'sorting': None
}).experiment

for e in my_project_experiments:
    print(e.id, e.display_name)

ef67f078-84f9-11ec-a666-763498f6ea98 maforudu


### Select an experiment

In [17]:
USE_CASE_EXPERIMENT = mlops.storage.experiment.get_experiment({
    'id': 'ef67f078-84f9-11ec-a666-763498f6ea98', 
    'response_metadata': None
})
USE_CASE_EXPERIMENT

{'experiment': {'created_time': datetime.datetime(2022, 2, 3, 14, 17, 40, 196245, tzinfo=tzutc()),
                'display_name': 'maforudu',
                'id': 'ef67f078-84f9-11ec-a666-763498f6ea98',
                'last_modified_time': datetime.datetime(2022, 2, 3, 14, 17, 40, 196245, tzinfo=tzutc()),
                'metadata': None,
                'owner_id': '8e1c1f5d-4f7c-4385-96e5-373a6c7f01b8',
                'parameters': {'fold_column': '',
                               'target_column': 'Churn',
                               'test_dataset_id': '',
                               'training_dataset_id': '',
                               'validation_dataset_id': '',
                               'weight_column': ''},
                'statistics': {'training_duration': '580s'},
                'status': 'EXPERIMENT_STATUS_UNSPECIFIED',
                'tag': []}}

## Deployments

### Deployment environment 
`Dev` and `Prod` are deployment environment tags that you can use for your model deployements. 

In [19]:
my_project_deployment_environments = mlops.storage.deployment_environment.list_deployment_environments(body={
    'filter': None, 
    'paging': None, 
    'project_id': USE_CASE_PROJECT.project.id, 
    'sorting': None
}).deployment_environment

for de in my_project_deployment_environments:
    print(de.id, de.display_name)

0401dd3b-5804-478e-9e63-d4acbd3cee36 PROD
4ad75d1f-18fc-4b8b-acb2-6ecf053d29f3 DEV


In [20]:
USE_CASE_DEPLOYMENT_EVN = mlops.storage.deployment_environment.get_deployment_environment({
    'deployment_environment_id': '4ad75d1f-18fc-4b8b-acb2-6ecf053d29f3'
})
USE_CASE_DEPLOYMENT_EVN

{'deployment_environment': {'created_time': datetime.datetime(2022, 2, 3, 14, 17, 32, 519740, tzinfo=tzutc()),
                            'deployment_target_name': 'kubernetes',
                            'display_name': 'DEV',
                            'id': '4ad75d1f-18fc-4b8b-acb2-6ecf053d29f3',
                            'last_modified_time': datetime.datetime(2022, 2, 3, 14, 17, 32, 519740, tzinfo=tzutc()),
                            'project_id': '8da2ca70-5f56-41cf-a49a-299ce6656cca'}}

### Deployment Types
Deployment types help MLOps understand how traffic should be routed when new data is sent for predictions.

In [22]:
h2o_mlops_client.StorageDeploymentType().allowable_values

['DEPLOYMENT_TYPE_UNSPECIFIED',
 'SINGLE_MODEL',
 'SHADOW_TRAFFIC',
 'SPLIT_TRAFFIC']

In [23]:
USE_CASE_DEPLOYMENT = mlops.storage.deployment_environment.deploy({
    'deployment_environment_id': USE_CASE_DEPLOYMENT_EVN.deployment_environment.id,
    'experiment_id': USE_CASE_EXPERIMENT.experiment.id,
    'metadata': None,
    'response_metadata': None,
    'secondary_scorer': None,
    'type': h2o_mlops_client.StorageDeploymentType().SINGLE_MODEL
})

In [24]:
USE_CASE_DEPLOYMENT

{'deployment': {'created_time': datetime.datetime(2022, 4, 14, 1, 1, 40, 158206, tzinfo=tzutc()),
                'deployer_data': '',
                'deployer_data_version': '',
                'deployment_environment_id': '4ad75d1f-18fc-4b8b-acb2-6ecf053d29f3',
                'experiment_id': 'ef67f078-84f9-11ec-a666-763498f6ea98',
                'id': 'bb4ae08e-1304-4245-a5a1-7837f9243d76',
                'last_modified_time': datetime.datetime(2022, 4, 14, 1, 1, 40, 158206, tzinfo=tzutc()),
                'metadata': None,
                'project_id': '8da2ca70-5f56-41cf-a49a-299ce6656cca',
                'secondary_scorer': [],
                'type': 'SINGLE_MODEL'}}

### Deployment health
Wait until our deployment has gone from launching to healthy

In [26]:
while mlops.deployer.deployment_status.get_deployment_status({
    'deployment_id': USE_CASE_DEPLOYMENT.deployment.id
}).deployment_status == "LAUNCHING":
    pass

In [27]:
DEPLOYMENT_STATUS = mlops.deployer.deployment_status.get_deployment_status({
    'deployment_id': USE_CASE_DEPLOYMENT.deployment.id
}).deployment_status

print(DEPLOYMENT_STATUS.state)
print(DEPLOYMENT_STATUS.scorer.sample_request.url)
print(DEPLOYMENT_STATUS.scorer.score.url)

HEALTHY
https://model.cloud-internal.h2o.ai/bb4ae08e-1304-4245-a5a1-7837f9243d76/model/sample_request
https://model.cloud-internal.h2o.ai/bb4ae08e-1304-4245-a5a1-7837f9243d76/model/score


## Make predictions
Using the `https` library, we make predictions on new data

In [29]:
sample_request_as_text = requests.get(DEPLOYMENT_STATUS.scorer.sample_request.url).text
sample_request = json.loads(sample_request_as_text)
sample_request

{'fields': ['Account_Length',
  'International_Plan',
  'No_Vmail_Messages',
  'Total_Day_minutes',
  'Total_Day_charge',
  'Total_Eve_Minutes',
  'Total_Eve_Charge',
  'Total_Night_Minutes',
  'Total_Night_Charge',
  'Total_Intl_Calls',
  'Total_Intl_Charge',
  'No_CS_Calls'],
 'rows': [['0', 'text', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']]}

In [31]:
fields = sample_request["fields"]
fields

['Account_Length',
 'International_Plan',
 'No_Vmail_Messages',
 'Total_Day_minutes',
 'Total_Day_charge',
 'Total_Eve_Minutes',
 'Total_Eve_Charge',
 'Total_Night_Minutes',
 'Total_Night_Charge',
 'Total_Intl_Calls',
 'Total_Intl_Charge',
 'No_CS_Calls']

In [32]:
sample_row = sample_request["rows"][0]
sample_row

['0', 'text', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']

In [50]:
new_row = ['100', 'text', '100', '100', '100', '100', '100', '100', '100', '100', '100', '100']

In [51]:
new_predictions = requests.post(
    url=DEPLOYMENT_STATUS.scorer.score.url,
    json={
        'fields': fields,
        'rows': [sample_row, new_row]
    }
)

In [52]:
new_predictions

<Response [200]>

In [55]:
predictions_dict = json.loads(new_predictions.text)
predictions_dict

{'fields': ['Churn.0', 'Churn.1'],
 'id': 'ef67f078-84f9-11ec-a666-763498f6ea98',
 'score': [['0.9908681401159277', '0.009131859884072276'],
  ['0.40318780592960823', '0.5968121940703918']]}

In [57]:
pd.DataFrame(predictions_dict["score"], columns=predictions_dict["fields"])

Unnamed: 0,Churn.0,Churn.1
0,0.9908681401159276,0.0091318598840722
1,0.4031878059296082,0.5968121940703918
