# Deploying an H2O model on Verta

Within Verta, a "Model" can be any arbitrary function: a traditional ML model (e.g., sklearn, PyTorch, TF, etc); a function (e.g., squaring a number, making a DB function etc.); or a mixture of the above (e.g., pre-processing code, a DB call, and then a model application.) See more [here](https://docs.verta.ai/verta/registry/concepts).

This notebook provides an example of how to deploy an H2O model on Verta as a Verta Standard Model by extending [VertaModelBase](https://verta.readthedocs.io/en/master/_autogen/verta.registry.VertaModelBase.html?highlight=VertaModelBase#verta.registry.VertaModelBase).

## 0. Imports

In [1]:
# restart your notebook if prompted on Colab
!python -m pip install verta



In [2]:
import h2o
from h2o.estimators import H2OGradientBoostingEstimator

h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_282"; OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_282-b08); OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.282-b08, mixed mode)
  Starting server from /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/f7/r9486pbd77j4bppmmcpnxw6c0000gp/T/tmpbdpfwiza
  JVM stdout: /var/folders/f7/r9486pbd77j4bppmmcpnxw6c0000gp/T/tmpbdpfwiza/h2o_hmacdonald_started_from_python.out
  JVM stderr: /var/folders/f7/r9486pbd77j4bppmmcpnxw6c0000gp/T/tmpbdpfwiza/h2o_hmacdonald_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.


0,1
H2O_cluster_uptime:,02 secs
H2O_cluster_timezone:,America/Los_Angeles
H2O_data_parsing_timezone:,UTC
H2O_cluster_version:,3.36.1.4
H2O_cluster_version_age:,1 month and 2 days
H2O_cluster_name:,H2O_from_python_hmacdonald_l74g03
H2O_cluster_total_nodes:,1
H2O_cluster_free_memory:,1.778 Gb
H2O_cluster_total_cores:,8
H2O_cluster_allowed_cores:,8


### 0.1 Verta import and setup

In [3]:
import os

# Ensure credentials are set up, if not, use below
# os.environ['VERTA_EMAIL'] = ""
# os.environ['VERTA_DEV_KEY'] = ""
# os.environ['VERTA_HOST'] = ""

from verta import Client
client = Client(os.environ['VERTA_HOST'])

got VERTA_EMAIL from environment
got VERTA_DEV_KEY from environment
connection successfully established


## 1. Model Training

### 1.1 Load training data

In [4]:
h2o_df = h2o.load_dataset("prostate.csv")
h2o_df["CAPSULE"] = h2o_df["CAPSULE"].asfactor()
h2o_df["GLEASON"] = h2o_df["GLEASON"].asfactor()

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%


In [5]:
import time
h2o_model = H2OGradientBoostingEstimator(distribution="gaussian",
                                         ntrees=2,
                                         max_depth=2,
                                         learn_rate=1)
h2o_model.train(y="AGE",
                x=["CAPSULE", "RACE", "PSA", "GLEASON"],
                training_frame=h2o_df)
MODEL_PATH = "h2o_model_file" + str(time.time())
h2o.save_model(model=h2o_model, path=MODEL_PATH, force=True)
saved_model_path = os.path.join(MODEL_PATH, os.listdir(MODEL_PATH)[0])
saved_model_path

gbm Model Build progress: |██████████████████████████████████████████████████████| (done) 100%


'h2o_model_file1662509276.563005/GBM_model_python_1662509255783_1'

In [None]:
h2o_df.head()

## 2. Register Model for deployment

In [6]:
import h2o
import os
import subprocess
from verta.registry import VertaModelBase, verify_io

class H2OModelWrapper(VertaModelBase):
    def __init__(self, artifacts):
        import h2o
        import jdk
        h2o.init()
        self.model = h2o.load_model(artifacts["serialized_model"])
    
    @verify_io
    def predict(self, model_input):
        frame = h2o.H2OFrame(model_input)
        model_out1 = self.model.predict(frame)
        model_out2 = h2o.as_list(model_out1)["predict"].to_list()[0]
        return model_out2

In [7]:
from verta.environment import Python

registered_model = client.get_or_create_registered_model(name="h2o-model-2")
model_version = registered_model.create_standard_model(
    model_cls=H2OModelWrapper,
    environment=Python(requirements=['h2o', 'install-jdk==0.3.0']),
    artifacts={"serialized_model":saved_model_path},
    name="1"
)

created new RegisteredModel: h2o-model-2 in workspace: hmacdonald_verta_ai
created new ModelVersion: 1
uploading serialized_model to Registry
uploading part 1
upload complete
uploading model to Registry
uploading part 1
upload complete
uploading model_api.json to Registry
uploading part 1
upload complete
uploading custom_modules to Registry
uploading part 1
upload complete


## 3. Deploy model to endpoint

In [8]:
# Deploy
h2o_endpoint = client.get_or_create_endpoint(path="/h2o-2")
h2o_endpoint.update(model_version, wait=True)

waiting for update...............................


{'components': [{'build_id': 4050, 'ratio': 1, 'status': 'running'}],
 'creator_request': {'enable_prediction_authz': False, 'name': 'production'},
 'date_created': '2022-09-07T00:09:18.000Z',
 'date_updated': '2022-09-07T00:10:41.000Z',
 'status': 'active',
 'stage_id': 4013}

In [10]:
deployed_model = h2o_endpoint.get_deployed_model()

In [11]:
data = {"CAPSULE":"0", "RACE":"2", "PSA":51.9, "GLEASON":"6"}

In [12]:
deployed_model.predict(data)

70.88191142835115

In [None]:
h2o_endpoint.delete()

---