# End-to-end workflow in the H2O AI Cloud

Get started building and deploying machine learning models in the H2O AI Cloud.

* Create an AI engine for building models
* Use AutoML to build a machine learning model
* Deploy the model to production

## Create an AI engine
Create a Driverless AI engine for access to automated machine learning to build models for us on our data. 

See the `2 Managing AI Engines` tutorial for more details on how to use and interact with **Engine Manager** for creating and managing your AI Engines.

In [None]:
import h2o_engine_manager

engine_manager = h2o_engine_manager.login()

Creating a new engine can take 2-20 minutes depending on your environment and configuration settings. Please reach out to your admin as needed to reduce start times. 

In [None]:
dai_engine = engine_manager.dai_engine_client.create_engine(
    display_name="My test engine",
)

dai_engine.wait()

In [None]:
dai = dai_engine.connect()

## Build a model

### Import data

In [None]:
telco_churn = dai.datasets.create(
    data="https://h2o-internal-release.s3-us-west-2.amazonaws.com/data/Splunk/churn.csv",  
    data_source="s3", 
    name="Telco_Churn",
    force=True
)

Complete 100.00% - [4/4] Computed stats for column Account Length


In [None]:
print(telco_churn.key, "|", telco_churn.name)
print("\nColumns:", telco_churn.columns)
print('\nShape:', telco_churn.shape)

c8d264dc-a01a-11ee-95af-425f7a7f3fbf | Telco_Churn

Columns: ['State', 'Account Length', 'Area Code', 'Phone', "Int'l Plan", 'VMail Plan', 'VMail Message', 'Day Mins', 'Day Calls', 'Day Charge', 'Eve Mins', 'Eve Calls', 'Eve Charge', 'Night Mins', 'Night Calls', 'Night Charge', 'Intl Mins', 'Intl Calls', 'Intl Charge', 'CustServ Calls', 'Churn?']

Shape: (3333, 21)


### Run an AutoML experiment

See the `3 AutoML` tutorial for more details on how to use and interact with Driverless AI for AutoML. 

In [None]:
default_baseline = dai.experiments.create(
    name='Default Baseline', 
    train_dataset=telco_churn, 
    target_column="Churn?", 
    task="classification",
    accuracy=1, time=1, interpretability=6  # a quick AutoML experiment to see a baseline    
)

INFO - Experiment launched at: https://enginemanager.internal.dedicated.h2o.ai/workspaces/default/daiEngines/my-test-engine-9859/#/experiment?key=ca85ba4a-a01a-11ee-95af-425f7a7f3fbf


[I 2023-12-21 16:05:42,337.337 driverlessai] Experiment launched at: https://enginemanager.internal.dedicated.h2o.ai/workspaces/default/daiEngines/my-test-engine-9859/#/experiment?key=ca85ba4a-a01a-11ee-95af-425f7a7f3fbf


Complete 100.00% - Status: Complete                                                


In [None]:
default_baseline.summary()

INFO - Status: Complete
Experiment: Default Baseline (ca85ba4a-a01a-11ee-95af-425f7a7f3fbf)
  Version: 1.10.6.1, 2023-12-21 16:08, Py client
  Settings: 1/1/6, seed=722131253, GPUs disabled
  Train data: Telco_Churn (3333, 21)
  Validation data: N/A
  Test data: N/A
  Target column: Churn? (binary, 14.491% target class)
System specs: Docker/Linux, 4 GB, 32 CPU cores, 0/0 GPU
  Max memory usage: 0.616 GB, 0 GB GPU, 0.0789 GB MOJO
Recipe: AutoDL (7 iterations, 2 individuals)
  Validation scheme: stratified, 9 internal holdouts (3-fold CV)
  Feature engineering: 74 features scored (22 selected)
Timing: MOJO latency 0.1699 millis (3.0MB), Python latency 123.6660 millis (22.6kB)
  Data preparation: 6.27 secs
  Shift/Leakage detection: 0.43 secs
  Model and feature tuning: 50.78 secs (64 models trained)
  Feature evolution: 1.08 secs (0 of 27 model trained)
  Final pipeline training: 48.62 secs (18 models trained)
  Python / MOJO scorer building: 33.20 secs / 19.69 secs
Validation score: LOG

[I 2023-12-21 16:08:26,279.279 driverlessai] Status: Complete
Experiment: Default Baseline (ca85ba4a-a01a-11ee-95af-425f7a7f3fbf)
  Version: 1.10.6.1, 2023-12-21 16:08, Py client
  Settings: 1/1/6, seed=722131253, GPUs disabled
  Train data: Telco_Churn (3333, 21)
  Validation data: N/A
  Test data: N/A
  Target column: Churn? (binary, 14.491% target class)
System specs: Docker/Linux, 4 GB, 32 CPU cores, 0/0 GPU
  Max memory usage: 0.616 GB, 0 GB GPU, 0.0789 GB MOJO
Recipe: AutoDL (7 iterations, 2 individuals)
  Validation scheme: stratified, 9 internal holdouts (3-fold CV)
  Feature engineering: 74 features scored (22 selected)
Timing: MOJO latency 0.1699 millis (3.0MB), Python latency 123.6660 millis (22.6kB)
  Data preparation: 6.27 secs
  Shift/Leakage detection: 0.43 secs
  Model and feature tuning: 50.78 secs (64 models trained)
  Feature evolution: 1.08 secs (0 of 27 model trained)
  Final pipeline training: 48.62 secs (18 models trained)
  Python / MOJO scorer building: 33.20 s

## Deploy the model

### Create a project
We willl create a project for our use case to easily share our work with others and deploy the models

In [None]:
churn_project = dai.projects.create(
    name="Telco Churn Predictions", 
    description="Which of our customers is likely to cancel their contract?",
    force="True"
)

In [None]:
churn_project.link_experiment(default_baseline)

<driverlessai._projects.Project>

### Connect to MLOps
See the `5 Model Deployment` tutorial for more details on how to use and interact with MLOps for managing models and deploymnets. 

In [None]:
import h2o_mlops
import h2o_authn
import h2o_discovery
import os

In [None]:
discovery = h2o_discovery.discover()

token_provider = h2o_authn.TokenProvider(
    refresh_token=os.getenv("H2O_CLOUD_CLIENT_PLATFORM_TOKEN"),
    issuer_url=discovery.environment.issuer_url,
    client_id=discovery.clients["platform"].oauth2_client_id,
)

In [None]:
mlops = h2o_mlops.Client(
    gateway_url=discovery.services['mlops-api'].uri,
    token_provider=token_provider
)

### Register experiment to a model

In [None]:
mlops_churn_project = mlops.projects.get(uid=churn_project.key)
mlops_churn_experiment = mlops_churn_project.experiments.get(uid=default_baseline.key)
mlops_churn_model = mlops_churn_project.models.create(name=mlops_churn_experiment.name)

In [None]:
mlops_churn_model.register(experiment=mlops_churn_experiment)

### Create a deployment

In [None]:
mlops_scoring_runtime = mlops.runtimes.scoring.list(
    artifact_type=mlops_churn_model.get_experiment().artifact_types[0],
    uid="dai_mojo_runtime"
)[0]

In [None]:
mlops_environment = mlops_churn_project.environments.list(name="DEV")[0]

In [None]:
mlops_churn_deployment = mlops_environment.deployments.create_single(
    name=mlops_churn_model.name,
    model=mlops_churn_model,
    scoring_runtime=mlops_scoring_runtime
)

### View information about our deployment

In [None]:
mlops_churn_deployment.status()

'LAUNCHING'

### Wait for deployment to become healthy

In [None]:
import time


while not mlops_churn_deployment.is_healthy():
    mlops_churn_deployment.raise_for_failure()
    time.sleep(5)

mlops_churn_deployment.status()

'HEALTHY'

### Make a prediction with new data

In [None]:
import requests
import json

In [None]:
mlops_churn_deployment.get_sample_request()

{'fields': ['State',
  'Account Length',
  'Area Code',
  "Int'l Plan",
  'VMail Plan',
  'VMail Message',
  'Day Mins',
  'Day Calls',
  'Day Charge',
  'Eve Mins',
  'Eve Calls',
  'Eve Charge',
  'Night Mins',
  'Night Calls',
  'Night Charge',
  'Intl Mins',
  'Intl Calls',
  'Intl Charge',
  'CustServ Calls'],
 'rows': [['text',
   '0',
   '0',
   'text',
   'text',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0',
   '0']]}

In [None]:
import requests


predictions = requests.post(
    url=mlops_churn_deployment.url_for_scoring,
    json=mlops_churn_deployment.get_sample_request()
)

predictions.json()

{'fields': ['Churn?.False.', 'Churn?.True.'],
 'id': 'ca85ba4a-a01a-11ee-95af-425f7a7f3fbf',
 'score': [['0.1944492202717737', '0.8055507797282263']]}

In [None]:
print(f"Churn Percent: {round(float(predictions.json()['score'][0][1]) * 100, 1)}%")

Churn Percent: 80.6%


## Clean up
Delete our AutoML machine - we can import the project to any new Driverless AI machine, and the model will remain deployed to MLOps

In [None]:
dai_engine.delete()