## Deploying the Model

### 🎯 Deploy the newest model version tracked by DVC.
If retrieving recent data and updating or dataset will produce a worse performing model, we will have rolled back the model version via DVC and therefore not automatically deploy the worst model.

---

When this notebook is executed, we expect <br>**(1)** Serialized model tracked with DVC <br>**(2)** Knowledge of the models location within DVCFileSystem (Path within Git Repository for DVC tracking)

Steps covered in this notebook:
1. Retrieve parameters
2. Retrieve DVC-tracked model from COS
3. Prepare Watson Machine Learning environment for Model Deployment
4. Retrieve DVC-tracked trainin data reference from COS
5. Deploy Model
6. Model Testing on the Serving Endpoint

### The following cell is a way to get the utility script required for this notebook. 
Since IBM CPD SaaS doesn't have a filesystem, this is the only reliable way to get scripts on the cloud environment. 
```
!rm -rf MLOps-CPD && git clone --quiet -b master https://github.com/IBM/MLOps-CPD.git
```
⚠️ Run the following cells only if you are executing on IBM CPD SaaS.

In [None]:
#!rm -rf MLOps-CPD && git clone --quiet -b master https://github.com/IBM/MLOps-CPD.git

In [None]:
#!mv MLOps-CPD MLOps_CPD

In [None]:
!python3 -m pip install ibm_watson_machine_learning

In [None]:
from ibm_watson_studio_pipelines import WSPipelines
from ibm_watson_machine_learning import APIClient
import ibm_boto3

from botocore.client import Config
from sklearn.model_selection import train_test_split
from dataclasses import dataclass
import numpy as np
import pandas as pd

import pickle
import dvc.api
import io

import logging
import os, types
import warnings

warnings.filterwarnings("ignore")

## Succeeding cell contains the credentials for MLOps COS
```
## PROJECT COS 
AUTH_ENDPOINT = "https://iam.cloud.ibm.com/oidc/token"
ENDPOINT_URL = "https://s3.private.us.cloud-object-storage.appdomain.cloud"
API_KEY_COS = "xxx"
BUCKET_PROJECT_COS = "mlops-donotdelete-pr-qxxcecxi1dtw94"

## MLOPS COS
ENDPOINT_URL_MLOPS = "https://s3.jp-tok.cloud-object-storage.appdomain.cloud"
API_KEY_MLOPS = "xxx"
CRN_MLOPS = "xxx"
BUCKET_MLOPS  = "mlops-asset"

## CATALOG
CATALOG_NAME = "MLOps-ns"
```

### 1. Retrieve Parameters

In [None]:
# For testing: Uncomment this cell and put your credentials in credentials.py to run locally.
# from credentials import set_env_variables_for_credentials
# set_env_variables_for_credentials()

#### Pipeline Environment

In [None]:
CLOUD_API_KEY = os.getenv("CLOUD_API_KEY")

# Model parameters
MODEL_NAME = os.getenv("model_name")
DEPLOYMENT_NAME = os.getenv("deployment_name")
SPACE_ID = os.getenv("space_id") # Deployment Space Id to deploy to
# "ff681eb5-f5aa-4bf9-9c26-a7fbef89853f"
# model_id = os.getenv('model_id')

#### Cloud Object Storage (COS) Credentials

In [None]:
GIT_REPOSITORY = os.getenv("GIT_REPOSITORY")
model_dvc_location = os.getenv("model_dvc_location")
train_package_dvc_location = os.getenv("train_package_dvc_location") 
test_package_dvc_location = os.getenv("test_package_dvc_location")

In [None]:
# For testing
# train_package_dvc_location = "data/train_package.pkl"
# test_package_dvc_location = "data/test_package.pkl"

### 2. Retrieve DVC-tracked model from COS

In [None]:
def read_dvc_tracked_data_from_cos(dvc_path, repo, mode='rb'):
    return pickle.load(io.BytesIO(dvc.api.read(dvc_path,repo=repo, mode=mode)))

In [None]:
model = read_dvc_tracked_data_from_cos("model/xgbr.pkl", GIT_REPOSITORY)
model

### 3. Prepare Watson Machine Learning environment for Model Deployment

#### Instantiate WML Client

In [None]:
url_frankfurt = "https://eu-de.ml.cloud.ibm.com"
url_dallas = "https://us-south.ml.cloud.ibm.com"

In [None]:
WML_CREDENTIALS = {
                   "url": url_dallas,
                   "apikey": CLOUD_API_KEY
            }


wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

In [None]:
wml_client.set.default_space(SPACE_ID)

In [None]:
software_spec_uid = wml_client.software_specifications.get_id_by_name("runtime-22.2-py3.10")
software_spec_uid

In [None]:
#client.hardware_specifications.list() 
hardware_spec_uid = wml_client.hardware_specifications.get_id_by_name('S')
hardware_spec_uid

In [None]:
software_spec_uid

### 4. Retrieve DVC-tracked trainin data reference from COS

In [None]:
# Load dvc-tracked testing package from cos
train_package = read_dvc_tracked_data_from_cos("data/train_package.pkl", "GIT_REPOSITORY")
train_package

In [None]:
X_train = train_package['X_train']
y_train = train_package['y_train']

In [None]:
# Only submit a few training rows to save resources and time
X = X_train.tail(100000)
y = y_train.tail(100000)

### 5. Deploy Model

In [None]:
model_name = "flood-regression_model"
deployment_name = "flood-regression_deployment"
model_type = "scikit-learn_1.1"
target = "dis24" # predictant

meta_props = {
            wml_client.repository.ModelMetaNames.NAME: model_name,
            wml_client.repository.ModelMetaNames.TYPE: model_type,
            wml_client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,
            wml_client.repository.ModelMetaNames.LABEL_FIELD: target,
            # wml_client._models.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: train_data_ref,
            wml_client.repository.ModelMetaNames.INPUT_DATA_SCHEMA: [
                {
                    "id": "input_data_schema",
                    "type": "list",
                    "fields": [
                        {"name": index, "type": value}
                        for index, value in X.dtypes.astype(str).items()
                    ],
                },
            ],
        }


In [None]:
model_details = wml_client.repository.store_model(
            model=model, meta_props=meta_props, training_data=X, training_target=y
)

In [None]:
model_uid = wml_client.repository.get_model_id(model_details)
model_uid

In [None]:
meta_props = {
    wml_client.deployments.ConfigurationMetaNames.NAME: deployment_name,
    wml_client.deployments.ConfigurationMetaNames.ONLINE: {},
}
deployment_details = wml_client.deployments.create(
    model_uid, meta_props=meta_props
)

In [None]:
deployment_uid = wml_client.deployments.get_uid(deployment_details)
deployment_uid

### 6. Model Testing on the Serving Endpoint



#### Load Test Data to Score against WML Endpoint

In [None]:
# Load dvc-tracked testing package from cos
test_package = read_dvc_tracked_data_from_cos(test_package_dvc_location)
test_package

In [None]:
# Take a few rows and score them against the deployed model/WML endpoint
a_few_rows = test_package['X_test'].head(5)
a_few_rows = a_few_rows.apply(pd.to_numeric, errors="coerce")
a_few_rows


#### Score the Endpoint

In [None]:
predictions = wml_client.deployments.score(deployment_uid, payload_scoring)
predictions

In [None]:
fields = list(test_package['X_test'].keys()) # feature cols

# For loop to score for each row in "a_few_rows"
for val in range(len(a_few_rows)):
    payload_scoring = {"input_data": [{"fields": fields, "values": [a_few_rows.iloc[val].tolist()]}]}
    predictions = wml_client.deployments.score(deployment_uid, payload_scoring)
    print(predictions)


## Save Params in WS Pipeline

In [None]:
deployment_done = {}
deployment_done['deployment_status'] = deploy_done
deployment_done['deployment_id'] = deployment_uid
deployment_done['model_id'] = model_uid
deployment_done['space_id'] = SPACE_ID

In [None]:
pipelines_client = WSPipelines.from_apikey(apikey=CLOUD_API_KEY)
pipelines_client.store_results(deployment_done)