# Standard Transformer Example

This notebook demonstrates how to deploy a PyFunc model and a standard transformer. The pyfunc model is an echo model which simply returns the request from the transformer. While the transformer itself has preprocess and postprocess pipeline, including Feast feature retrieval, variable declaration, and table creation.

## Requirements

- Authenticated to gcloud (```gcloud auth application-default login```)

In [None]:
!pip install --upgrade -r requirements.txt > /dev/null

In [None]:
import warnings
warnings.filterwarnings('ignore')

## 1. Initialize Merlin

### 1.1 Set Merlin Server

In [None]:
import merlin
print(merlin.__version__)

MERLIN_URL = "<MERLIN_HOST>/api/merlin"

merlin.set_url(MERLIN_URL)

### 1.2 Set Active Project

`project` represent a project in real life. You may have multiple model within a project.

`merlin.set_project(<project-name>)` will set the active project into the name matched by argument. You can only set it to an existing project. If you would like to create a new project, please do so from the MLP UI.

In [None]:
PROJECT_NAME = "sample"

merlin.set_project(PROJECT_NAME)

### 1.3 Set Active Model

`model` represents an abstract ML model. Conceptually, `model` in Merlin is similar to a class in programming language. To instantiate a `model` you'll have to create a `model_version`.

Each `model` has a type, currently model type supported by Merlin are: sklearn, xgboost, tensorflow, pytorch, and user defined model (i.e. pyfunc model).

`model_version` represents a snapshot of particular `model` iteration. You'll be able to attach information such as metrics and tag to a given `model_version` as well as deploy it as a model service.

`merlin.set_model(<model_name>, <model_type>)` will set the active model to the name given by parameter, if the model with given name is not found, a new model will be created.

In [None]:
from merlin.model import ModelType

MODEL_NAME = "standard-transformer"

merlin.set_model(MODEL_NAME, ModelType.PYFUNC)

## 2. Create PyFunc Model

To create a PyFunc model you'll have to extend `merlin.PyFuncModel` class and implement its `initialize` and `infer` method.

`initialize` will be called once during model initialization. The argument to `initialize` is a dictionary containing a key value pair of artifact name and its URL. The artifact's keys are the same value as received by `log_pyfunc_model`.

`infer` method is the prediction method that needs to be implemented. It accept a dictionary type argument which represent incoming request body. `infer` should return a dictionary object which correspond to response body of prediction result.

In following example we are creating PyFunc model called `StandardModel`. 
This model will simply echo-ing back the request body to its sender and print `feast_features` that will be populated by standard transformer into stdout.

In [None]:
import pandas as pd
import logging
from merlin.model import PyFuncModel

class StandardModel(PyFuncModel):
    def initialize(self, artifacts):
        pass
        
    def infer(self, request, **kwargs):
        logging.info(request)
        return request

Now, let's test it locally.

In [None]:
m = StandardModel()
m.initialize({})
m.infer(
   {
        "lat": -6.2335,
        "lon": 106.8022,     
        "details":"{\"merchant_id\": 542958066}"
   }
)

Test whether it could accept feature enrichment from standard transformer. Note that `feast_features` json field will be populated by standard transformer, and the format follow pandas.DataFrame with `split` orientation.

In [None]:
m.infer(
    {
  "lat": -6.2335,
  "lon": 106.8022,      
  "details":"{\"merchant_id\": 542958066}",
  "feast_features": {
      "location_geohash" : {
        "columns": [
            "location_geohash",
            "poi_geohash:total",
            "poi_geohash:ppoi"
        ],
        "data": [["qqguw34zpxkkh", 3, 15]]
    },
    "merchant_id" : {
        "columns": [
            "merchant_id",
            "merchant_ratings:average_rating",
            "merchant_ratings:total_ratings"
        ],
        "data": [["542958066", 4, 4]]
      },
    }
}

)

## 3. Deploy Model

To deploy the model, we will have to create an iteration of the model (by create a `model_version`), upload the serialized model to MLP, and then deploy.

### 3.1 Create Model Version and Upload

`merlin.new_model_version()` is a convenient method to create a model version and start its development process. It is equal to following codes:

```
v = model.new_model_version()
v.start()
v.log_pyfunc_model(model_instance=EnsembleModel(), 
                conda_env="env.yaml", 
                artifacts={"xgb_model": model_1_path, "sklearn_model": model_2_path})
v.finish()
```

To upload PyFunc model you have to provide following arguments:
1. `model_instance` is the instance of PyFunc model, the model has to extend `merlin.PyFuncModel`
2. `conda_env` is path to conda environment yaml file. The environment yaml file must contain all dependency required by the PyFunc model.
3. (Optional) `artifacts` is additional artifact that you want to include in the model
4. (Optional) `code_dir` is a list of directory containing python code that will be loaded during model initialization, this is required when `model_instance` depend on local python package

In [None]:
with merlin.new_model_version(labels={"service_type": "GO-RIDE", "date": "2021-06-23"}) as v:    
    merlin.log_pyfunc_model(model_instance=StandardModel(),
                            conda_env="env.yaml",
                            artifacts={})

### 3.2 Deploy Model and Transformer

To deploy a model and its transformer, you must pass a `transformer` object to `deploy()` function. Each of deployed model version will have its own generated url. The `transformer` object is initialized by specifying the YAML config file.

In [None]:
!cat "config.yaml"

In [None]:
from merlin.resource_request import ResourceRequest
from merlin.transformer import StandardTransformer
from merlin.logger import Logger, LoggerConfig, LoggerMode

# Create a transformer object and its resources requests
transformer_config_path = "config.yaml"
transformer = StandardTransformer(config_file=transformer_config_path,
                                  enabled=True)

log = Logger(model=LoggerConfig(enabled=True,  mode=LoggerMode.ALL))
endpoint = merlin.deploy(v, transformer=transformer, logger=log)

### 3.3 Send Test Request

In [None]:
import json
import requests
import os

with open(os.path.join("request.json"), "r") as f:
    req = json.load(f)

resp = requests.post(endpoint.url, json=req)
pretty_json = json.loads(resp.text)
print (json.dumps(pretty_json, indent=2))

## 3. Clean Up

## 3.1 Delete Deployment

In [None]:
merlin.undeploy(v)