# Deploy

This notebook takes our trained models, deploys them as serving endpoints, and sets functions for a proper query.

## ML Model Deployment
We use MLflow's deploy client to simplify endpoint deployments

In [0]:
from mlflow.deployments import get_deploy_client

client = get_deploy_client("databricks")
endpoint = client.create_endpoint(
    name="shm_3w_lightgbm",
    config={
        "served_entities": [
            {
                "name": "lightgbm", 
                "entity_name": "shm.3w.lightgbm",
                "entity_version": "2",
                "workload_size": "Small",
                "scale_to_zero_enabled": True
            }
        ],
        "traffic_config": {
            "routes": [
                {
                    "served_model_name": "lightgbm",
                    "traffic_percentage": 100
                }
            ]
        }
    }
)

# Tools

We could use something like Genie to talk with out data, but when the structures are simple, it makes sense to use a more lightweight approach. This is a simple example of using a SQL query as a tool.

Let's declare a parameterized function that queries the model for a specific well and time range. We have also defined other tools using SQL to get information about our well.

We have defined a lot more tools as SQL queries in ../fixtures

In [0]:
%sql
CREATE OR REPLACE FUNCTION shm.3w.latest_n_obs(
  well_number_param BIGINT DEFAULT -1 COMMENT "Well number between 1 and 50, must be an integer",
  n_obs_param INT DEFAULT 5 COMMENT "Number of observations to return")
RETURNS TABLE (
  well_number DOUBLE, 
  timestamp TIMESTAMP,
  `T-JUS-CKP` DOUBLE,
  `T-TPT` DOUBLE,
  `P-TPT` DOUBLE,
  `P-MON-CKP` DOUBLE,
  `P-PDG` DOUBLE,
  `QGL` DOUBLE
  )
COMMENT "Gives the latest n observations for the well. If well_number_param is -1, returns the most recent n observations for all wells."
RETURN
SELECT well_number, timestamp, `T-JUS-CKP`, `T-TPT`, `P-TPT`, `P-MON-CKP`, `P-PDG`, `QGL`
FROM (
  SELECT 
    well_number,
    timestamp,
    `T-JUS-CKP`,
    `T-TPT`,
    `P-TPT`,
    `P-MON-CKP`,
    `P-PDG`,
    `QGL`,
    ROW_NUMBER() OVER (PARTITION BY well_number ORDER BY timestamp DESC) as row_num
  FROM shm.3w.well_data
  WHERE (well_number_param = -1 OR well_number = well_number_param)
  ORDER BY timestamp DESC
  LIMIT 1000 -- Use a large constant limit
) subquery
WHERE row_num <= n_obs_param

In [0]:
%sql
CREATE OR REPLACE FUNCTION shm.3w.predict_state(
    well_number_param INT DEFAULT -1 COMMENT "Well number between 1 and 50, must be integer. Select -1 to get all wells", 
    n_obs_param INT DEFAULT 10 COMMENT "Number of observations to predict"
)
RETURNS TABLE (
    well_number INTEGER,
    timestamp TIMESTAMP,
    prediction INTEGER 
)
COMMENT "Generates hydrate predictions for the latest sensor readings"
RETURN
SELECT 
    well_number,
    timestamp,
    AI_QUERY(
        'shm_3w_lightgbm',
        request => NAMED_STRUCT(
        'P-PDG', `latest_n_obs`.`P-PDG`,
        'P-TPT', `latest_n_obs`.`P-TPT`,
        'T-TPT', `latest_n_obs`.`T-TPT`,
        'P-MON-CKP', `latest_n_obs`.`P-MON-CKP`,
        'T-JUS-CKP', `latest_n_obs`.`T-JUS-CKP`,
        'QGL', `latest_n_obs`.`QGL`,
        'well_number', `latest_n_obs`.`well_number`
        ),
        returnType => 'DOUBLE'
    ) AS prediction
FROM shm.3w.latest_n_obs(well_number_param, n_obs_param)