# MLOps with `vetiver`

## Build a model
Data scientists can still use the tools they are most comfortable with for the bulk of their workflow.

In [1]:
import pandas as pd
import numpy as np
from sklearn import model_selection, preprocessing, pipeline
from sklearn.ensemble import RandomForestRegressor
import rsconnect
import vetiver
from vetiver import vetiver_pin_write, vetiver_endpoint

import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

api_key = os.getenv("CONNECT_API_KEY")
rsc_url = os.getenv("CONNECT_SERVER")
np.random.seed(500)

We can read in our data, and fit a pipeline that has both the preprocessing steps and the model.

In [2]:
raw = pd.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-03-02/youtube.csv')
df = pd.DataFrame(raw)

In [3]:
df = df[["like_count", "funny", "show_product_quickly", "patriotic", \
    "celebrity", "danger", "animals"]].dropna()
X, y = df.iloc[:,1:],df['like_count']
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y,test_size=0.2)

le = preprocessing.OrdinalEncoder().fit(X)
rf = RandomForestRegressor().fit(le.transform(X_train), y_train)

In [4]:
rf_pipe = pipeline.Pipeline([('label_encoder',le), ('random_forest', rf)])

## Version a model

Users first create a deployable model object, `VetiverModel()`. This holds all the pieces necessary to deploy the model later.

*In R, you saw the equivalent, `vetiver_model()`.*

In [6]:
v = vetiver.VetiverModel(
    rf_pipe, 
    prototype_data=X_train, 
    model_name = "isabel.zimmerman/superbowl_rf"
)

In [7]:
import pins 
board = pins.board_connect(allow_pickle_read=True)

vetiver_pin_write(board, v)

Model Cards provide a framework for transparent, responsible reporting. 
 Use the vetiver `.qmd` Quarto template as a place to start, 
 with vetiver.model_card()
Writing pin:
Name: 'isabel.zimmerman/superbowl_rf'
Version: 20240116T191309Z-30745


In [16]:
vetiver.model_card()

'./model_card.qmd'

## Deploy a model
Next, intialize the API endpoint with `VetiverAPI()`. To run the API locally, use `.run()`

*In R, you saw the equivalents, `vetiver_api()` and `pr_run()`.*

In [7]:
app = vetiver.VetiverAPI(v)
app.run()

INFO:     Started server process [32750]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [32750]


This is a great start to debug my API, but my end goal is to *NOT* run my model on my personal machine. We can instead deploy to a remote server, such as RStudio Connect. This will involve setting up a connection with the server and deploying our pinned model to RSConnect.

We can deploy our model, which is strongly linked to the version we just pinned above. Note: this model is already deployed, so no need to run this chunk again, unless we want to update our model.

In [8]:
connect_server = rsconnect.api.RSConnectServer(url = rsc_url, api_key = api_key)

# vetiver.deploy_rsconnect(
#     connect_server = connect_server, 
#     board = board, 
#     pin_name = "isabel.zimmerman/superbowl_rf"
#     )

             Consider creating a requirements.txt file instead.[0m


              Do you need to check your pinned model?
              Using version 88405
[33;20mConnect detected CLI commands and/or environment variables that overlap with stored credential.
[0m[33;20mCheck your environment variables (e.g. CONNECT_API_KEY) to make sure you want them to be used.
[0m[33;20mCredential paremeters are taken with the following precedence: stored > CLI > environment.
[0m[33;20mTo ignore an environment variable, override it in the CLI with an empty string (e.g. -k '').
[0m[0mValidating server...[0m[32;20m 	[OK]
[0m[0mValidating app mode...[0m[32;20m 	[OK]
[0m[0mMaking bundle ...[0m[32;20m 	[OK]
[0m[0mDeploying bundle ...[0m[32;20m 	[OK]
[0m[0mSaving deployed information...[0m[32;20m 	[OK]
[0m[0mBuilding FastAPI application...[0m
[0mBundle created with Python version 3.10.7 is compatible with environment Kubernetes::ghcr.io/rstudio/content-pro:r4.1.3-py3.10.11-ubuntu2204 with Python version 3.10.11 from /opt/python/3.10.11/bin/pyt

With the model deployed, we can interact with the API endpoint as if it were a model in memory.

In [15]:
connect_endpoint = vetiver_endpoint("https://colorado.posit.co/rsc/superbowl_vetiver_python/predict")

response = vetiver.predict(data = X_test.head(5), endpoint = connect_endpoint)
response

Unnamed: 0,predict
0,452.581548
1,15054.536775
2,8830.437135
3,9872.934486
4,181.150403


## Monitoring

In both R and Python, vetiver offers helper functions `compute_metrics`, `plot_metrics`, and `pin_metrics` along with a 

In [17]:
vetiver.monitoring_dashboard()

'./monitoring_dashboard.qmd'