# Model Inferencing Endpoint
This notebook walks through creating a simple machine learning model, registering it with 
the Model Registry, and creating a publicly accessible model inferencing endpoint. It then
demonstrates how to get programmatic access to the model inferencing endpoint from outside
of Snowflake using a Programmatic Access Token (PAT).

## 1. Setup Snowflake
Before we proceed, go to the Packages" pull-down and enter `scikit-learn` in the "Find Packages" textbox and select `scikit-learn`. Do the same with `snowflake-ml-python`. Then click "Save", which will also restart the Notebook.

First we create a database, schema, and role for use in this example.

In [None]:
USE ROLE accountadmin;
CREATE ROLE IF NOT EXISTS ml_role;
GRANT ROLE ml_role TO ROLE ACCOUNTADMIN;
CREATE DATABASE IF NOT EXISTS api;
CREATE SCHEMA IF NOT EXISTS api.ml;
GRANT ALL ON DATABASE api TO ROLE ml_role;
GRANT ALL ON SCHEMA api.ml TO ROLE ml_role;

Next, let's create a compute pool for our service, and grant usage permissions to our `ML_ROLE` role. 
We also grant the `ML_ROLE` role the permission to create services with public endpoints.

In [None]:
USE ROLE accountadmin;
CREATE COMPUTE POOL IF NOT EXISTS pool_api 
    MIN_NODES = 1
    MAX_NODES = 1
    INSTANCE_FAMILY = CPU_X64_XS;
GRANT ALL ON COMPUTE POOL pool_api TO ROLE ml_role;
GRANT BIND SERVICE ENDPOINT ON ACCOUNT TO ROLE ml_role;

Since we are going to be using Snowpark Container Services to host the inferencing endpoint, we
will need an `IMAGE REPOSITORY` to store the model image. We create that using the `ML_ROLE` role.

In [None]:
USE ROLE ml_role;
CREATE IMAGE REPOSITORY IF NOT EXISTS api.ml.repo_ml;

## 2. Create the Model

Now we turn our attention to the actual machine learning model. 

For illustrative purposes, we are creating a simple linear regression model based on the diabetes
data set included in Scikit Learn. The example we are using can be found [here](https://scikit-learn.org/1.5/auto_examples/linear_model/plot_ols.html).

In [None]:
import numpy as np

from sklearn import datasets, linear_model

# Load the diabetes dataset
diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

# Use only one feature
diabetes_X = diabetes_X[:, np.newaxis, 2]

# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]

# Split the targets into training/testing sets
diabetes_y_train = diabetes_y[:-20]
diabetes_y_test = diabetes_y[-20:]

# Create linear regression object
regr = linear_model.LinearRegression()

# Train the model using the training sets
regr.fit(diabetes_X_train, diabetes_y_train)

Now that we have created our `regr` linear regression model, let's just test it by calling the `predict()` function directly.

In [None]:
regr.predict([[0.0779]])

## 3. Register the Model

Now we can turn our attention to registering our Scikit Learn model in the Snowflake Model Registry.

First, we create a Snowpark Session.

In [None]:
from snowflake.snowpark.context import get_active_session
session = get_active_session()

Next, we create a Snowflake ML Registry object using the Snowpark Session. We provide the database, `API`, and the schema `ML`.

In [None]:
from snowflake.ml.registry import Registry

session.use_schema('API.ML')
session.use_role('ML_ROLE')
reg = Registry(session=session, database_name="API", schema_name="ML")

Next, we register the `regr` model with the Model Registry. We provide a name for the model (`linreg_diabetes`), a version name (`v1`), and an optional comment. We need to list the Anaconda dependencies for this model (in our case, we just depend on the `scikit-learn` package). We also provide some sample input data so that the schema of the data can be inferred. Lastly, we provide some options to limit warnings.

We then show the models in the Model Registry.

In [None]:
mv = reg.log_model(regr,
                   model_name="linreg_diabetes",
                   version_name="v1",
                   conda_dependencies=["scikit-learn"],
                   comment="Diabetes Linear Regression",
                   options={"relax_version": True},
                   sample_input_data=diabetes_X_test)
reg.show_models()

Now that we have the model registered, we will create a `SERVICE` in Snowpark Container Services (SPCS) to host the inferencing endpoint. We provide a service name (`linreg_diabetes_svc`), a compute pool to use (`pool_api`, which we created earlier), and an image repository to hold the image (`repo_ml`, which we created earlier). Lastly, we indicate that the service should expose the model inferencing endpoint publicly.

In [None]:
# Deploy the model to SPCS
mv.create_service(
    service_name="linreg_diabetes_svc",
    service_compute_pool="pool_api",
    image_repo="API.ML.REPO_ML",
    ingress_enabled=True)


## Accessing the Model Inferencing Endpoint

We want to set up a separate user and role to access the model inferencing endpoint, as opposed to using the role that created the service.

First, we create a new `ML_SCORING_ROLE` role and grant it access to the `API` database and `ML` schema.

In [None]:
USE ROLE ACCOUNTADMIN;
CREATE ROLE IF NOT EXISTS ml_scoring_role;
GRANT ROLE ml_scoring_role TO ROLE accountadmin;
GRANT USAGE ON DATABASE api TO ROLE ml_scoring_role;
GRANT USAGE ON SCHEMA api.ml TO ROLE ml_scoring_role;

Next, we create a user that we can use externally to access the endpoint. This user (`ML_SCORING_USER`) is granted the `ML_SCORING_ROLE` role.

In [None]:
USE ROLE ACCOUNTADMIN;
CREATE USER IF NOT EXISTS ml_scoring_user PASSWORD='User123' DEFAULT_ROLE = ml_scoring_role
    DEFAULT_SECONDARY_ROLES = ('ALL') MUST_CHANGE_PASSWORD = FALSE;
GRANT ROLE ml_scoring_role TO USER ml_scoring_user;

Next, we create a Programmatic Access Token (PAT) that we can use to programmatically access the model inferencing endpoint from outside of Snowflake. 

In order to create a PAT, the user must have a network policy, so we create a network policy that allows access from any source IP address. In practice, this network policy should be set as narrowly as possible. Then, we assign that network policy to our user.

Then, we create a PAT for the `ML_SCORING_USER` user. We will need this token to access from outside Snowflake.

In [None]:
USE ROLE ACCOUNTADMIN;
CREATE NETWORK POLICY IF NOT EXISTS api_np ALLOWED_IP_LIST = ('0.0.0.0/0');
ALTER USER ml_scoring_user SET NETWORK_POLICY = api_np;
ALTER USER IF EXISTS ml_scoring_user ADD PROGRAMMATIC ACCESS TOKEN ml_scoring_token;

We now grant access to the public endpoint to the `ML_SCORING_ROLE`.

In [None]:
GRANT SERVICE ROLE api.ml.linreg_diabetes_svc!all_endpoints_usage TO ROLE ml_scoring_role;

Lastly, we need the actual hostname for the endpoint.

In [None]:
SHOW ENDPOINTS IN SERVICE api.ml.linreg_diabetes_svc;

## Access the Endpoint Programmatically

To access an endpoint in SPCS, we exchange the PAT for a short-lived access token using a Snowflake endpoint. Then we can use that access token to access the endpoint in SPCS. When the access token expires, we can re-exchange the PAT for a new access token. 

To support this pattern, there is [this GitHub repo](https://github.com/sfc-gh-bhess/ex_spcs_token) example we can use. [This blog](https://medium.com/snowflake/programmatic-access-to-snowpark-container-services-b49ef65a7694) post also walks through the details of this pattern, if you want more details.

### Making Requests
The model scoring endpoint hosts an path for the scoring function of the model. For example, the `LinearRegression` model in this example has a `predict()` function, which is accessible at the `/predict` path on the ingress endpoint.

The format for sending in an examplar to be scored and the format of the response are the same as [Snowflake's External Function format](https://docs.snowflake.com/en/sql-reference/external-functions-data-format.html#label-external-functions-data-format). In each call you can submit multiple examplars and get an inference for each examplar. 

Specifically, the input payload is a JSON object with one field named `data` that is an array-of-arrays. Each array in the array-of-arrays is a record where the first element in the array is the index (0-counted) of the exemplar and the subsequent elements are the input arguments to the scoring function. For example, in our example a request to score 3 examplars would look like:

```json
{
    "data": [
        [0, 0.070],
        [1, 0.071],
        [2, 0.072]
    ]
}
```

The response is another JSON object with one field named `data` that is an array-of-arrays. Each array in the array-of-arrays is a record where the first element in the array is the index (0-counted) of the input exemplar followed by a JSON object with a key like `output_feature_0` and a value of the score. 

The example return value for the above example would be:

```json
{
    "data": [
        [0,{"output_feature_0":218.59551211375583}],
        [1,{"output_feature_0":219.53374997500717}],
        [2,{"output_feature_0":220.4719878362585}]
    ]
}
```

## Example
To make this easy, there are some helper classes in [this GitHub repo](https://github.com/sfc-gh-bhess/ex_spcs_token), as well as a command-line tool. The `PATGenerator` class is the one we will focus on for this example.

You can install the `snowkey` package from that GitHub repo using:
```bash
pip install git+https://github.com/sfc-gh-bhess/ex_spcs_token.git
```

or 
```bash
pipenv install git+https://github.com/sfc-gh-bhess/ex_spcs_token.git#egg=snowkey
```

The command-line tool can be tested using our model inferencing endpoint by running the following (execute the following cell to get the specific command line for your example):

In [None]:
import streamlit as st
from snowflake.snowpark.context import get_active_session
session = get_active_session()
identifier = session.sql("SELECT CURRENT_ORGANIZATION_NAME() || '-' || CURRENT_ACCOUNT_NAME() AS identifier").collect()[0]['IDENTIFIER']
account_url = f"{identifier}.snowflakecomputing.com"
pat = cell12.to_pandas().iloc[0].to_dict()['token_secret']
endpoint = cell13.to_pandas().iloc[0].to_dict()['ingress_url']
role = "ML_SCORING_ROLE"
scoring_endpoint = f"https://{endpoint}/predict"

st.markdown(f"""
```bash
python -m snowkey.spcs_request --account_url '{account_url}' \\
   --pat '{pat}' \\
   --role '{role}' \\
   --url '{scoring_endpoint}' \\
   --method 'POST' \\
   --data '{{"data": [[0, 0.070], [1, 0.071], [2, 0.072]]}}'
```
""")

Now we show an example of using the Python classes programmatically so you can incorporate it into your code.

Run the following cell to see the code you can use to access the endpoint programmatically in Python. It uses output from previous cells and some SQL to get the values needed in the sample code below.

In [None]:
import streamlit as st
from snowflake.snowpark.context import get_active_session
session = get_active_session()
identifier = session.sql("SELECT CURRENT_ORGANIZATION_NAME() || '-' || CURRENT_ACCOUNT_NAME() AS identifier").collect()[0]['IDENTIFIER']
pat = cell12.to_pandas().iloc[0].to_dict()['token_secret']
endpoint = cell13.to_pandas().iloc[0].to_dict()['ingress_url']
role = "ML_SCORING_ROLE"
scoring_endpoint = f"https://{endpoint}/predict"

st.markdown(f"""
```python
import requests
from pat_gen import PATGenerator

account_url = "{identifier}.snowflakecomputing.com"
pat = "{pat}"
endpoint = "https://{endpoint}"
role = "{role}"
scoring_endpoint = "{scoring_endpoint}"

# Set up once at the beginning of your program
gen = PATGenerator(account=account_url,
                    pat=pat, 
                    endpoint=endpoint, 
                    role=role)

# Each call to the endpoint looks like this:
resp = requests.post(url=scoring_endpoint, 
                        headers=gen.authorization_header(), 
                        json={{"data": [[0, 0.070], [1, 0.071], [2, 0.072]]}})

# Do something with the scores
scores = resp.json()
```
""")

## Cleanup
If you are finished with this example, we can now delete the scoring service, the model, the user and scoring role.

In [None]:
USE ROLE accountadmin;
ALTER SERVICE api.ml.linreg_diabetes_svc SUSPEND;
DROP SERVICE api.ml.linreg_diabetes_svc;
DROP USER ml_scoring_user;
DROP ROLE ml_scoring_role;
DROP MODEL api.ml.linreg_diabetes;

You can drop the following resources, as well, but if those are being used for other purposes (e.g., you have other things using the compute pool we created), comment out (or delete) those lines.

In [None]:
USE ROLE accountadmin;
DROP IMAGE REPOSITORY api.ml.repo_ml;
DROP COMPUTE POOL pool_api;
DROP ROLE ml_role;
DROP SCHEMA api.ml;