The purpose of this notebook is to deploy the microservice behind our image embedding application.  This notebook was developed using a **Databricks ML 11.2** cluster.

##Introduction

With our data prepared, we can now turn our attention to the deployment of a microservice that will take an image and search for similar images using our embedding data. The core components behind such a microservice is the sentence-transformers model and a set of embeddings we wish to search, aka the *corpus_embeddings*, for matches.  The microservice will receive an image, convert it to an embedding, perform the search, and then return JSON-formatted results.

A large, robust implementation would place the corpus embeddings along with the model behind the REST API and return enough information so that the requesting application could then retrieve metadata and full images from secondary stores.</p>

<img src='https://brysmiwasb.blob.core.windows.net/demos/images/recipes_fullarch.png' width=650>
</p>
But given our dataset is relatively small, we can house all the needed data with our model and embeddings behind the REST API, allowing the API to return a complete set of data with a single call:</p>

<img src='https://brysmiwasb.blob.core.windows.net/demos/images/recipes_simplearch.png' width=600>
</p>
The deployment of these information assets to a REST API can be a complicated task.  However, using MLFlow and Databricks model service, we can greatly simplify this task. A key focus of this notebook will be on how we make use of those components to accomplish this.

In [0]:
%pip install -U sentence-transformers

Python interpreter will be restarted.
Collecting sentence-transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
Collecting sentencepiece
  Downloading sentencepiece-0.1.97-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
Building wheels for collected packages: sentence-transformers
  Building wheel for sentence-transformers (setup.py): started
  Building wheel for sentence-transformers (setup.py): finished with status 'done'
  Created wheel for sentence-transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125938 sha256=674ddd3773baf795b769e7b5eb4783b27a3fa350cf2c44655bd49baed1fcfe9d
  Stored in directory: /root/.cache/pip/wheels/71/67/06/162a3760c40d74dd40bc855d527008d26341c2b0ecf3e8e11f
Successfully built sentence-transformers
Installing collected packages: sentencepiece, sentence-transformers
Successfully installed sentence-transformers-2.2.2 sentencepiece-0.1.97
Python interpreter will be restarted.


In [0]:
%run "./IE 00: Intro & Config"

In [0]:
import pyspark.pandas as ps
import pandas as pd
import numpy as np

import pickle
from PIL import Image
import base64

from sentence_transformers import SentenceTransformer, util
import torch

import mlflow.pyfunc

import os

import pprint
pp = pprint.PrettyPrinter(indent=4)



## Step 1: Persist Artifacts Required by Microservice

As mentioned in the introduction, our model will persist several artifacts in order to return the expected response.  These will include the model with which an embedding will be calculated for a search image, a list of pre-computed embeddings which will be search for matches, and additional information about the pre-computed embeddings which will allow us to connect more directly with the recipes associated with them.

To persist these items, we will first define a path to which these will be temporarily stored:

In [0]:
# define artifact path
artifact_path = 'dbfs:'+ config['mount path'] + '/deployment'

# build clean folder for artifacts
dbutils.fs.rm(artifact_path, recurse=True)
dbutils.fs.mkdirs(artifact_path)

Out[6]: True

Next, we will save the model to this path using it's built-in *save* functionality.  Please note that this functionality requires storage locations in the Databricks file system to be presented using local paths so that the optional Databricks file system prefix of *dbfs:* is replaced with */dbfs*.  This directs file access from the */dbfs* fuse mounts to the cloud storage that makes up the file system:

In [0]:
# instantiate model
model = SentenceTransformer('clip-ViT-B-32')

# save model using native functionality
model_path = artifact_path + '/model'
dbutils.fs.mkdirs(model_path) # make sure path exists

model.save(model_path.replace('dbfs:','/dbfs'))

Downloading:   0%|          | 0.00/690 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/4.03k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/525k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/316 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/605M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/389 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/604 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/961k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.88k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/122 [00:00<?, ?B/s]

ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy.


And we can finally we can persist our data elements.  Please note that we are saving our searchable embeddings, *i.e.* the corpus embeddings, as a list and our additional recipe data associated with those embeddings as a pandas dataframe.  Both were retrieved as part of a pandas dataframe at the top of the code block to ensure records are captured in the same order.  

These two structures are then incorporated into a dictionary which is then pickled for persistence.  The use of a dictionary makes retrieving these items easier in later steps:

In [0]:
# get rows with valid embeddings
recipes_pd = ps.read_table('recipes').dropna().to_pandas() # convert to traditional pandas

# extract embeddings as list of tensors (as required for search)
corpus_embeddings = torch.tensor(recipes_pd['embeddings'].to_list())

# read images into base64 encoded arrays
def convert_image_to_base64(image_path):
  # load image into base64 encoded array
  with open(image_path, 'rb') as f_in:
    image_encoded = base64.b64encode(f_in.read())
  return image_encoded.decode('utf-8')

recipes_pd['encoded_image'] = recipes_pd['image_path'].apply(convert_image_to_base64)

# get data that will be used for recipe lookups
lookup_info_pd = recipes_pd.drop(['embeddings','cleaned_ingredients','image_path'], axis=1)

# define location where artifcats will be saved
data_path = artifact_path + '/data'
dbutils.fs.mkdirs(data_path) # make sure path exists

# save artifacts as a pickled dictionary so that it will be easier to grab required information later
with open(data_path.replace('dbfs:','/dbfs')+'/data.pkl', 'wb') as f_out:
  pickle.dump({'corpus_embeddings':corpus_embeddings, 'lookup_info_pd':lookup_info_pd}, f_out)

  corpus_embeddings = torch.tensor(recipes_pd['embeddings'].to_list())


We can now examine the persisted items in the file system as follows:

In [0]:
def list_dbfs_dir_contents(dbfs_dir_path, i=1):
  
  # print item path
  print(dbfs_dir_path)
    
  # for each item in path
  for f in dbutils.fs.ls(dbfs_dir_path):
    
    # if is directory:
    if f.size==0 and f.name[-1]=='/':
      # print directory contents
      list_dbfs_dir_contents(f.path, i+1)
    else:
      # print item path
      print(f.path)
    
list_dbfs_dir_contents(artifact_path)

dbfs:/mnt/image_embeddings/deployment
dbfs:/mnt/image_embeddings/deployment/data/
dbfs:/mnt/image_embeddings/deployment/data/data.pkl
dbfs:/mnt/image_embeddings/deployment/model/
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/config.json
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/merges.txt
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/preprocessor_config.json
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/pytorch_model.bin
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/special_tokens_map.json
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/tokenizer.json
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/tokenizer_config.json
dbfs:/mnt/image_embeddings/deployment/model/0_CLIPModel/vocab.json
dbfs:/mnt/image_embeddings/deployment/model/README.md
dbfs:/mnt/image_embeddings/deployment/model/config_sentence_transformers.json
dbfs:/mnt/image_embeddings/deployment/model/mo

## Step 2: Save Model to MLFlow

With our items persisted, we can now focus on saving our model to MLFlow.  Saving a model to MLFlow allows us to more easily deploy the model as a microservice.  But in order to take advantage of this functionality, the model needs to provide a standard *predict* method which will be called when data is passed to that service. The sentence transformer model does not provide such a method but we can write a custom wrapper to map a *predict* method call to the functionality we wish to employ.

For our custom wrapper, we will make use of both a *predict* method and a *load_context* method.  The *load_context* method is called upon service instantiation. It is where we will load persisted artifacts into memory so that these items are ready for use with calls to *predict*. Calls to predict will submit either an encoded image or a text-search string.  Logic in the predict function will determine which is being submitted and trigger the appropriate logic from there.

One thing to note is that our predict method accepts its inputs as part of a pandas dataframe. This is part of a built-in mechanism provided by MLFlow which makes it easier for an application to pass multiple arguments per input record as well as multiple input records in a single API call.  Because the data arrives as a pandas dataframe, we will write our search logic as part of an internal function that we will apply to each incoming row of data:

In [0]:
class customWrapper(mlflow.pyfunc.PythonModel):
    
    
  def load_context(self, context):
    
    # instantiate the model
    self.model = SentenceTransformer(context.artifacts['model'])
    
    # instantiate the data assets
    data = pickle.load(open(context.artifacts['data']+'/data.pkl', 'rb'))
    self.corpus_embeddings = data['corpus_embeddings']
    self.lookup_info_pd = data['lookup_info_pd']
    
    
  def predict(self, context, inputs_pd):
    
    import io, json
    import base64
    from PIL import Image
    from sentence_transformers import SentenceTransformer, util
    import torch
    
    def search(input):
      
      search_string = input['search']
      if 'k' in input:
        k = int(input['k'])
      else:
        k=3
    
      # convert search element into an embedding
      if len(search_string) > 512 : # if search string is likely an encoded image

        # decode image to bytes
        query_image_decoded = base64.b64decode(search_string.encode('utf-8'))

        # convert image bytes to PIL image
        query_image = Image.open(io.BytesIO(query_image_decoded)).convert('RGB')

        # generate image embedding
        query_embedding = self.model.encode([query_image], convert_to_tensor=True, show_progress_bar=False)

      else: # if search string is likley text
        query_embedding = self.model.encode([search_string], convert_to_tensor=True, show_progress_bar=False)

        
      # perform search on the embedding
      search_results = util.semantic_search(query_embedding, self.corpus_embeddings, top_k=k)[0]
      
      
      # attach lookup info to search results
      results = []
      for r in search_results:

        # retrieve coresponding lookup info record
        lookup_info = self.lookup_info_pd.iloc[r['corpus_id']].to_dict()

        # append info to search record
        r.update(lookup_info)

        results += [r]
      
      return json.dumps(results)
  
    # apply search to each row in inputs
    return inputs_pd.apply(search, axis=1)

Our model has quite a few dependencies.  We can add these to the default environment configuration as follows:

In [0]:
# get standard environment configuration
conda_env = mlflow.pyfunc.get_default_conda_env()

# add model-specific dependencies 
conda_env['dependencies'][-1]['pip'] += ['sentence_transformers','Pillow','torch']

print(conda_env)

{'name': 'mlflow-env', 'channels': ['conda-forge'], 'dependencies': ['python=3.9.5', 'pip<=21.2.4', {'pip': ['mlflow', 'cloudpickle==2.0.0', 'sentence_transformers', 'Pillow', 'torch']}]}


Now we can log our model to MLFlow.  Note that we are passing the path of our persisted artifacts to be stored under an artifact path in MLFlow.  Notice too that we are passing in both the custom wrapper defined above as well as the modified environment configuration.  Finally, we are registering our model using a preferred name which will make it easier to locate in the model registry:

In [0]:
# register the model with mlflow
with mlflow.start_run() as run:
  
  mlflow.pyfunc.log_model(
    artifact_path='root', 
    artifacts={'model':model_path, 'data':data_path},
    python_model=customWrapper(),
    conda_env=conda_env, 
    registered_model_name=config['model name']
    )  
  
# move latest model version to production
client = mlflow.MlflowClient()

model_version = client.search_model_versions(f"name='{config['model name']}'")[0].version
client.transition_model_version_stage(
  name=config['model name'],
  version=model_version,
  stage='production'
  )      

# set any other production models to archive status
for mv in client.search_model_versions(f"name='{config['model name']}'"):
  if mv.version != model_version:
    # if model with this name is marked production
    if mv.current_stage.lower() == 'production':
      # mark is as archived
      client.transition_model_version_stage(
        name=mv.name,
        version=mv.version,
        stage='archived'
        )

Registered model 'recipe_model' already exists. Creating a new version of this model...
2022/10/21 21:41:11 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: recipe_model, version 9
Created version '9' of model 'recipe_model'.


With our model now persisted and registered with MLFlow, we can use the Databricks workspace UI to inspect it.  It's important to note that the UI distinguishes between the persistence of the model to MLFlow as an *experiment* and the registration of the model to the MLFlow registry as a *model*. This distinction helps us separate model development & training activities from model deployment activities but can be a little confusing the first time its encountered.

To examine the persisted model, use the *Experiments* item in the Databricks workspace UI to locate an experiment named for this notebook. Click on the experiment's name to locate the runs associated with it and then click on the *Start Time* value for the latest run to see the persisted experiments details.

Navigating to the bottom of the resulting page, expand the *Artifacts* item if it is not already opened. to display a list of items persisted with the model.  You should see a root folder named as specified for the *artifact_path* argument supplied with the *log_model* call.  If you expand the *artifacts* folder under it, you should see the various items specified with the *artifacts* argument now persisted with your model.  The structure under this folder should mirror the one we created as we persisted these objects to the filesystem:</p>

<img src='https://brysmiwasb.blob.core.windows.net/demos/images/recipes_mlflow_artifacts.PNG' width=400>

To test our model, we can now load it from MLFlow.  The definition of the custom wrapper means that the model now has a simple, standard interface that we can use for making predictions:

In [0]:
model = mlflow.pyfunc.load_model( f"models:/{config['model name']}/production")

In [0]:
# sample image path
sample_image_path = f"/dbfs{config['mount path']}/dataset/images/pumpkin-pie-102601.jpg"

# load image into base64 encoded array
with open(sample_image_path, 'rb') as f_in:
  sample_image_encoded = base64.b64encode(f_in.read()).decode('utf-8')
  
inputs_pd = pd.DataFrame( [[sample_image_encoded, 2]], columns=['search','k'])

# get model prediction
result = model.predict( inputs_pd )

# present results
pp.pprint(
  eval(result[0])
 )

[   {   'corpus_id': 13167,
        'encoded_image': '/9j/4AAQSkZJRgABAQAAAQABAAD/4gIcSUNDX1BST0ZJTEUAAQEAAAIMbGNtcwIQAABtbnRyUkdCIFhZWiAH3AABABkAAwApADlhY3NwQVBQTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA9tYAAQAAAADTLWxjbXMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAApkZXNjAAAA/AAAAF5jcHJ0AAABXAAAAAt3dHB0AAABaAAAABRia3B0AAABfAAAABRyWFlaAAABkAAAABRnWFlaAAABpAAAABRiWFlaAAABuAAAABRyVFJDAAABzAAAAEBnVFJDAAABzAAAAEBiVFJDAAABzAAAAEBkZXNjAAAAAAAAAANjMgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB0ZXh0AAAAAEZCAABYWVogAAAAAAAA9tYAAQAAAADTLVhZWiAAAAAAAAADFgAAAzMAAAKkWFlaIAAAAAAAAG+iAAA49QAAA5BYWVogAAAAAAAAYpkAALeFAAAY2lhZWiAAAAAAAAAkoAAAD4QAALbPY3VydgAAAAAAAAAaAAAAywHJA2MFkghrC/YQPxVRGzQh8SmQMhg7kkYFUXdd7WtwegWJsZp8rGm/fdPD6TD////bAEMABAQEBAQEBAUFBAYGBgYGCQgHBwgJDQoKCgoKDRQNDw0NDw0UEhYSERIWEiAZFxcZICUfHh8lLSkpLTk2OUtLZP/bAEMBBAQEBAQEBAUFBAYGBgYGCQgHBwgJDQoKCgoKDRQNDw0NDw0UEhYSERIWEiAZFxcZICUfHh8lLSkpLTk2OUtLZP/AABEIAKkBEg

In [0]:
# sample string input
sample_string = "pumpkin pie"
  
inputs_pd = pd.DataFrame( [[sample_string, 2]], columns=['search','k'])

# get model prediction
result = model.predict( inputs_pd )

# present results
pp.pprint(
  eval(result[0])
 )

[   {   'corpus_id': 50,
        'encoded_image': '/9j/4AAQSkZJRgABAQAAAQABAAD/4gIcSUNDX1BST0ZJTEUAAQEAAAIMbGNtcwIQAABtbnRyUkdCIFhZWiAH3AABABkAAwApADlhY3NwQVBQTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA9tYAAQAAAADTLWxjbXMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAApkZXNjAAAA/AAAAF5jcHJ0AAABXAAAAAt3dHB0AAABaAAAABRia3B0AAABfAAAABRyWFlaAAABkAAAABRnWFlaAAABpAAAABRiWFlaAAABuAAAABRyVFJDAAABzAAAAEBnVFJDAAABzAAAAEBiVFJDAAABzAAAAEBkZXNjAAAAAAAAAANjMgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB0ZXh0AAAAAEZCAABYWVogAAAAAAAA9tYAAQAAAADTLVhZWiAAAAAAAAADFgAAAzMAAAKkWFlaIAAAAAAAAG+iAAA49QAAA5BYWVogAAAAAAAAYpkAALeFAAAY2lhZWiAAAAAAAAAkoAAAD4QAALbPY3VydgAAAAAAAAAaAAAAywHJA2MFkghrC/YQPxVRGzQh8SmQMhg7kkYFUXdd7WtwegWJsZp8rGm/fdPD6TD////bAEMABAQEBAQEBAUFBAYGBgYGCQgHBwgJDQoKCgoKDRQNDw0NDw0UEhYSERIWEiAZFxcZICUfHh8lLSkpLTk2OUtLZP/bAEMBBAQEBAQEBAUFBAYGBgYGCQgHBwgJDQoKCgoKDRQNDw0NDw0UEhYSERIWEiAZFxcZICUfHh8lLSkpLTk2OUtLZP/AABEIAKkBEgMBI

## Step 3: Deploy Microservice

To deploy our model, we'll make use of Databricks' [Serverless Real-Time Inference capabilities](https://docs.databricks.com/applications/mlflow/serverless-real-time-inference.html).  To do this, we must first enable these within our workspace using the steps [prescribed here](https://docs.databricks.com/applications/mlflow/migrate-and-enable-serverless-real-time-inference.html#enable-serverless-real-time-inference-for-your-workspace):</p>

<img src='https://brysmiwasb.blob.core.windows.net/demos/images/recipes_enable_rti2.png' width=700>

</p>

With the capability enabled, we can then deploy our model by configuring the Databricks workspace UI for **Machine Learning** through the [workspace sidebar](https://docs.databricks.com/workspace/index.html#use-the-sidebar).  Through the Machine Learning configured sidebar, we can then:
</p>

1. Click on the **Models** icon
2. Locate the registered model (named *recipe_model* by default within this notebook)
3. Click on the registered model name
4. Select the **Serving** tab from the model's page
5. Click on the **Enabled Serverless Real-Time Inference** button
6. Wait until the **Status** item reads **Ready**

Once the Serverless Real-Time Interface has a status of **Ready**, monitor the version of the model elevated to production in prior steps.  It will be in a **Pending** state for a while as the container hosting the model is assembled and deployed. Once the model has been fully deployed, it too should be moved into a **Ready** state. (This step may take a few minutes to complete.)</p>

<img src='https://brysmiwasb.blob.core.windows.net/demos/images/recipe_model_ready.PNG' width=600>
</p>

Please note, now that the model is in a **Ready** state that it will run indefinitely until it is explicitly stopped.  You can stop the model by clicking **Stop** next to the Status item.

Before moving on, you should explore the **Compute Settings** tab on the Serving page.  There you can set the size of the container for your model to accommodate the number of requests you intend to service.  If your needs exceed those of the default settings, please contact Databricks support to elevate the service's capacity.

We can then test the service using a modified version of the sample code provided with the model version as presented in the Service interface discussed above.  To access the service endpoint, you will need to setup a personal access token as described [here](https://docs.databricks.com/dev-tools/api/latest/authentication.html):


**NOTE** Sensitive data should not be recorded in plain text as being done here.  If you wish to protect your REST API endpoint from unauthorized access, consider recording your personal access token as part of a [Databricks secret](https://docs.databricks.com/security/secrets/index.html) and update the code below appropriately.

In [0]:
# set model URI
databricks_instance=spark.sparkContext.getConf().get('spark.databricks.workspaceUrl')
MODEL_VERSION_URI = f'https://{databricks_instance}/model-endpoint/recipe_model/Production/invocations'

# set personal access token
DATABRICKS_TOKEN = 'token' # replace with your own credential here temporarily or set up a secret scope with your credential
#DATABRICKS_TOKEN = dbutils.secrets.get("solution-accelerator-cicd", "serving_personal_access_token")

In [0]:
import os
import requests
import numpy as np
import pandas as pd
import json

def score_model(dataset_pd):
  
  # assign url and headers
  url = MODEL_VERSION_URI
  headers = {'Authorization': f'Bearer {DATABRICKS_TOKEN}', 'Content-Type': 'application/json'}
  
  # assemble payload as expected by webservice
  ds_dict = {'dataframe_split': dataset_pd.to_dict(orient='split')}
  
  data_json = json.dumps(ds_dict, allow_nan=True)
  response = requests.request(method='POST', headers=headers, url=url, data=data_json)
  
  if response.status_code != 200:
     raise Exception(f'Request failed with status {response.status_code}, {response.text}')
  return response.json()

To score data using the deployed endpoint, we need to assemble a data payload and pass it to our service:

In [0]:
# sample image path
sample_image_path = f"/dbfs{config['mount path']}/dataset/images/pumpkin-pie-102601.jpg"

# load image into base64 encoded array
with open(sample_image_path, 'rb') as f_in:
  sample_image_encoded = base64.b64encode(f_in.read()).decode('utf-8')

# assemble pandas df
payload_pd = pd.DataFrame([[sample_image_encoded, 3]],columns=['search','k'])

In [0]:
score_model(payload_pd)

Out[11]: {'predictions': [{'0': '[{"corpus_id": 3849, "score": 0.3169584572315216, "id": 5815, "title": "Thai Chicken Curry", "ingredients": "[\'2 teaspoons vegetable oil\', \'1 4-ounce can or jar yellow curry paste\', \'3/4 pound carrots, peeled, cut into 1/2\\"-thick rounds\', \'1 medium onion, chopped\', \'1 red bell pepper, cut into 1\\" pieces\', \'1 pound Yukon Gold potatoes (about 3), peeled, cut into 1/2\\" pieces\', \'1 pound skinless, boneless chicken thighs, cut into 1\\" pieces\', \'1 13.5-ounce or 15-ounce can unsweetened coconut milk\', \'Chopped fresh basil and cilantro\']", "instructions": "Heat oil in a large heavy pot over medium heat. Add curry paste and cook, stirring, until fragrant, about 1 minute. Add carrots, onion, and pepper and cook, stirring occasionally, until onion is translucent, about 10 minutes.Add potatoes, chicken, coconut milk, and 1 1/2 cups water and bring to a boil. Reduce heat to a simmer and cook, stirring occasionally, until chicken is cooked t

We can test using search text as well:

In [0]:
payload_pd = pd.DataFrame([['pumpkin pie', 3]],columns=['search','k'])
score_model(payload_pd)

Out[11]: {'predictions': [{'0': '[{"corpus_id": 35, "score": 0.3144124448299408, "id": 8410, "title": "Spiced Pumpkin Phyllo Pie", "ingredients": "[\'1 1 3/4- to 2-pound sugar pumpkin or butternut squash, halved through core, seeded\', \'1 teaspoon ground cinnamon\', \'1 teaspoon ground ginger\', \'1/4 teaspoon freshly grated nutmeg plus additional for garnish\', \'1 cup (loosely packed) golden brown sugar\', \'3 large eggs\', \'6 tablespoons canned evaporated skim milk\', \'1 1/2 tablespoons heavy whipping cream\', \'1 tablespoon cornstarch\', \'3/4 teaspoon vanilla extract\', \'1/2 teaspoon salt\', \'10 14x9-inch sheets fresh phyllo pastry or frozen, thawed\', \'5 tablespoons butter, melted\', \'2 tablespoons sugar\']", "instructions": "Preheat oven to 375\\u00b0F. Line rimmed baking sheet with parchment paper. Place pumpkin, cut side down, on parchment. Bake until very tender, about 1 hour. Cool. Scoop pumpkin flesh into processor; discard skin. Combine cinnamon, ginger, and 1/4 tea

&copy; 2022 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the [Databricks License](https://databricks.com/db-license-source).  All included or referenced third party libraries are subject to the licenses set forth below.

| library                                | description             | license    | source                                              |
|----------------------------------------|-------------------------|------------|-----------------------------------------------------|
|sentence-transformers | Provides an easy method to compute dense vector representations for sentences, paragraphs, and images | Apache 2.0| https://pypi.org/project/sentence-transformers/      |
| kaggle| Official API for https://www.kaggle.com, accessible using a command line tool implemented in Python | Apache 2.0 | https://pypi.org/project/kaggle/|