Article for inspiration: https://www.snowflake.com/blog/container-services-llama2-snowpark-ml/

Compute Pool: skhara_compute_gpu7

In [None]:
!pip install transformers

In [1]:
from snowflake.snowpark.session import Session
from snowflake.ml.registry import model_registry
from snowflake.ml.model import deploy_platforms

import json
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

In [3]:
connection_parameters = json.load(open('creds.json'))
session = Session.builder.configs(connection_parameters).create()

# LLAMA Model Setup

## Load LLAMA Model

In [4]:
HF_AUTH_TOKEN = "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" #Your token from Hugging Face

In [5]:
from transformers import pipeline
from snowflake.ml.model.models import huggingface_pipeline

llama_model = huggingface_pipeline.HuggingFacePipelineModel(task="text-generation",
                                                            model="meta-llama/Llama-2-7b-chat-hf",
                                                            token=HF_AUTH_TOKEN,
                                                            return_full_text=False,
                                                            max_new_tokens=100)

## Register the model

In [6]:
registry_name = 'SKHARA'
schema_name = 'BUILD_REGISTRY'

model_registry.create_model_registry(session= session,
                                     database_name= registry_name,
                                     schema_name= schema_name)

registry = model_registry.ModelRegistry(session= session,
                                        database_name= registry_name,
                                        schema_name= schema_name)

create_model_registry() is in private preview since 0.2.0. Do not use it in production. 


In [13]:
MODEL_NAME = "LLAMA2_MODEL_7b_CHAT"
MODEL_VERSION = "7"

llama_model_ref= registry.log_model(
    model_name=MODEL_NAME,
    model_version=MODEL_VERSION,
    model=llama_model
)

llama_model_ref

<snowflake.ml.registry.model_registry.ModelReference at 0x28acb3700>

## Deploy Model

In [14]:
llama_model_ref.deploy(
    deployment_name="llama_predict",
    platform= deploy_platforms.TargetPlatform.SNOWPARK_CONTAINER_SERVICES,
    options={"compute_pool": "SKHARA_COMPUTE_GPU3",
             "num_gpus": 1,
             # Remove the 'prebuilt_snowflake_image' argument below when running .deploy() for the first time
             #"prebuilt_snowflake_image": "sfsenorthamerica-fcto-spc.registry.snowflakecomputing.com/skhara/build_registry/snowml_repo/116da812e88f2751324c6a16eb00de3726ed06a3:latest"
            },
    permanent = True,
)



{'name': 'SKHARA.BUILD_REGISTRY.llama_predict',
 'platform': <TargetPlatform.SNOWPARK_CONTAINER_SERVICES: 'SNOWPARK_CONTAINER_SERVICES'>,
 'target_method': '__call__',
 'signature': ModelSignature(
                     inputs=[
                         FeatureSpec(dtype=DataType.STRING, name='inputs')
                     ],
                     outputs=[
                         FeatureSpec(dtype=DataType.STRING, name='outputs')
                     ]
                 ),
 'options': {'compute_pool': 'SKHARA_COMPUTE_GPU3', 'num_gpus': 1},
 'details': {'image_name': 'sfsenorthamerica-fcto-spc.registry.snowflakecomputing.com/skhara/build_registry/snowml_repo/116da812e88f2751324c6a16eb00de3726ed06a3:latest',
  'service_spec': "spec:\n  container:\n  - env:\n      MODEL_ZIP_STAGE_PATH: SKHARA.BUILD_REGISTRY.snowml_model_0dfc6cb071db11ee9c1d0a72b796458c/0dfc6cb071db11ee9c1d0a72b796458c.zip\n      NUM_WORKERS: 1\n      SNOWML_USE_GPU: true\n      TARGET_METHOD: __call__\n      _CONCURRENT_RE

# I/O Setup

We will load a JSON file to a Snowflake Table. For prediction purposes, we have two options - use Snowpark DataFrame, use Local Pandas DataFrame.
For sake of simplicity, we will use a Local Pandas Dataframe with only tow rows. If the dataset is big, it is advised to use Snowpark Dataframes.

## Load Data

In [15]:
json_dataset = pd.read_json("frosty_dataset_generator/frosty_transcripts_all.jsonl", lines=True).convert_dtypes()
json_dataset.head()

Unnamed: 0,transcript,name,location,toy_list
0,frosty: Hi there! This is Frosty. How can I he...,Alex,Houston,"[Barbie Science Lab Playset, Pokémon 8-Inch Pl..."
1,"frosty: Hello, happy holiday! How can I help y...",Amber,London,"[Dog-E, 2023 Holiday Fox 12-Inch Plush]"
2,"frosty: Hi! I'm Frosty, how can I assist you t...",Bella,San Francisco,[Transformers Rise of the Beasts Beast-Mode Bu...
3,frosty: Hello! This is Frosty. How can I help ...,Luke,Melbourne,"[LeapFrog Magic Adventures Microscope, Marvel'..."
4,"frosty: Hi, happy holidays! How can I assist y...",Owen,Toronto,"[Star Wars LOLA animatronic droid, Fisher-Pric..."


In [16]:
TABLE_NAME = "AK_BUILD_DATA"
session.write_pandas(json_dataset, table_name=TABLE_NAME, auto_create_table=True, overwrite=True)

<snowflake.snowpark.table.Table at 0x28997f6a0>

## Input: Prompt Engineering

In [17]:
session.sql('SELECT * from AK_BUILD_DATA LIMIT 5').to_pandas()

Unnamed: 0,transcript,name,location,toy_list
0,frosty: Hi there! This is Frosty. How can I he...,Alex,Houston,"[\n ""Barbie Science Lab Playset"",\n ""Pokémon..."
1,"frosty: Hello, happy holiday! How can I help y...",Amber,London,"[\n ""Dog-E"",\n ""2023 Holiday Fox 12-Inch Plu..."
2,"frosty: Hi! I'm Frosty, how can I assist you t...",Bella,San Francisco,"[\n ""Transformers Rise of the Beasts Beast-Mo..."
3,frosty: Hello! This is Frosty. How can I help ...,Luke,Melbourne,"[\n ""LeapFrog Magic Adventures Microscope"",\n..."
4,"frosty: Hi, happy holidays! How can I assist y...",Owen,Toronto,"[\n ""Star Wars LOLA animatronic droid"",\n ""F..."


In [18]:
sdf_input = session.table('AK_BUILD_DATA')
df_local = sdf_input.limit(20).to_pandas()
df_local.head()

Unnamed: 0,transcript,name,location,toy_list
0,frosty: Hi there! This is Frosty. How can I he...,Alex,Houston,"[\n ""Barbie Science Lab Playset"",\n ""Pokémon..."
1,"frosty: Hello, happy holiday! How can I help y...",Amber,London,"[\n ""Dog-E"",\n ""2023 Holiday Fox 12-Inch Plu..."
2,"frosty: Hi! I'm Frosty, how can I assist you t...",Bella,San Francisco,"[\n ""Transformers Rise of the Beasts Beast-Mo..."
3,frosty: Hello! This is Frosty. How can I help ...,Luke,Melbourne,"[\n ""LeapFrog Magic Adventures Microscope"",\n..."
4,"frosty: Hi, happy holidays! How can I assist y...",Owen,Toronto,"[\n ""Star Wars LOLA animatronic droid"",\n ""F..."


In [19]:
def add_prompt(transcript):
    prompt = f'''[INST] <PROMPT>
    Your output will be parsed by a computer program as a JSON object. Please respond ONLY with valid json that conforms to this JSON schema:
    {{
      "name": {{
        "type": "string",
        "description": "The name of the person calling"
      }},
      "location": {{
        "type": "string",
        "description": "The name of the location where the person is calling from."
      }},
      "toy_list": {{
        "type": "array",
        "description": "The list of toys requested by the person calling."
      }},
      "required": ["name", "location", "toy_list"]
    }}


    Example 1:
    Input: "{df_local['transcript'].iloc[0]}"
    Output: {{"name": {df_local['name'].iloc[0]}, "location": {df_local['location'].iloc[0]}, "toy_list": {df_local['toy_list'].iloc[0]}}}

    Example 2:
    Input: "{df_local['transcript'].iloc[1]}"
    Output: {{"name": {df_local['name'].iloc[1]}, "location": {df_local['location'].iloc[1]}, "toy_list": {df_local['toy_list'].iloc[1]}}}
    </PROMPT>

    Actual Input: {transcript}
    [/INST]
    '''
    return prompt

In [20]:
df_local['inputs'] = df_local['transcript'].apply(add_prompt)
print(df_local['inputs'].iloc[3])

[INST] <PROMPT>
    Your output will be parsed by a computer program as a JSON object. Please respond ONLY with valid json that conforms to this JSON schema:
    {
      "name": {
        "type": "string",
        "description": "The name of the person calling"
      },
      "location": {
        "type": "string",
        "description": "The name of the location where the person is calling from."
      },
      "toy_list": {
        "type": "array",
        "description": "The list of toys requested by the person calling."
      },
      "required": ["name", "location", "toy_list"]
    }


    Example 1:
    Input: "frosty: Hi there! This is Frosty. How can I help you today?
caller: Hi Frosty, I want to make my holiday wish.
frosty: Of course! May I know your name, please?
caller: I'm Alex.
frosty: Hi Alex! Where are you calling from?
caller: From Houston.
frosty: Wonderful! Now, what's your holiday wish?
caller: I want the barbie science doll set and pokemon plushie.
frosty: Awesome 

## Output: Processing
Ensure that processing code conforms to the JSON Structure provided during Prompt Engineering.

In [21]:
import json
def format_output(output_string):
    try:
        outer_list = json.loads(output_string)
        generated_text_str = outer_list[0]['generated_text']
        
        end_pos = generated_text_str.rfind('}')
        if end_pos == -1:
            raise ValueError("No closing brace found in generated_text")
        json_str = generated_text_str[:end_pos + 1]
        
        generated_text_dict = json.loads(json_str)
        return generated_text_dict
    except:
        return 'Could not parse output'

# Get Predictions

## Get Deployed Model

In [22]:
registry_name = 'SKHARA'
schema_name = 'BUILD_REGISTRY'

registry = model_registry.ModelRegistry(session= session,
                                        database_name= registry_name,
                                        schema_name= schema_name)

In [23]:
model_list = registry.list_models()
model_list.to_pandas()

Unnamed: 0,CREATION_CONTEXT,CREATION_ENVIRONMENT_SPEC,CREATION_ROLE,CREATION_TIME,ID,INPUT_SPEC,NAME,OUTPUT_SPEC,RUNTIME_ENVIRONMENT_SPEC,TYPE,URI,VERSION,ARTIFACT_IDS,DESCRIPTION,METRICS,TAGS,REGISTRATION_TIMESTAMP
0,,"{\n ""python"": ""3.9.17""\n}","""SPC_USER_ROLE""",2023-10-18 11:19:26.257000-07:00,d90f1c246de211eeae210a72b796458c,,LLAMA2_MODEL_7b_CHAT,,,huggingface_pipeline,sfc://SKHARA.BUILD_REGISTRY.SNOWML_MODEL_D90F1...,3,[],,,,2023-10-18 11:19:27.494000-07:00
1,,"{\n ""python"": ""3.9.17""\n}","""SPC_USER_ROLE""",2023-10-18 11:20:31.620000-07:00,0168781e6de311eeae210a72b796458c,,LLAMA2_MODEL_7b_CHAT,,,huggingface_pipeline,sfc://SKHARA.BUILD_REGISTRY.SNOWML_MODEL_01687...,4,[],,,,2023-10-18 11:20:33.306000-07:00
2,,"{\n ""python"": ""3.9.17""\n}","""SPC_USER_ROLE""",2023-10-23 08:23:21.668000-07:00,1454b4e671b811eeb25b0a72b796458c,,LLAMA2_MODEL_7b_CHAT,,,huggingface_pipeline,sfc://SKHARA.BUILD_REGISTRY.SNOWML_MODEL_1454B...,5,[],,,,2023-10-23 08:23:22.988000-07:00
3,,"{\n ""python"": ""3.9.17""\n}","""SPC_USER_ROLE""",2023-10-23 11:06:19.746000-07:00,d870c6ec71ce11ee9c1d0a72b796458c,,LLAMA2_MODEL_7b_CHAT,,,huggingface_pipeline,sfc://SKHARA.BUILD_REGISTRY.SNOWML_MODEL_D870C...,6,[],,,,2023-10-23 11:06:21.405000-07:00
4,,"{\n ""python"": ""3.9.17""\n}","""SPC_USER_ROLE""",2023-10-23 12:33:41.971000-07:00,0dfc6cb071db11ee9c1d0a72b796458c,,LLAMA2_MODEL_7b_CHAT,,,huggingface_pipeline,sfc://SKHARA.BUILD_REGISTRY.SNOWML_MODEL_0DFC6...,7,[],,,,2023-10-23 12:33:43.307000-07:00


In [32]:
model_list = registry.list_deployments(model_name = model_name, model_version='7')
model_list.to_pandas()

Unnamed: 0,MODEL_NAME,MODEL_VERSION,DEPLOYMENT_NAME,CREATION_TIME,TARGET_METHOD,TARGET_PLATFORM,SIGNATURE,OPTIONS,STAGE_PATH,ROLE
0,LLAMA2_MODEL_7b_CHAT,7,llama_predict,2023-10-23 12:35:19.101000-07:00,__call__,SNOWPARK_CONTAINER_SERVICES,"{\n ""inputs"": [\n {\n ""name"": ""inputs...","{\n ""compute_pool"": ""SKHARA_COMPUTE_GPU3"",\n ...",@SKHARA.BUILD_REGISTRY._SYSTEM_REGISTRY_DEPLOY...,"""SPC_USER_ROLE"""


In [29]:
model_name = 'LLAMA2_MODEL_7b_CHAT'
model = model_registry.ModelReference(registry=registry, model_name=model_name, model_version='7')

## Predict & See Outputs

In [30]:
res = model.predict(
    deployment_name= 'llama_predict',
    data= df_local[['inputs']]
)

In [31]:
for i in range(len(df_local)):
    print(f'\n\n **** Transcript # {i} ****')
    print(df_local['transcript'].iloc[i])
    print('\n')
    print(format_output(res['outputs'].iloc[i]))



 **** Transcript # 0 ****
frosty: Hi there! This is Frosty. How can I help you today?
caller: Hi Frosty, I want to make my holiday wish.
frosty: Of course! May I know your name, please?
caller: I'm Alex.
frosty: Hi Alex! Where are you calling from?
caller: From Houston.
frosty: Wonderful! Now, what's your holiday wish?
caller: I want the barbie science doll set and pokemon plushie.
frosty: Awesome choices, Alex! Your list has been added. Thanks for calling and have a jolly Holiday!


Could not parse output


 **** Transcript # 1 ****
frosty: Hello, happy holiday! How can I help you today?
caller: I'm Amber. I want to give my wish list.
frosty: Of course, Amber! What's on your wish list?
caller: robot dog and the fox plushie.
frosty: Brilliant choices, Amber! And where are you calling from? 
caller: From London.
frosty: Alright, Amber from London. Your list has been recorded. Have a fantastic holiday!


Could not parse output


 **** Transcript # 2 ****
frosty: Hi! I'm Frosty, how can