# LLM HOL Lab1

Please:
* Make sure you are running this notebook from the Lab1 directory.
* Make sure you have updated the connection.json file with all your credentials.
* Make sure you have created the compute pool and it is in IDLE state.


### Install Libraries

In [None]:
%pip install snowflake-ml-python==1.1.2

In [None]:
%pip install transformers==4.34.0 tokenizers

### Import Libraries

In [None]:
from snowflake.snowpark.session import Session
from snowflake.ml.model.models import llm
from snowflake.ml.registry import model_registry
from snowflake.ml.model import deploy_platforms
from snowflake.snowpark import VERSION
import snowflake.snowpark.functions as F

import sys
import os
import json
import pandas as pd
pd.set_option('display.max_colwidth', None)

### Establish Secure Connection

*NOTE: Update [connection.json](connection.json) 

In [None]:
# Create Snowflake Session object
connection_parameters = json.load(open('connection.json'))
session = Session.builder.configs(connection_parameters).create()
session.sql_simplifier_enabled = True

snowflake_environment = session.sql('select current_user(), current_version()').collect()
snowpark_version = VERSION

# Current Environment Details
print('Account                     : {}'.format(session.get_current_account()))
print('User                        : {}'.format(snowflake_environment[0][0]))
print('Role                        : {}'.format(session.get_current_role()))
print('Database                    : {}'.format(session.get_current_database()))
print('Schema                      : {}'.format(session.get_current_schema()))
print('Warehouse                   : {}'.format(session.get_current_warehouse()))
print('Snowflake version           : {}'.format(snowflake_environment[0][1]))
print('Snowpark for Python version : {}.{}.{}'.format(snowpark_version[0],snowpark_version[1],snowpark_version[2]))

### Reference Llama 2 from Huggingface

In [None]:
options = llm.LLMOptions(
    token=connection_parameters['huggingface_token'],
    max_batch_size=100,
)
llama_model = llm.LLM(
    model_id_or_path="meta-llama/Llama-2-7b-chat-hf",
    options=options
)

### Register, Log and Deploy Llama 2 into Snowpark Container Services 

*NOTE: Logging and deploying the same model are one time operations. Once the model is logged and deployed, use ModeReference to get the reference to the model.*

In [None]:
MODEL_NAME    = "LLAMA2_7b_CHAT"
MODEL_VERSION = "NewBaseV2.0"
MODEL_REGISTRY_DB = connection_parameters['database']
MODEL_REGISTRY_SCHEMA = connection_parameters['schema']
COMPUTE_POOL = connection_parameters['compute_pool']

registry = model_registry.ModelRegistry(
    session=session, 
    database_name=MODEL_REGISTRY_DB, 
    schema_name=MODEL_REGISTRY_SCHEMA, 
    create_if_not_exists=True)

llama_model_ref = registry.log_model(
    model_name=MODEL_NAME,
    model_version=MODEL_VERSION,
    model=llama_model
)

*Note: Deploying model for the first time can take ~25-30mins*

In [None]:
%%time
# Optionally enable INFO log level
import logging
logging.basicConfig()
logging.getLogger().setLevel(logging.INFO)

llama_model_ref.deploy(
    deployment_name="llama_predict", 
    platform=deploy_platforms.TargetPlatform.SNOWPARK_CONTAINER_SERVICES,
    permanent=True, 
    options={"compute_pool": COMPUTE_POOL, "num_gpus": 1})

llama_model_ref = model_registry.ModelReference(registry=registry,model_name=MODEL_NAME,model_version=MODEL_VERSION)
llama_model_ref

### Load Data from JSON into Snowflake

*NOTE: Reading data in JSON and storing it in a Snowflake table are one time operations. Once the data is loaded, use Snowpark to load the data from the existing table.*

In [None]:
df = pd.read_json("data/frosty_transcripts.json",lines=True)
sf_df = session.write_pandas(df,'frosty_transcripts',auto_create_table=True,quote_identifiers=False,overwrite=True)
sf_df.to_pandas().head()

### Simple Prompt Engineering Example

For every transcript, define summarization instruction for the LLM

In [None]:
begin_prompt = \
"""
[INST] Summarize this transcript in less than 200 words: 
"""
end_prompt = " [/INST]"

df_inputs = sf_df.with_column('"input"',F.concat_ws(F.lit(" "),F.lit(begin_prompt),F.col('transcript'),F.lit(end_prompt))).select('"input"')
df_inputs.to_pandas().head()

### Inference using Simple Prompt

Pass the summariation instruction to the LLM and examine results of 10 records

In [None]:
df_predict_results = llama_model_ref.predict(deployment_name="llama_predict",data=df_inputs)
df_predict_results.select('"input"','"generated_text"').limit(5).to_pandas()

### Complex Prompt Engineering and Inference Example

For every transcript, define more specific instruction for the LLM

*NOTE: In the results, notice that the output is not consistent across all transcripts. The base model failed to follow the instructions in many of the cases as seen below.*

In [None]:
sf_df = session.table('frosty_transcripts')

begin_prompt = \
"""
[INST] Extract location and list of toys in JSON format: 
"""
end_prompt = " [/INST]"

df_inputs = sf_df.with_column('"input"',F.concat_ws(F.lit(" "),F.lit(begin_prompt),F.col('transcript'),F.lit(end_prompt))).select('"input"')
df_predict_results = llama_model_ref.predict(deployment_name="llama_predict",data=df_inputs)
df_predict_results.limit(10).to_pandas()