# AI / ML Easy Button 
## *Basics using Review Data*
Welcome to the AI/ML Easy Button. This notebook will allow you to explore to explore Snowflake's out of the box functions and understand how they work with custom prompts in both SQL and Python.  
We will leverage some Take-Two Game Review Data and Game Summary Data for these analysis.

The Snowflake Docs are always a great reference point, so we will include these along the way.
[LLM Docs](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions)

![](https://www.fatherhood.org/hs-fs/hubfs/Images/Blog/easy-button.png?width=585&name=easy-button.png)

In [None]:
# !!! Add packages in the upper right under packages --- this code needs 'snowflake-ml'
# Import python packages
import streamlit as st
import pandas as pd
import json

# We can also use Snowpark for our analyses!
from snowflake.snowpark import functions as F
from snowflake.cortex import Complete, Sentiment, Summarize, Translate
from snowflake.snowpark.context import get_active_session
session = get_active_session()


In [None]:
--first create your own schema to work in!
create schema if not exists MY_NAME ;
use schema MY_NAME;
---this should be your name!!
select current_database(), current_schema();





### Translate  
https://docs.snowflake.com/en/sql-reference/functions/translate-snowflake-cortex

In [None]:
-- TRANSLATE
-- Translate text For Ad Copy

---call the in built SQL function, with text, from language and to language
select
    snowflake.cortex.translate(
        'Buy this awesome product, we know you will love it',
        'en',
        'pt');

In [None]:
-- SQL Translate from Any Text to English...leave from language blank to infer it
select
    snowflake.cortex.translate(
        '제가 가장 좋아하고 아늑한 소파가 오늘 판매 중입니다',
        '',
        'en');

In [None]:
#PYTHON IS EASY TOO...same function directly from Python
Translate(
    "Detta är det bästa videospelet någonsin!",
    "",
    "en"
)

In [None]:
--we can run this against millions of records easily, effeciently and securly
select
    REVIEW_TEXT,
    snowflake.cortex.translate(
        REVIEW_TEXT,
        '',
        'en'
    ) as review_in_english
from
    LAB_DATA.PUBLIC.REVIEWS
    where IS_ENGLISH = 0;

In [None]:
###can turn cell results into snowpark dataframes... or Pandas ones too
sp_df = sample_reviews.to_df()
sp_df.show(3)
pd_df = sample_reviews.to_pandas()
pd_df.head(3)

### Sentiment 
https://docs.snowflake.com/en/sql-reference/functions/sentiment-snowflake-cortex

In [None]:
select
    REVIEW_TEXT,
    snowflake.cortex.sentiment(
        snowflake.cortex.translate( ---chained together, translate and sentiment
        REVIEW_TEXT,
        '',
        'en'
        ) 
    ) as review_sentiment
from
    LAB_DATA.PUBLIC.REVIEWS

### Summarize  
https://docs.snowflake.com/en/sql-reference/functions/summarize-snowflake-cortex

In [None]:
--summarize a webpage
select
    snowflake.cortex.summarize('
    The GE Profile 5.3 cu. Ft. Smart Front Load Washer (Model # PFW870SSVWW) features OdorBlock, UltraFresh Vent System, Microban Technology, and SmartHQ connectivity. It offers 12 wash cycles, steam cleaning, and is ENERGY STAR certified. Dimensions are 39.75" H x 28" W x 34" D.
    CYCLE INFORMATION	Number of Cycles	Wash Cycles	• Self Clean • Power Clean	• Downloaded• Whites• Towels• Bulky/Bedding• Sanitize + Allergen	• Quick Wash	• Delicates	• Cold Wash	• Rinse + Spin	Options	• Extra Rinse	• UltraFresh Vent	• More Water	• Time Saver	• PreWash	• Smart Dispense	• Tumble Care	• Soak Rinse for Sanitizer	FEATURES	Technologies	• dBT™ Quiet Control	• Amazon Alexa	• IFTTT	• The Google Assistant	• Sonos	• UltraFresh Vent System Plus™ with OdorBlock™	• SmartDispense™ Technology	• Load-Sensing Adaptive Fill	• PerfecTemp Deluxe	• WiFi Connect Built-in	Control Lock	Stackable	Reversible Door	Wi-Fi	Dispensers	• Bleach Timed Flow Through	• Fabric Softener Timed Flow Through	• PreWash	• Smart Dispense - Bulk Detergent	• Protected with Microban®	Additional Information	• Automatic Temperature Control	• Auto-Load Sensing	• Adaptive Spin	• Adjustable Leveling Legs	• Internal heater	• LED Basket Light	• Power Cord Attached		• Control Type:	• Rotary-Electronic w/LEDs	• Power On/Off	• Start/Pause	• Capacitive Touch		• Optional Soil Levels:	• Extra Heavy	• Heavy	• Normal	• Light	• Extra Light		• Washer Control Features:	• Control Lock	• Adjustable End-of-Cycle Signal	• LED Display	• Digital Cycle Countdown	• LED Indicators	• Add Garment/Pause	• Remote Start	• Delay Wash - Up to 24 hours	ELECTRICAL INFORMATION	Information	• 60 Hz	• 15 or 20 A	CERTIFICATIONS AND APPROVALS	Energy Star Certified	Information	ACCESSORIES	Optional	• Pedestal Accessory GFP1528PNRS	• Riser GFR0728PNRS	• Stacking Kit GFA28KITN	DIMENSIONS	Class Width	Width	Height	Depth	Weight	Additional Dimensions	SHIPPING DIMENSIONS	Weight	WARRANTY	Warranty	• Limited 10-year motor 12	• Normal							• Delay Wash		• Quiet-By-Design with dBT™ Quiet Control											Yes	Yes	Yes	Yes	• Detergent (Liquid/Powder) Timed Flow Through						• Door Style: See-Thru Glass with Tinted Door Protect, Chrome Outer Ring with Chrome Vent	• 120 V				Yes	ADA		• Fill Hose WH41X10207					28" - 30"	28 inch(es) / 71.12 cm	39.75 inch(es) / 100.97 cm	34 inch(es) / 86.36 cm	245.99 lb(s) / 111.58 kg	Depth with Door Open 90 Degrees: 56 1/2 in		264 lb(s) / 119.75 kg		• Limited 1-Year Parts and Labor	') as WEB_PAGE_SUMMARY
	

    
    

### COMPLETE  
https://docs.snowflake.com/en/sql-reference/functions/complete-snowflake-cortex

In [None]:
select
    snowflake.cortex.complete(
        'llama3-70b',
       'what is the meaning of life?' 
       ) meaning_of_life

In [None]:
#Similarly we can call this easily from Python
MeaningOfLife = Complete(
        "mistral-large",
        "What is the meaning of life?"
    )

print(MeaningOfLife)

In [None]:
select * from AUDIENCE_DESC limit 5;

In [None]:
---SQL to leverage an LLM... its this easy!  Millions of $$$ spend to train these models that you can leverage easily and cheaply.  And switch between them in seconds.
select
    snowflake.cortex.complete(
        'llama3-70b',
        'Write one catchy ad headline for a new computer for this audience, return the headline only : ' || AUDIENCE_DESC
    ) HEADLINE,
    AUDIENCE_DESC
    from  LAB_DATA.PUBLIC.SEGMENTS limit 1

In [None]:
select summary from LAB_DATA.PUBLIC.GAMES limit 5

In [None]:
---more complex scenario, join audiences to products and create ad copy
select 
 snowflake.cortex.complete(
        'mistral-large',
        'Write one catchy ad headline for this :product::::' || summary|| ':::: for the following audience, return the headline only :audience::::: ' || AUDIENCE_DESC
    ) HEADLINE,
    AUDIENCE_DESC

from 
LAB_DATA.PUBLIC.GAMES 
join 
LAB_DATA.PUBLIC.SEGMENTS limit 5

In [None]:
###easily save to table in SQL or Python
save_this_data.to_df().write.mode("overwrite").save_as_table("AD_COPY_FOR_AUDIENCES")

In [None]:
--keywords from a webpage
select
    snowflake.cortex.complete( 'mistral-large','Return me the top 10 keywords ONLY from this product description as a JSON object:::::::::
    The GE Profile 5.3 cu. Ft. Smart Front Load Washer (Model # PFW870SSVWW) features OdorBlock, UltraFresh Vent System, Microban Technology, and SmartHQ connectivity. It offers 12 wash cycles, steam cleaning, and is ENERGY STAR certified. Dimensions are 39.75" H x 28" W x 34" D.
    CYCLE INFORMATION	Number of Cycles	Wash Cycles	• Self Clean • Power Clean	• Downloaded• Whites• Towels• Bulky/Bedding• Sanitize + Allergen	• Quick Wash	• Delicates	• Cold Wash	• Rinse + Spin	Options	• Extra Rinse	• UltraFresh Vent	• More Water	• Time Saver	• PreWash	• Smart Dispense	• Tumble Care	• Soak Rinse for Sanitizer	FEATURES	Technologies	• dBT™ Quiet Control	• Amazon Alexa	• IFTTT	• The Google Assistant	• Sonos	• UltraFresh Vent System Plus™ with OdorBlock™	• SmartDispense™ Technology	• Load-Sensing Adaptive Fill	• PerfecTemp Deluxe	• WiFi Connect Built-in	Control Lock	Stackable	Reversible Door	Wi-Fi	Dispensers	• Bleach Timed Flow Through	• Fabric Softener Timed Flow Through	• PreWash	• Smart Dispense - Bulk Detergent	• Protected with Microban®	Additional Information	• Automatic Temperature Control	• Auto-Load Sensing	• Adaptive Spin	• Adjustable Leveling Legs	• Internal heater	• LED Basket Light	• Power Cord Attached		• Control Type:	• Rotary-Electronic w/LEDs	• Power On/Off	• Start/Pause	• Capacitive Touch		• Optional Soil Levels:	• Extra Heavy	• Heavy	• Normal	• Light	• Extra Light		• Washer Control Features:	• Control Lock	• Adjustable End-of-Cycle Signal	• LED Display	• Digital Cycle Countdown	• LED Indicators	• Add Garment/Pause	• Remote Start	• Delay Wash - Up to 24 hours	ELECTRICAL INFORMATION	Information	• 60 Hz	• 15 or 20 A	CERTIFICATIONS AND APPROVALS	Energy Star Certified	Information	ACCESSORIES	Optional	• Pedestal Accessory GFP1528PNRS	• Riser GFR0728PNRS	• Stacking Kit GFA28KITN	DIMENSIONS	Class Width	Width	Height	Depth	Weight	Additional Dimensions	SHIPPING DIMENSIONS	Weight	WARRANTY	Warranty	• Limited 10-year motor 12	• Normal							• Delay Wash		• Quiet-By-Design with dBT™ Quiet Control											Yes	Yes	Yes	Yes	• Detergent (Liquid/Powder) Timed Flow Through						• Door Style: See-Thru Glass with Tinted Door Protect, Chrome Outer Ring with Chrome Vent	• 120 V				Yes	ADA		• Fill Hose WH41X10207					28" - 30"	28 inch(es) / 71.12 cm	39.75 inch(es) / 100.97 cm	34 inch(es) / 86.36 cm	245.99 lb(s) / 111.58 kg	Depth with Door Open 90 Degrees: 56 1/2 in		264 lb(s) / 119.75 kg		• Limited 1-Year Parts and Labor	') as WEB_PAGE_KEYWORDS
	

    
    

In [None]:
select
   REVIEW_TEXT CHECK_FOR_TEXT_FOR_TOXIC_LANGUAGE
   from   LAB_DATA.PUBLIC.REVIEWS 
   limit 5

In [None]:
---DATA CLASSIFICATION

select
   REVIEW_TEXT, snowflake.cortex.complete(
        'llama3-8b',
        '[INST]
### 
Tell me based on the following Text, is there any toxic or offensive language in the text? Answer should be only one of the following words - \
"Likely Toxic" or "Unlikely Toxic" or "Unsure". Make sure there are no additional additional text.
Review -
###' || REVIEW_TEXT) as CLASSIFIED_TEXT
     from  LAB_DATA.PUBLIC.REVIEWS limit 5

### EMBED_TEXT
https://docs.snowflake.com/sql-reference/functions/embed_text-snowflake-cortex

In [None]:
select
    snowflake.cortex.embed_text_768(
        'snowflake-arctic-embed-m',
        'I love ads'
    );

In [None]:
select
    summary,
    snowflake.cortex.embed_text_768(
        'snowflake-arctic-embed-m',
        summary
    ) as summary_embedding
from
    LAB_DATA.PUBLIC.GAMES
limit 10;

### VECTOR SIMILARITY CALCULATIONS  
https://docs.snowflake.com/en/sql-reference/functions/vector_cosine_similarity

In [None]:
select
    vector_cosine_similarity(
        snowflake.cortex.embed_text_768('snowflake-arctic-embed-m', 'California Contemporary style'),
        snowflake.cortex.embed_text_768('snowflake-arctic-embed-m', 'California Contemporary style homes')
    );

In [None]:
--snowflake own SOTA text embed model
create or replace table TEXT_EMBED as 
select REVIEW_TEXT as TEXT,
    snowflake.cortex.embed_text_768(
        'snowflake-arctic-embed-m',
        REVIEW_TEXT
    ) as TEXT_embedding
from
    LAB_DATA.PUBLIC.REVIEWS

In [None]:
--use cosine similarity to find like titles
select
    b.TEXT SIMILAR_TEXT, a.TEXT TEXT_SEARCHED, 
    vector_cosine_similarity(
        a.TEXT_embedding,
        b.TEXT_embedding
    ) as similarity
from
    TEXT_EMBED a
    cross join TEXT_EMBED b
where 
     b.TEXT < a.TEXT
     and similarity < .9
order by
    similarity desc
limit 20;

## Retrieval-Augmented Generation (RAG) 
is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response.
We do this in two steps when querying the LLM
1. First we take the question and get the relevant data using vector cosine similarities (as shown above)
2. Next we use that relevant data in the prompt to the LLM with our question

In [None]:
#question = """What Cities does GTA V Take Place In?"""
#question = """Is Lebron James featured in any NBA 2K games?"""
question = """What year is Grand Theft Auto: Vice City set in?"""

In [None]:
#step 1 get the relvant data
relevant_titles = session.sql(f"""
   select
            title, summary,
            vector_cosine_similarity(
                summary_embedding,
                snowflake.cortex.embed_text_768(
                  'snowflake-arctic-embed-m',
                  '{question}'
                )
            ) as similarity
        from
            GAMES_EMBED
  order by
      similarity desc
  limit 10""") 
relevant_titles.show()

In [None]:
#step 2 feed to the LLM using a specific prompt
info = '. | '.join([x[0] for x in relevant_titles.select("*").collect()]).replace("'", "")
prompt = f"""
            You are a video game expert. Please provide knowledge and guidance to the questions in the tags <question> and </question> based on the provided 
            context found between the tags <context> and </context>.

            <context>
            '{info}'
            </context>
            <question>
            '{question}'
            </question>
            Answer: """
query = """
      select
          snowflake.cortex.complete(
              ?, 
              ?
          ) as response
      """
complete = session.sql(query, params=['mistral-large', prompt])
with st.chat_message(name="Assistant"):
    st.write(complete.collect()[0][0])

# UI to Easily Create Chat Bots with (RAG & Search) is coming!
A Summit our head of product randomly chose someone from the audience to create a chatbot in Snowflake.  They had only logged into Snowflake 7 times and were able to create a RAG chat bot using the new UI in minutes.

[Chatbot Demo with Rag](https://youtu.be/CWEjp1iadUc?feature=shared&t=532)

## Fine Tuning Too
Snowflake is making this easy too, you supply your training data to the model via a table and you get an additional version of the model...fine tuned to your task.
Fine tuning can be the best combo of results and cost. Dont have training data?  Use a more expensive LLM to score a set of data and train a smaller, cheaper one. Boom cheap and effective.

[Snowflake Fine Tuning Docs](https://docs.snowflake.com/en/user-guide/snowflake-cortex/cortex-finetuning?_fsi=wWDeqSCS&_fsi=wWDeqSCS)

[Snowflake Fine Tuning Workshop](https://quickstarts.snowflake.com/guide/finetuning_llm_using_snowflake_cortex_ai/index.html?index=..%2F..index#2)