## Let's show how to use Cortex in Python

How to get started with Cortex in Python both individual queries and working with tables in Snowflake.

Related Docs:
https://docs.snowflake.com/en/sql-reference/functions/complete-snowflake-cortex

In [6]:
# Snowpark for Python
from snowflake.snowpark.session import Session
from snowflake.snowpark.types import Variant
from snowflake.snowpark.version import VERSION

# Snowpark ML
# Misc
import pandas as pd
import json
import logging 
logger = logging.getLogger("snowflake.snowpark.session")
logger.setLevel(logging.ERROR)

from snowflake import connector
from snowflake.ml.utils import connection_params

In [7]:
with open('../creds.json') as f:
    data = json.load(f)
    USERNAME = data['user']
    PASSWORD = data['password']
    SF_ACCOUNT = data['account']
    SF_WH = data['warehouse']

CONNECTION_PARAMETERS = {
   "account": SF_ACCOUNT,
   "user": USERNAME,
   "password": PASSWORD,
}

session = Session.builder.configs(CONNECTION_PARAMETERS).create()

In [29]:
snowflake_environment = session.sql('select current_user(), current_version()').collect()
snowpark_version = VERSION

from snowflake.ml import version
mlversion = version.VERSION


# Current Environment Details
print('User                        : {}'.format(snowflake_environment[0][0]))
print('Role                        : {}'.format(session.get_current_role()))
print('Database                    : {}'.format(session.get_current_database()))
print('Schema                      : {}'.format(session.get_current_schema()))
print('Warehouse                   : {}'.format(session.get_current_warehouse()))
print('Snowflake version           : {}'.format(snowflake_environment[0][1]))
print('Snowpark for Python version : {}.{}.{}'.format(snowpark_version[0],snowpark_version[1],snowpark_version[2]))
print('Snowflake ML version : {}.{}.{}'.format(mlversion[0],mlversion[2],mlversion[4]))

'User                        : RSHAH'
'Role                        : "RAJIV"'
'Database                    : "RAJIV"'
'Schema                      : "PUBLIC"'
'Warehouse                   : "RAJIV"'
'Snowflake version           : 8.9.1'
'Snowpark for Python version : 1.11.1'
'Snowflake ML version : 1.2.2'


## Run all the cortex functions

In [5]:
from snowflake.cortex import Complete, ExtractAnswer, Sentiment, Summarize, Translate

text = """
    The Snowflake company was co-founded by Thierry Cruanes, Marcin Zukowski,
    Bob Muglia, and Benoit Dageville in 2012 and is headquartered in Bozeman,
    Montana.
"""

print(Complete("llama2-70b-chat", "how do snowflakes get their unique patterns?"))
print(ExtractAnswer(text, "When was snowflake founded?"))
print(Sentiment("I really enjoyed this restaurant. Fantastic service!"))
print(Summarize(text))
print(Translate(text, "en", "fr"))

Complete() is experimental since 1.0.12. Do not use it in production. 
ExtractAnswer() is experimental since 1.0.12. Do not use it in production. 


 Snowflakes get their unique patterns through a process called crystallization, which occurs when water vapor in the air freezes into ice crystals. As the water vapor freezes, it forms a nucleus, which is a small cluster of water molecules that acts as a center around which the crystal will grow. The nucleus is typically made up of a few hundred water molecules, and it can be formed in various ways, such as through the condensation of water vapor onto a dust particle or the collision of two water molecules.

Once the nucleus has formed, it begins to grow as water vapor in the air condenses onto it, forming a crystal lattice structure. The lattice structure is made up of a repeating pattern of water molecules that are arranged in a specific way, with each molecule bonded to its neighbors through hydrogen bonds. The unique pattern of the snowflake is determined by the way the water molecules arrange themselves in the lattice structure, and this is influenced by a number of factors, inclu

Sentiment() is experimental since 1.0.12. Do not use it in production. 


[
  {
    "answer": "2012",
    "score": 0.9998274
  }
]


Summarize() is experimental since 1.0.12. Do not use it in production. 


0.8329001


Translate() is experimental since 1.0.12. Do not use it in production. 


The Snowflake company was founded by Thierry Cruanes, Marcin Zukowski, Bob Muglia, and Benoit Dageville in 2012 and is based in Bozeman, Montana.
La société Snowflake a été fondée en 2012 par Thierry Cruanes, Marcin Zukowski, Bob Muglia et Benoit Dageville et a son siège social à Bozeman, au Montana.


In [6]:
print(Complete("llama2-70b-chat", "how do snowflakes get their unique patterns?"))

 Snowflakes get their unique patterns through a process called crystallization, which occurs when water vapor in the air freezes into ice crystals. As the water vapor freezes, it forms a nucleus, which is a small cluster of water molecules that acts as a center around which the crystal will grow. The nucleus is typically made up of a few hundred water molecules, and it can be formed in various ways, such as through the condensation of water vapor onto a dust particle or the collision of two water molecules.

Once the nucleus has formed, it begins to grow as water vapor in the air condenses onto it, forming a crystal lattice structure. The lattice structure is made up of a repeating pattern of water molecules that are arranged in a specific way, with each molecule bonded to its neighbors through hydrogen bonds. The unique pattern of the snowflake is determined by the way the water molecules arrange themselves in the lattice structure, and this is influenced by a number of factors, inclu

## Use it on a column in a datset 

In [8]:
from snowflake.cortex import Summarize, Complete, ExtractAnswer, Sentiment, Summarize, Translate
from snowflake.snowpark.functions import col

article_df = session.table("IMDB_SAMPLE")
article_df = article_df.withColumn(
    "text",
    Summarize(col("text"))
)
article_df.collect()

Summarize() is experimental since 1.0.12. Do not use it in production. 


[Row(LABEL=1, TEXT='The text expresses high praise for the movie, featuring great entertainment, wonderful performances by Belushi, Beach, Dalton, and Railsback, and a 10/10 rating.'),
 Row(LABEL=1, TEXT='The viewer enjoyed "Timothy Dalton\'s James Bond" movie despite it not being his third outing as expected. Belushi added humor with his performance, while Dalton hammed it up. The other British actor was surprising, but overall, it was a comedic film.'),
 Row(LABEL=1, TEXT='The person bought a movie they expected to dislike but found enjoyable. James Belushi and Timothy Dalton gave standout performances. The film\'s ending, featuring Bill "The Mouth" Manucci\'s theft of 12 Million Dollars from the Mafia and subsequent hiding in a witness protection program, was well-executed with good camera work, dialogues, and acting.'),
 Row(LABEL=1, TEXT='The movie "Made Men" surprised the viewer with its unexpected comedic elements instead of the anticipated action thriller genre. James Belushi s

In [10]:
df=article_df.toPandas()
df

Unnamed: 0,LABEL,TEXT
0,1,"The text expresses high praise for the movie, ..."
1,1,"The viewer enjoyed ""Timothy Dalton's James Bon..."
2,1,The person bought a movie they expected to dis...
3,1,"The movie ""Made Men"" surprised the viewer with..."
4,1,"""Made Men"" is a great action movie with many t..."
5,1,"This movie features explosions, shootouts, and..."
6,1,"This movie had surprising good one-liners, lau..."
7,1,"The text is a review of the movie ""Beirut"" (19..."
8,1,"Jack Dundee, played by Robin Williams, intends..."
9,1,"The movie ""The Best of Times"" is about a middl..."


## Converting Cortext SQL to Python

Simple example

In [None]:
df = session.sql("""SELECT SNOWFLAKE.CORTEX.COMPLETE('mixtral-8x7b', 'What are large language models?')""")
df.collect()

[Row(SNOWFLAKE.CORTEX.COMPLETE('MIXTRAL-8X7B', 'WHAT ARE LARGE LANGUAGE MODELS?')=' Large language models are artificial intelligence models that have been trained on a vast amount of text data to generate human-like text. They are called "large" because they typically have a large number of parameters (billions) and require a significant amount of computational resources to train. These models can be used for a variety of natural language processing tasks, such as text generation, translation, summarization, question answering, and more. They work by predicting the likelihood of a word given the previous words in a sentence, allowing them to generate coherent and contextually appropriate text. Some examples of large language models include OpenAI\'s GPT-3, Google\'s BERT, and Facebook\'s RoBERTa.')]

Messing with the hyperparameters

In [None]:
df = session.sql("""SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'llama2-70b-chat',
    [
        {'role': 'user',
         'content': 'how does a snowflake get its unique pattern?'}
    ],
    {
        'temperature': 0.7,
        'max_tokens': 30
    }
);""")
data_string = df.collect()

In [None]:
# Accessing the JSON string
json_string = data_string[0][0]

# Parse the JSON string
try:
    json_data = json.loads(json_string)
    # Extract the message
    message = json_data["choices"][0]["messages"] if "choices" in json_data and json_data["choices"] else "Message not found."
except json.JSONDecodeError:
    message = "Invalid JSON format."

# Print the extracted message
print(message)

 The unique pattern on a snowflake is formed by a combination of factors, including the temperature and humidity in the air, the shape


## Let's run it across a column in a table 

In [30]:
df = session.sql("""SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'mistral-7b',
        CONCAT('Critique this review in bullet points: <review>', text, '</review>')
) FROM IMDB_SAMPLE LIMIT 10;""")
data_string = df.collect()

In [31]:
import pandas as pd
import ast

# Initialize a list to hold parsed data
parsed_data = []

# Iterate over each Row object
for row in data_string:
    parsed_data.append(row[0])

# Convert the list of dictionaries to a DataFrame
df = pd.DataFrame(parsed_data)
# Now df is a pandas DataFrame with your data
df.columns = ['text']
print (df['text'][2])

(' * The reviewer expresses initial skepticism about the movie but was '
 'pleasantly surprised.\n'
 '\n'
 '* James Belushi\'s performance as Bill "The Mouth" Manucci is commended.\n'
 '\n'
 "* Timothy Dalton's performance as the Sheriff is also praised.\n"
 '\n'
 "* The 'end' scene in Bill's house is highlighted as excellent, with good "
 'camera work, nice dialogues, and good acting.\n'
 '\n'
 '* The plot involves Bill Manucci, who has stolen 12 Million Dollars from the '
 'Mafia, living in South-Carolina under witness protection with his wife.\n'
 '\n'
 '* The Mafia tracks him down and wants the money back, leaving Bill to trust '
 'only himself.\n'
 '\n'
 '* No specific criticisms or negative comments are mentioned in the review.\n'
 '\n'
 "* The reviewer's language is informal and conversational, making it easy to "
 'read and understand.\n'
 '\n'
 '* The review does not provide enough context to determine the title or genre '
 'of the movie.\n'
 '\n'
 '* The reviewer does not men