# Learn Snowflake Cortex AI - Python Tutorial

## Welcome to Your First AI-Powered Data Analysis!

This tutorial will teach you how to use Python to talk to Snowflake's AI system called "Cortex". You'll learn to:
- Ask questions in plain English
- Have AI write SQL code for you
- Get answers from your data

**IMPORTANT**: Type every single line of code by hand. Do not copy and paste. This helps your brain learn the patterns.

Let's start!

## Step 1: Import the Tools We Need

Before we can do anything, we need to tell Python which tools (called "libraries" or "modules") we want to use.

Think of this like getting your toolbox ready before starting a project.

In [None]:
# The 'import' keyword tells Python we want to use a tool
# 'json' helps us work with data that looks like {"key": "value"}
import json

# This is Snowflake's special internal tool for talking to their AI
# The underscore (_) at the start means it's an internal tool
import _snowflake

# This gets us connected to our Snowflake database
# 'from X import Y' means "from toolbox X, just get tool Y"
from snowflake.snowpark.context import get_active_session

## Step 2: Set Up Our Configuration

Now we need to set up some settings. These are like the "address" and "phone number" for Snowflake's AI system.

In programming, we use ALL_CAPS names for settings that never change (called "constants").

In [None]:
# This is the "web address" where Snowflake's AI lives
# Think of it like a URL, but for computer programs
API_ENDPOINT = "/api/v2/cortex/agent:run"

# This says "wait up to 50 seconds for an answer"
# 50_000 milliseconds = 50 seconds (the underscores make big numbers easier to read)
API_TIMEOUT_MS = 50_000

# This tells Snowflake which AI brain to use - we're using Claude-4-Sonnet
MODEL_NAME = "claude-4-sonnet"
#MODEL_NAME = "auto"

# This is the "address" of our search tool that looks through sales conversations
CORTEX_SEARCH_SERVICES = "sales_intelligence.data.sales_conversation_search"

# This is the "address" of our data model that knows about sales metrics
SEMANTIC_MODELS = "@sales_intelligence.data.models/sales_metrics_model_grc.yaml"


## Step 3: Connect to Snowflake

Now we need to connect to our Snowflake database. Think of this like logging into your email - we need to establish a connection before we can do anything.

In [None]:
# This gets our current connection to Snowflake
# 'session' is like a phone line - it stays open so we can keep talking to Snowflake
session = get_active_session()

## Step 4: Ask Your First Question

Now comes the fun part! We're going to ask the AI a question in plain English, and it will:
1. Understand what we want
2. Write SQL code to get the data
3. Give us an answer

Let's start with a simple question about sales data.

In [None]:
# This is our question in plain English - no programming knowledge needed!
# We store it in a variable called 'prompt' (a prompt is a question for AI)
prompt = "Create a weekly sales metrics summary for the last 8 weeks. Include total revenue and win rate."

In [None]:
prompt = "Show me the recent closed won deals"

## Step 5: Package Our Request

Now we need to package our question with instructions for the AI. This is like addressing an envelope - we need to tell it who should answer, what tools they can use, and what our question is.

We use something called a "dictionary" (the curly braces `{}`) to organize this information.

In [None]:
# This is our "package" of information for the AI
# The curly braces {} create a dictionary - think of it like a filing cabinet with labeled folders
payload = {
    # Tell the AI which "brain" to use
    "model": MODEL_NAME,
    
    # This is our conversation with the AI
    # Square brackets [] create a list - like a grocery list
    "messages": [
        {
            # "role": "user" means this message is from us (the human)
            "role": "user", 
            # "content" is what we're actually saying
            "content": [
                {
                    # "type": "text" means we're sending words, not pictures
                    "type": "text", 
                    # "text" is our actual question
                    "text": prompt
                }
            ]
        }
    ],
    
    # These are the "tools" the AI can use to answer our question
    "tools": [
        {
            # This tool can write SQL code from our English question
            "tool_spec": {
                "type": "cortex_analyst_text_to_sql", 
                "name": "analyst1"
            }
        },
        {
            # This tool can search through conversations and documents
            "tool_spec": {
                "type": "cortex_search", 
                "name": "search1"
            }
        }
    ],
    
    # These are the "settings" for each tool
    "tool_resources": {
        # Settings for the SQL-writing tool
        "analyst1": {
            # This tells the tool where to find information about our data structure
            "semantic_model_file": SEMANTIC_MODELS
        },
        # Settings for the search tool
        "search1": {
            # Where to search
            "name": CORTEX_SEARCH_SERVICES, 
            # Don't return more than 3 results
            "max_results": 3, 
            # Use this column as the unique identifier
            "id_column": "conversation_id"
        }
    }
}

## Step 6: Send Our Request to the AI

Now we actually send our question to Snowflake's AI. This is like pressing "Send" on an email.

We also need to check if something went wrong, like if the AI is busy or our internet is slow.

In [None]:
# Send our request to Snowflake's AI
# This is like making a phone call - we dial the number (API_ENDPOINT) and talk (payload)
resp = _snowflake.send_snow_api_request(
    "POST",           # "POST" means we're sending data (like mailing a letter)
    API_ENDPOINT,     # Where to send it
    {},               # Headers (extra info) - empty for now
    {},               # Parameters (settings) - empty for now  
    payload,          # Our actual question and settings
    None,             # Authentication (login info) - None means use current login
    API_TIMEOUT_MS    # How long to wait for an answer
)

# Check if something went wrong
# Status 200 means "everything worked perfectly"
if resp.get("status") != 200:
    # If something went wrong, stop and show an error message
    # The 'f' before the quotes lets us put variables inside the text
    raise RuntimeError(f"HTTP {resp.get('status')}: {resp.get('reason')} -> {resp}")

In [None]:
resp

## Step 7: Understand the AI's Response

The AI sends us back a lot of information, but it's in a special format. We need to extract the parts we care about:
1. The AI's explanation in English
2. The SQL code it wrote

Think of this like opening a package and sorting out what's inside.

In [None]:
# The AI sends back JSON data (looks like {"key": "value"})
# We convert this from text into something Python can understand
events = json.loads(resp["content"])

# Create empty variables to store what we find
# Think of these like empty boxes we'll fill up
assistant_text = ""  # This will hold the AI's explanation
generated_sql = ""   # This will hold the SQL code the AI wrote

# Look through all the pieces of the AI's response
# 'for' means "do this for each item in the list"
for ev in events:
    # We only care about "message.delta" events (these contain the actual answer)
    # '.get()' safely gets a value - if it doesn't exist, it returns None instead of crashing
    if ev.get("event") != "message.delta":
        continue  # Skip this event and go to the next one
    
    # Look through the content of this event
    # This is like opening nested boxes inside boxes
    for item in ev.get("data", {}).get("delta", {}).get("content", []):
        # If this piece is text (the AI's explanation)
        if item.get("type") == "text":
            # Add it to our explanation box
            # The '+=' means "add this to what we already have"
            assistant_text += item.get("text", "")
        
        # If this piece is tool results (SQL code and data)
        elif item.get("type") == "tool_results":
            # Look through each result from the tools
            for r in item.get("tool_results", {}).get("content", []):
                # If this result is JSON data
                if r.get("type") == "json":
                    # Get the JSON content
                    j = r.get("json", {})
                    # Add any text explanation to our explanation box
                    assistant_text += j.get("text", "")
                    # If there's SQL code, save it
                    if j.get("sql"):
                        generated_sql = j["sql"]

## Step 8: Clean Up the Text

Sometimes the AI includes special symbols that don't look nice. We'll clean those up to make the text more readable.

In [None]:
# Replace weird symbols with normal brackets
# The AI sometimes uses special Unicode symbols that look strange
# '.replace()' finds text and changes it to something else
assistant_text = assistant_text.replace("【†", "[").replace("†】", "]")

## Step 9: Show the Results

Now let's see what the AI found for us! We'll print out:
1. The AI's explanation in English
2. The SQL code it wrote

In [None]:
# Print the AI's explanation
# '\n' means "start a new line" - like pressing Enter
print("\n--- Assistant Text ---\n", assistant_text or "(none)")

# Print the SQL code the AI wrote
print("\n--- Generated SQL ---\n", generated_sql or "(none)")

## Step 10: Run the SQL and See Our Data

If the AI wrote SQL code for us, let's run it and see what data we get back!

In [None]:
# Only run the SQL if we actually got some
# 'if' means "only do this when the condition is true"
if generated_sql:
    # Clean up the SQL by removing semicolons (;) that might cause problems
    # Then run it and convert the results to a pandas DataFrame (like a spreadsheet)
    df = session.sql(generated_sql.replace(";", "")).to_pandas()
    
    # Show the first few rows of our results
    # '.head()' means "show me the beginning"
    print("\n--- Query Results (first few rows) ---\n", df.head())

## Step 11: Try Another Question

Great job! Now let's try a different question to see how the AI handles various types of requests.

This time we'll ask about revenue by week - a common business question.

In [None]:
# Let's ask a different question
prompt = "Total revenue for all Closed Won Deals by week"

## Step 12: Package the Second Request

We'll create the same type of package as before, but with our new question.

In [None]:
# Create the same type of package, but with our new question
payload = {
    "model": MODEL_NAME,
    "messages": [
        {
            "role": "user", 
            "content": [
                {
                    "type": "text", 
                    "text": prompt
                }
            ]
        }
    ],
    "tools": [
        {
            "tool_spec": {
                "type": "cortex_analyst_text_to_sql", 
                "name": "analyst1"
            }
        },
        {
            "tool_spec": {
                "type": "cortex_search", 
                "name": "search1"
            }
        }
    ],
    "tool_resources": {
        "analyst1": {
            "semantic_model_file": SEMANTIC_MODELS
        },
        "search1": {
            "name": CORTEX_SEARCH_SERVICES, 
            "max_results": 3, 
            "id_column": "conversation_id"
        }
    }
}

## Step 13: Complete the Second Request

Now we'll send our second question and process the response - same steps as before!

In [None]:
# Send the second request
resp = _snowflake.send_snow_api_request(
    "POST", 
    API_ENDPOINT, 
    {}, 
    {}, 
    payload, 
    None, 
    API_TIMEOUT_MS
)

# Check for errors
if resp.get("status") != 200:
    raise RuntimeError(f"HTTP {resp.get('status')}: {resp.get('reason')} -> {resp}")

# Parse the response
events = json.loads(resp["content"])

# Reset our variables for the new response
assistant_text = ""
generated_sql = ""

# Process the events (same logic as before)
for ev in events:
    if ev.get("event") != "message.delta":
        continue
    for item in ev.get("data", {}).get("delta", {}).get("content", []):
        if item.get("type") == "text":
            assistant_text += item.get("text", "")
        elif item.get("type") == "tool_results":
            for r in item.get("tool_results", {}).get("content", []):
                if r.get("type") == "json":
                    j = r.get("json", {})
                    assistant_text += j.get("text", "")
                    if j.get("sql"):
                        generated_sql = j["sql"]

# Clean up the text
assistant_text = assistant_text.replace("【†", "[").replace("†】", "]")

# Show the results
print("\n--- Assistant Text ---\n", assistant_text or "(none)")
print("\n--- Generated SQL ---\n", generated_sql or "(none)")

# Run the SQL if we got some
if generated_sql:
    df = session.sql(generated_sql.replace(";", "")).to_pandas()
    print("\n--- Query Results (first few rows) ---\n", df.head())