In [None]:
import warnings
warnings.filterwarnings("ignore")
import streamlit as st
from snowflake.snowpark.context import get_active_session
from snowflake.core import Root
from snowflake.cortex import Complete
session = get_active_session()
root = Root(get_active_session())

# Cortex Agents
In this notebook you will setup multiple Cortex Search and Cortex Analyst Services which will be used by Cortex Agents to answer user queries on unstructured and structured data.
![text](https://github.com/michaelgorkow/snowflake_cortex_agents_demo/blob/main/resources/cortex_agents_notebook_small.png?raw=true)

# Setup the Cortex Search Service [Unstructured Data]

We have property descriptions in our database for rental homes that users should be able to search in and ask questions.

In [None]:
SELECT * FROM HOMES;

In [None]:
-- Create a Cortex Search Service for rental home descriptions
CREATE OR REPLACE CORTEX SEARCH SERVICE RENTAL_HOME_DESCRIPTIONS
  ON RENTAL_HOME_DESCRIPTION
  ATTRIBUTES RENTAL_HOME_ID
  WAREHOUSE = COMPUTE_WH
  TARGET_LAG = '1 hour'
  EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (
  SELECT
      RENTAL_HOME_ID,
      RENTAL_HOME_DESCRIPTION
  FROM HOMES
);

### [Optional] Test Your Service in a Simple RAG Pipeline  

In this small example, we **combine Cortex Search with Cortex LLMs** to generate a response from context—also known as **Retrieval-Augmented Generation (RAG)**.  
This approach enhances responses by retrieving relevant data before generating an answer, improving accuracy and contextual relevance. 🚀  

In [None]:
question = 'I am looking for a place at the beach with a BBQ area. It should also be close to the airport.'

# Fetch service
my_service = (root
  .databases["CORTEX_AGENTS_DEMO"]
  .schemas["STAYBNB"]
  .cortex_search_services["RENTAL_HOME_DESCRIPTIONS"]
)

# Query service
resp = my_service.search(
  query=question,
  columns=["RENTAL_HOME_ID", "RENTAL_HOME_DESCRIPTION"],
  limit=1
)
resp = resp.results[0]

with st.expander(f'**Rental Home ID:** {resp["RENTAL_HOME_ID"]}', expanded=False):
    st.info(resp["RENTAL_HOME_DESCRIPTION"])

# Generate Response
model = 'mistral-large2'
prompt = f"{question} Answer based on the provided context: {resp['RENTAL_HOME_DESCRIPTION']}"
response = Complete(model, prompt).strip()

st.info(f'**LLM Response:**\n\n**{response}**')

# Setup the Cortex Analyst Service [Structured Data]   

## Dataset Overview  

The dataset consists of **two tables**:  

- **`HOMES`**  
  - Contains data about **property type** and **total square meters**.

- **`BOOKINGS`**  
  - Stores **booking details**, including:  
    - `BOOKING_START_DATE`  
    - `BOOKING_END_DATE`  
    - `PRICE_PER_NIGHT`  
    - `CLEANING_FEE`  

In addition to that we have a stage with images for each rental home.
In order to let users ask questions about available appliances, we'll turn these images into structured data
- **`RENTAL_HOME_ROOMS_STRUCTURED`**  
  - Stores **room types** and their **appliances**, such as:  

This structured dataset will allow **Cortex Analyst** to process user queries efficiently and return meaningful results. 🚀  


In [None]:
SELECT * FROM HOMES LIMIT 3;

In [None]:
SELECT * FROM BOOKINGS LIMIT 3;

In [None]:
-- Extract room types and appliances from images via multimodal LLM
CREATE OR REPLACE TABLE RENTAL_HOME_ROOMS AS (
    SELECT 
        split(RELATIVE_PATH,'/')[0]::text AS RENTAL_HOME_ID,
        PARSE_JSON(
            SNOWFLAKE.CORTEX.COMPLETE(
                'claude-3-5-sonnet',
                'Extract the room type and appliances in the room. 
                 Respond in JSON format.
                 Examples: 
                 {
                    "room":"kitchen", "appliances":["refrigerator","microwave", "oven", "coffee machine", "wine fridge", "dish washer"], 
                 }
                 {
                    "room":"living room", "appliances":["tv", "game console", "couch", "speakers"]
                 }',
                TO_FILE('@CORTEX_AGENTS_DEMO.STAYBNB.IMAGES', RELATIVE_PATH)
            )
        ) AS ROOM_APPLIANCES
    FROM 
        DIRECTORY(@CORTEX_AGENTS_DEMO.STAYBNB.IMAGES)
);

-- Format the output
CREATE OR REPLACE TABLE RENTAL_HOME_ROOMS_STRUCTURED AS (
    SELECT 
        RENTAL_HOME_ID,
        ROOM_APPLIANCES['room']::TEXT AS ROOM_TYPE,
        APPLIANCE.VALUE::TEXT AS appliance
    FROM 
        RENTAL_HOME_ROOMS,
        LATERAL FLATTEN(input => ROOM_APPLIANCES['appliances']) AS appliance
);

SELECT * FROM RENTAL_HOME_ROOMS_STRUCTURED;

In [None]:
CREATE OR REPLACE CORTEX SEARCH SERVICE _ANALYST_ROOM_TYPE_SEARCH
  ON ROOM_TYPE
  WAREHOUSE = COMPUTE_WH
  TARGET_LAG = '1 hour'
  EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (
  SELECT
      DISTINCT ROOM_TYPE
  FROM RENTAL_HOME_ROOMS_STRUCTURED
);

In [None]:
CREATE OR REPLACE CORTEX SEARCH SERVICE _ANALYST_APPLIANCE_SEARCH
  ON APPLIANCE
  WAREHOUSE = COMPUTE_WH
  TARGET_LAG = '1 hour'
  EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
AS (
  SELECT
      DISTINCT APPLIANCE
  FROM RENTAL_HOME_ROOMS_STRUCTURED
);

# Explore the Semantic Model in the Snowflake UI  

With our data available in **Snowflake** and the **Search Services** set up, it's time to explore the **native Semantic Model Generator** in **Snowsight**.  
![text](https://github.com/michaelgorkow/snowflake_cortex_agents_demo/blob/main/resources/semantic_model_ui.png?raw=true)

## Why Do You Need a Semantic Model?  

**Cortex Analyst** allows users to query **Snowflake** data using **natural language**. However, business users often use terminology that does not align with the database schema.  

### The Problem  
- Users specify **domain-specific business terms** in their questions  
- Underlying data is stored using **technical abbreviations**  
- Example:  
  - **"CUST"** is used for **customers**  
  - **Schema lacks semantic context**, making queries harder to interpret  

### The Solution: Semantic Models  
Semantic models **map business terminology to database schemas** and provide **contextual meaning**.  

#### Example  
When a user asks:  
🗣️ *"Total revenue last month"*  

The **semantic model** can:  
✅ Define **"revenue"** as **net revenue**  
✅ Define **"last month"** as **the previous calendar month**  

This mapping helps **Cortex Analyst** understand **user intent** and generate **accurate answers**.  

🔗 More details can be found in the [Semantic Model Specification](https://docs.snowflake.com/en/user-guide/snowflake-cortex/cortex-analyst/semantic-model-spec).  
