# Snowflake Cost Analysis Agent v2.0

## Overview
This notebook creates a Cortex Analyst agent that can answer questions about **any time period** (up to 365 days).

## Key Changes in v2.0:
- ✅ **No time restrictions** - Ask about any time period
- ✅ **Simplified views** - Only 3 views (was 5)
- ✅ **Removed DAYS_BACK** - Time periods defined in questions
- ✅ **Cortex handles time filtering** - Views expose all data
- ✅ **Fully automated** - No manual steps

## Example Questions (Any Time Period!):
- What is my daily AI cost for the last 30 days?
- Show me costs for warehouse X in March 2025
- Compare my costs from last week to this week
- Show me Q1 2025 costs by product categories
- What are my top 5 warehouses in the last 90 days?
- How much did I spend on compute vs storage in January?
- Show me cost trends over the past 6 months
- How much does the cost agent itself cost?

## Step 1: Configuration

Only ONE configuration variable needed (no more DAYS_BACK!):

In [None]:
-- ConfigurationUSE ROLE ACCOUNTADMIN;-- Only variable you need to updateSET COST_PER_CREDIT = 2.5;  -- Your Snowflake cost per credit in USD-- Create dedicated warehouse for cost agentCREATE WAREHOUSE IF NOT EXISTS COST_AGENT_WH    WAREHOUSE_SIZE = 'XSMALL'    AUTO_SUSPEND = 60    AUTO_RESUME = TRUE    INITIALLY_SUSPENDED = FALSE    COMMENT = 'Dedicated warehouse for Cost Analysis Agent - tracks agent costs';USE WAREHOUSE COST_AGENT_WH;-- Create database and schemaCREATE DATABASE IF NOT EXISTS COST_ANALYSIS_DB;CREATE SCHEMA IF NOT EXISTS COST_ANALYSIS_DB.COST_VIEWS;USE DATABASE COST_ANALYSIS_DB;USE SCHEMA COST_VIEWS;SELECT     'Configuration complete!' AS STATUS,    'Cost per credit: $' || $COST_PER_CREDIT AS COST_INFO,    'Warehouse: COST_AGENT_WH' AS WAREHOUSE_INFO;

## Step 2: Create Simplified Cost Views

We create 3 views that add value WITHOUT time restrictions:

1. **COST_WITH_CATEGORIES** - Adds product category grouping (NO time filter)
2. **WAREHOUSE_COSTS** - Adds USD cost calculations (NO time filter)
3. **QUERY_COSTS** - Adds proportional cost estimates (NO time filter)

**Important**: These views do NOT filter by time - they expose all available data (up to 365 days).
The Cortex Analyst will handle time filtering based on user questions.

**Why views instead of direct ACCOUNT_USAGE access?**
- Add product category grouping (AI, Storage, Compute, etc.)
- Calculate USD costs from credits
- Provide proportional query cost estimates
- Reduce noise and consolidate metrics for better agent performance

In [None]:
-- View 1: Cost with Product Categories-- NO time filter - exposes all available dataCREATE OR REPLACE VIEW COST_WITH_CATEGORIES ASSELECT     USAGE_DATE,    SERVICE_TYPE,    CASE         -- AI and ML Services        WHEN SERVICE_TYPE IN ('SNOWPARK', 'SNOWPARK_CONTAINER_SERVICES', 'CORTEX_SEARCH',                               'CORTEX_ANALYST', 'CORTEX_FINE_TUNING', 'CORTEX_INFERENCE',                              'ML_FUNCTIONS', 'SNOWFLAKE_ML')             THEN 'AI & Machine Learning'        -- Data Transformation & Processing        WHEN SERVICE_TYPE IN ('WAREHOUSE_METERING', 'COMPUTE', 'QUERY_ACCELERATION',                              'MATERIALIZED_VIEW', 'PIPE', 'TASK', 'AUTOMATIC_CLUSTERING')             THEN 'Data Transformation & Compute'        -- Data Storage        WHEN SERVICE_TYPE IN ('STORAGE', 'DATA_TRANSFER', 'DATABASE_STORAGE',                               'STAGE_STORAGE', 'FAILSAFE_STORAGE')             THEN 'Data Storage'        -- Data Sharing & Collaboration        WHEN SERVICE_TYPE IN ('DATA_SHARING', 'REPLICATION', 'EXTERNAL_FUNCTIONS',                              'LISTING_AUTO_FULFILLMENT')             THEN 'Data Sharing & Collaboration'        -- Cloud Services        WHEN SERVICE_TYPE IN ('CLOUD_SERVICES', 'CLOUD_SERVICES_ONLY')             THEN 'Cloud Services'        -- Serverless Features        WHEN SERVICE_TYPE IN ('SERVERLESS_TASK', 'SERVERLESS_FEATURE')             THEN 'Serverless Features'        -- Search & Optimization        WHEN SERVICE_TYPE IN ('SEARCH_OPTIMIZATION', 'AUTOMATIC_RECLUSTERING')             THEN 'Search & Optimization'        ELSE 'Other Services'    END AS PRODUCT_CATEGORY,    CREDITS_USED,    CREDITS_USED * $COST_PER_CREDIT AS COST_USD,    ACCOUNT_NAMEFROM SNOWFLAKE.ACCOUNT_USAGE.METERING_DAILY_HISTORY;-- NO WHERE clause! All data exposed.SELECT 'View COST_WITH_CATEGORIES created successfully' AS STATUS;

In [None]:
-- View 2: Warehouse Costs-- NO time filter - exposes all available dataCREATE OR REPLACE VIEW WAREHOUSE_COSTS ASSELECT    START_TIME,    END_TIME,    WAREHOUSE_NAME,    CREDITS_USED,    CREDITS_USED_COMPUTE,    CREDITS_USED_CLOUD_SERVICES,    CREDITS_USED * $COST_PER_CREDIT AS COST_USD,    CREDITS_USED_COMPUTE * $COST_PER_CREDIT AS COMPUTE_COST_USD,    CREDITS_USED_CLOUD_SERVICES * $COST_PER_CREDIT AS CLOUD_SERVICES_COST_USD,    WAREHOUSE_IDFROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY;-- NO WHERE clause! All data exposed.SELECT 'View WAREHOUSE_COSTS created successfully' AS STATUS;

In [None]:
-- View 3: Query Costs with Estimates-- NO time filter - exposes all available dataCREATE OR REPLACE VIEW QUERY_COSTS ASWITH warehouse_hourly_credits AS (    SELECT         DATE_TRUNC('HOUR', START_TIME) AS HOUR,        WAREHOUSE_NAME,        SUM(CREDITS_USED) AS TOTAL_CREDITS    FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY    GROUP BY DATE_TRUNC('HOUR', START_TIME), WAREHOUSE_NAME    -- NO WHERE clause!),query_hours AS (    SELECT        QUERY_ID,        DATE_TRUNC('HOUR', START_TIME) AS HOUR,        START_TIME,        END_TIME,        QUERY_TYPE,        USER_NAME,        WAREHOUSE_NAME,        DATABASE_NAME,        SCHEMA_NAME,        EXECUTION_STATUS,        TOTAL_ELAPSED_TIME / 1000.0 / 3600.0 AS QUERY_HOURS,        BYTES_SCANNED,        ROWS_PRODUCED    FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY    WHERE QUERY_TYPE NOT IN ('USE', 'SHOW', 'DESCRIBE')        AND WAREHOUSE_NAME IS NOT NULL    -- NO time filter!),warehouse_total_hours AS (    SELECT        HOUR,        WAREHOUSE_NAME,        SUM(QUERY_HOURS) AS TOTAL_HOURS    FROM query_hours    GROUP BY HOUR, WAREHOUSE_NAME)SELECT    qh.QUERY_ID,    qh.START_TIME,    qh.END_TIME,    qh.QUERY_TYPE,    qh.USER_NAME,    qh.WAREHOUSE_NAME,    qh.DATABASE_NAME,    qh.SCHEMA_NAME,    qh.EXECUTION_STATUS,    qh.QUERY_HOURS,    qh.BYTES_SCANNED,    qh.ROWS_PRODUCED,    whc.TOTAL_CREDITS AS WAREHOUSE_HOURLY_CREDITS,    wth.TOTAL_HOURS AS WAREHOUSE_HOURLY_QUERY_HOURS,    CASE         WHEN wth.TOTAL_HOURS > 0 THEN             (qh.QUERY_HOURS / wth.TOTAL_HOURS) * whc.TOTAL_CREDITS        ELSE 0     END AS ESTIMATED_CREDITS,    CASE         WHEN wth.TOTAL_HOURS > 0 THEN             (qh.QUERY_HOURS / wth.TOTAL_HOURS) * whc.TOTAL_CREDITS * $COST_PER_CREDIT        ELSE 0     END AS ESTIMATED_COST_USDFROM query_hours qhLEFT JOIN warehouse_hourly_credits whc     ON qh.HOUR = whc.HOUR     AND qh.WAREHOUSE_NAME = whc.WAREHOUSE_NAMELEFT JOIN warehouse_total_hours wth    ON qh.HOUR = wth.HOUR    AND qh.WAREHOUSE_NAME = wth.WAREHOUSE_NAME;SELECT 'View QUERY_COSTS created successfully' AS STATUS;

## Step 3: Verify Views

Check that views were created and see data availability:

In [None]:
-- Check data availability and date rangesSELECT 'COST_WITH_CATEGORIES' AS VIEW_NAME,        COUNT(*) AS TOTAL_ROWS,       MIN(USAGE_DATE) AS EARLIEST_DATE,       MAX(USAGE_DATE) AS LATEST_DATE,       DATEDIFF('day', MIN(USAGE_DATE), MAX(USAGE_DATE)) AS DAYS_OF_DATAFROM COST_WITH_CATEGORIESUNION ALLSELECT 'WAREHOUSE_COSTS',       COUNT(*),       MIN(DATE(START_TIME)),       MAX(DATE(START_TIME)),       DATEDIFF('day', MIN(DATE(START_TIME)), MAX(DATE(START_TIME)))FROM WAREHOUSE_COSTSUNION ALLSELECT 'QUERY_COSTS',       COUNT(*),       MIN(DATE(START_TIME)),       MAX(DATE(START_TIME)),       DATEDIFF('day', MIN(DATE(START_TIME)), MAX(DATE(START_TIME)))FROM QUERY_COSTS;

In [None]:
-- Preview recent data from each viewSELECT 'Recent Service Costs' AS SAMPLE_TYPE;SELECT USAGE_DATE, PRODUCT_CATEGORY, SERVICE_TYPE, COST_USDFROM COST_WITH_CATEGORIESORDER BY USAGE_DATE DESCLIMIT 5;SELECT 'Recent Warehouse Costs' AS SAMPLE_TYPE;SELECT DATE(START_TIME) AS DATE, WAREHOUSE_NAME, COST_USDFROM WAREHOUSE_COSTSORDER BY START_TIME DESCLIMIT 5;

In [None]:
-- Check if COST_AGENT_WH appears in costs-- (May take a few hours due to ACCOUNT_USAGE latency)SELECT     DATE(START_TIME) AS DATE,    WAREHOUSE_NAME,    SUM(COST_USD) AS TOTAL_COSTFROM WAREHOUSE_COSTSWHERE WAREHOUSE_NAME = 'COST_AGENT_WH'GROUP BY DATE(START_TIME), WAREHOUSE_NAMEORDER BY DATE DESCLIMIT 10;-- If empty, check back in a few hours!

## Step 4: Create Cortex Analyst Semantic Model

The semantic model defines how Cortex Analyst understands your data.

**Key Feature**: No time restrictions - users can ask about any time period!

In [None]:
# Create semantic model YAMLsemantic_model_yaml = '''name: snowflake_cost_analysisdescription: Analyze Snowflake costs for any time period (up to 365 days)tables:  - name: COST_WITH_CATEGORIES    description: Service costs with product categories. Contains ALL historical data.    base_table:      database: COST_ANALYSIS_DB      schema: COST_VIEWS      table: COST_WITH_CATEGORIES    dimensions:      - name: USAGE_DATE        description: Date of usage        data_type: DATE        synonyms: [date, day, when, time, period]      - name: SERVICE_TYPE        description: Specific Snowflake service        data_type: VARCHAR        synonyms: [service, service name, type]      - name: PRODUCT_CATEGORY        description: Product category grouping        data_type: VARCHAR        synonyms: [category, product, product type]        sample_values:          - AI & Machine Learning          - Data Transformation & Compute          - Data Storage          - Cloud Services    measures:      - name: CREDITS_USED        description: Credits consumed        data_type: NUMBER        aggregation: SUM      - name: COST_USD        description: Cost in US dollars        data_type: NUMBER        aggregation: SUM        synonyms: [cost, spend, spending, expense]  - name: WAREHOUSE_COSTS    description: Warehouse costs. Contains ALL historical data.    base_table:      database: COST_ANALYSIS_DB      schema: COST_VIEWS      table: WAREHOUSE_COSTS    dimensions:      - name: START_TIME        description: Start time of metering period        data_type: TIMESTAMP_NTZ        synonyms: [start, time, date, when]      - name: WAREHOUSE_NAME        description: Warehouse name        data_type: VARCHAR        synonyms: [warehouse, wh]    measures:      - name: COST_USD        description: Total cost in USD        data_type: NUMBER        aggregation: SUM        synonyms: [cost, spend]      - name: COMPUTE_COST_USD        description: Compute cost in USD        data_type: NUMBER        aggregation: SUM      - name: CLOUD_SERVICES_COST_USD        description: Cloud services cost in USD        data_type: NUMBER        aggregation: SUM  - name: QUERY_COSTS    description: Query-level costs. Contains ALL historical data.    base_table:      database: COST_ANALYSIS_DB      schema: COST_VIEWS      table: QUERY_COSTS    dimensions:      - name: START_TIME        description: Query start time        data_type: TIMESTAMP_NTZ        synonyms: [start, time, date, when]      - name: USER_NAME        description: User who executed query        data_type: VARCHAR        synonyms: [user, username, who]      - name: WAREHOUSE_NAME        description: Warehouse used        data_type: VARCHAR        synonyms: [warehouse, wh]    measures:      - name: ESTIMATED_COST_USD        description: Estimated cost in USD        data_type: NUMBER        aggregation: SUM        synonyms: [cost, spend]verified_queries:  - name: ai_costs_flexible    question: What is my daily AI cost for the last 30 days?    sql: |      SELECT USAGE_DATE, SUM(COST_USD) AS AI_COST      FROM COST_ANALYSIS_DB.COST_VIEWS.COST_WITH_CATEGORIES      WHERE PRODUCT_CATEGORY = 'AI & Machine Learning'        AND USAGE_DATE >= DATEADD('day', -30, CURRENT_DATE())      GROUP BY USAGE_DATE ORDER BY USAGE_DATE DESC    - name: warehouse_costs_flexible    question: Show me costs for a warehouse    sql: |      SELECT DATE(START_TIME) AS DATE, WAREHOUSE_NAME, SUM(COST_USD) AS COST      FROM COST_ANALYSIS_DB.COST_VIEWS.WAREHOUSE_COSTS      WHERE START_TIME >= DATEADD('day', -30, CURRENT_DATE())      GROUP BY DATE(START_TIME), WAREHOUSE_NAME      ORDER BY DATE DESC    - name: cost_agent_costs    question: How much does the cost agent itself cost?    sql: |      SELECT DATE(START_TIME) AS DATE, SUM(COST_USD) AS COST      FROM COST_ANALYSIS_DB.COST_VIEWS.WAREHOUSE_COSTS      WHERE WAREHOUSE_NAME = 'COST_AGENT_WH'      GROUP BY DATE(START_TIME) ORDER BY DATE DESC'''print('Semantic model created successfully!')print('No time restrictions - users can ask about any period!')

In [None]:
-- Create stageCREATE STAGE IF NOT EXISTS COST_AGENT_STAGE;SELECT 'Stage created successfully' AS STATUS;

In [None]:
# Upload semantic model to stageimport tempfileimport oswith tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f:    f.write(semantic_model_yaml)    temp_file = f.nametry:    put_result = session.file.put(        temp_file,        '@COST_AGENT_STAGE',        auto_compress=False,        overwrite=True    )    print('Semantic model uploaded successfully!')    print(put_result)finally:    os.unlink(temp_file)session.sql('LIST @COST_AGENT_STAGE').show()

## Step 6: Create Cortex Analyst Agent

The agent will be created automatically. If this fails, use the manual method below.

In [None]:
-- Create Cortex Analyst AgentCREATE OR REPLACE CORTEX SEARCH SERVICE SNOWFLAKE_COST_ANALYST    ON USAGE_DATE, SERVICE_TYPE, WAREHOUSE_NAME, PRODUCT_CATEGORY, START_TIME    WAREHOUSE = COST_AGENT_WH    TARGET_LAG = '1 hour'    AS (        SELECT             USAGE_DATE,            USAGE_DATE AS START_TIME,            SERVICE_TYPE,            'N/A' AS WAREHOUSE_NAME,            PRODUCT_CATEGORY,            CREDITS_USED,            COST_USD        FROM COST_WITH_CATEGORIES        UNION ALL        SELECT             DATE(START_TIME) AS USAGE_DATE,            START_TIME,            'N/A' AS SERVICE_TYPE,            WAREHOUSE_NAME,            'N/A' AS PRODUCT_CATEGORY,            CREDITS_USED,            COST_USD        FROM WAREHOUSE_COSTS    );SELECT 'Agent created! Go to Snowflake Intelligence to use it.' AS STATUS;

## Alternative: Manual Agent Creation

If automated creation failed:
1. Go to Snowflake Intelligence
2. Create Analyst
3. Use semantic model from @COST_AGENT_STAGE

In [None]:
-- Test: Different time periodsSELECT 'Last 7 Days' AS PERIOD, SUM(COST_USD) AS TOTAL_COSTFROM COST_WITH_CATEGORIESWHERE USAGE_DATE >= DATEADD('day', -7, CURRENT_DATE())UNION ALLSELECT 'Last 30 Days', SUM(COST_USD)FROM COST_WITH_CATEGORIESWHERE USAGE_DATE >= DATEADD('day', -30, CURRENT_DATE())UNION ALLSELECT 'Last 90 Days', SUM(COST_USD)FROM COST_WITH_CATEGORIESWHERE USAGE_DATE >= DATEADD('day', -90, CURRENT_DATE());

## Summary - All Set! 🎉

### What We Created:
- 3 simplified views (NO time filters)
- Dedicated warehouse (COST_AGENT_WH)
- Semantic model (no time restrictions)
- Cortex Analyst agent

### Ask Questions About ANY Time Period:
- Last 7/30/90 days, 6 months, 1 year
- Specific months (March 2025)
- Quarters (Q1 2025)
- Compare periods (last week vs this week)

### Example Questions:
- What is my daily AI cost for the last 30 days?
- Show me costs for warehouse X in March 2025
- Compare my costs from last week to this week
- Show me Q1 2025 costs by product categories
- How much does the cost agent itself cost?

**Go to Snowflake Intelligence and start asking!**

The Cortex Analyst will handle all time filtering based on your questions.