# üöÄ Cortex Analyst Interactive Tutorial
## Learn by Doing: Snowflake Cortex Analyst Service

**Author:** Li Ma  
**Date:** February 24, 2026  
**Project:** DIA v2.0 - Direct Marketing Analytics Intelligence

---

## üìö What You'll Learn

This interactive notebook teaches you how to:
1. ‚úÖ Connect to Snowflake with Python
2. ‚úÖ Use Cortex Analyst to convert natural language to SQL
3. ‚úÖ Execute queries and process results
4. ‚úÖ Build production-ready service wrappers
5. ‚úÖ Handle errors and log activities

## üéØ Prerequisites

- Docker containers running (`docker-compose up`)
- Snowflake credentials configured in `.env` file
- Semantic model deployed to Snowflake stage

---

**üí° Tip:** Run each cell with `Shift + Enter` and experiment with the code!

In [7]:
# Install required packages for this notebook
# Run this cell once to install dependencies
import sys
import subprocess

packages = [
    'structlog',
    'python-dotenv',
    'snowflake-snowpark-python'
]

print("üì¶ Installing required packages...")
for package in packages:
    print(f"   Installing {package}...")
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])
        print(f"   ‚úÖ {package} installed")
    except subprocess.CalledProcessError as e:
        print(f"   ‚ùå Failed to install {package}: {e}")

print("\n‚úÖ Installation complete!")
print("‚ö†Ô∏è  If this is the first install, please RESTART THE KERNEL:")
print("   Jupyter menu: Kernel ‚Üí Restart Kernel")

üì¶ Installing required packages...
   Installing structlog...
   ‚úÖ structlog installed
   Installing python-dotenv...
   ‚úÖ python-dotenv installed
   Installing snowflake-snowpark-python...
   ‚úÖ snowflake-snowpark-python installed

‚úÖ Installation complete!
‚ö†Ô∏è  If this is the first install, please RESTART THE KERNEL:
   Jupyter menu: Kernel ‚Üí Restart Kernel


In [9]:
# Add the parent directory to path so we can import modules
import sys
import os

# Calculate the project paths dynamically
# This notebook is in: notebooks/ folder
notebook_dir = os.getcwd()
project_root = os.path.abspath(os.path.join(notebook_dir, '..'))
orchestrator_path = os.path.join(project_root, 'orchestrator')

# Add paths for both local and Docker environments
sys.path.insert(0, orchestrator_path)
sys.path.insert(0, project_root)
sys.path.insert(0, '/app')  # For Docker environment

print(f"üìÅ Python paths added:")
print(f"   Project Root: {project_root}")
print(f"   Orchestrator: {orchestrator_path}")

# Verify orchestrator path exists
if os.path.exists(orchestrator_path):
    print(f"   ‚úÖ Orchestrator directory found")
else:
    print(f"   ‚ö†Ô∏è  Orchestrator directory NOT found at: {orchestrator_path}")

# Core Python libraries
import json
from typing import Dict, List, Any, Optional
from dataclasses import dataclass

# Snowflake libraries
from snowflake.snowpark import Session

# Environment and logging
from dotenv import load_dotenv

# Try to import custom logger with fallback
try:
    from utils.logging import get_logger
    logger = get_logger(__name__)
    print(f"   ‚úÖ Using custom structlog logger")
except ImportError as e:
    import logging
    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
    logger = logging.getLogger(__name__)
    print(f"   ‚ö†Ô∏è  Using standard logging (utils.logging not found)")
    print(f"   Error details: {e}")

# Load environment variables from .env file
load_dotenv()

print("\n‚úÖ All libraries imported successfully!")
print(f"   Python version: {sys.version.split()[0]}")

ModuleNotFoundError: No module named 'utils'

In [2]:
@dataclass
class AnalystResponse:
    """
    Container for Cortex Analyst responses.
    
    Attributes:
        query (str): The natural language question
        sql (str): Generated SQL query
        results (List[Dict]): Query results as list of dictionaries
        metadata (Dict): Additional information (row count, execution time)
        error (str): Error message if something went wrong
    """
    query: str
    sql: Optional[str] = None
    results: Optional[List[Dict[str, Any]]] = None
    metadata: Optional[Dict[str, Any]] = None
    error: Optional[str] = None
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert to dictionary (useful for JSON APIs)"""
        return {
            "query": self.query,
            "sql": self.sql,
            "results": self.results,
            "metadata": self.metadata,
            "error": self.error
        }

# Test it out!
sample_response = AnalystResponse(
    query="What is the average open rate?",
    sql="SELECT AVG(OPEN_RATE) FROM VW_SFMC_EMAIL_PERFORMANCE",
    results=[{"AVG_OPEN_RATE": 22.5}],
    metadata={"row_count": 1}
)

print("‚úÖ AnalystResponse created!")
print(f"   Query: {sample_response.query}")
print(f"   SQL: {sample_response.sql}")
print(f"   Results: {sample_response.results}")

‚úÖ AnalystResponse created!
   Query: What is the average open rate?
   SQL: SELECT AVG(OPEN_RATE) FROM VW_SFMC_EMAIL_PERFORMANCE
   Results: [{'AVG_OPEN_RATE': 22.5}]


In [10]:
# Import the CortexAnalyst service class
try:
    from services.cortex_analyst import CortexAnalyst
    print("‚úÖ CortexAnalyst class imported successfully!")
    print("   Ready to create instances and query Snowflake")
except ImportError as e:
    print(f"‚ùå Failed to import CortexAnalyst: {e}")
    print("\nüí° Troubleshooting:")
    print("   1. Make sure you ran Cell 2 (path setup)")
    print("   2. Check that orchestrator/services/cortex_analyst.py exists")
    print(f"   3. Current sys.path: {sys.path[:3]}")

‚ùå Failed to import CortexAnalyst: No module named 'services'

üí° Troubleshooting:
   1. Make sure you ran Cell 2 (path setup)
   2. Check that orchestrator/services/cortex_analyst.py exists
   3. Current sys.path: ['/app', '/app', '/app']


## üîß Import CortexAnalyst Service

Now let's import the complete `CortexAnalyst` class from the services module.

In [11]:
# Create analyst instance
with CortexAnalyst() as analyst:
    verification = analyst.verify_semantic_model()
    
    if verification['exists']:
        print("‚úÖ Semantic Model Found!")
        print(f"   File: {verification['file_name']}")
        print(f"   Size: {verification['file_size']:,} bytes ({verification['file_size']/1024:.1f} KB)")
        print(f"   Modified: {verification['last_modified']}")
        print(f"   Stage: {verification['stage_path']}")
    else:
        print("‚ùå Semantic Model NOT Found")
        print(f"   Error: {verification.get('error', 'Unknown')}")
        print("\nüí° Make sure you ran: python scripts/deploy_semantic_model.py")

NameError: name 'CortexAnalyst' is not defined

## üéì Summary: What You Learned

Congratulations! You've learned:

‚úÖ **Python OOP Concepts**
- Classes and objects  
- Instance methods and attributes
- Context managers (`with` statement)
- Type hints and dataclasses

‚úÖ **Snowflake Integration**
- Connecting with Snowpark
- Executing SQL queries
- Handling results

‚úÖ **Service Design Patterns**
- Lazy loading (efficient resource usage)
- Error handling and logging
- Structured responses

‚úÖ **Production Best Practices**
- Environment-based configuration
- Comprehensive logging
- Clean code with documentation

---

## üöÄ Next Steps

1. **Practice More**: Try different questions in the exercise cell above  
2. **Build Other Services**: Apply this pattern to `cortex_complete.py`, `cortex_search.py`
3. **Enhance Features**: Add caching, retry logic, rate limiting
4. **Integration**: Use in your FastAPI endpoints
5. **Testing**: Write pytest tests for edge cases

---

## üìö Resources

- [Snowflake Cortex Documentation](https://docs.snowflake.com/en/user-guide/snowflake-cortex)
- [CORTEX_ANALYST_LEARNING_GUIDE.md](../orchestrator/services/CORTEX_ANALYST_LEARNING_GUIDE.md)
- [Python Dataclasses](https://docs.python.org/3/library/dataclasses.html)
- [Structlog](https://www.structlog.org/)

---

**Happy Coding! üéâ**

In [None]:
# üèãÔ∏è Exercise: Write your own question!

with CortexAnalyst() as analyst:
    # TODO: Change this question to something you want to know!
    my_question = "CHANGE THIS TO YOUR QUESTION"
    
    response = analyst.send_message(my_question)
    
    if response.error:
        print(f"‚ùå Error: {response.error}")
    else:
        print(f"‚úÖ Question: {response.query}")
        print(f"\nüìä SQL: {response.sql}")
        print(f"\nüìà Results:")
        for i, row in enumerate(response.results[:10], 1):
            print(f"   {i}. {row}")

## üéØ Practice Exercise: Ask Your Own Question

**Your Turn!** Try asking different questions about your email data.

**Example Questions:**
- "What was the total emails sent last month?"
- "Show me click rate by market"
- "Which campaigns had bounce rate above 5%?"
- "What is the average open rate by business unit?"

**Instructions:**
1. Change the `my_question` variable below
2. Run the cell
3. See if Cortex Analyst can answer it!

In [None]:
with CortexAnalyst() as analyst:
    # Ask a question in natural language
    question = "What is the average open rate?"
    
    print(f"ü§î Asking: '{question}'")
    print("   Processing...")
    
    response = analyst.send_message(question)
    
    if response.error:
        print(f"\n‚ö†Ô∏è  Query Failed (Expected if Cortex Analyst not enabled)")
        print(f"   Error: {response.error}")
        print("\nüí° To enable Cortex Analyst:")
        print("   1. Contact Snowflake support or your account admin")
        print("   2. Request 'Cortex Analyst' feature activation")
    else:
        print("\n‚úÖ Query Successful!")
        print(f"\nüìù Question: {response.query}")
        print(f"\nüîç Generated SQL:")
        print(f"   {response.sql}")
        print(f"\nüìä Results:")
        for i, row in enumerate(response.results[:5], 1):  # First 5 rows
            print(f"   {i}. {row}")
        print(f"\n‚ÑπÔ∏è  Metadata: {response.metadata}")

## üß™ Test 3: Try Cortex Analyst (Natural Language Query)

Now let's try asking a question in natural language!

**Note:** This requires Cortex Analyst to be enabled in your Snowflake account. If not enabled yet, you'll see an error message (that's expected!).

In [None]:
with CortexAnalyst() as analyst:
    try:
        # Simple count query
        sql = "SELECT COUNT(*) AS ROW_COUNT FROM VW_SFMC_EMAIL_PERFORMANCE LIMIT 1"
        results = analyst._execute_sql(sql)
        
        print("‚úÖ SQL Execution Test Passed!")
        print(f"   Rows in VW_SFMC_EMAIL_PERFORMANCE: {results[0]['ROW_COUNT']:,}")
        
        # Get sample data
        sample_sql = "SELECT MARKET, OPEN_RATE, CLICK_RATE FROM VW_SFMC_EMAIL_PERFORMANCE LIMIT 5"
        sample_data = analyst._execute_sql(sample_sql)
        
        print("\nüìä Sample Data:")
        for i, row in enumerate(sample_data, 1):
            print(f"   {i}. Market: {row['MARKET']}, Open Rate: {row['OPEN_RATE']}%, Click Rate: {row['CLICK_RATE']}%")
        
    except Exception as e:
        print(f"‚ùå SQL Execution Failed: {e}")

## üß™ Test 2: Execute Simple SQL Query

Before trying Cortex Analyst, let's test basic SQL execution against your data.

## üß™ Test 1: Verify Semantic Model

Let's verify your semantic model is deployed correctly in Snowflake.

In [None]:
# Import the complete CortexAnalyst class
from services.cortex_analyst import CortexAnalyst

# Create an instance
analyst = CortexAnalyst()

print("‚úÖ CortexAnalyst instance created!")
print(f"   Database: {analyst.database}")
print(f"   Schema: {analyst.schema}")
print(f"   Semantic Model: @{analyst.stage_name}/{analyst.semantic_model_file}")

## 3Ô∏è‚É£ CortexAnalyst Class: Complete Implementation

Now let's import the complete CortexAnalyst class from our service module.

This class includes:
- Snowflake connection management
- Cortex Analyst API calls
- Error handling and logging
- Helper methods

## 2Ô∏è‚É£ Data Models: AnalystResponse

Before we build the service, let's create a data structure to hold responses from Cortex Analyst.

**Why dataclasses?**
- Clean, readable code
- Type hints for better error checking
- Built-in methods like `__repr__`
- Less boilerplate than regular classes

## 1Ô∏è‚É£ Import Required Libraries

First, we need to import all the Python libraries we'll use:
- **snowflake.snowpark**: For connecting to Snowflake
- **dotenv**: For loading environment variables
- **dataclasses**: For creating data structures
- **json**: For parsing JSON responses