# Cortex AI SQL Functions for Understood

This demo showcases how Understood can use Snowflake's built-in AI functions to:
1. **Analyze Sentiment** - Understand how families feel about programs
2. **Summarize Text** - Quickly digest long feedback
3. **Generate Responses** - Draft empathetic replies to parents
4. **Detect PII** - Protect sensitive family information

## Setup

In [None]:
from snowflake.snowpark.context import get_active_session
session = get_active_session()
session.sql("USE DATABASE UNDERSTOOD_DEMO").collect()
session.sql("USE SCHEMA AI_DEMO").collect()

---
## 1. Create Sample Program Feedback Data

In [None]:
session.sql("""
CREATE OR REPLACE TABLE program_feedback (
    feedback_id INT,
    program_name VARCHAR,
    parent_name VARCHAR,
    feedback_text VARCHAR,
    submitted_date DATE
)
""").collect()

session.sql("""
INSERT INTO program_feedback VALUES
(1, 'Reading Support', 'Sarah M.', 'The reading program has been incredible! My son went from struggling to actually enjoying books. The specialists are so patient and understanding.', '2024-01-15'),
(2, 'Math Tutoring', 'James W.', 'Very disappointed. We waited 3 weeks to get started and the tutor canceled twice. My daughter is falling further behind and Im frustrated with the lack of communication.', '2024-01-18'),
(3, 'Social Skills Group', 'Maria G.', 'Mixed feelings. My child enjoys the sessions but I wish there was more parent involvement. Hard to know whats happening or how to reinforce at home.', '2024-01-20'),
(4, 'ADHD Coaching', 'Emily B.', 'Life-changing! Coach David helped my daughter develop routines that actually stick. Her teachers have noticed a huge improvement in focus and organization.', '2024-01-22'),
(5, 'IEP Advocacy', 'Michael C.', 'The advocacy support gave us confidence to speak up at school meetings. We finally got the accommodations our son needs. Thank you for empowering us!', '2024-01-25')
""").collect()

print("Feedback data created!")

In [None]:
session.sql("SELECT * FROM program_feedback").to_pandas()

---
## 2. SNOWFLAKE.CORTEX.SENTIMENT - Sentiment Analysis

**What SENTIMENT Does:**
`SNOWFLAKE.CORTEX.SENTIMENT(text)` analyzes text and returns a numeric score from **-1 (very negative) to +1 (very positive)**. This makes it easy to sort, filter, and set thresholds for automated workflows.

**How We're Using It Here:**
We analyze parent feedback and display the numeric score (e.g., -0.34, +0.85). Scores below -0.3 trigger a "Negative - Follow Up!" flag, helping staff prioritize outreach to struggling families before issues escalate.

In [None]:
sentiment_query = """
SELECT 
    program_name,
    parent_name,
    feedback_text,
    ROUND(SNOWFLAKE.CORTEX.SENTIMENT(feedback_text), 3) AS sentiment_score,
    CASE 
        WHEN SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) > 0.3 THEN 'Positive'
        WHEN SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) < -0.3 THEN 'Negative - Follow Up!'
        ELSE 'Neutral'
    END AS sentiment_flag
FROM program_feedback
ORDER BY sentiment_score
"""
session.sql(sentiment_query).to_pandas()

---
## 3. SNOWFLAKE.CORTEX.SUMMARIZE - Condense Long Text

**What SUMMARIZE Does:**
`SNOWFLAKE.CORTEX.SUMMARIZE(text)` uses a Snowflake-managed AI model to generate concise summaries of long text. It handles up to 32,000 tokens (~24,000 words) and automatically extracts key points - no model selection or prompt engineering required.

**How We're Using It Here:**
We summarize detailed parent feedback so program managers can quickly understand the gist without reading every word. This helps staff review hundreds of feedback submissions efficiently and spot trends across programs.

In [None]:
summarize_query = """
SELECT 
    program_name,
    SNOWFLAKE.CORTEX.SUMMARIZE(feedback_text) AS quick_summary
FROM program_feedback
WHERE program_name = 'Math Tutoring'
"""
result = session.sql(summarize_query).to_pandas()
print(f"Program: {result['PROGRAM_NAME'][0]}")
print(f"Summary: {result['QUICK_SUMMARY'][0]}")

---
## 4. AI_COMPLETE - Generate Custom AI Responses

**What AI_COMPLETE Does:**
`AI_COMPLETE(model, prompt)` sends a prompt to a large language model (LLM) and returns the generated text. You choose the model (Claude, Llama, Mistral, etc.) and craft the prompt - giving you full control for any AI task: text generation, Q&A, classification, extraction, or analysis.

**How We're Using It Here:**
We generate empathetic, personalized response drafts for support staff to send to concerned parents. The prompt instructs the AI to act as a "compassionate family support specialist" and write 2-3 sentences acknowledging concerns and offering next steps. This saves staff time while maintaining a warm, human touch.

In [None]:
response_query = """
SELECT 
    parent_name,
    feedback_text,
    AI_COMPLETE(
        'claude-3-5-sonnet',
        'You are a compassionate family support specialist at Understood.org. Write a brief, empathetic response to this parent feedback. Acknowledge their concerns and offer a concrete next step. Keep it to 2-3 sentences.

Feedback: ' || feedback_text
    ) AS suggested_response
FROM program_feedback
WHERE SNOWFLAKE.CORTEX.SENTIMENT(feedback_text) < 0
"""
result = session.sql(response_query).to_pandas()
print(f"Parent: {result['PARENT_NAME'][0]}")
print(f"\nOriginal Feedback: {result['FEEDBACK_TEXT'][0]}")
print(f"\nSuggested Response: {result['SUGGESTED_RESPONSE'][0]}")

---
## 5. AI_COMPLETE for PII Detection - Protect Family Data

**What AI_COMPLETE Does (for PII):**
While Snowflake offers `AI_REDACT(text)` for automatic PII removal, `AI_COMPLETE` with a custom prompt gives you flexible, intelligent PII detection. The LLM recognizes varied formats of SSNs, phone numbers, emails, addresses, and more - going beyond rigid pattern matching to catch edge cases.

**How We're Using It Here:**
We scan family case notes to identify sensitive information that needs protection. The AI lists each PII item with its type, then classifies overall risk level (HIGH/MEDIUM/LOW/CLEAN). This helps compliance teams prioritize which records need immediate attention for data governance.

In [None]:
session.sql("""
CREATE OR REPLACE TABLE family_case_notes (
    note_id INT,
    family_name VARCHAR,
    case_notes VARCHAR
)
""").collect()

session.sql("""
INSERT INTO family_case_notes VALUES
(1, 'Johnson Family', 'Spoke with Sarah at 555-123-4567 about Emmas IEP meeting. She mentioned her email is sarah.johnson@gmail.com for follow-up.'),
(2, 'Chen Family', 'Family enrolled in reading program. Child showing great progress with phonics.'),
(3, 'Garcia Family', 'Medical records indicate ADHD diagnosis. SSN on file is 123-45-6789. Insurance ID: BC12345678.'),
(4, 'Wilson Family', 'Scheduled follow-up for next Tuesday. Family is happy with current support plan.'),
(5, 'Brown Family', 'Mother DOB 03/15/1985. Credit card ending 4532 used for premium subscription. Address: 123 Oak Street, Austin TX 78701')
""").collect()

print("Case notes created!")

In [None]:
session.sql("SELECT * FROM family_case_notes").to_pandas()

### Detect PII in Case Notes
Identifies and categorizes each PII type (phone, email, SSN, etc.) found in unstructured case notes.

In [None]:
pii_query = """
SELECT 
    family_name,
    case_notes,
    AI_COMPLETE(
        'claude-3-5-sonnet',
        'Identify any PII (personally identifiable information) in this text. List each PII item found with its type (phone, email, SSN, address, DOB, etc). If no PII found, say "No PII detected". Be concise.

Text: ' || case_notes
    ) AS pii_detected
FROM family_case_notes
"""
session.sql(pii_query).to_pandas()

### Flag High-Risk Records for Review
Counts PII items and classifies risk level in one call - HIGH RISK (3+ items or SSN/financial), MEDIUM (1-2 items), LOW (names only), or CLEAN.

In [None]:
risk_query = """
SELECT 
    family_name,
    case_notes,
    AI_COMPLETE(
        'claude-3-5-sonnet',
        'Count the PII items in this text and classify risk level.
Return ONLY in this format: "X PII items found - [HIGH RISK/MEDIUM RISK/LOW RISK/CLEAN]"
HIGH RISK = 3+ items or SSN/financial data
MEDIUM RISK = 1-2 items  
LOW RISK = only names
CLEAN = no PII

Text: ' || case_notes
    ) AS risk_assessment
FROM family_case_notes
"""
session.sql(risk_query).to_pandas()

---
## Summary

| Function | Use Case for Understood |
|----------|------------------------|
| `SENTIMENT()` | Identify unhappy families for proactive outreach |
| `SUMMARIZE()` | Quickly digest long feedback for program reviews |
| `COMPLETE()` | Draft empathetic responses to parents |
| `COMPLETE()` (PII) | Detect and flag sensitive family data for compliance |