# üìä Snowflake ML Demo: Analytics Tables Setup

This notebook creates the analytics infrastructure and ML model tracking tables needed for our machine learning pipeline.

## üéØ What We're Creating

### Analytics Tables (DEMO_ANALYTICS schema)
- **PATIENT_RISK_SCORES**: Store patient risk assessments
- **DRUG_INTERACTION_ALERTS**: Track potential drug interactions
- **AE_PREDICTIONS**: Log model predictions and results
- **POPULATION_VS_FAERS_COMPARISON**: Compare local vs national adverse event rates

### ML Model Tables (ML_MODELS schema)
- **MODEL_REGISTRY**: Track model versions and performance
- **FEATURE_IMPORTANCE**: Store feature importance scores
- **MODEL_PREDICTIONS_LOG**: Detailed prediction logging

## üìã Purpose
These tables provide the infrastructure for model deployment, monitoring, and business analytics.


In [1]:
# üîó Establish Snowflake Connection
print("üîó Connecting to Snowflake...")

# Import required libraries
import sys
import os

# Fix path for snowflake_connection module
current_dir = os.getcwd()
if "notebooks" in current_dir:
    # Running from notebooks folder
    src_path = os.path.join(current_dir, "..", "src")
else:
    # Running from root folder  
    src_path = os.path.join(current_dir, "src")

sys.path.append(src_path)
print(f"üìÅ Added to Python path: {src_path}")

# Now import the connection module
from snowflake_connection import get_session

# Create Snowpark session
session = get_session()

if session:
    print("‚úÖ Connected to Snowflake successfully!")
    
    # Set context for analytics tables setup
    print("üìä Setting context for analytics tables...")
    session.sql("USE DATABASE ADVERSE_EVENT_MONITORING").collect()
    session.sql("USE WAREHOUSE ADVERSE_EVENT_WH").collect()
    print("‚úÖ Context set successfully!")
else:
    print("‚ùå Failed to connect to Snowflake!")
    print("   Please check your .env file configuration")
    raise Exception("Snowflake connection failed")


üîó Connecting to Snowflake...
üìÅ Added to Python path: /Users/beddy/Desktop/Github/Snowflake_ML_HCLS/notebooks/../src
üöÄ Initializing Snowflake ML Platform connection...
‚úÖ Snowflake connection established successfully!
üìç Connected to: SFSENORTHAMERICA-SE-HCLS-EXPANSION-EAST
üë§ User: BEDDY
üè¢ Database: ADVERSE_EVENT_MONITORING
üìä Schema: DEMO_ANALYTICS
‚ö° Warehouse: ADVERSE_EVENT_WH
üß™ Connection test passed!
   Snowflake Version: 9.21.0
‚úÖ Demo environment already exists
üéâ Ready for ML Platform operations!
‚úÖ Connected to Snowflake successfully!
üìä Setting context for analytics tables...
‚úÖ Context set successfully!


In [2]:
# üìä Create analytics tables in DEMO_ANALYTICS schema
print("üìä Creating analytics tables in DEMO_ANALYTICS schema...")

# Switch to DEMO_ANALYTICS schema
session.sql("USE SCHEMA DEMO_ANALYTICS").collect()
print("‚úÖ Switched to DEMO_ANALYTICS schema")


üìä Creating analytics tables in DEMO_ANALYTICS schema...
‚úÖ Switched to DEMO_ANALYTICS schema


In [3]:
# üìä Create PATIENT_RISK_SCORES table
print("üìä Creating PATIENT_RISK_SCORES table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS PATIENT_RISK_SCORES (
        patient_id VARCHAR(50),
        risk_score FLOAT,
        risk_category VARCHAR(20),
        primary_risk_factors ARRAY,
        medication_count INTEGER,
        condition_count INTEGER,
        age_group VARCHAR(20),
        calculation_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ PATIENT_RISK_SCORES table created successfully!")


üìä Creating PATIENT_RISK_SCORES table...
‚úÖ PATIENT_RISK_SCORES table created successfully!


In [4]:
# üìä Create DRUG_INTERACTION_ALERTS table
print("üìä Creating DRUG_INTERACTION_ALERTS table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS DRUG_INTERACTION_ALERTS (
        alert_id VARCHAR(50),
        patient_id VARCHAR(50),
        drug1 VARCHAR(200),
        drug2 VARCHAR(200),
        interaction_type VARCHAR(100),
        severity_level VARCHAR(20),
        clinical_significance VARCHAR(500),
        recommendation VARCHAR(1000),
        alert_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ DRUG_INTERACTION_ALERTS table created successfully!")

# üéØ Create AE_PREDICTIONS table
print("üìä Creating AE_PREDICTIONS table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS AE_PREDICTIONS (
        prediction_id VARCHAR(50),
        patient_id VARCHAR(50),
        predicted_ae VARCHAR(200),
        probability FLOAT,
        confidence_interval VARCHAR(50),
        model_version VARCHAR(20),
        prediction_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ AE_PREDICTIONS table created successfully!")


üìä Creating DRUG_INTERACTION_ALERTS table...
‚úÖ DRUG_INTERACTION_ALERTS table created successfully!
üìä Creating AE_PREDICTIONS table...
‚úÖ AE_PREDICTIONS table created successfully!


In [5]:
# üìä Create POPULATION_VS_FAERS_COMPARISON table
print("üìä Creating POPULATION_VS_FAERS_COMPARISON table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS POPULATION_VS_FAERS_COMPARISON (
        medication VARCHAR(200),
        adverse_event VARCHAR(200),
        local_population_rate FLOAT,
        faers_national_rate FLOAT,
        rate_ratio FLOAT,
        statistical_significance FLOAT,
        sample_size INTEGER,
        analysis_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ POPULATION_VS_FAERS_COMPARISON table created successfully!")


üìä Creating POPULATION_VS_FAERS_COMPARISON table...
‚úÖ POPULATION_VS_FAERS_COMPARISON table created successfully!


In [6]:
# ü§ñ Create ML model tracking tables in ML_MODELS schema
print("ü§ñ Creating ML model tracking tables in ML_MODELS schema...")

# Switch to ML_MODELS schema
session.sql("USE SCHEMA ML_MODELS").collect()
print("‚úÖ Switched to ML_MODELS schema")

# Create MODEL_REGISTRY table
print("üìä Creating MODEL_REGISTRY table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS MODEL_REGISTRY (
        model_id VARCHAR(50),
        model_name VARCHAR(100),
        model_type VARCHAR(50),
        model_version VARCHAR(20),
        training_date TIMESTAMP,
        accuracy_score FLOAT,
        precision_score FLOAT,
        recall_score FLOAT,
        f1_score FLOAT,
        model_status VARCHAR(20),
        created_by VARCHAR(100)
    )
""").collect()

print("‚úÖ MODEL_REGISTRY table created successfully!")


ü§ñ Creating ML model tracking tables in ML_MODELS schema...
‚úÖ Switched to ML_MODELS schema
üìä Creating MODEL_REGISTRY table...
‚úÖ MODEL_REGISTRY table created successfully!


In [7]:
# üìä Create FEATURE_IMPORTANCE table
print("üìä Creating FEATURE_IMPORTANCE table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS FEATURE_IMPORTANCE (
        model_id VARCHAR(50),
        feature_name VARCHAR(100),
        importance_score FLOAT,
        feature_type VARCHAR(50),
        created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ FEATURE_IMPORTANCE table created successfully!")

# üîç Create MODEL_PREDICTIONS_LOG table
print("üìä Creating MODEL_PREDICTIONS_LOG table...")

session.sql("""
    CREATE TABLE IF NOT EXISTS MODEL_PREDICTIONS_LOG (
        prediction_id VARCHAR(50),
        model_id VARCHAR(50),
        patient_id VARCHAR(50),
        input_features OBJECT,
        prediction_result OBJECT,
        confidence_score FLOAT,
        prediction_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    )
""").collect()

print("‚úÖ MODEL_PREDICTIONS_LOG table created successfully!")


üìä Creating FEATURE_IMPORTANCE table...
‚úÖ FEATURE_IMPORTANCE table created successfully!
üìä Creating MODEL_PREDICTIONS_LOG table...
‚úÖ MODEL_PREDICTIONS_LOG table created successfully!


In [8]:
# ‚úÖ Verify analytics tables were created successfully
print("‚úÖ Verifying DEMO_ANALYTICS tables were created...")

# Switch to DEMO_ANALYTICS schema and show tables
session.sql("USE SCHEMA DEMO_ANALYTICS").collect()
analytics_tables = session.sql("SHOW TABLES").collect()

print("\nüìä DEMO_ANALYTICS Tables:")
for table in analytics_tables:
    print(f"   üî∏ {table[1]}")  # table[1] contains the table name

print(f"\n‚úÖ {len(analytics_tables)} analytics tables created successfully!")


‚úÖ Verifying DEMO_ANALYTICS tables were created...

üìä DEMO_ANALYTICS Tables:
   üî∏ AE_PREDICTIONS
   üî∏ DRUG_INTERACTION_ALERTS
   üî∏ ENHANCED_HEALTHCARE_DATA
   üî∏ FAERS_HCLS_INTEGRATED_DATASET
   üî∏ FEATURE_METADATA
   üî∏ HEALTHCARE_FEATURES
   üî∏ ML_TRAINING_FAERS_HCLS
   üî∏ MODEL_ALERTS
   üî∏ MODEL_MONITORING
   üî∏ PATIENT_RISK_SCORES
   üî∏ POPULATION_VS_FAERS_COMPARISON
   üî∏ PREPARED_HEALTHCARE_ANALYTICS
   üî∏ PREPARED_HEALTHCARE_DATA
   üî∏ REGRESSION_HEALTHCARE_DATA

‚úÖ 14 analytics tables created successfully!


In [9]:
# ‚úÖ Verify ML model tables were created successfully
print("‚úÖ Verifying ML_MODELS tables were created...")

# Switch to ML_MODELS schema and show tables
session.sql("USE SCHEMA ML_MODELS").collect()
ml_tables = session.sql("SHOW TABLES").collect()

print("\nü§ñ ML_MODELS Tables:")
for table in ml_tables:
    print(f"   üî∏ {table[1]}")  # table[1] contains the table name

print(f"\n‚úÖ {len(ml_tables)} ML model tables created successfully!")
print("\nüéâ All analytics infrastructure is ready!")


‚úÖ Verifying ML_MODELS tables were created...

ü§ñ ML_MODELS Tables:
   üî∏ FEATURE_IMPORTANCE
   üî∏ MODEL_PREDICTIONS_LOG
   üî∏ MODEL_REGISTRY

‚úÖ 3 ML model tables created successfully!

üéâ All analytics infrastructure is ready!


In [10]:
# üìä Create Healthcare Analytics Data (Required for Notebook 4)
print("üìä Creating healthcare analytics data for feature engineering...")

# Generate sample healthcare data for ML training
session.sql("""
CREATE OR REPLACE TABLE ADVERSE_EVENT_MONITORING.DEMO_ANALYTICS.PREPARED_HEALTHCARE_ANALYTICS AS
WITH patient_base AS (
    SELECT 
        CONCAT('PAT_', LPAD(SEQ4(), 7, '0')) as PATIENT_ID,
        FLOOR(UNIFORM(18, 90, RANDOM())) as AGE,
        CASE WHEN UNIFORM(0, 1, RANDOM()) < 0.6 THEN 'F' ELSE 'M' END as GENDER,
        FLOOR(UNIFORM(1, 15, RANDOM())) as NUM_CONDITIONS,
        FLOOR(UNIFORM(0, 25, RANDOM())) as NUM_MEDICATIONS,
        FLOOR(UNIFORM(1, 200, RANDOM())) as NUM_CLAIMS
    FROM TABLE(GENERATOR(ROWCOUNT => 50000))  -- Generate 50K sample patients
)
SELECT 
    PATIENT_ID,
    AGE,
    GENDER,
    NUM_CONDITIONS,
    NUM_MEDICATIONS, 
    NUM_CLAIMS,
    -- Add some derived healthcare metrics
    CASE 
        WHEN AGE >= 65 AND NUM_CONDITIONS >= 5 THEN 'HIGH_RISK'
        WHEN AGE >= 50 AND NUM_CONDITIONS >= 3 THEN 'MEDIUM_RISK'  
        ELSE 'LOW_RISK'
    END as RISK_CATEGORY,
    CASE WHEN NUM_MEDICATIONS >= 10 THEN 1 ELSE 0 END as POLYPHARMACY_FLAG,
    CURRENT_TIMESTAMP() as CREATED_DATE
FROM patient_base
""").collect()

# Verify the data was created
verification = session.sql("SELECT COUNT(*) as row_count FROM ADVERSE_EVENT_MONITORING.DEMO_ANALYTICS.PREPARED_HEALTHCARE_ANALYTICS").collect()
row_count = verification[0]['ROW_COUNT']

print(f"‚úÖ Created PREPARED_HEALTHCARE_ANALYTICS table with {row_count:,} patient records")
print("üìä This table is now ready for feature engineering in notebook 4")

# Show sample of the data
print("\nüîç Sample healthcare analytics data:")
sample_data = session.sql("""
    SELECT PATIENT_ID, AGE, GENDER, NUM_CONDITIONS, NUM_MEDICATIONS, NUM_CLAIMS, RISK_CATEGORY
    FROM ADVERSE_EVENT_MONITORING.DEMO_ANALYTICS.PREPARED_HEALTHCARE_ANALYTICS
    LIMIT 5
""").to_pandas()
print(sample_data.to_string(index=False))


üìä Creating healthcare analytics data for feature engineering...
‚úÖ Created PREPARED_HEALTHCARE_ANALYTICS table with 50,000 patient records
üìä This table is now ready for feature engineering in notebook 4

üîç Sample healthcare analytics data:
 PATIENT_ID  AGE GENDER  NUM_CONDITIONS  NUM_MEDICATIONS  NUM_CLAIMS RISK_CATEGORY
PAT_0000000   65      M               2               25         136      LOW_RISK
PAT_0000001   57      F              12               18          69   MEDIUM_RISK
PAT_0000002   50      F               6               22         173   MEDIUM_RISK
PAT_0000003   36      M              14               11         140      LOW_RISK
PAT_0000004   25      M               2               15         117      LOW_RISK


## ‚úÖ Analytics Infrastructure Complete!

Your ML analytics infrastructure is now ready:

### üìä Analytics Tables (DEMO_ANALYTICS)
- ‚úÖ **PATIENT_RISK_SCORES**: Ready for risk assessments
- ‚úÖ **DRUG_INTERACTION_ALERTS**: Drug interaction monitoring
- ‚úÖ **AE_PREDICTIONS**: Model prediction storage
- ‚úÖ **POPULATION_VS_FAERS_COMPARISON**: Population analysis
- ‚úÖ **PREPARED_HEALTHCARE_ANALYTICS**: 50K patient records for ML training

### ü§ñ ML Model Tables (ML_MODELS)
- ‚úÖ **MODEL_REGISTRY**: Model version tracking
- ‚úÖ **FEATURE_IMPORTANCE**: Feature analysis storage
- ‚úÖ **MODEL_PREDICTIONS_LOG**: Detailed prediction logging

## üéØ Ready for Machine Learning
The foundation is now complete! These tables will support:
- Model training and evaluation with real healthcare data
- Real-time predictions and logging
- Performance monitoring and drift detection
- Business analytics and reporting

## üìã Next Steps
1. ‚úÖ **Run Notebook 4**: Feature engineering with Snowflake ML preprocessing
2. ‚úÖ **Run Notebook 5**: Complete ML workflow (Feature Store + distributed training)
3. ‚úÖ **Run Notebook 7**: Native ML observability and monitoring

---
*Analytics infrastructure + healthcare data provides the complete foundation for enterprise ML operations.*
