# FLUX Ops Center - Demo Setup Notebook

This notebook walks you through setting up the FLUX Ops Center demo in your own Snowflake account.

## Overview

The FLUX Ops Center is a comprehensive utility operations demo featuring:
- **596K+ meters** with realistic infrastructure topology
- **Real-time AMI data** with 15-minute interval readings
- **Work orders, outages, and power quality events**
- **Cortex Analyst** semantic model for natural language queries
- **Cortex Agent** for intelligent operations assistance

## Prerequisites

1. Snowflake account with ACCOUNTADMIN access
2. Python 3.8+ with required packages
3. Clone of the flux-ops-center-spcs repository

In [None]:
# Install required packages
!pip install snowflake-connector-python pandas pyarrow snowflake-snowpark-python --quiet

In [None]:
import os
import pandas as pd
from pathlib import Path
from datetime import datetime

import snowflake.connector
from snowflake.connector.pandas_tools import write_pandas

print(f"Setup started at: {datetime.now()}")

## Step 1: Configure Snowflake Connection

Update these variables with your Snowflake account details.

In [None]:
# CONFIGURATION - Update these values
SNOWFLAKE_ACCOUNT = "your_account"  # e.g., "xy12345.us-east-1"
SNOWFLAKE_USER = "your_username"
SNOWFLAKE_PASSWORD = "your_password"  # Or use authenticator='externalbrowser'
SNOWFLAKE_ROLE = "ACCOUNTADMIN"
SNOWFLAKE_WAREHOUSE = "COMPUTE_WH"

# Target database/schema for the demo
TARGET_DATABASE = "FLUX_DEMO"
TARGET_SCHEMA = "PRODUCTION"

# Path to seed data (relative to this notebook)
SEED_DATA_DIR = Path(".")  # Assumes notebook is in scripts/seed_data/

In [None]:
# Alternative: Use named connection from ~/.snowflake/connections.toml
# Uncomment the following to use a named connection instead:

# CONNECTION_NAME = "my_connection"
# conn = snowflake.connector.connect(connection_name=CONNECTION_NAME)

# Standard connection
conn = snowflake.connector.connect(
    account=SNOWFLAKE_ACCOUNT,
    user=SNOWFLAKE_USER,
    password=SNOWFLAKE_PASSWORD,
    role=SNOWFLAKE_ROLE,
    warehouse=SNOWFLAKE_WAREHOUSE,
)

cursor = conn.cursor()
print("✅ Connected to Snowflake!")

## Step 2: Create Database and Schema

In [None]:
# Create database and schema
cursor.execute(f"CREATE DATABASE IF NOT EXISTS {TARGET_DATABASE}")
cursor.execute(f"USE DATABASE {TARGET_DATABASE}")
cursor.execute(f"CREATE SCHEMA IF NOT EXISTS {TARGET_SCHEMA}")
cursor.execute(f"USE SCHEMA {TARGET_SCHEMA}")

print(f"✅ Created {TARGET_DATABASE}.{TARGET_SCHEMA}")

## Step 3: Create Tables

Create all required tables for the demo.

In [None]:
# Reference Tables
tables_ddl = {
    'SUBSTATIONS': """
        CREATE TABLE IF NOT EXISTS SUBSTATIONS (
            SUBSTATION_ID VARCHAR, SUBSTATION_NAME VARCHAR,
            LATITUDE FLOAT, LONGITUDE FLOAT, CAPACITY_MW FLOAT,
            VOLTAGE_LEVEL_KV FLOAT, INSTALL_DATE DATE, STATUS VARCHAR, REGION VARCHAR
        )
    """,
    'CIRCUIT_METADATA': """
        CREATE TABLE IF NOT EXISTS CIRCUIT_METADATA (
            CIRCUIT_ID VARCHAR, CIRCUIT_NAME VARCHAR, SUBSTATION_ID VARCHAR,
            VOLTAGE_CLASS VARCHAR, CIRCUIT_TYPE VARCHAR, TOTAL_CUSTOMERS NUMBER,
            TOTAL_TRANSFORMERS NUMBER, LINE_MILES FLOAT, INSTALL_DATE DATE,
            LATITUDE FLOAT, LONGITUDE FLOAT, STATUS VARCHAR
        )
    """,
    'TRANSFORMER_METADATA': """
        CREATE TABLE IF NOT EXISTS TRANSFORMER_METADATA (
            TRANSFORMER_ID VARCHAR, TRANSFORMER_NAME VARCHAR, CIRCUIT_ID VARCHAR,
            SUBSTATION_ID VARCHAR, KVA_RATING FLOAT, VOLTAGE_PRIMARY FLOAT,
            VOLTAGE_SECONDARY FLOAT, PHASE_CONFIG VARCHAR, INSTALL_DATE DATE,
            MANUFACTURER VARCHAR, LATITUDE FLOAT, LONGITUDE FLOAT, STATUS VARCHAR,
            LOAD_FACTOR FLOAT, LAST_MAINTENANCE_DATE DATE
        )
    """,
    'GRID_POLES_INFRASTRUCTURE': """
        CREATE TABLE IF NOT EXISTS GRID_POLES_INFRASTRUCTURE (
            POLE_ID VARCHAR, POLE_TYPE VARCHAR, MATERIAL VARCHAR, HEIGHT_FT NUMBER,
            INSTALL_DATE DATE, CIRCUIT_ID VARCHAR, LATITUDE FLOAT, LONGITUDE FLOAT,
            CONDITION_STATUS VARCHAR, LAST_INSPECTION_DATE DATE
        )
    """,
    'HOUSTON_WEATHER_HOURLY': """
        CREATE TABLE IF NOT EXISTS HOUSTON_WEATHER_HOURLY (
            TIMESTAMP TIMESTAMP_NTZ, TEMPERATURE_F FLOAT, HUMIDITY_PCT FLOAT,
            WIND_SPEED_MPH FLOAT, PRECIPITATION_IN FLOAT, WEATHER_CONDITION VARCHAR,
            HEAT_INDEX FLOAT, WIND_CHILL FLOAT
        )
    """,
    'ERCOT_LMP_HOUSTON_ZONE': """
        CREATE TABLE IF NOT EXISTS ERCOT_LMP_HOUSTON_ZONE (
            TIMESTAMP TIMESTAMP_NTZ, LMP_PRICE FLOAT, ENERGY_PRICE FLOAT,
            CONGESTION_PRICE FLOAT, LOSS_PRICE FLOAT, ZONE VARCHAR
        )
    """,
    'POWER_QUALITY_READINGS': """
        CREATE TABLE IF NOT EXISTS POWER_QUALITY_READINGS (
            READING_ID VARCHAR, METER_ID VARCHAR, TIMESTAMP TIMESTAMP_NTZ,
            VOLTAGE FLOAT, FREQUENCY FLOAT, THD_VOLTAGE FLOAT, THD_CURRENT FLOAT,
            POWER_FACTOR FLOAT, SAG_EVENT BOOLEAN, SWELL_EVENT BOOLEAN
        )
    """,
    'SAP_WORK_ORDERS': """
        CREATE TABLE IF NOT EXISTS SAP_WORK_ORDERS (
            WORK_ORDER_ID VARCHAR, WORK_ORDER_TYPE VARCHAR, PRIORITY VARCHAR,
            STATUS VARCHAR, CUSTOMER_ID VARCHAR, DESCRIPTION VARCHAR,
            CREATED_DATE TIMESTAMP_NTZ, SCHEDULED_DATE TIMESTAMP_NTZ,
            COMPLETED_DATE TIMESTAMP_NTZ, CREW_ID VARCHAR,
            ESTIMATED_DURATION_HOURS FLOAT, ACTUAL_DURATION_HOURS FLOAT,
            LABOR_COST FLOAT, PARTS_COST FLOAT
        )
    """,
    'OUTAGE_EVENTS': """
        CREATE TABLE IF NOT EXISTS OUTAGE_EVENTS (
            OUTAGE_ID VARCHAR, TRANSFORMER_ID VARCHAR, CIRCUIT_ID VARCHAR,
            OUTAGE_START_TIME TIMESTAMP_NTZ, OUTAGE_END_TIME TIMESTAMP_NTZ,
            OUTAGE_CAUSE VARCHAR, CUSTOMERS_AFFECTED NUMBER,
            WEATHER_RELATED BOOLEAN, RESTORATION_CREW VARCHAR
        )
    """,
    'METER_INFRASTRUCTURE': """
        CREATE TABLE IF NOT EXISTS METER_INFRASTRUCTURE (
            METER_ID VARCHAR, METER_LATITUDE FLOAT, METER_LONGITUDE FLOAT,
            COMMISSIONED_DATE DATE, METER_TYPE VARCHAR, CUSTOMER_SEGMENT_ID VARCHAR,
            POLE_ID VARCHAR, CIRCUIT_ID VARCHAR, TRANSFORMER_ID VARCHAR,
            SUBSTATION_ID VARCHAR, POLE_TYPE VARCHAR, POLE_MATERIAL VARCHAR,
            POLE_HEIGHT_FT NUMBER, CONDITION_STATUS VARCHAR, ZIP_CODE VARCHAR,
            CITY VARCHAR, COUNTY_NAME VARCHAR, HEALTH_SCORE FLOAT
        )
    """,
    'CUSTOMERS_MASTER_DATA': """
        CREATE TABLE IF NOT EXISTS CUSTOMERS_MASTER_DATA (
            CUSTOMER_ID VARCHAR, FIRST_NAME VARCHAR, LAST_NAME VARCHAR,
            FULL_NAME VARCHAR, PRIMARY_METER_ID VARCHAR, CUSTOMER_SEGMENT VARCHAR,
            SERVICE_ADDRESS VARCHAR, SERVICE_COUNTY VARCHAR, PHONE VARCHAR,
            EMAIL VARCHAR, ACCOUNT_STATUS VARCHAR, SERVICE_START_DATE DATE,
            CREATED_AT TIMESTAMP_NTZ, DATA_SOURCE VARCHAR, ZIP_CODE NUMBER, CITY VARCHAR
        )
    """,
    'AMI_INTERVAL_READINGS': """
        CREATE TABLE IF NOT EXISTS AMI_INTERVAL_READINGS (
            METER_ID VARCHAR, TIMESTAMP TIMESTAMP_NTZ, USAGE_KWH FLOAT,
            VOLTAGE NUMBER, POWER_FACTOR NUMBER(23,2),
            CUSTOMER_SEGMENT_ID VARCHAR, SOURCE_TABLE VARCHAR
        )
    """,
}

for table_name, ddl in tables_ddl.items():
    cursor.execute(ddl)
    print(f"✅ Created: {table_name}")

## Step 4: Load Seed Data

Load the parquet files into Snowflake tables.

In [None]:
def load_parquet_to_table(table_name, parquet_pattern, subdir):
    """Load parquet files matching pattern into table."""
    parquet_dir = SEED_DATA_DIR / subdir
    files = list(parquet_dir.glob(parquet_pattern))
    
    if not files:
        print(f"⚠️  No files found for {table_name} ({parquet_pattern})")
        return 0
    
    total_rows = 0
    for f in sorted(files):
        df = pd.read_parquet(f)
        success, _, _, _ = write_pandas(
            conn=conn, df=df, table_name=table_name,
            schema=TARGET_SCHEMA, quote_identifiers=False
        )
        if success:
            total_rows += len(df)
    
    print(f"✅ {table_name}: {total_rows:,} rows loaded")
    return total_rows

In [None]:
# Load Reference Data
print("Loading Reference Data...")
print("-" * 40)

load_parquet_to_table('SUBSTATIONS', 'substations*.parquet', 'reference')
load_parquet_to_table('CIRCUIT_METADATA', 'circuit_metadata*.parquet', 'reference')
load_parquet_to_table('TRANSFORMER_METADATA', 'transformer_metadata*.parquet', 'reference')
load_parquet_to_table('GRID_POLES_INFRASTRUCTURE', 'grid_poles*.parquet', 'reference')
load_parquet_to_table('HOUSTON_WEATHER_HOURLY', 'houston_weather*.parquet', 'reference')
load_parquet_to_table('ERCOT_LMP_HOUSTON_ZONE', 'ercot_lmp*.parquet', 'reference')
load_parquet_to_table('POWER_QUALITY_READINGS', 'power_quality*.parquet', 'reference')

In [None]:
# Load Operational Data
print("\nLoading Operational Data...")
print("-" * 40)

load_parquet_to_table('SAP_WORK_ORDERS', 'sap_work_orders*.parquet', 'operational')
load_parquet_to_table('OUTAGE_EVENTS', 'outage_events*.parquet', 'operational')

In [None]:
# Load Sample Data (10K meters + customers)
print("\nLoading Sample Data...")
print("-" * 40)

load_parquet_to_table('METER_INFRASTRUCTURE', 'meter_infrastructure_10k*.parquet', 'samples')
load_parquet_to_table('CUSTOMERS_MASTER_DATA', 'customers_master_data_10k*.parquet', 'samples')

## Step 5: Verify Data Loads

In [None]:
# Verify row counts
expected_counts = {
    'SUBSTATIONS': 275,
    'CIRCUIT_METADATA': 8842,
    'TRANSFORMER_METADATA': 91554,
    'GRID_POLES_INFRASTRUCTURE': 62038,
    'HOUSTON_WEATHER_HOURLY': 4464,
    'ERCOT_LMP_HOUSTON_ZONE': 45213,
    'POWER_QUALITY_READINGS': 10000,
    'SAP_WORK_ORDERS': 250488,
    'OUTAGE_EVENTS': 34252,
    'METER_INFRASTRUCTURE': 10000,
    'CUSTOMERS_MASTER_DATA': 11849,
}

print("\nVerifying Data Loads...")
print("-" * 60)
print(f"{'Table':<35} {'Expected':>10} {'Actual':>10} {'Status':>8}")
print("-" * 60)

for table, expected in expected_counts.items():
    cursor.execute(f"SELECT COUNT(*) FROM {table}")
    actual = cursor.fetchone()[0]
    status = "✅" if actual >= expected * 0.9 else "⚠️"
    print(f"{table:<35} {expected:>10,} {actual:>10,} {status:>8}")

## Step 6: Generate AMI Readings (Optional)

The AMI readings are too large to include in the repository. You have two options:

1. **Deploy Flux Data Forge** - Generate realistic streaming AMI data
2. **Generate sample data** - Use the code below for quick testing

In [None]:
# Quick sample AMI data generation (for testing)
import random
from datetime import timedelta

def generate_sample_ami(meters_df, days=7, interval_minutes=15):
    """Generate sample AMI readings for testing."""
    readings = []
    start_date = datetime.now() - timedelta(days=days)
    
    meter_ids = meters_df['METER_ID'].tolist()[:100]  # Limit for quick testing
    
    for meter_id in meter_ids:
        current_time = start_date
        while current_time < datetime.now():
            hour = current_time.hour
            # Time-of-day usage pattern
            if 14 <= hour <= 19:  # Peak
                usage = random.uniform(1.5, 3.5)
            elif 6 <= hour <= 9:  # Morning
                usage = random.uniform(1.0, 2.5)
            else:  # Off-peak
                usage = random.uniform(0.3, 1.5)
            
            readings.append({
                'METER_ID': meter_id,
                'TIMESTAMP': current_time,
                'USAGE_KWH': round(usage, 4),
                'VOLTAGE': random.randint(118, 122),
                'POWER_FACTOR': round(random.uniform(0.92, 0.99), 2),
                'CUSTOMER_SEGMENT_ID': 'RESIDENTIAL',
                'SOURCE_TABLE': 'GENERATED'
            })
            current_time += timedelta(minutes=interval_minutes)
    
    return pd.DataFrame(readings)

# Uncomment to generate sample AMI data:
# meters_df = pd.read_sql("SELECT METER_ID FROM METER_INFRASTRUCTURE LIMIT 100", conn)
# ami_df = generate_sample_ami(meters_df, days=7)
# write_pandas(conn, ami_df, 'AMI_INTERVAL_READINGS', schema=TARGET_SCHEMA)
# print(f"Generated {len(ami_df):,} AMI readings")

## Step 7: Create Views

Run the view creation scripts from the repository.

In [None]:
# Point to view scripts
views_dir = SEED_DATA_DIR.parent / 'views'

print("View scripts to run:")
print(f"  1. {views_dir / '01_semantic_model_views.sql'}")
print(f"  2. {views_dir / '02_utility_views.sql'}")
print("\nRun these in Snowsight or using Snow CLI:")
print(f"  snow sql -f {views_dir / '01_semantic_model_views.sql'}")

## Summary

Your FLUX Ops Center demo environment is now set up with:

| Data Type | Records | Description |
|-----------|---------|-------------|
| Substations | 275 | Grid substations |
| Circuits | 8,842 | Distribution circuits |
| Transformers | 91,554 | Distribution transformers |
| Poles | 62,038 | Grid pole infrastructure |
| Weather | 4,464 | Houston hourly weather |
| ERCOT Pricing | 45,213 | LMP pricing data |
| Work Orders | 250,488 | SAP-style maintenance |
| Outages | 34,252 | Historical outage events |
| Meters | 10,000 | Sample meter infrastructure |
| Customers | ~12,000 | Associated customers |

### Next Steps

1. **Create Views**: Run the SQL scripts in `scripts/views/`
2. **Deploy Flux Data Forge**: Generate real-time AMI data
3. **Deploy Flux Ops Center**: Main demo application
4. **Configure Cortex Analyst**: Set up semantic model

In [None]:
# Cleanup
cursor.close()
conn.close()
print("\n✅ Setup complete! Connection closed.")