# Intro to Hybrid Tables

## Overview
This template demonstrates the power of Snowflake Hybrid Tables through a practical transactional application scenario. You'll learn how to create row-based, ACID-compliant tables designed for high-concurrency operational workloads, enabling real-time updates and transactional consistency.

**What you'll accomplish:**
- Create hybrid tables for transactional workloads
- Demonstrate ACID compliance with row-level locking
- Understand when to use hybrid vs standard tables
- Build real-world use cases like shopping carts and inventory management

**Business Context:**
You're building a modern e-commerce platform that needs real-time inventory tracking, user session management, and order processing with high concurrency. Hybrid Tables provide ACID-compliant, row-based storage within Snowflake's ecosystem for transactional workloads.

## Step 1: Setup & Environment
First, let's initialize our learning environment and clean up any existing objects from previous runs.

In [None]:
import snowflake.snowpark as snowpark
from snowflake.snowpark import Session
from snowflake.snowpark.functions import col, lit, current_timestamp, row_number
from snowflake.snowpark.window import Window
from snowflake.snowpark.types import StructType, StructField, StringType, IntegerType, DecimalType, TimestampType, BooleanType
import random
import time

def get_session():
    """Get the active Snowpark session."""
    return snowpark.Session.builder.getOrCreate()

# Get session
session = get_session()

In [None]:
-- Initialize learning environment
USE ROLE SNOWFLAKE_LEARNING_ROLE;
USE WAREHOUSE SNOWFLAKE_LEARNING_WH;
USE DATABASE SNOWFLAKE_LEARNING_DB;

-- Create unique schema for this template
SET schema_name = CONCAT(CURRENT_USER(), '_HYBRID_TABLE');
DROP SCHEMA IF EXISTS IDENTIFIER($schema_name); 
CREATE SCHEMA IDENTIFIER($schema_name);
USE SCHEMA IDENTIFIER($schema_name);

In [None]:
-- Clean up any existing objects from previous runs
DROP TABLE IF EXISTS shopping_carts;
DROP TABLE IF EXISTS user_sessions;
DROP TABLE IF EXISTS inventory;
DROP TABLE IF EXISTS users;
DROP TABLE IF EXISTS products;
DROP TABLE IF EXISTS order_items;

## Step 2: Understanding Hybrid Tables vs Standard Tables

Before we create tables, let's understand the key differences:

**Hybrid Tables provide:**
- Row-based storage for transactional workloads
- ACID compliance with row-level locking
- Support for high-concurrency operations
- Point lookups by primary key
- Mandatory PRIMARY KEY constraints
- Foreign key and unique constraints support

**Standard Tables provide:**
- Columnar storage for analytical workloads
- Ideal for batch processing and complex queries
- Compressed storage for large datasets
- Designed for aggregations and full table scans

## Step 3: Create Sample Data & Standard Tables
Let's create standard tables for our product catalog and user profiles, which are better suited for analytical workloads.

In [None]:
-- Create Products table (Standard table for product catalog)
CREATE OR REPLACE TABLE products (
    product_id STRING PRIMARY KEY,
    product_name STRING NOT NULL,
    category STRING NOT NULL,
    base_price DECIMAL(10,2) NOT NULL,
    description STRING,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
);

In [None]:
# Generate sample product data
print("Generating sample product data...")
products_data = []
categories = ['Electronics', 'Clothing', 'Home & Garden', 'Sports', 'Books', 'Beauty', 'Automotive']
for i in range(1, 1001):  # 1000 products
    product_id = f"PROD_{i:04d}"
    category = random.choice(categories)
    product_name = f"{category} Item {i}"
    base_price = round(random.uniform(9.99, 999.99), 2)
    description = f"High-quality {category.lower()} product for everyday use"
    products_data.append((product_id, product_name, category, base_price, description))

# Insert products data in batches
batch_size = 100
for i in range(0, len(products_data), batch_size):
    batch = products_data[i:i + batch_size]
    values_str = ",".join([f"('{p[0]}', '{p[1]}', '{p[2]}', {p[3]}, '{p[4]}')" for p in batch])
    session.sql(f"""
    INSERT INTO products (product_id, product_name, category, base_price, description) 
    VALUES {values_str}
    """).collect()

print(f"Inserted {len(products_data)} products into standard table.")

In [None]:
-- Create Users table (Standard table for user profiles)
CREATE OR REPLACE TABLE users (
    user_id STRING PRIMARY KEY,
    username STRING NOT NULL UNIQUE,
    email STRING NOT NULL UNIQUE,
    first_name STRING,
    last_name STRING,
    registration_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    last_login TIMESTAMP
);

In [None]:
# Generate sample user data
print("Generating sample user data...")
users_data = []
for i in range(1, 501):  # 500 users
    user_id = f"USER_{i:04d}"
    username = f"user{i}"
    email = f"user{i}@example.com"
    first_name = f"FirstName{i}"
    last_name = f"LastName{i}"
    users_data.append((user_id, username, email, first_name, last_name))

# Insert users data in batches
for i in range(0, len(users_data), batch_size):
    batch = users_data[i:i + batch_size]
    values_str = ",".join([f"('{u[0]}', '{u[1]}', '{u[2]}', '{u[3]}', '{u[4]}')" for u in batch])
    session.sql(f"""
    INSERT INTO users (user_id, username, email, first_name, last_name) 
    VALUES {values_str}
    """).collect()

print(f"Inserted {len(users_data)} users into standard table.")

# Verify standard table creation
products_count = session.sql("SELECT COUNT(*) as count FROM products").collect()[0]['COUNT']
users_count = session.sql("SELECT COUNT(*) as count FROM users").collect()[0]['COUNT']
print(f"Standard tables created: {products_count} products, {users_count} users")

## Step 4: Basic Hybrid Table Creation
Now let's create hybrid tables optimized for transactional workloads. Notice the mandatory PRIMARY KEY constraints that enable optimal performance.

In [None]:
-- Create Inventory Hybrid Table (Real-time stock levels with PRIMARY KEY)
CREATE OR REPLACE HYBRID TABLE inventory (
    product_id STRING PRIMARY KEY,
    warehouse_location STRING NOT NULL,
    stock_quantity INTEGER NOT NULL DEFAULT 0,
    reserved_quantity INTEGER NOT NULL DEFAULT 0,
    last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    min_stock_level INTEGER DEFAULT 10,
    max_stock_level INTEGER DEFAULT 1000
);

In [None]:
-- Create User Sessions Hybrid Table (Active user session tracking)
CREATE OR REPLACE HYBRID TABLE user_sessions (
    session_id STRING PRIMARY KEY,
    user_id STRING NOT NULL,
    session_start TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    last_activity TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    is_active BOOLEAN DEFAULT TRUE,
    ip_address STRING,
    user_agent STRING DEFAULT 'web-browser'
);

In [None]:
print("Created hybrid tables with PRIMARY KEY constraints")

# Generate inventory data for all products
print("Populating inventory hybrid table...")
inventory_data = []
warehouse_locations = ['NYC_WAREHOUSE', 'LA_WAREHOUSE', 'CHI_WAREHOUSE', 'MIA_WAREHOUSE']

for i in range(1, 1001):  # Match all products
    product_id = f"PROD_{i:04d}"
    warehouse = random.choice(warehouse_locations)
    stock_qty = random.randint(0, 500)
    reserved_qty = random.randint(0, min(50, stock_qty))
    min_level = random.randint(5, 20)
    max_level = random.randint(100, 1000)
    inventory_data.append((product_id, warehouse, stock_qty, reserved_qty, min_level, max_level))

# Insert inventory data in batches
for i in range(0, len(inventory_data), batch_size):
    batch = inventory_data[i:i + batch_size]
    values_str = ",".join([f"('{inv[0]}', '{inv[1]}', {inv[2]}, {inv[3]}, {inv[4]}, {inv[5]})" for inv in batch])
    session.sql(f"""
    INSERT INTO inventory (product_id, warehouse_location, stock_quantity, reserved_quantity, min_stock_level, max_stock_level) 
    VALUES {values_str}
    """).collect()

print(f"Inserted {len(inventory_data)} inventory records into hybrid table")

In [None]:
# Generate active user sessions
print("Creating active user sessions...")
session_data = []
for i in range(1, 101):  # 100 active sessions
    session_id = f"SESSION_{i:04d}_{int(time.time())}"
    user_id = f"USER_{random.randint(1, 500):04d}"
    ip_address = f"192.168.{random.randint(1, 255)}.{random.randint(1, 255)}"
    is_active = random.choice([True, True, True, False])  # 75% active
    session_data.append((session_id, user_id, ip_address, is_active))

# Insert session data in batches
for i in range(0, len(session_data), batch_size):
    batch = session_data[i:i + batch_size]
    values_str = ",".join([f"('{s[0]}', '{s[1]}', '{s[2]}', {str(s[3]).upper()})" for s in batch])
    session.sql(f"""
    INSERT INTO user_sessions (session_id, user_id, ip_address, is_active) 
    VALUES {values_str}
    """).collect()

print(f"Created {len(session_data)} user sessions in hybrid table")

# Verify hybrid table creation
inventory_count = session.sql("SELECT COUNT(*) as count FROM inventory").collect()[0]['COUNT']
sessions_count = session.sql("SELECT COUNT(*) as count FROM user_sessions").collect()[0]['COUNT']
print(f"Hybrid tables created: {inventory_count} inventory records, {sessions_count} sessions")

## Step 5: Transactional Operations
Let's demonstrate the key features of hybrid tables: point lookups by primary key, concurrent updates, and ACID compliance.

In [None]:
# Point Lookup - Single-record retrieval by primary key
print("Retrieving inventory record by primary key...")
result = session.sql("SELECT * FROM inventory WHERE product_id = 'PROD_0001'").collect()
print(f"Found inventory record: {result[0]}")

In [None]:
# High-concurrency updates - Simulate inventory updates
print("\nSimulating high-concurrency inventory updates...")
update_operations = []

# Simulate order fulfillment - reducing stock
for _ in range(10):
    product_id = f"PROD_{random.randint(1, 100):04d}"
    qty_sold = random.randint(1, 5)
    
    session.sql(f"""
    UPDATE inventory 
    SET stock_quantity = stock_quantity - {qty_sold},
        last_updated = CURRENT_TIMESTAMP()
    WHERE product_id = '{product_id}' AND stock_quantity >= {qty_sold}
    """).collect()
    
    update_operations.append(f"Sold {qty_sold} units of {product_id}")

print(f"Completed {len(update_operations)} concurrent inventory updates")

# Simulate restocking operations
print("Simulating restocking operations...")
for _ in range(5):
    product_id = f"PROD_{random.randint(1, 100):04d}"
    restock_qty = random.randint(50, 200)
    
    session.sql(f"""
    UPDATE inventory 
    SET stock_quantity = stock_quantity + {restock_qty},
        last_updated = CURRENT_TIMESTAMP()
    WHERE product_id = '{product_id}'
    """).collect()
    
    print(f"Restocked {restock_qty} units of {product_id}")

### ACID Compliance Demonstration
Let's demonstrate transaction handling with atomic operations that can be committed or rolled back as a unit.

In [None]:
-- Show current state before transaction
SELECT product_id, stock_quantity, reserved_quantity 
FROM inventory 
WHERE product_id IN ('PROD_0001', 'PROD_0002') 
ORDER BY product_id;

In [None]:
-- Perform atomic update - reserve inventory for an order
BEGIN TRANSACTION;

-- Reserve 10 units from PROD_0001 and 5 units from PROD_0002
UPDATE inventory 
SET reserved_quantity = reserved_quantity + 10,
    stock_quantity = stock_quantity - 10,
    last_updated = CURRENT_TIMESTAMP()
WHERE product_id = 'PROD_0001' AND stock_quantity >= 10;

UPDATE inventory 
SET reserved_quantity = reserved_quantity + 5,
    stock_quantity = stock_quantity - 5,
    last_updated = CURRENT_TIMESTAMP()
WHERE product_id = 'PROD_0002' AND stock_quantity >= 5;

COMMIT;

In [None]:
-- Verify transaction results
SELECT product_id, stock_quantity, reserved_quantity 
FROM inventory 
WHERE product_id IN ('PROD_0001', 'PROD_0002') 
ORDER BY product_id;

## Step 6: Advanced Hybrid Table Features
Let's create a shopping cart system that demonstrates foreign key constraints, unique constraints, and mixed workload patterns.

In [None]:
-- Create Shopping Cart Hybrid Table with multiple constraints
CREATE OR REPLACE HYBRID TABLE shopping_carts (
    cart_id STRING PRIMARY KEY,
    session_id STRING NOT NULL,
    product_id STRING NOT NULL,
    quantity INTEGER NOT NULL DEFAULT 1,
    added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
    FOREIGN KEY (session_id) REFERENCES user_sessions(session_id),
    UNIQUE (session_id, product_id)
);

In [None]:
print("Created shopping_carts hybrid table with multiple constraints")

# Populate shopping carts with sample data
print("Adding items to shopping carts...")
cart_data = []
for i in range(1, 51):  # 50 cart items
    cart_id = f"CART_{i:04d}"
    session_id = f"SESSION_{random.randint(1, 20):04d}_{int(time.time())}"
    product_id = f"PROD_{random.randint(1, 100):04d}"
    quantity = random.randint(1, 5)
    cart_data.append((cart_id, session_id, product_id, quantity))

# Insert cart data
for cart in cart_data:
    try:
        session.sql(f"""
        INSERT INTO shopping_carts (cart_id, session_id, product_id, quantity) 
        VALUES ('{cart[0]}', '{cart[1]}', '{cart[2]}', {cart[3]})
        """).collect()
    except Exception as e:
        # Skip duplicates due to UNIQUE constraint
        continue

print(f"Added items to shopping carts")

## Step 7: Mixed Workload & Analytics
Hybrid tables excel at combining transactional and analytical workloads. Let's run some analytical queries on our transactional data.

In [None]:
-- Analytical query on hybrid table data - Most popular products in carts
SELECT 
    p.product_name,
    p.category,
    SUM(sc.quantity) as total_in_carts,
    COUNT(DISTINCT sc.session_id) as unique_sessions,
    AVG(i.stock_quantity) as avg_stock_level
FROM shopping_carts sc
JOIN products p ON sc.product_id = p.product_id
JOIN inventory i ON p.product_id = i.product_id
GROUP BY p.product_id, p.product_name, p.category
ORDER BY total_in_carts DESC
LIMIT 10;

In [None]:
-- Real-time inventory alerts based on shopping cart demand
SELECT 
    i.product_id,
    p.product_name,
    i.stock_quantity,
    i.min_stock_level,
    COALESCE(SUM(sc.quantity), 0) as pending_demand,
    (i.stock_quantity - COALESCE(SUM(sc.quantity), 0)) as available_after_carts
FROM inventory i
JOIN products p ON i.product_id = p.product_id
LEFT JOIN shopping_carts sc ON i.product_id = sc.product_id
GROUP BY i.product_id, p.product_name, i.stock_quantity, i.min_stock_level
HAVING i.stock_quantity <= i.min_stock_level 
   OR (i.stock_quantity - COALESCE(SUM(sc.quantity), 0)) < i.min_stock_level
ORDER BY available_after_carts ASC
LIMIT 5;

## Step 8: Real-world Use Cases
Let's explore practical applications where hybrid tables provide significant value.

In [None]:
-- Use Case 1: Real-time Shopping Cart Management
SELECT 
    sc.session_id,
    COUNT(*) as items_in_cart,
    ROUND(SUM(sc.quantity * p.base_price), 2) as cart_value,
    CASE WHEN us.is_active THEN 'Active' ELSE 'Inactive' END as session_status
FROM shopping_carts sc
JOIN products p ON sc.product_id = p.product_id
LEFT JOIN user_sessions us ON sc.session_id = us.session_id
GROUP BY sc.session_id, us.is_active
ORDER BY cart_value DESC
LIMIT 5;

In [None]:
-- Use Case 2: Cross-channel Inventory Synchronization
SELECT 
    warehouse_location,
    COUNT(*) as total_products,
    SUM(stock_quantity) as total_stock,
    SUM(reserved_quantity) as total_reserved,
    ROUND(AVG(stock_quantity), 1) as avg_stock_per_product,
    COUNT(CASE WHEN stock_quantity <= min_stock_level THEN 1 END) as low_stock_alerts
FROM inventory
GROUP BY warehouse_location
ORDER BY total_stock DESC;

In [None]:
-- Use Case 3: Real-time User Activity & Personalization
SELECT 
    u.user_id,
    u.username,
    COUNT(DISTINCT us.session_id) as total_sessions,
    COUNT(DISTINCT sc.product_id) as unique_products_viewed,
    COALESCE(SUM(sc.quantity), 0) as total_items_in_carts
FROM users u
LEFT JOIN user_sessions us ON u.user_id = us.user_id
LEFT JOIN shopping_carts sc ON us.session_id = sc.session_id
GROUP BY u.user_id, u.username
HAVING COUNT(DISTINCT us.session_id) > 0
ORDER BY total_items_in_carts DESC NULLS LAST
LIMIT 5;

## Step 9: Best Practices & Key Takeaways

### When to Use Hybrid Tables:
- Real-time inventory management and stock updates
- User session tracking and management 
- Shopping cart and order processing systems
- Application metadata and configuration storage
- High-concurrency transactional workloads
- Systems requiring point lookups by primary key

### Best Practices:
- **Always use PRIMARY KEY constraints** (mandatory for hybrid tables)
- **Design for transactional workloads** with frequent updates
- **Combine with standard tables** for mixed analytical workloads
- **Monitor storage footprint** as row-based storage uses more space
- **Leverage constraints** for referential integrity and data validation
- **Use transactions** for ACID compliance in critical operations

### Current Limitations:
- Available only in AWS commercial regions currently
- Higher storage footprint compared to columnar tables
- PRIMARY KEY constraint is mandatory
- Designed for operational workloads, not large analytical scans

In [None]:
-- View all constraints implemented in this demo
SELECT 
    table_name,
    constraint_type,
    constraint_name
FROM information_schema.table_constraints
WHERE table_schema = CURRENT_SCHEMA()
AND constraint_type IN ('PRIMARY KEY', 'FOREIGN KEY', 'UNIQUE')
ORDER BY table_name, constraint_type;

## Cleanup
Let's clean up all the objects we created during this demo.

In [None]:
-- Clean up all created objects
DROP TABLE IF EXISTS shopping_carts;
DROP TABLE IF EXISTS user_sessions;
DROP TABLE IF EXISTS inventory;
DROP TABLE IF EXISTS users;
DROP TABLE IF EXISTS products;

## Key Learning Points

**You've successfully learned how to:**
- Create hybrid tables with mandatory PRIMARY KEY constraints
- Implement ACID-compliant transactions with row-level locking
- Build real-world transactional applications like inventory management
- Combine operational and analytical workloads in a single system
- Understand when hybrid tables provide value over standard tables

**Key Insights:**
- Hybrid Tables provide ACID-compliant, row-based storage for transactional workloads
- PRIMARY KEY constraints enable point lookups and concurrent operations
- Designed for real-time applications requiring high concurrency
- Seamlessly integrate with standard Snowflake tables for mixed workloads
- Essential for modern operational applications within the Snowflake ecosystem

## Additional Resources

- [Snowflake Hybrid Tables Documentation](https://docs.snowflake.com/en/user-guide/tables-hybrid)
- [Hybrid Tables vs Standard Tables Comparison](https://docs.snowflake.com/en/user-guide/tables-hybrid#comparison-with-standard-tables)
- [Constraint Management in Hybrid Tables](https://docs.snowflake.com/en/user-guide/table-constraints-hybrid)
- [Templates Hub](https://app.snowflake.com/templates)