# RFID Badge Event Simulator

**What you're about to see:** Snowflake ingesting real-time data via REST API‚Äîno Kafka, no message queues, just HTTP POST.

This notebook will:
1. Validate your environment is ready
2. Authenticate using JWT key-pair auth
3. Open a streaming channel via REST API
4. Send 1,000 badge events via HTTP POST
5. Verify data landed in your tables

**The Technology:** Snowpipe Streaming REST API (GA since Sept 2024)
- Direct HTTP ingestion (no middleware)
- 10 GB/sec throughput per table
- Sub-10-second latency
- Production-ready, fully managed by Snowflake


## Step 1: Environment Validation

Let's verify your environment is properly configured before we start.


In [None]:
# Import libraries and validate environment
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import col
import _snowflake
import requests
import json
import time

session = snowpark.context.get_active_session()

print("="*70)
print("ENVIRONMENT VALIDATION")
print("="*70)

# Check 1: Database and schema exist
try:
    result = session.sql("SHOW DATABASES LIKE 'SNOWFLAKE_EXAMPLE'").collect()
    db_exists = len(result) > 0
    print(f"[{'PASS' if db_exists else 'FAIL'}] Database: SNOWFLAKE_EXAMPLE")
except Exception as e:
    print(f"[FAIL] Database check failed: {e}")
    db_exists = False

# Check 2: Pipe exists
try:
    result = session.sql("""
        SELECT COUNT(*) as cnt 
        FROM SNOWFLAKE_EXAMPLE.INFORMATION_SCHEMA.PIPES 
        WHERE PIPE_SCHEMA = 'RAW_INGESTION' 
          AND PIPE_NAME = 'SFE_BADGE_EVENTS_PIPE'
    """).collect()
    pipe_exists = result[0]['CNT'] > 0
    print(f"[{'PASS' if pipe_exists else 'FAIL'}] Pipe: SFE_BADGE_EVENTS_PIPE")
except Exception as e:
    print(f"[FAIL] Pipe check failed: {e}")
    pipe_exists = False

# Check 3: Target table exists
try:
    result = session.sql("""
        SELECT COUNT(*) as cnt 
        FROM SNOWFLAKE_EXAMPLE.INFORMATION_SCHEMA.TABLES 
        WHERE TABLE_SCHEMA = 'RAW_INGESTION' 
          AND TABLE_NAME = 'RAW_BADGE_EVENTS'
    """).collect()
    table_exists = result[0]['CNT'] > 0
    print(f"[{'PASS' if table_exists else 'FAIL'}] Table: RAW_BADGE_EVENTS")
except Exception as e:
    print(f"[FAIL] Table check failed: {e}")
    table_exists = False

# Check 4: Secrets configured
secrets_ok = True
missing_secrets = []
for secret_name in ['SFE_SS_JWT_KEY', 'SFE_SS_ACCOUNT', 'SFE_SS_USER']:
    try:
        # Try fully qualified name first
        _snowflake.get_generic_secret_string(f'SNOWFLAKE_EXAMPLE.DEMO_REPO.{secret_name}')
        print(f"[PASS] Secret: {secret_name} configured")
    except Exception as e:
        print(f"[FAIL] Secret: {secret_name} NOT FOUND")
        print(f"       Error: {str(e)}")
        missing_secrets.append(secret_name)
        secrets_ok = False

print("="*70)

# Summary
all_ok = db_exists and pipe_exists and table_exists and secrets_ok

if all_ok:
    print("[READY] All prerequisites are configured.")
    print("        Continue to the next cell to start sending data.")
else:
    print("[SETUP INCOMPLETE] Please fix the following:")
    if not db_exists or not pipe_exists or not table_exists:
        print("   -> Run: sql/00_git_setup/03_deploy_from_git.sql")
    if not secrets_ok:
        print("   -> Run: sql/00_git_setup/02_configure_secrets.sql")
        print(f"   -> Missing: {', '.join(missing_secrets)}")
    print()
    print("   After fixing, re-run this cell to validate.")
    raise SystemExit("Environment validation failed - please complete setup first")

print("="*70)


In [None]:
## Step 2: Load Authentication Configuration

Now we'll load your credentials from Snowflake secrets.


In [None]:
# Load credentials from Snowflake secrets
import hashlib
import base64
import random
from datetime import datetime, timedelta
from cryptography.hazmat.primitives import serialization, hashes
from cryptography.hazmat.primitives.asymmetric import padding

# Load secrets (fully qualified names required)
private_key_pem = _snowflake.get_generic_secret_string('SNOWFLAKE_EXAMPLE.DEMO_REPO.SFE_SS_JWT_KEY')
account = _snowflake.get_generic_secret_string('SNOWFLAKE_EXAMPLE.DEMO_REPO.SFE_SS_ACCOUNT')
user = _snowflake.get_generic_secret_string('SNOWFLAKE_EXAMPLE.DEMO_REPO.SFE_SS_USER')

config = {
    'account': account,
    'user': user,
    'private_key_pem': private_key_pem,
    'database': 'SNOWFLAKE_EXAMPLE',
    'schema': 'RAW_INGESTION',
    'pipe': 'SFE_BADGE_EVENTS_PIPE'
}

print("="*70)
print("AUTHENTICATION CONFIGURATION")
print("="*70)
print(f"Account: {config['account']}")
print(f"User: {config['user']}")
print(f"Target Pipe: {config['database']}.{config['schema']}.{config['pipe']}")
print(f"Private Key: {len(config['private_key_pem'])} bytes loaded")
print("="*70)


## Step 3: Initialize JWT Authentication

We'll create a JWT token generator using RS256 key-pair authentication‚Äîthe same method Snowflake connectors use.


In [None]:
# JWT Authentication Class
class SnowflakeAuth:
    """Generates JWT tokens for Snowflake REST API authentication"""
    
    def __init__(self, account, user, private_key_pem):
        self.account = account
        self.user = user
        self.private_key = self._load_private_key(private_key_pem)
        self.public_key_fingerprint = self._calculate_fingerprint()
    
    def _load_private_key(self, pem_string):
        """Load RSA private key from PEM format"""
        key_bytes = pem_string.encode() if isinstance(pem_string, str) else pem_string
        return serialization.load_pem_private_key(key_bytes, password=None)
    
    @staticmethod
    def _base64url_encode(data: bytes) -> str:
        """Base64 URL-safe encoding (no padding)"""
        return base64.urlsafe_b64encode(data).rstrip(b"=").decode("utf-8")
    
    def _calculate_fingerprint(self):
        """Calculate SHA256 fingerprint of public key"""
        public_key = self.private_key.public_key()
        public_key_der = public_key.public_bytes(
            encoding=serialization.Encoding.DER,
            format=serialization.PublicFormat.SubjectPublicKeyInfo
        )
        sha256_hash = hashlib.sha256(public_key_der).digest()
        return 'SHA256:' + base64.b64encode(sha256_hash).decode('utf-8')
    
    def generate_jwt(self, expiration_minutes=59):
        """Generate JWT token (max 60 minutes lifetime)"""
        now = datetime.utcnow()
        qualified_username = f"{self.account}.{self.user}".upper()
        
        # JWT payload
        payload = {
            "iss": f"{qualified_username}.{self.public_key_fingerprint}",
            "sub": qualified_username,
            "iat": int(now.timestamp()),
            "exp": int((now + timedelta(minutes=expiration_minutes)).timestamp())
        }
        
        # Build JWT: header.payload.signature
        header = {"alg": "RS256", "typ": "JWT"}
        header_segment = self._base64url_encode(
            json.dumps(header, separators=(",", ":")).encode("utf-8")
        )
        payload_segment = self._base64url_encode(
            json.dumps(payload, separators=(",", ":")).encode("utf-8")
        )
        signing_input = f"{header_segment}.{payload_segment}".encode("utf-8")
        signature = self.private_key.sign(signing_input, padding.PKCS1v15(), hashes.SHA256())
        signature_segment = self._base64url_encode(signature)
        
        return f"{header_segment}.{payload_segment}.{signature_segment}"

# Initialize authenticator
auth = SnowflakeAuth(
    account=config['account'],
    user=config['user'],
    private_key_pem=config['private_key_pem']
)

# Verify we can generate tokens
token = auth.generate_jwt()
print("="*70)
print("JWT AUTHENTICATION")
print("="*70)
print(f"Key Fingerprint: {auth.public_key_fingerprint[:40]}...")
print(f"JWT Token Generated: {len(token)} bytes")
print(f"Preview: {token[:60]}...")
print("="*70)


## Step 4: Build the REST API Client

This is where it gets interesting. We'll create a client that talks directly to Snowflake's streaming endpoints.

**The 3-Step REST API Flow:**
1. **GET** `/v2/streaming/hostname` ‚Üí Get control plane URL
2. **POST** `/v2/streaming/.../pipes/{PIPE}:open-channel` ‚Üí Open streaming channel
3. **POST** `/v2/streaming/.../channels/{CHANNEL}:insert-rows` ‚Üí Send data (THIS IS IT!)


In [None]:
# Snowpipe Streaming REST API Client
class SnowpipeStreamingClient:
    """Direct HTTP client for Snowflake's streaming ingestion API"""
    
    def __init__(self, auth, database, schema, pipe):
        self.auth = auth
        self.database = database
        self.schema = schema
        self.pipe = pipe
        
        # Build account URL
        account_for_url = auth.account.replace('_', '-').lower()
        self.account_url = f"https://{account_for_url}.snowflakecomputing.com"
        
        # Session state (populated during workflow)
        self.control_host = None
        self.ingest_host = None
        self.scoped_token = None
        self.continuation_token = None
    
    def get_control_host(self):
        """Step 1: Discover the control plane hostname"""
        jwt_token = self.auth.generate_jwt()
        
        response = requests.get(
            f"{self.account_url}/v2/streaming/hostname",
            headers={"Authorization": f"Bearer {jwt_token}"}
        )
        response.raise_for_status()
        
        self.control_host = response.text.strip('"')
        return self.control_host
    
    def open_channel(self, channel_name):
        """Step 2: Open a streaming channel (returns ingest host + scoped token)"""
        if not self.control_host:
            self.get_control_host()
        
        jwt_token = self.auth.generate_jwt()
        url = f"https://{self.control_host}/v2/streaming/databases/{self.database}/schemas/{self.schema}/pipes/{self.pipe}:open-channel"
        
        response = requests.post(
            url,
            headers={
                "Authorization": f"Bearer {jwt_token}",
                "Content-Type": "application/json"
            },
            json={"channel_name": channel_name}
        )
        response.raise_for_status()
        
        data = response.json()
        self.ingest_host = data['ingest_host']
        self.scoped_token = data['scoped_token']
        self.continuation_token = data['continuation_token']
        
        return data
    
    def insert_rows(self, channel_name, rows):
        """Step 3: Send data via HTTP POST (THE MAIN EVENT!)"""
        url = f"https://{self.ingest_host}/v2/streaming/databases/{self.database}/schemas/{self.schema}/pipes/{self.pipe}/channels/{channel_name}:insert-rows"
        
        response = requests.post(
            url,
            headers={
                "Authorization": f"Bearer {self.scoped_token}",
                "Content-Type": "application/json",
                "X-Snowflake-Streaming-Continuation-Token": self.continuation_token
            },
            json={"rows": rows}
        )
        response.raise_for_status()
        
        result = response.json()
        self.continuation_token = result.get('continuation_token', self.continuation_token)
        
        return result

# Initialize client
client = SnowpipeStreamingClient(
    auth=auth,
    database=config['database'],
    schema=config['schema'],
    pipe=config['pipe']
)

print("="*70)
print("REST API CLIENT READY")
print("="*70)
print(f"Account URL: {client.account_url}")
print(f"Target: {client.database}.{client.schema}.{client.pipe}")
print("Ready to stream data via HTTP POST")
print("="*70)


## Step 5: Generate Sample RFID Events

Let's create realistic badge scan events. We'll simulate 100 employees accessing 20 zones via 10 RFID readers.


In [None]:
# RFID Badge Event Generator
class BadgeEventGenerator:
    """Generates realistic RFID badge scan events"""
    
    def __init__(self, num_users=100, num_zones=20, num_readers=10):
        self.badge_ids = [f"BADGE-{str(i).zfill(5)}" for i in range(1, num_users + 1)]
        self.user_ids = [f"USR-{str(i).zfill(3)}" for i in range(1, num_users + 1)]
        self.zone_ids = [f"ZONE-{zone_type}-{i}" 
                        for zone_type in ["LOBBY", "OFFICE", "CONF", "SECURE", "PARKING"]
                        for i in range(1, (num_zones // 5) + 1)]
        self.reader_ids = [f"RDR-{str(i).zfill(3)}" for i in range(1, num_readers + 1)]
        self.directions = ["ENTRY", "EXIT"]
    
    def generate_event(self, timestamp=None):
        """Generate a single badge scan event"""
        if timestamp is None:
            timestamp = datetime.utcnow()
        
        user_idx = random.randint(0, len(self.user_ids) - 1)
        
        return {
            "badge_id": self.badge_ids[user_idx],
            "user_id": self.user_ids[user_idx],
            "zone_id": random.choice(self.zone_ids),
            "event_timestamp": timestamp.isoformat() + "Z",
            "event_type": random.choice(self.directions),
            "reader_id": random.choice(self.reader_ids),
            "signal_strength": random.randint(-85, -20),
            "direction": random.choice(self.directions)
        }
    
    def generate_batch(self, count=100, start_time=None):
        """Generate a batch of events with realistic timestamps"""
        if start_time is None:
            start_time = datetime.utcnow()
        
        events = []
        for i in range(count):
            timestamp = start_time + timedelta(seconds=i*0.01)  # 10ms apart
            events.append(self.generate_event(timestamp))
        
        return events

# Initialize generator
generator = BadgeEventGenerator(num_users=100, num_zones=20, num_readers=10)

# Generate one sample to show what we're sending
sample = generator.generate_event()

print("="*70)
print("EVENT GENERATOR READY")
print("="*70)
print(f"Simulating: 100 users, 20 zones, 10 readers")
print(f"Sample event generated:")
print()
print(json.dumps(sample, indent=2))
print("="*70)


## Step 6: THE DEMO - Stream Data to Snowflake

**This is it!** Watch as we send 1,000 RFID events directly to Snowflake via HTTP POST.

No Kafka. No Kinesis. No message queues. Just REST API calls hitting Snowflake's ingestion endpoints.


In [None]:
# Execute the streaming demo
def stream_events_to_snowflake(num_events=1000, batch_size=100):
    """
    Send events to Snowflake via REST API
    
    This function demonstrates the complete Snowpipe Streaming workflow:
    1. Get control host
    2. Open channel  
    3. POST data in batches
    """
    
    channel_name = f"rfid_demo_{int(time.time())}"
    
    print("="*70)
    print("STREAMING DEMO STARTING")
    print("="*70)
    print()
    
    # STEP 1: Get control plane host
    print("Step 1: Discovering control plane...")
    control_host = client.get_control_host()
    print(f"   Control host: {control_host}")
    print()
    
    # STEP 2: Open streaming channel
    print(f"Step 2: Opening channel '{channel_name}'...")
    channel_data = client.open_channel(channel_name)
    print(f"   Channel opened!")
    print(f"   Ingest host: {client.ingest_host}")
    print(f"   Token: {client.scoped_token[:20]}...")
    print()
    
    # STEP 3: Stream data in batches
    print(f"Step 3: Streaming {num_events} events via HTTP POST...")
    print()
    
    total_sent = 0
    start_time = time.time()
    num_batches = (num_events + batch_size - 1) // batch_size
    
    for batch_num in range(num_batches):
        # Generate batch
        batch_count = min(batch_size, num_events - total_sent)
        events = generator.generate_batch(batch_count)
        
        # Send via REST API - THIS IS THE STAR OF THE SHOW!
        result = client.insert_rows(channel_name, events)
        
        total_sent += batch_count
        elapsed = time.time() - start_time
        rate = total_sent / elapsed if elapsed > 0 else 0
        
        # Progress indicator
        progress = "‚ñà" * (batch_num + 1) + "‚ñë" * (num_batches - batch_num - 1)
        print(f"   [{progress}] Batch {batch_num + 1}/{num_batches}: "
              f"{total_sent:,} events | {rate:.0f} events/sec")
        
        # Brief pause between batches
        time.sleep(0.1)
    
    elapsed = time.time() - start_time
    
    print()
    print("="*70)
    print("STREAMING COMPLETE!")
    print("="*70)
    print(f"   Events sent: {total_sent:,}")
    print(f"   Duration: {elapsed:.2f} seconds")
    print(f"   Throughput: {total_sent/elapsed:.0f} events/sec")
    print(f"   Channel: {channel_name}")
    print("="*70)
    print()
    print("Data is now flowing through Snowflake's pipeline:")
    print("   REST API -> Pipe -> RAW table -> Stream -> Task -> STAGING -> ANALYTICS")
    print()
    
    return total_sent

# RUN THE DEMO!
events_sent = stream_events_to_snowflake(num_events=1000, batch_size=100)


## Step 7: Verify Data Landed in Snowflake

Let's query your tables to confirm the data arrived and flowed through the pipeline.


In [None]:
# Verify data landed and check pipeline status
print("Checking your tables...")
print()
print("Waiting 5 seconds for ingestion to complete...")
time.sleep(5)

print("="*70)
print("PIPELINE STATUS")
print("="*70)

# Query actual row counts from YOUR tables
raw_count = session.sql(
    "SELECT COUNT(*) as cnt FROM SNOWFLAKE_EXAMPLE.RAW_INGESTION.RAW_BADGE_EVENTS"
).collect()[0]['CNT']

staging_count = session.sql(
    "SELECT COUNT(*) as cnt FROM SNOWFLAKE_EXAMPLE.STAGING_LAYER.STG_BADGE_EVENTS"
).collect()[0]['CNT']

analytics_count = session.sql(
    "SELECT COUNT(*) as cnt FROM SNOWFLAKE_EXAMPLE.ANALYTICS_LAYER.FCT_ACCESS_EVENTS"
).collect()[0]['CNT']

# Check if stream still has data (means tasks are processing)
stream_status = session.sql(
    "SELECT SYSTEM$STREAM_HAS_DATA('SNOWFLAKE_EXAMPLE.RAW_INGESTION.sfe_badge_events_stream') as has_data"
).collect()[0]['HAS_DATA']

# Display results
print()
print(f"{'Layer':<20} {'Row Count':>15} {'Status'}")
print("-"*70)
print(f"{'RAW_BADGE_EVENTS':<20} {raw_count:>15,} {'PASS - Received' if raw_count > 0 else 'FAIL - No data'}")
print(f"{'STG_BADGE_EVENTS':<20} {staging_count:>15,} {'PASS - Processed' if staging_count > 0 else 'Processing...'}")
print(f"{'FCT_ACCESS_EVENTS':<20} {analytics_count:>15,} {'PASS - Transformed' if analytics_count > 0 else 'Processing...'}")
print()
print(f"Stream Status: {'Processing' if stream_status else 'Empty (all caught up)'}")
print("="*70)

# Success message
if raw_count >= events_sent:
    print()
    print("SUCCESS! Your data is in Snowflake!")
    print(f"   You sent {events_sent:,} events via REST API")
    print(f"   Snowflake received {raw_count:,} events")
    print()
    
    if staging_count == raw_count and analytics_count == raw_count:
        print("BONUS: Complete end-to-end pipeline validated!")
        print("   Data flowed: REST API -> RAW -> Stream -> Task -> STAGING -> ANALYTICS")
    elif staging_count > 0 or analytics_count > 0:
        print("Pipeline still processing downstream tables (normal)")
        print("   Wait 1-2 minutes for tasks to complete, then re-run this cell")
else:
    print()
    print("Data still arriving (wait a few seconds and re-run)")

print("="*70)


## Step 8: View Your Data

Let's look at the actual events that just arrived in your tables.


In [None]:
# Show sample events from RAW table
print("="*70)
print("SAMPLE EVENTS FROM YOUR RAW TABLE (Most Recent)")
print("="*70)
print()

sample_df = session.sql("""
    SELECT 
        badge_id,
        user_id,
        zone_id,
        event_type,
        event_timestamp,
        ingestion_time
    FROM SNOWFLAKE_EXAMPLE.RAW_INGESTION.RAW_BADGE_EVENTS
    ORDER BY ingestion_time DESC
    LIMIT 10
""")

sample_df.show()

# Show analytics summary
print()
print("="*70)
print("ANALYTICS SUMMARY")
print("="*70)
print()

summary_df = session.sql("""
    SELECT 
        event_type,
        COUNT(*) as event_count,
        COUNT(DISTINCT user_id) as unique_users,
        COUNT(DISTINCT zone_id) as unique_zones
    FROM SNOWFLAKE_EXAMPLE.RAW_INGESTION.RAW_BADGE_EVENTS
    GROUP BY event_type
    ORDER BY event_count DESC
""")

summary_df.show()

print()
print("="*70)
print("DEMO COMPLETE!")
print("="*70)
print()
print("What you just saw:")
print("  - REST API authentication (JWT)")
print("  - Streaming channel opened")
print("  - 1,000 events sent via HTTP POST")
print("  - Data landed in Snowflake in seconds")
print("  - No middleware required")
print()
print("Next steps:")
print("  - Explore the ANALYTICS_LAYER tables for transformed data")
print("  - Check sql/03_monitoring/monitoring_views.sql for pipeline metrics")
print("  - Try modifying num_events and batch_size in the demo function")
print("  - Review the REST API calls in the SnowpipeStreamingClient class")
print("="*70)


In [None]:
# Cell 4: Snowpipe Streaming REST API Client
# Demonstrates the complete REST API workflow

class SnowpipeStreamingClient:
    """Client for Snowflake Snowpipe Streaming REST API"""
    
    def __init__(self, auth, database, schema, pipe):
        self.auth = auth
        self.database = database
        self.schema = schema
        self.pipe = pipe
        
        # Build account URL
        account_for_url = auth.account.replace('_', '-').lower()
        self.account_url = f"https://{account_for_url}.snowflakecomputing.com"
        
        # Session state
        self.control_host = None
        self.ingest_host = None
        self.scoped_token = None
        self.continuation_token = None
    
    def get_control_host(self):
        """Step 1: Get control plane hostname"""
        jwt_token = self.auth.generate_jwt()
        
        response = requests.get(
            f"{self.account_url}/v2/streaming/hostname",
            headers={"Authorization": f"Bearer {jwt_token}"}
        )
        response.raise_for_status()
        
        self.control_host = response.text.strip('"')
        print(f"   Control host: {self.control_host}")
        return self.control_host
    
    def open_channel(self, channel_name):
        """Step 2: Open streaming channel"""
        if not self.control_host:
            self.get_control_host()
        
        jwt_token = self.auth.generate_jwt()
        url = f"https://{self.control_host}/v2/streaming/databases/{self.database}/schemas/{self.schema}/pipes/{self.pipe}:open-channel"
        
        response = requests.post(
            url,
            headers={
                "Authorization": f"Bearer {jwt_token}",
                "Content-Type": "application/json"
            },
            json={"channel_name": channel_name}
        )
        response.raise_for_status()
        
        data = response.json()
        self.ingest_host = data['ingest_host']
        self.scoped_token = data['scoped_token']
        self.continuation_token = data['continuation_token']
        
        print(f"   ‚úÖ Channel '{channel_name}' opened")
        print(f"   Ingest host: {self.ingest_host}")
        return data
    
    def insert_rows(self, channel_name, rows):
        """Step 3: Insert rows via REST API - THIS IS THE KEY DEMO!"""
        url = f"https://{self.ingest_host}/v2/streaming/databases/{self.database}/schemas/{self.schema}/pipes/{self.pipe}/channels/{channel_name}:insert-rows"
        
        response = requests.post(
            url,
            headers={
                "Authorization": f"Bearer {self.scoped_token}",
                "Content-Type": "application/json",
                "X-Snowflake-Streaming-Continuation-Token": self.continuation_token
            },
            json={"rows": rows}
        )
        response.raise_for_status()
        
        result = response.json()
        self.continuation_token = result.get('continuation_token', self.continuation_token)
        
        return result

# Initialize client
client = SnowpipeStreamingClient(
    auth=auth,
    database=config['database'],
    schema=config['schema'],
    pipe=config['pipe']
)

print("‚úÖ Snowpipe Streaming client initialized")


In [None]:
# Cell 5: RFID Badge Event Generator
# Generates realistic badge scan events

class BadgeEventGenerator:
    """Generate realistic RFID badge events"""
    
    def __init__(self, num_users=100, num_zones=20, num_readers=10):
        self.badge_ids = [f"BADGE-{str(i).zfill(5)}" for i in range(1, num_users + 1)]
        self.user_ids = [f"USR-{str(i).zfill(3)}" for i in range(1, num_users + 1)]
        self.zone_ids = [f"ZONE-{zone_type}-{i}" 
                        for zone_type in ["LOBBY", "OFFICE", "CONF", "SECURE", "PARKING"]
                        for i in range(1, (num_zones // 5) + 1)]
        self.reader_ids = [f"RDR-{str(i).zfill(3)}" for i in range(1, num_readers + 1)]
        self.directions = ["ENTRY", "EXIT"]
    
    def generate_event(self, timestamp=None):
        """Generate a single badge event"""
        if timestamp is None:
            timestamp = datetime.utcnow()
        
        user_idx = random.randint(0, len(self.user_ids) - 1)
        
        event = {
            "badge_id": self.badge_ids[user_idx],
            "user_id": self.user_ids[user_idx],
            "zone_id": random.choice(self.zone_ids),
            "event_timestamp": timestamp.isoformat() + "Z",
            "event_type": random.choice(self.directions),
            "reader_id": random.choice(self.reader_ids),
            "signal_strength": random.randint(-85, -20),  # dBm
            "direction": random.choice(self.directions)
        }
        
        return event
    
    def generate_batch(self, count=100, start_time=None):
        """Generate a batch of events"""
        if start_time is None:
            start_time = datetime.utcnow()
        
        events = []
        for i in range(count):
            # Spread events over time (0.01 seconds apart)
            timestamp = start_time + timedelta(seconds=i*0.01)
            events.append(self.generate_event(timestamp))
        
        return events

# Initialize generator
generator = BadgeEventGenerator(num_users=100, num_zones=20, num_readers=10)

# Test generation
sample_event = generator.generate_event()
print("‚úÖ Event generator initialized")
print(f"   Sample event: {json.dumps(sample_event, indent=2)}")


## üöÄ Run the Simulation

This is where the magic happens! We'll:
1. Open a streaming channel
2. Generate badge events
3. Send them via REST API POST
4. Validate they arrived in Snowflake

**This demonstrates the core value:** Direct HTTP ingestion with no middleware!


In [None]:
# Cell 6: Execute Simulation - Send Data via REST API
# This is the main demo of Snowpipe Streaming REST API!

def run_simulation(num_events=1000, batch_size=100):
    """Run RFID simulation - sends data to Snowflake REST API"""
    
    channel_name = f"rfid_channel_{int(time.time())}"
    
    print("="*70)
    print("üöÄ Starting RFID Badge Event Simulation")
    print("="*70)
    print()
    
    # Step 1: Get control host
    print("üì° Step 1: Getting control plane hostname...")
    client.get_control_host()
    print()
    
    # Step 2: Open channel
    print(f"üîì Step 2: Opening streaming channel '{channel_name}'...")
    client.open_channel(channel_name)
    print()
    
    # Step 3: Send events in batches
    print(f"üì§ Step 3: Sending {num_events} events via REST API...")
    total_sent = 0
    start_time = time.time()
    
    num_batches = (num_events + batch_size - 1) // batch_size
    
    for batch_num in range(num_batches):
        # Generate batch
        batch_count = min(batch_size, num_events - total_sent)
        events = generator.generate_batch(batch_count)
        
        # Send via REST API - THIS IS THE KEY DEMO!
        result = client.insert_rows(channel_name, events)
        
        total_sent += batch_count
        elapsed = time.time() - start_time
        rate = total_sent / elapsed if elapsed > 0 else 0
        
        print(f"   Batch {batch_num + 1}/{num_batches}: {batch_count} events sent | "
              f"Total: {total_sent} | Rate: {rate:.0f} events/sec")
        
        # Brief pause between batches
        time.sleep(0.1)
    
    elapsed = time.time() - start_time
    print()
    print("="*70)
    print(f"‚úÖ Simulation Complete!")
    print(f"   Events sent: {total_sent}")
    print(f"   Duration: {elapsed:.2f} seconds")
    print(f"   Average rate: {total_sent/elapsed:.0f} events/sec")
    print("="*70)
    print()
    
    return total_sent

# Run simulation with 1000 events
events_sent = run_simulation(num_events=1000, batch_size=100)


In [None]:
# Cell 7: Validate Data Arrived in Snowflake
# Query the table to confirm REST API ingestion worked

def validate_pipeline():
    """Check that events made it through the pipeline"""
    session = get_session()
    
    print("üîç Validating data pipeline...")
    print()
    
    # Wait a moment for ingestion to complete
    print("   Waiting 5 seconds for ingestion to complete...")
    time.sleep(5)
    
    # Check raw table
    raw_count = session.sql(
        "SELECT COUNT(*) FROM SNOWFLAKE_EXAMPLE.STAGE_BADGE_TRACKING.RAW_BADGE_EVENTS"
    ).collect()[0][0]
    
    # Check staging table
    staging_count = session.sql(
        "SELECT COUNT(*) FROM SNOWFLAKE_EXAMPLE.TRANSFORM_BADGE_TRACKING.STG_BADGE_EVENTS"
    ).collect()[0][0]
    
    # Check analytics table
    analytics_count = session.sql(
        "SELECT COUNT(*) FROM SNOWFLAKE_EXAMPLE.ANALYTICS_BADGE_TRACKING.FCT_ACCESS_EVENTS"
    ).collect()[0][0]
    
    # Check stream status
    stream_has_data = session.sql(
        "SELECT SYSTEM$STREAM_HAS_DATA('SNOWFLAKE_EXAMPLE.STAGE_BADGE_TRACKING.raw_badge_events_stream')"
    ).collect()[0][0]
    
    print("üìä Pipeline Status:")
    print("   " + "="*66)
    print(f"   {'Layer':<20} | {'Row Count':>10} | {'Status':>30}")
    print("   " + "-"*66)
    print(f"   {'RAW':<20} | {raw_count:>10,} | {'‚úÖ Data received' if raw_count > 0 else '‚ùå No data'}")
    print(f"   {'STAGING':<20} | {staging_count:>10,} | {'‚úÖ Processed' if staging_count > 0 else '‚è≥ Processing'}")
    print(f"   {'ANALYTICS':<20} | {analytics_count:>10,} | {'‚úÖ Transformed' if analytics_count > 0 else '‚è≥ Processing'}")
    print("   " + "="*66)
    print(f"   Stream Status: {'‚è≥ Processing' if stream_has_data else '‚úÖ Empty (all processed)'}")
    print()
    
    if raw_count > 0:
        print("   ‚úÖ SUCCESS! REST API ingestion is working!")
        print("   Data flowed: REST API ‚Üí Snowpipe ‚Üí RAW table")
        
        if staging_count == raw_count and analytics_count == raw_count:
            print("   ‚úÖ BONUS! Complete pipeline validated!")
            print("   Data flowed: RAW ‚Üí Streams ‚Üí Tasks ‚Üí STAGING ‚Üí ANALYTICS")
        elif staging_count > 0 or analytics_count > 0:
            print("   ‚è≥ Pipeline still processing... (wait 1-2 minutes for tasks)")
    else:
        print("   ‚ö†Ô∏è  No data in RAW table yet. Wait a few seconds and re-run.")
    
    print()
    
    # Show sample events
    if raw_count > 0:
        print("üìã Sample Events (first 5):")
        sample_df = session.sql(
            "SELECT badge_id, zone_id, event_timestamp, event_type "
            "FROM SNOWFLAKE_EXAMPLE.STAGE_BADGE_TRACKING.RAW_BADGE_EVENTS "
            "ORDER BY ingestion_time DESC LIMIT 5"
        )
        sample_df.show()

# Run validation
validate_pipeline()


## üéØ What We Just Demonstrated

This notebook showcased **Snowflake's Snowpipe Streaming REST API**:

### Key Capabilities:
1. **Native HTTP Ingestion** - No external infrastructure required
2. **JWT Authentication** - Secure key-pair auth with RS256
3. **Channel-Based Streaming** - Isolated streams with continuation tokens
4. **High Performance** - 1000+ events/sec with sub-second batching
5. **Low Latency** - <10 seconds from POST to queryable data

### The API Workflow:
```
1. GET /v2/streaming/hostname
   ‚Üí Returns control plane host

2. POST /v2/streaming/.../pipes/{PIPE}:open-channel
   ‚Üí Opens channel, returns ingest host + scoped token

3. POST /v2/streaming/.../channels/{CHANNEL}:insert-rows
   ‚Üí Sends data via HTTP POST (THIS IS THE STAR!)
   ‚Üí Includes continuation token for ordering

4. Data flows automatically:
   REST API ‚Üí PIPE ‚Üí RAW table ‚Üí Stream ‚Üí Task ‚Üí STAGING ‚Üí ANALYTICS
```

### Why This Matters:
- **Zero middleware** - RFID vendors POST directly to Snowflake
- **Snowflake-native** - No Kafka, no message queues, no external services
- **Production-ready** - GA since September 2024, supports 10 GB/sec per table
- **Cost-efficient** - Throughput-based pricing, no compute overhead

---

## üìö Next Steps

1. **View the data:**
   ```sql
   SELECT * FROM SNOWFLAKE_EXAMPLE.ANALYTICS_BADGE_TRACKING.FCT_ACCESS_EVENTS
   ORDER BY event_timestamp DESC LIMIT 100;
   ```

2. **Test with curl:**
   See `README.md#tldr` for direct curl commands to hit the REST API

3. **Explore the pipeline:**
   - Streams: `SHOW STREAMS IN DATABASE SNOWFLAKE_EXAMPLE;`
   - Tasks: `SHOW TASKS IN DATABASE SNOWFLAKE_EXAMPLE;`
   - Monitoring: Query `sql/03_monitoring/monitoring_views.sql`

4. **Customize:**
   - Modify event schema: `sql/01_setup/01_core_setup.sql` (RAW_BADGE_EVENTS table)
   - Add transformations: `sql/01_setup/01_core_setup.sql` (PIPE definition)
   - Extend analytics model: `sql/01_setup/02_analytics_layer.sql` (dimensions/facts)
