# Customer.IO Data Pipelines API - Events and Tracking

## Purpose

This notebook demonstrates comprehensive event tracking with Customer.IO's Data Pipelines API.
It covers standard events, custom events, semantic events, and advanced tracking patterns with proper validation and error handling.

## Prerequisites

- Complete setup from `00_setup_and_configuration.ipynb`
- Complete authentication setup from `01_authentication_and_utilities.ipynb`
- Customer.IO API key configured in Databricks secrets
- Sample data available in Delta tables

## Key Concepts

- **Event Tracking**: Recording user actions and behaviors
- **Semantic Events**: Standardized events with consistent schemas
- **Custom Properties**: Flexible event attributes for business logic
- **Event Validation**: Type-safe event creation and validation
- **Batch Event Processing**: Efficient bulk event submission
- **Event Context**: Rich contextual data for better targeting

## Event Categories Covered

1. **Standard Events**: Basic tracking events (page views, clicks, etc.)
2. **Ecommerce Events**: Product views, cart actions, purchases
3. **Engagement Events**: Content interaction, feature usage
4. **Lifecycle Events**: Onboarding, activation, churn signals
5. **Custom Events**: Business-specific tracking events

## Setup and Imports

In [None]:
# Standard library imports
import sys
import os
from datetime import datetime, timezone, timedelta
from typing import Dict, List, Optional, Any, Union
import json
import uuid
from decimal import Decimal

print("SUCCESS: Standard libraries imported")

In [None]:
# Add utils directory to Python path
sys.path.append('/Workspace/Repos/customer_io_notebooks/utils')
print("SUCCESS: Utils directory added to Python path")

In [None]:
# Import Customer.IO API utilities
from utils.api_client import CustomerIOClient
from utils.validators import (
    TrackRequest,
    EcommerceEventProperties,
    OrderCompletedProperties,
    ProductViewedProperties,
    EmailEventProperties,
    MobileAppEventProperties,
    VideoEventProperties,
    validate_request_size,
    create_context
)

print("SUCCESS: Customer.IO API utilities imported")

In [None]:
# Import transformation utilities
from utils.transformers import (
    EventTransformer,
    BatchTransformer,
    ContextTransformer
)

print("SUCCESS: Transformation utilities imported")

In [None]:
# Import error handling utilities
from utils.error_handlers import (
    CustomerIOError,
    RateLimitError,
    ValidationError,
    NetworkError,
    retry_on_error,
    ErrorContext
)

print("SUCCESS: Error handling utilities imported")

In [None]:
# Import Databricks and Spark utilities
from pyspark.sql import SparkSession
from pyspark.sql import functions as F
from pyspark.sql.types import *
from delta.tables import DeltaTable

print("SUCCESS: Databricks and Spark utilities imported")

In [None]:
# Import validation and logging
import structlog
from pydantic import ValidationError as PydanticValidationError, BaseModel, Field

# Import EventManager for advanced event handling
from utils.event_manager import EventManager, EventTemplate, EventCategory, EventPriority, EventSession

# Initialize logger
logger = structlog.get_logger("events_tracking")

print("SUCCESS: Validation, logging, and EventManager imported")

## Configuration and Client Setup

In [None]:
# Load configuration from setup notebook (secure approach)
try:
    CUSTOMERIO_REGION = dbutils.widgets.get("customerio_region") or "us"
    DATABASE_NAME = dbutils.widgets.get("database_name") or "customerio_demo"
    CATALOG_NAME = dbutils.widgets.get("catalog_name") or "main"
    ENVIRONMENT = dbutils.widgets.get("environment") or "test"
    
    print(f"Configuration loaded from setup notebook:")
    print(f"  Region: {CUSTOMERIO_REGION}")
    print(f"  Database: {CATALOG_NAME}.{DATABASE_NAME}")
    print(f"  Environment: {ENVIRONMENT}")
    
except Exception as e:
    print(f"WARNING: Could not load configuration from setup notebook: {str(e)}")
    print("INFO: Using fallback configuration")
    CUSTOMERIO_REGION = "us"
    DATABASE_NAME = "customerio_demo"
    CATALOG_NAME = "main"
    ENVIRONMENT = "test"

In [None]:
# Get Customer.IO API key from secure storage
CUSTOMERIO_API_KEY = dbutils.secrets.get("customerio", "api_key")
print("SUCCESS: Customer.IO API key retrieved from secure storage")

In [None]:
# Configure Spark to use the specified database
spark.sql(f"USE {CATALOG_NAME}.{DATABASE_NAME}")
print("SUCCESS: Database configured")

In [None]:
# Initialize the Customer.IO client
try:
    client = CustomerIOClient(
        api_key=CUSTOMERIO_API_KEY,
        region=CUSTOMERIO_REGION,
        timeout=30,
        max_retries=3,
        retry_backoff_factor=2.0,
        enable_logging=True,
        spark_session=spark
    )
    print("SUCCESS: Customer.IO client initialized for event tracking")
    
except Exception as e:
    print(f"ERROR: Failed to initialize Customer.IO client: {str(e)}")
    raise

In [None]:
# Initialize the EventManager with the Customer.IO client
event_manager = EventManager(client)
print("SUCCESS: EventManager initialized with default templates")
print(f"Available templates: {list(event_manager.templates.keys())}")

## Test-Driven Development: Event Validation Functions

In [None]:
# Test function: Validate basic event structure
def test_basic_event_validation():
    """Test that basic events have required fields and pass validation."""
    
    # Test valid event
    valid_event = {
        "userId": "user_123",
        "event": "Page Viewed",
        "properties": {
            "page_name": "Home",
            "url": "https://example.com"
        },
        "timestamp": datetime.now(timezone.utc)
    }
    
    try:
        track_request = TrackRequest(**valid_event)
        assert track_request.userId == "user_123"
        assert track_request.event == "Page Viewed"
        print("SUCCESS: Basic event validation test passed")
        return True
    except Exception as e:
        print(f"ERROR: Basic event validation test failed: {str(e)}")
        return False

# Run the test
test_basic_event_validation()

In [None]:
# Test function: Validate ecommerce event properties
def test_ecommerce_event_validation():
    """Test that ecommerce events validate correctly with proper schemas."""
    
    # Test product viewed event
    try:
        product_viewed = ProductViewedProperties(
            product_id="prod_123",
            name="Test Product",
            price=29.99,
            currency="USD"
        )
        assert product_viewed.product_id == "prod_123"
        assert product_viewed.price == 29.99
        print("SUCCESS: Product viewed validation test passed")
    except Exception as e:
        print(f"ERROR: Product viewed validation test failed: {str(e)}")
        return False
    
    # Test order completed event
    try:
        order_completed = OrderCompletedProperties(
            order_id="order_456",
            total=89.97,
            currency="USD",
            products=[
                {"product_id": "prod_123", "quantity": 3, "price": 29.99}
            ]
        )
        assert order_completed.order_id == "order_456"
        assert order_completed.total == 89.97
        print("SUCCESS: Order completed validation test passed")
        return True
    except Exception as e:
        print(f"ERROR: Order completed validation test failed: {str(e)}")
        return False

# Run the test
test_ecommerce_event_validation()

In [None]:
# Test function: Validate event size limits
def test_event_size_validation():
    """Test that events respect size limits and validation."""
    
    # Test normal-sized event
    normal_event = {
        "userId": "user_123",
        "event": "Feature Used",
        "properties": {
            "feature_name": "search",
            "search_query": "test product"
        }
    }
    
    if not validate_request_size(normal_event):
        print("ERROR: Normal event failed size validation")
        return False
    
    print("SUCCESS: Normal event passed size validation")
    
    # Test oversized event
    oversized_event = {
        "userId": "user_123",
        "event": "Large Data Event",
        "properties": {
            "large_payload": "x" * (33 * 1024)  # 33KB - exceeds limit
        }
    }
    
    if validate_request_size(oversized_event):
        print("ERROR: Oversized event incorrectly passed validation")
        return False
    
    print("SUCCESS: Oversized event correctly failed size validation")
    return True

# Run the test
test_event_size_validation()

## Standard Event Tracking Implementation

In [None]:
# Implementation: Create basic tracking event using EventManager
# Test the EventManager implementation
sample_event = event_manager.create_event(
    user_id="demo_user_001",
    template_name="page_viewed",
    properties={
        "page_name": "Product Catalog",
        "url": "https://example.com/products",
        "referrer": "https://google.com",
        "page_load_time": 1.25
    }
)

print("Basic event created using EventManager:")
print(json.dumps(sample_event, indent=2, default=str))

In [None]:
# Implementation: Send single event using EventManager
if ENVIRONMENT == "test":
    result = event_manager.send_event_with_fallback(sample_event)
    print(f"Event send result: {result}")
else:
    try:
        result = event_manager.send_event(sample_event)
        print(f"Event sent successfully: {result}")
    except Exception as e:
        print(f"Error sending event: {str(e)}")

## Ecommerce Event Tracking

In [None]:
# Implementation: Create product viewed event using EventManager
product_event = event_manager.create_ecommerce_event(
    user_id="demo_user_001",
    event_type="product_viewed",
    product_data={
        "product_id": "prod_widget_premium_001",
        "name": "Premium Widget Pro",
        "price": 149.99,
        "currency": "USD"
    },
    category="Electronics",
    brand="WidgetCorp",
    sku="WCP-PRO-001"
)

print("Product Viewed event created using EventManager:")
print(json.dumps(product_event, indent=2, default=str))

In [None]:
# Implementation: Create order completed event using EventManager
order_event = event_manager.create_ecommerce_event(
    user_id="demo_user_001",
    event_type="order_completed",
    product_data={
        "order_id": "order_premium_001",
        "total": 179.98,
        "currency": "USD",
        "products": [
            {
                "product_id": "prod_widget_premium_001",
                "name": "Premium Widget Pro",
                "quantity": 1,
                "price": 149.99
            },
            {
                "product_id": "shipping_standard",
                "name": "Standard Shipping",
                "quantity": 1,
                "price": 9.99
            }
        ]
    },
    payment_method="credit_card",
    shipping_method="standard",
    discount_amount=0.00
)

print("Order Completed event created using EventManager:")
print(json.dumps(order_event, indent=2, default=str))

## Custom Event Types and Templates

In [None]:
# Implementation: Feature usage tracking using EventManager
feature_event = event_manager.create_event(
    user_id="demo_user_001",
    template_name="feature_used",
    properties={
        "feature_name": "search",
        "action": "query_executed",
        "search_query": "premium widgets",
        "results_count": 15,
        "search_time_ms": 245,
        "filters_applied": ["category:electronics", "price:100-200"]
    }
)

print("Feature Usage event created using EventManager:")
print(json.dumps(feature_event, indent=2, default=str))

In [None]:
# Implementation: User engagement tracking using EventManager
engagement_event = event_manager.create_event(
    user_id="demo_user_001",
    template_name="content_engaged",
    properties={
        "content_type": "blog_post",
        "content_id": "post_advanced_widgets_guide",
        "engagement_type": "read_completion",
        "read_percentage": 95,
        "time_spent_seconds": 285,
        "scroll_depth": 100,
        "shares": 0,
        "likes": 1
    }
)

print("Content Engagement event created using EventManager:")
print(json.dumps(engagement_event, indent=2, default=str))

## Batch Event Processing

In [None]:
# Implementation: Create multiple events for batch processing
def create_user_session_events(user_id: str, session_id: str) -> List[Dict[str, Any]]:
    """Create a realistic user session with multiple events."""
    
    events = []
    base_time = datetime.now(timezone.utc)
    
    # Session start
    events.append({
        "userId": user_id,
        "event": "Session Started",
        "properties": {
            "session_id": session_id,
            "platform": "web",
            "referrer": "https://google.com"
        },
        "timestamp": base_time
    })
    
    # Page views
    pages = [
        {"name": "Home", "url": "/", "time_offset": 5},
        {"name": "Products", "url": "/products", "time_offset": 30},
        {"name": "Product Detail", "url": "/products/widget-pro", "time_offset": 45},
        {"name": "Cart", "url": "/cart", "time_offset": 120}
    ]
    
    for page in pages:
        events.append({
            "userId": user_id,
            "event": "Page Viewed",
            "properties": {
                "session_id": session_id,
                "page_name": page["name"],
                "url": page["url"],
                "page_load_time": 1.2 + (page["time_offset"] * 0.01)
            },
            "timestamp": base_time + timedelta(seconds=page["time_offset"])
        })
    
    # Product interaction
    events.append({
        "userId": user_id,
        "event": "Product Added to Cart",
        "properties": {
            "session_id": session_id,
            "product_id": "prod_widget_pro_001",
            "product_name": "Widget Pro",
            "price": 99.99,
            "quantity": 1,
            "cart_total": 99.99
        },
        "timestamp": base_time + timedelta(seconds=135)
    })
    
    return events

# Create sample session events
session_events = create_user_session_events(
    user_id="demo_user_002",
    session_id=f"session_{uuid.uuid4().hex[:8]}"
)

print(f"Created {len(session_events)} events for user session:")
for i, event in enumerate(session_events[:2]):  # Show first 2
    print(f"  Event {i+1}: {event['event']} at {event['timestamp']}")

In [None]:
# Implementation: Batch event submission using EventManager
batch_results = event_manager.send_events_batch(
    events=session_events,
    optimize_batches=True
)

print("\nBatch submission results using EventManager:")
for result in batch_results:
    status_msg = f"Batch {result['batch_id']}: {result['status']} ({result['count']} events)"
    if result['status'] == 'failed':
        status_msg += f" - Error: {result.get('error', 'Unknown error')}"
    print(f"  {status_msg}")

## Event Context and Enrichment

## Custom Event Templates

In [None]:
# Register custom event templates
newsletter_template = EventTemplate(
    name="Newsletter Signup",
    category=EventCategory.ENGAGEMENT,
    priority=EventPriority.HIGH,
    required_properties=["newsletter_type", "signup_source"],
    default_properties={"platform": "web"}
)

push_notification_template = EventTemplate(
    name="Push Notification Opened",
    category=EventCategory.ENGAGEMENT,
    priority=EventPriority.NORMAL,
    required_properties=["notification_id", "action_taken"],
    default_properties={"platform": "mobile"}
)

# Register templates with EventManager
event_manager.register_template(newsletter_template)
event_manager.register_template(push_notification_template)

print("Custom event templates registered:")
print(f"  Newsletter Signup: {newsletter_template.required_properties}")
print(f"  Push Notification Opened: {push_notification_template.required_properties}")

In [None]:
# Implementation: Create enriched events using EventManager with custom templates
enriched_web_event = event_manager.create_enriched_event(
    user_id="demo_user_003",
    template_name="newsletter_signup",
    properties={
        "newsletter_type": "weekly_product_updates",
        "signup_source": "product_page_footer",
        "email": "demo@example.com"
    },
    platform="web",
    ip="192.168.1.100",
    user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
    url="https://example.com/products",
    referrer="https://google.com/search",
    locale="en-US",
    timezone="America/New_York"
)

print("Enriched Web Event using EventManager:")
print(json.dumps(enriched_web_event, indent=2, default=str))

In [None]:
# Test enriched mobile event using EventManager
enriched_mobile_event = event_manager.create_enriched_event(
    user_id="demo_user_003",
    template_name="push_notification_opened",
    properties={
        "notification_id": "notif_promo_001",
        "campaign_name": "Weekend Sale Alert",
        "action_taken": "opened_app",
        "time_to_open_seconds": 45
    },
    platform="mobile",
    app_name="Customer IO Demo App",
    app_version="2.1.0",
    os_name="iOS",
    os_version="17.2",
    device_model="iPhone 15 Pro",
    device_id="device_demo_001",
    locale="en-US",
    timezone="America/Los_Angeles"
)

print("Enriched Mobile Event using EventManager:")
print(json.dumps(enriched_mobile_event, indent=2, default=str))

## Data-Driven Event Generation from Spark

In [None]:
# Load sample data for event generation
print("=== Data-Driven Event Generation ===")

# Load customer and event data
customers_sample = spark.table(f"{CATALOG_NAME}.{DATABASE_NAME}.customers").limit(3)
events_sample = spark.table(f"{CATALOG_NAME}.{DATABASE_NAME}.events").limit(5)

print("Sample customers:")
customers_sample.select("user_id", "email", "plan", "created_at").show()

print("Sample events:")
events_sample.select("user_id", "event_name", "timestamp").show()

In [None]:
# Transform Spark data to Customer.IO events
print("Transforming Spark data to Customer.IO events:")

# Transform events using the EventTransformer
track_requests = EventTransformer.spark_to_track_requests(
    df=events_sample,
    user_id_col="user_id",
    event_name_col="event_name",
    properties_cols=["properties"],
    timestamp_col="timestamp"
)

print(f"Generated {len(track_requests)} track requests from Spark data")

# Show sample transformed event
if track_requests:
    print("\nSample transformed event:")
    print(json.dumps(track_requests[0], indent=2, default=str))

In [None]:
# Process the transformed events using EventManager
if track_requests:
    print("Processing transformed events using EventManager batch:")
    
    # Submit as batch using EventManager
    spark_batch_results = event_manager.send_events_batch(
        events=track_requests,
        optimize_batches=True
    )
    
    print("\nSpark-to-Customer.IO batch results using EventManager:")
    for result in spark_batch_results:
        status_msg = f"Batch {result['batch_id']}: {result['status']} ({result['count']} events)"
        if result['status'] == 'failed':
            status_msg += f" - Error: {result.get('error', 'Unknown error')}"
        print(f"  {status_msg}")
else:
    print("No events to process from Spark data")

## Performance Monitoring and Metrics

In [None]:
# Implementation: Event tracking metrics using EventManager
def track_event_metrics():
    """Display current event tracking performance metrics."""
    
    print("=== Event Tracking Metrics ===")
    
    # Get EventManager metrics
    manager_metrics = event_manager.get_metrics()
    
    # Template information
    print(f"Event Templates:")
    print(f"  Registered templates: {manager_metrics['templates']['registered_count']}")
    print(f"  Available templates: {', '.join(manager_metrics['templates']['template_names'])}")
    
    # Rate limiting status
    rate_limit = manager_metrics['client']['rate_limit']
    print(f"\nRate Limiting:")
    print(f"  Current requests: {rate_limit['current_requests']}")
    print(f"  Max requests: {rate_limit['max_requests']}")
    print(f"  Can make request: {rate_limit['can_make_request']}")
    
    # Client configuration
    print(f"\nClient Configuration:")
    print(f"  Base URL: {manager_metrics['client']['base_url']}")
    print(f"  Max retries: {manager_metrics['client']['max_retries']}")
    
    return manager_metrics

# Display metrics
metrics = track_event_metrics()

## Error Handling and Recovery Patterns

In [None]:
# Implementation: Robust event sending using EventManager
test_event = {
    "userId": "demo_user_resilience",
    "event": "Error Handling Test",
    "properties": {
        "test_type": "retry_mechanism",
        "timestamp": datetime.now(timezone.utc).isoformat()
    },
    "timestamp": datetime.now(timezone.utc)
}

try:
    result = event_manager.send_event(test_event)
    print(f"Event sent using EventManager: {result}")
except Exception as e:
    print(f"Event failed: {str(e)}")

In [None]:
# Implementation: Graceful error handling using EventManager fallback
fallback_result = event_manager.send_event_with_fallback(
    event_data=test_event,
    save_failed_events=True
)
print(f"Event with fallback using EventManager: {fallback_result}")

## Clean Up and Summary

In [None]:
# Final metrics and cleanup
print("=== Final Event Tracking Summary ===")

# Display final metrics
final_metrics = track_event_metrics()

print("\n=== Events Created in This Session ===")
print("SUCCESS: Basic page view event")
print("SUCCESS: Product viewed semantic event")
print("SUCCESS: Order completed semantic event")
print("SUCCESS: Feature usage tracking event")
print("SUCCESS: Content engagement event")
print("SUCCESS: User session events (batch)")
print("SUCCESS: Enriched events with context")
print("SUCCESS: Data-driven events from Spark")
print("SUCCESS: Error handling and retry patterns")

print("\n=== Key Capabilities Demonstrated ===")
print("SUCCESS: Type-safe event creation with Pydantic validation")
print("SUCCESS: Semantic event schemas for ecommerce tracking")
print("SUCCESS: Batch processing with size optimization")
print("SUCCESS: Rich context integration (web and mobile)")
print("SUCCESS: Data transformation from Spark to Customer.IO")
print("SUCCESS: Comprehensive error handling and recovery")
print("SUCCESS: Rate limiting protection")
print("SUCCESS: Performance monitoring and metrics")

In [None]:
# Close the API client connection
client.close()
print("SUCCESS: API client connection closed")

print("\nCOMPLETED: Event tracking notebook finished successfully!")
print("Ready for people management operations in the next notebook.")