# S3 Quickstart â€” Screen & Events

This notebook demonstrates reading `NormalizedEvent` JSONL from S3 (with a local fallback),
computing a few KPIs, and showing a minimal alignment placeholder.

## 1) Environment & dependencies
Install extras if needed:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e '.[video,s3,dev]'
```

We attempt to read from S3 using `boto3` if credentials are available; otherwise the notebook falls back to `examples/demo_events.jsonl`.

In [None]:
import json
from pathlib import Path
import pandas as pd
import boto3
from botocore.exceptions import NoCredentialsError, ClientError

LOCAL = Path('examples/demo_events.jsonl')

def load_events_from_local(path: Path) -> pd.DataFrame:
    rows = [json.loads(l) for l in path.read_text(encoding='utf-8').splitlines() if l.strip()]
    return pd.DataFrame(rows)

def load_events_from_s3(bucket: str, prefix: str = '') -> pd.DataFrame:
    s3 = boto3.client('s3')
    objs = s3.list_objects_v2(Bucket=bucket, Prefix=prefix).get('Contents', [])
    rows = []
    for o in objs:
        key = o['Key']
        resp = s3.get_object(Bucket=bucket, Key=key)
        body = resp['Body'].read().decode('utf-8')
        for line in body.splitlines():
            if line.strip():
                rows.append(json.loads(line))
    return pd.DataFrame(rows)

# Try S3, else local
def load_events(bucket=None, prefix=None):
    if bucket:
        try:
            df = load_events_from_s3(bucket, prefix or '')
            print(f'Loaded {len(df)} events from s3://{bucket}/{prefix}')
            return df
        except (NoCredentialsError, ClientError) as e:
            print('S3 load failed, falling back to local demo:', e)
    if LOCAL.exists():
        df = load_events_from_local(LOCAL)
        print(f'Loaded {len(df)} events from {LOCAL}')
        return df
    raise RuntimeError('No events source available')

## 2) Compute simple KPIs
Load events and compute totals, per-kind counts, and median timestamp.

In [None]:
# Replace with your bucket/prefix if you want to try S3
bucket = None
prefix = None
df = load_events(bucket, prefix)

print('Total events:', len(df))
print(df['kind'].value_counts())
print('Median t_event_ms:', int(df['t_event_ms'].median()))

## 3) Minimal alignment placeholder
This notebook includes a minimal placeholder showing how to estimate an offset between video gate (app_open) and event timestamps. For full alignment use the library's `correlate/align.py` logic.

In [None]:
# Minimal example: assume app_open at video t=0 and find earliest session_start event
session_starts = df[df['kind'] == 'session_start']
if not session_starts.empty:
    first_event = int(session_starts['t_event_ms'].min())
    print('Earliest session_start t_event_ms =', first_event)
    print('Estimated offset (video->event):', first_event, 'ms')
else:
    print('No session_start events found in dataset')

## End-to-end notes
This notebook is intended as a quick, reproducible starting point for data teams. For production workflows, use the `s2e run` CLI and the adapters in `src/screen2events/events/`.