# GameHistoryCollector Example

This notebook demonstrates how to use the GameHistoryCollector to automatically collect game data from the gameHistory[] rolling window.

## Overview

The GameHistoryCollector provides:
- **Automatic deduplication** by gameId
- **Passive collection** from gameStateUpdate events
- **JSONL storage** compatible with existing recordings
- **RL training export** functionality

In [None]:
# Setup: Add paths
from _paths import *
import sys
sys.path.insert(0, str(CLAUDE_FLOW_ROOT / 'jupyter'))

## Basic Usage with Mock Data

First, let's try the collector with mock data to understand how it works:

In [None]:
from lib import MockGameHistoryCollector
import tempfile
from pathlib import Path

# Create a temporary directory for this demo
demo_dir = Path(tempfile.mkdtemp())
print(f"Demo storage: {demo_dir}")

# Initialize mock collector
collector = MockGameHistoryCollector(storage_dir=demo_dir, auto_save=False)
collector.start_collecting()

print("\nMock games generated!")
print(f"Total collected: {collector.stats['total_collected']}")

### View Collected Games

In [None]:
import pandas as pd

games = collector.get_collected_games()

# Create a summary DataFrame
df = pd.DataFrame([
    {
        'game_id': g['id'],
        'timestamp': pd.to_datetime(g['timestamp'], unit='ms'),
        'num_ticks': len(g['prices']),
        'peak_price': max(g['prices']),
        'rug_point': g['rugPoint']
    }
    for g in games
])

display(df)

### Visualize a Game

In [None]:
import matplotlib.pyplot as plt

if games:
    game = games[0]
    
    plt.figure(figsize=(12, 6))
    plt.plot(game['prices'], linewidth=2)
    plt.axhline(y=game['rugPoint'], color='r', linestyle='--', label=f"Rug Point: {game['rugPoint']:.2f}x")
    plt.title(f"Game {game['id']} - Price History")
    plt.xlabel('Tick')
    plt.ylabel('Price Multiplier')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

### Test Deduplication

In [None]:
# Try to collect same games again
initial_count = collector.stats['total_collected']

# Simulate receiving the same games again
for game in games:
    collector._process_game(game)

print(f"Initial: {initial_count} games")
print(f"After re-processing: {collector.stats['total_collected']} games")
print(f"Duplicates skipped: {collector.stats['duplicates_skipped']}")
print("\nâœ“ Deduplication working correctly!")

### Validate Game Structure

In [None]:
validation = collector.validate_game_structure()

print(f"Games analyzed: {validation['games_analyzed']}")
print(f"Unique fields: {validation['total_unique_fields']}")
print("\nField Coverage:")

# Create DataFrame for better display
field_df = pd.DataFrame([
    {
        'field': field,
        'coverage': info['coverage'],
        'types': ', '.join(info['types'])
    }
    for field, info in validation['fields'].items()
])

display(field_df)

## Statistics

In [None]:
stats = collector.get_statistics()

print("Collection Statistics:")
print("=" * 40)
for key, value in stats.items():
    print(f"{key:.<30} {value}")

## Export for RL Training

In [None]:
export_path = collector.export_for_rl_training(
    output_file=demo_dir / "rl_training_export.jsonl"
)

print(f"Exported to: {export_path}")
print(f"File size: {export_path.stat().st_size} bytes")

# Show first few lines
import json

print("\nFirst game in export:")
with open(export_path, 'r') as f:
    first_line = f.readline()
    game = json.loads(first_line)
    print(json.dumps(game, indent=2)[:500] + "...")

## Integration with CDPCapture

**Note:** This requires a live Chrome connection with CDP enabled.

```python
from lib import CDPCapture, GameHistoryCollector

# Connect to Chrome
capture = CDPCapture()
if capture.connect():
    # Attach game history collector
    collector = GameHistoryCollector()
    collector.attach_to_capture(capture)
    
    print("Collecting games from live session...")
    # Games are now collected automatically from gameStateUpdate events
    
    # Check stats periodically
    import time
    while True:
        time.sleep(30)
        stats = collector.get_statistics()
        print(f"Collected: {stats['total_collected']}, Skipped: {stats['duplicates_skipped']}")
```

## Cleanup

In [None]:
collector.stop_collecting()
print("Collection stopped")

# Cleanup demo directory
import shutil
shutil.rmtree(demo_dir)
print(f"Cleaned up: {demo_dir}")

## Summary

The GameHistoryCollector provides:

1. **Zero-effort collection**: Passive monitoring during live sessions
2. **Automatic deduplication**: Tracks game IDs across sessions
3. **Storage efficiency**: JSONL format compatible with existing tools
4. **RL training ready**: Export functionality for training pipelines
5. **Data validation**: Built-in structure analysis

### Value Proposition

| Metric | Manual CDP | GameHistoryCollector |
|--------|------------|----------------------|
| Effort | High | Zero |
| Data completeness | Session-dependent | Rolling window |
| Deduplication | Manual | Automatic |
| Historical depth | 929 games | ~10 recent (instant) |

### Next Steps

1. Connect to live rugs.fun session via CDP
2. Attach GameHistoryCollector
3. Let it run passively
4. Export games for RL training
5. Validate collected data structure

See `GAME_HISTORY_COLLECTOR_GUIDE.md` for complete documentation.