# Gaming Events Social Scraper

This notebook demonstrates the Gaming Events Scraper using the refactored Python modules.

## Features
- Parse Facebook public posts and events for gaming events
- Monitor Discord servers for event announcements
- Extract structured data: game system, venue, date, time
- Save events in consistent JSON schema
- Extensible for additional platforms

## Installation

First, make sure you have installed the package:

```bash
# Or with UV
uv sync
```

In [1]:
# Import the refactored modules
from src.gaming_events_scraper import (
    GamingEvent,
    EventExtractor,
    FacebookEventScraper,
    DiscordEventScraper,
    EventStorage,
    GamingEventsScraper
)

import json

## Quick Start Example

Let's create a simple example using the refactored modules.

In [2]:
# Create a sample gaming event
sample_event = GamingEvent(
    title="Friday Night Magic - Modern",
    game_system="MTG",
    venue="Local Card Shop",
    date="2024-07-18",
    start_time="19:00",
    source="example"
)

print("Sample Gaming Event:")
print(f"Title: {sample_event.title}")
print(f"Game: {sample_event.game_system}")
print(f"Venue: {sample_event.venue}")
print(f"Date: {sample_event.date}")
print(f"Time: {sample_event.start_time}")
print(f"Extracted at: {sample_event.extracted_at}")

# Convert to dictionary
print("\nAs JSON:")
print(json.dumps(sample_event.to_dict(), indent=2))

Sample Gaming Event:
Title: Friday Night Magic - Modern
Game: MTG
Venue: Local Card Shop
Date: 2024-07-18
Time: 19:00
Extracted at: 2025-08-03T03:46:31.830601+00:00

As JSON:
{
  "title": "Friday Night Magic - Modern",
  "game_system": "MTG",
  "venue": "Local Card Shop",
  "date": "2024-07-18",
  "start_time": "19:00",
  "source": "example",
  "source_url": null,
  "description": null,
  "extracted_at": "2025-08-03T03:46:31.830601+00:00"
}


## Text Extraction Testing

Test the extraction logic with sample text.

In [3]:
# Test extraction with sample data
extractor = EventExtractor()

sample_texts = [
    "Friday Night Magic - Modern format at Card Kingdom, July 18th 7:00 PM",
    "Warhammer 40K tournament this Saturday 10 AM at Games Workshop",
    "D&D Adventurers League - Wed Aug 15 6:30pm at Local Game Store",
    "Pokemon League Challenge on 8/20/2024 starting at 12:00 PM"
]

print("Testing extraction logic:")
print("=" * 50)

for i, text in enumerate(sample_texts, 1):
    print(f"\nSample {i}: {text}")
    print(f"Game System: {extractor.extract_game_system(text)}")
    print(f"Date: {extractor.extract_date(text)}")
    print(f"Time: {extractor.extract_time(text)}")
    print(f"Contains gaming keywords: {extractor.contains_gaming_keywords(text)}")

Testing extraction logic:

Sample 1: Friday Night Magic - Modern format at Card Kingdom, July 18th 7:00 PM
Game System: MTG
Date: 2025-07-18
Time: 7:00 PM
Contains gaming keywords: True

Sample 2: Warhammer 40K tournament this Saturday 10 AM at Games Workshop
Game System: Warhammer
Date: None
Time: 10:00 AM
Contains gaming keywords: True

Sample 3: D&D Adventurers League - Wed Aug 15 6:30pm at Local Game Store
Game System: D&D
Date: 2025-08-15
Time: 6:30 PM
Contains gaming keywords: True

Sample 4: Pokemon League Challenge on 8/20/2024 starting at 12:00 PM
Game System: Pokemon
Date: 2024-08-20
Time: 12:00 PM
Contains gaming keywords: True


## Event Storage Testing

Test event storage and retrieval functionality.

In [4]:
# Create sample events for testing storage
sample_events = [
    GamingEvent(
        title="Friday Night Magic",
        game_system="MTG",
        venue="Card Kingdom",
        date="2024-08-02",
        start_time="19:00",
        source="facebook"
    ),
    GamingEvent(
        title="Warhammer Tournament",
        game_system="Warhammer",
        venue="Games Workshop",
        date="2024-08-03",
        start_time="10:00",
        source="discord"
    ),
    GamingEvent(
        title="D&D Adventure League",
        game_system="D&D",
        venue="Community Center",
        date="2024-08-05",
        start_time="18:30",
        source="facebook"
    )
]

# Initialize storage
storage = EventStorage("demo_events_data")

# Save events
filepath = storage.save_events(sample_events, "demo_events.json")
print(f"Saved events to: {filepath}")

# Load events back
loaded_events = storage.load_events("demo_events.json")
print(f"\nLoaded {len(loaded_events)} events:")
for event in loaded_events:
    print(f"- {event.title} ({event.game_system})")

# Test filtering
mtg_events = storage.get_events_by_game("MTG")
print(f"\nMTG events: {len(mtg_events)}")

facebook_events = storage.get_events_by_source("facebook")
print(f"Facebook events: {len(facebook_events)}")

# Test upcoming events (these will be in the past, but demonstrates the functionality)
upcoming = storage.get_upcoming_events(days_ahead=30)
print(f"Upcoming events (next 30 days): {len(upcoming)}")

Saved 3 events to demo_events_data/demo_events.json
Saved events to: demo_events_data/demo_events.json

Loaded 3 events:
- Friday Night Magic (MTG)
- Warhammer Tournament (Warhammer)
- D&D Adventure League (D&D)

MTG events: 1
Facebook events: 2
Upcoming events (next 30 days): 0


## Facebook Scraper Testing

Test the Facebook scraper (without actual API calls).

In [5]:
# Test Facebook scraper functionality (without making actual API calls)
facebook_scraper = FacebookEventScraper()

# Test parsing functionality with sample Facebook event data
sample_facebook_event = {
    "name": "Friday Night Magic - Standard Format",
    "description": "Join us for Standard format FNM. Entry fee $5, prizes for top players!",
    "start_time": "2024-08-02T19:00:00-07:00",
    "place": {
        "name": "Local Card Kingdom"
    }
}

# Parse the sample event
parsed_event = facebook_scraper._parse_facebook_event(sample_facebook_event, "https://facebook.com/event/123")

if parsed_event:
    print("Successfully parsed Facebook event:")
    print(f"Title: {parsed_event.title}")
    print(f"Game System: {parsed_event.game_system}")
    print(f"Venue: {parsed_event.venue}")
    print(f"Date: {parsed_event.date}")
    print(f"Time: {parsed_event.start_time}")
    print(f"Source: {parsed_event.source}")
else:
    print("Event was not recognized as a gaming event")

# Test text parsing
sample_text = "Magic the Gathering tournament at Card Kingdom on July 20th at 6 PM"
text_event = facebook_scraper._parse_text_event(sample_text, "https://facebook.com/page/123")

if text_event:
    print(f"\nParsed from text: {text_event.title}")
    print(f"Game System: {text_event.game_system}")
else:
    print("\nText was not recognized as a gaming event")

Successfully parsed Facebook event:
Title: Friday Night Magic - Standard Format
Game System: MTG
Venue: Local Card Kingdom
Date: 2024-08-02
Time: 19:00
Source: facebook

Parsed from text: Magic the Gathering tournament at Card Kingdom on July 20th at 6 PM
Game System: MTG


## Discord Scraper Testing

Test the Discord scraper functionality.

In [6]:
# Test Discord scraper functionality
discord_scraper = DiscordEventScraper()

print(f"Discord.py available: {discord_scraper.is_discord_available()}")

# Create a mock Discord message for testing
class MockDiscordMessage:
    def __init__(self, content, guild_name="Gaming Server", guild_id=12345, channel_id=67890, message_id=11111):
        self.content = content
        self.guild = MockGuild(guild_name, guild_id)
        self.channel = MockChannel(channel_id)
        self.id = message_id

class MockGuild:
    def __init__(self, name, guild_id):
        self.name = name
        self.id = guild_id

class MockChannel:
    def __init__(self, channel_id):
        self.id = channel_id

# Test parsing different types of messages
test_messages = [
    "Friday Night Magic tonight at 7 PM! Modern format, bring your best deck!",
    "Warhammer 40K tournament this Saturday starting at 10 AM. Registration required.",
    "D&D session Wednesday at 6:30 PM. New players welcome!",
    "Regular book club meeting tonight.",  # Should not be detected
    "Pokemon League Challenge on Saturday 2 PM. Prizes for winners!"
]

print("\nTesting Discord message parsing:")
for i, message_content in enumerate(test_messages, 1):
    mock_message = MockDiscordMessage(message_content)
    parsed_event = discord_scraper._parse_discord_message(mock_message)
    
    print(f"\nMessage {i}: {message_content[:50]}...")
    if parsed_event:
        print(f"  ✓ Detected: {parsed_event.game_system} event")
        print(f"  Title: {parsed_event.title}")
        print(f"  Venue: {parsed_event.venue}")
    else:
        print("  ✗ Not detected as gaming event")

Discord.py available: True

Testing Discord message parsing:

Message 1: Friday Night Magic tonight at 7 PM! Modern format,...
  ✓ Detected: MTG event
  Title: Friday Night Magic tonight at 7 PM! Modern format, bring your best deck!
  Venue: Gaming Server

Message 2: Warhammer 40K tournament this Saturday starting at...
  ✓ Detected: Warhammer event
  Title: Warhammer 40K tournament this Saturday starting at 10 AM. Registration required.
  Venue: Gaming Server

Message 3: D&D session Wednesday at 6:30 PM. New players welc...
  ✓ Detected: D&D event
  Title: D&D session Wednesday at 6:30 PM. New players welcome!
  Venue: Gaming Server

Message 4: Regular book club meeting tonight....
  ✗ Not detected as gaming event

Message 5: Pokemon League Challenge on Saturday 2 PM. Prizes ...
  ✓ Detected: Pokemon event
  Title: Pokemon League Challenge on Saturday 2 PM. Prizes for winners!
  Venue: Gaming Server


## Main Scraper Orchestrator

Initialize and configure the main scraper.

In [7]:
# Configuration
# Add your API tokens here (optional - will use public scraping fallbacks)
FACEBOOK_ACCESS_TOKEN = None  # Get from Facebook Developers
DISCORD_BOT_TOKEN = None      # Get from Discord Developer Portal

# Configure scraping targets
FACEBOOK_PAGES = [
    # Add Facebook page IDs or usernames
    # Example: "yourlocalcardshop", "magicthegathering"
]

DISCORD_SERVERS = [
    # Add Discord server configurations
    # Example: {"guild_id": 123456789, "channels": ["events", "announcements"]}
]

# Initialize the main scraper
scraper = GamingEventsScraper(
    facebook_token=FACEBOOK_ACCESS_TOKEN,
    discord_token=DISCORD_BOT_TOKEN,
    storage_dir="gaming_events_data"
)

print("Gaming Events Scraper initialized!")
print(f"Facebook token configured: {FACEBOOK_ACCESS_TOKEN is not None}")
print(f"Discord token configured: {DISCORD_BOT_TOKEN is not None}")
print("Storage directory: gaming_events_data")

# Test duplicate removal
test_events = [
    GamingEvent("Event A", "MTG", "Venue 1", "2024-07-18", "19:00", "facebook"),
    GamingEvent("Event A", "MTG", "Venue 2", "2024-07-18", "19:00", "discord"),  # Duplicate
    GamingEvent("Event B", "Warhammer", "Venue 1", "2024-07-19", "10:00", "facebook"),
]

unique_events = scraper._remove_duplicates(test_events)
print("\nDuplicate removal test:")
print(f"Original events: {len(test_events)}")
print(f"Unique events: {len(unique_events)}")

Gaming Events Scraper initialized!
Facebook token configured: False
Discord token configured: False
Storage directory: gaming_events_data

Duplicate removal test:
Original events: 3
Unique events: 2


## Running the Scraper

To actually scrape events, uncomment and configure the code below.

In [8]:
# Uncomment and configure these to actually scrape events
# 
# # Add your Facebook pages and Discord servers
# FACEBOOK_PAGES = ["your_facebook_page_id"]
# DISCORD_SERVERS = [{"guild_id": 123456789, "channels": ["events"]}]
# 
# # Run the scraper
# print("Starting gaming events scraper...")
# 
# try:
#     # Scrape and save events
#     output_file = scraper.scrape_and_save(
#         facebook_pages=FACEBOOK_PAGES,
#         discord_servers=DISCORD_SERVERS
#     )
#     
#     if output_file:
#         print(f"\nEvents saved to: {output_file}")
#     else:
#         print("\nNo events were found or saved.")
#         
# except Exception as e:
#     print(f"Error running scraper: {e}")

print("Scraper is ready to use!")
print("Configure FACEBOOK_PAGES and DISCORD_SERVERS, then uncomment the code above to scrape events.")

Scraper is ready to use!
Configure FACEBOOK_PAGES and DISCORD_SERVERS, then uncomment the code above to scrape events.


## Working with Saved Events

Load and analyze previously scraped events.

In [9]:
# Access the storage system
storage = scraper.get_storage()

# List all saved event files
files = storage.list_files()
print(f"Event files found: {files}")

if files:
    # Load events from the first file
    events = storage.load_events(files[0])
    print(f"\nLoaded {len(events)} events from {files[0]}")
    
    for event in events[:3]:  # Show first 3
        print(f"- {event.title}")
        print(f"  {event.game_system} | {event.date} {event.start_time} | {event.venue}")
        print(f"  Source: {event.source}")
        print()
    
    # Get events by game system
    all_events = storage.get_all_events()
    if all_events:
        print("Events by game system:")
        game_systems = set(event.game_system for event in all_events)
        for game in game_systems:
            game_events = storage.get_events_by_game(game)
            print(f"  {game}: {len(game_events)} events")
    
    # Get upcoming events
    upcoming = storage.get_upcoming_events(days_ahead=30)
    print(f"\nUpcoming events (next 30 days): {len(upcoming)}")
else:
    print("No event files found. Run the scraper first or create some demo events.")

Event files found: []
No event files found. Run the scraper first or create some demo events.


## Running Tests

You can run the comprehensive test suite to verify everything works.

In [10]:
# Run tests from the notebook
# Uncomment the lines below to run tests

# import subprocess
# import sys

# # Run the test suite
# try:
#     result = subprocess.run([sys.executable, "-m", "pytest", "tests/", "-v"], 
#                           capture_output=True, text=True, cwd=".")
#     print("Test Results:")
#     print(result.stdout)
#     if result.stderr:
#         print("Errors:")
#         print(result.stderr)
# except Exception as e:
#     print(f"Error running tests: {e}")

print("To run tests manually:")
print("1. Open a terminal in the project directory")
print("2. Run: uv run pytest tests/ -v")
print("3. Or run: python -m pytest tests/ -v")
print("")
print("Available test modules:")
print("- tests/test_models.py - Test event data models")
print("- tests/test_extractors.py - Test text extraction")
print("- tests/test_storage.py - Test event storage")
print("- tests/test_facebook_scraper.py - Test Facebook scraper")
print("- tests/test_discord_scraper.py - Test Discord scraper")
print("- tests/test_scraper.py - Test main orchestrator")

To run tests manually:
1. Open a terminal in the project directory
2. Run: uv run pytest tests/ -v
3. Or run: python -m pytest tests/ -v

Available test modules:
- tests/test_models.py - Test event data models
- tests/test_extractors.py - Test text extraction
- tests/test_storage.py - Test event storage
- tests/test_facebook_scraper.py - Test Facebook scraper
- tests/test_discord_scraper.py - Test Discord scraper
- tests/test_scraper.py - Test main orchestrator


## Next Steps

### Getting Started

1. **Get API Credentials** (Optional but recommended):
   - **Facebook**: Create a Facebook App and get an access token from [Facebook Developers](https://developers.facebook.com/)
   - **Discord**: Create a Discord Bot and get a bot token from [Discord Developer Portal](https://discord.com/developers/applications)

2. **Configure Target Sources**:
   - Add Facebook page IDs or usernames to `FACEBOOK_PAGES`
   - Add Discord server configurations to `DISCORD_SERVERS`

3. **Run the Scraper**:
   - Uncomment and configure the scraping code above
   - Execute the cells to start collecting events

### Expanding the Scraper

1. **Add More Sources**:
   - Reddit gaming communities
   - Twitter/X gaming accounts  
   - Eventbrite gaming events
   - Local gaming store websites

2. **Improve Extraction**:
   - Train ML models for better text parsing
   - Add more game systems and keywords
   - Improve date/time parsing accuracy

3. **Add Features**:
   - Email notifications for new events
   - Calendar export (ICS format)
   - Web dashboard for viewing events
   - Integration with calendar apps

4. **Storage Options**:
   - Database storage (PostgreSQL, MongoDB)
   - Cloud storage (AWS S3, Google Cloud)
   - Real-time event updates

### Development

- **Run Tests**: `uv run pytest tests/ -v`
- **Code Formatting**: `uv run black .`
- **Linting**: `uv run ruff check .`
- **Type Checking**: `uv run mypy src/`

The modular design makes it easy to extend and customize for your specific needs!