# Multi-League Scraper Examples

This notebook demonstrates how to use the comprehensive scraper suite across all 5 HockeyTech leagues:
- QMJHL (Quebec Major Junior Hockey League)
- OHL (Ontario Hockey League)
- WHL (Western Hockey League)
- PWHL (Professional Women's Hockey League)
- AHL (American Hockey League)

Each league has the following scrapers:
1. **Schedule**: Get game schedules with scores and status
2. **Teams**: Get team information
3. **Standings**: Get league standings
4. **Player Stats**: Get player statistics (skaters)
5. **Roster**: Get team rosters

In [None]:
# Import all league scrapers
from scrapernhl.qmjhl.scrapers import (
    scrapeSchedule as qmjhl_schedule,
    scrapeTeams as qmjhl_teams,
    scrapeStandings as qmjhl_standings,
    scrapePlayerStats as qmjhl_stats,
    scrapeRoster as qmjhl_roster
)

from scrapernhl.ohl.scrapers import (
    scrapeSchedule as ohl_schedule,
    scrapeTeams as ohl_teams,
    scrapeStandings as ohl_standings,
    scrapePlayerStats as ohl_stats,
    scrapeRoster as ohl_roster
)

from scrapernhl.whl.scrapers import (
    scrapeSchedule as whl_schedule,
    scrapeTeams as whl_teams,
    scrapeStandings as whl_standings,
    scrapePlayerStats as whl_stats,
    scrapeRoster as whl_roster
)

from scrapernhl.pwhl.scrapers import (
    scrapeSchedule as pwhl_schedule,
    scrapeTeams as pwhl_teams,
    scrapeStandings as pwhl_standings,
    scrapePlayerStats as pwhl_stats,
    scrapeRoster as pwhl_roster
)

from scrapernhl.ahl.scrapers import (
    scrapeSchedule as ahl_schedule,
    scrapeTeams as ahl_teams,
    scrapeStandings as ahl_standings,
    scrapePlayerStats as ahl_stats,
    scrapeRoster as ahl_roster
)

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

## 1. QMJHL Examples

The QMJHL has two schedule API patterns:
- `scrapeSchedule()`: Uses statviewfeed endpoint for structured data
- `scrapeScorebar()`: Uses modulekit endpoint for recent games

In [None]:
# Get QMJHL schedule for season 211 (2024-25)
qmjhl_schedule_df = qmjhl_schedule(season=211, team_id=-1)
print(f"Retrieved {len(qmjhl_schedule_df)} QMJHL games")
qmjhl_schedule_df.head()

In [None]:
# Get QMJHL standings
qmjhl_standings_df = qmjhl_standings(season=211, group_by='division')
print(f"Retrieved {len(qmjhl_standings_df)} QMJHL teams in standings")
qmjhl_standings_df[['team_code', 'wins', 'losses', 'ot_losses', 'points']].head(10)

In [None]:
# Get top QMJHL skaters
qmjhl_stats_df = qmjhl_stats(season=211, player_type='skater', sort='points', limit=20)
print(f"Retrieved {len(qmjhl_stats_df)} QMJHL players")
qmjhl_stats_df[['name', 'team_code', 'games_played', 'goals', 'assists', 'points']].head(10)

In [None]:
# Get roster for Quebec Remparts (team_id=52)
qmjhl_roster_df = qmjhl_roster(team_id=52, season=211)
print(f"Retrieved {len(qmjhl_roster_df)} QMJHL roster players")
qmjhl_roster_df[['name', 'tp_jersey_number', 'position', 'birthdate', 'birthplace']].head()

## 2. OHL Examples

In [None]:
# Get OHL schedule for season 68
ohl_schedule_df = ohl_schedule(season=68, team_id=-1)
print(f"Retrieved {len(ohl_schedule_df)} OHL games")
ohl_schedule_df.head()

In [None]:
# Get OHL standings
ohl_standings_df = ohl_standings(season=68)
print(f"Retrieved {len(ohl_standings_df)} OHL teams in standings")
ohl_standings_df[['team_code', 'wins', 'losses', 'ot_losses', 'points']].head(10)

In [None]:
# Get top OHL skaters
ohl_stats_df = ohl_stats(season=68, limit=20)
print(f"Retrieved {len(ohl_stats_df)} OHL players")
ohl_stats_df[['name', 'team_code', 'games_played', 'goals', 'assists', 'points']].head(10)

## 3. Cross-League Comparison

Compare standings across leagues

In [None]:
# Get standings from multiple leagues
pwhl_standings_df = pwhl_standings(season=2)
ahl_standings_df = ahl_standings(season=71)

print(f"PWHL: {len(pwhl_standings_df)} teams")
print(f"AHL: {len(ahl_standings_df)} teams")
print(f"QMJHL: {len(qmjhl_standings_df)} teams")
print(f"OHL: {len(ohl_standings_df)} teams")

## 4. Player Stats Comparison

Get top scorers from each league

In [None]:
# Get top scorers from each league
leagues = {
    'QMJHL': qmjhl_stats(season=211, limit=5),
    'OHL': ohl_stats(season=68, limit=5),
    'PWHL': pwhl_stats(season=2, limit=5),
    'AHL': ahl_stats(season=71, limit=5)
}

for league, df in leagues.items():
    print(f"\n{league} Top 5 Scorers:")
    if len(df) > 0:
        print(df[['name', 'team_code', 'points']].head())
    else:
        print("No data available")

## 5. Roster Analysis

Analyze team rosters across leagues

In [None]:
# Get rosters from different leagues
pwhl_roster_df = pwhl_roster(team_id=1, season=2)
ahl_roster_df = ahl_roster(team_id=1, season=71)

print(f"PWHL roster size: {len(pwhl_roster_df)}")
print(f"AHL roster size: {len(ahl_roster_df)}")
print(f"QMJHL roster size: {len(qmjhl_roster_df)}")

# Show position breakdown
if len(pwhl_roster_df) > 0:
    print("\nPWHL Position Breakdown:")
    print(pwhl_roster_df['position'].value_counts())

## 6. Export Data

Export scraped data to various formats

In [None]:
# Export to CSV
qmjhl_standings_df.to_csv('qmjhl_standings.csv', index=False)
ohl_stats_df.to_csv('ohl_player_stats.csv', index=False)

# Export to Excel (multiple sheets)
with pd.ExcelWriter('multi_league_data.xlsx') as writer:
    qmjhl_standings_df.to_excel(writer, sheet_name='QMJHL Standings', index=False)
    ohl_standings_df.to_excel(writer, sheet_name='OHL Standings', index=False)
    pwhl_standings_df.to_excel(writer, sheet_name='PWHL Standings', index=False)
    ahl_standings_df.to_excel(writer, sheet_name='AHL Standings', index=False)

print("Data exported successfully!")

## Summary

This notebook demonstrated:
1. Scraping schedules, standings, player stats, and rosters across all leagues
2. Comparing data across leagues
3. Analyzing rosters and player statistics
4. Exporting data to various formats

All scrapers follow the same pattern:
- `scrapeSchedule(season, team_id, ...)`
- `scrapeTeams(season)`
- `scrapeStandings(season, ...)`
- `scrapePlayerStats(season, limit, ...)`
- `scrapeRoster(team_id, season)`

They all support:
- pandas and polars output formats
- Built-in caching (TTL-based)
- Error handling
- Progress indicators