# DraftKings Slate Data Collection

This notebook demonstrates how to collect live slate data from the DraftKings API using the `draft_kings` Python client.

## Features
- Query available NBA contests
- Extract draft group IDs
- Retrieve player pool with salaries
- Get matchup and game information
- Export data for optimization pipeline

## API Documentation
Client: https://github.com/jaebradley/draftkings_client

## Installation
```bash
pip install draft-kings
```

## Setup

In [4]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np
from datetime import datetime, timezone
import json

try:
    from draft_kings import Sport, Client
    print('draft_kings client imported successfully')
except ImportError:
    print('ERROR: draft_kings not installed')
    print('Install with: pip install draft-kings')
    raise

repo_root = Path.cwd().parent
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 50)
pd.set_option('display.width', None)

print('Setup complete')

draft_kings client imported successfully
Setup complete


## Initialize DraftKings Client

The client does not require authentication for public contest data.

In [5]:
client = Client()
print('DraftKings client initialized')

DraftKings client initialized


## Step 1: Query Available NBA Contests

Get all active NBA contests currently available on DraftKings.

In [6]:
print('Fetching available NBA contests...')

contests_response = client.contests(sport=Sport.NBA)

print(f'Response type: {type(contests_response)}')
print(f'Response attributes: {dir(contests_response)}')

if contests_response and hasattr(contests_response, 'contests'):
    contests = contests_response.contests
    print(f'Found {len(contests)} active NBA contests')
    
    contests_list = [vars(c) if hasattr(c, '__dict__') else c for c in contests]
    contests_df = pd.DataFrame(contests_list)
    
    print('\nContest Summary:')
    if 'game_type' in contests_df.columns:
        print(f'  Game types: {contests_df["game_type"].nunique()}')
    if 'draft_group_id' in contests_df.columns:
        print(f'  Unique draft groups: {contests_df["draft_group_id"].nunique()}')
    if 'entry_fee' in contests_df.columns:
        contests_df['entry_fee'] = pd.to_numeric(contests_df['entry_fee'], errors='coerce')
        print(f'  Entry fees range: ${contests_df["entry_fee"].min():.2f} - ${contests_df["entry_fee"].max():.2f}')
    
    print('\nSample contests:')
    display_cols = ['name', 'draft_group_id', 'game_type', 'entry_fee', 'total_payouts', 'starts_at']
    display_cols = [col for col in display_cols if col in contests_df.columns]
    display(contests_df[display_cols].head(20))
    
else:
    print('No contests found or API error')
    print(f'Response: {contests_response}')
    contests_df = pd.DataFrame()

Fetching available NBA contests...
Response type: <class 'draft_kings.output.objects.contests.ContestsDetails'>
Response attributes: ['__annotations__', '__class__', '__dataclass_fields__', '__dataclass_params__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__firstlineno__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__match_args__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__replace__', '__repr__', '__setattr__', '__sizeof__', '__static_attributes__', '__str__', '__subclasshook__', '__weakref__', 'contests', 'draft_groups']
Found 1293 active NBA contests

Contest Summary:
  Unique draft groups: 10

Sample contests:


Unnamed: 0,name,draft_group_id,starts_at
0,NBA $750K Opening Tip Off [$200K to 1st],133712,2025-10-21 23:30:00+00:00
1,NBA $6K Fadeaway [$2K to 1st],135398,2025-10-15 23:00:00+00:00
2,NBA $300K Power Dunk [$100K to 1st],133712,2025-10-21 23:30:00+00:00
3,NBA Showdown $300K Fadeaway [$100K to 1st] (GS...,133716,2025-10-22 02:00:00+00:00
4,NBA $750 And-One [20 Entry Max],135398,2025-10-15 23:00:00+00:00
5,"NBA $540 Showtime [Single Entry, Top 2 Win]",135398,2025-10-15 23:00:00+00:00
6,NBA $100K Four Point Play [20 Entry Max],133712,2025-10-21 23:30:00+00:00
7,NBA $30K High Five [Single Entry],133712,2025-10-21 23:30:00+00:00
8,NBA $10K High Five [Single Entry],133712,2025-10-21 23:30:00+00:00
9,"NBA $408 Winner Take All [$1.5K to 1st, Must F...",135398,2025-10-15 23:00:00+00:00


## Step 2: Select Target Draft Group

Choose a draft group to analyze. Typically you want the main slate (largest player pool).

In [7]:
if contests_df.empty:
    print('No contests available to select from')
else:
    draft_group_counts = contests_df['draft_group_id'].value_counts()
    
    print('Draft Groups by Contest Count:')
    print('='*60)
    for draft_group_id, count in draft_group_counts.head(10).items():
        group_contests = contests_df[contests_df['draft_group_id'] == draft_group_id]
        starts_at = group_contests['starts_at'].iloc[0] if 'starts_at' in group_contests.columns else 'Unknown'
        print(f'  Draft Group {draft_group_id}: {count} contests | Starts: {starts_at}')
    
    target_draft_group_id = draft_group_counts.index[0]
    
    print(f'\nSelected Draft Group: {target_draft_group_id}')
    print(f'Contests using this group: {draft_group_counts.iloc[0]}')
    
    print('\nTo select a different draft group, set:')
    print('  target_draft_group_id = YOUR_DRAFT_GROUP_ID')

Draft Groups by Contest Count:
  Draft Group 133712: 445 contests | Starts: 2025-10-21 23:30:00+00:00
  Draft Group 133716: 214 contests | Starts: 2025-10-22 02:00:00+00:00
  Draft Group 133715: 161 contests | Starts: 2025-10-21 23:30:00+00:00
  Draft Group 135398: 151 contests | Starts: 2025-10-15 23:00:00+00:00
  Draft Group 133721: 150 contests | Starts: 2025-10-21 23:30:00+00:00
  Draft Group 133718: 82 contests | Starts: 2025-10-22 03:00:00+00:00
  Draft Group 133178: 46 contests | Starts: 2025-10-21 23:30:00+00:00
  Draft Group 135513: 16 contests | Starts: 2025-10-15 23:00:00+00:00
  Draft Group 133713: 14 contests | Starts: 2025-10-21 23:30:00+00:00
  Draft Group 133717: 14 contests | Starts: 2025-10-22 02:00:00+00:00

Selected Draft Group: 133712
Contests using this group: 445

To select a different draft group, set:
  target_draft_group_id = YOUR_DRAFT_GROUP_ID


## Step 3: Get Draft Group Details

Retrieve metadata about the selected draft group (games, sport, start time, etc.).

In [8]:
print(f'Fetching draft group details for {target_draft_group_id}...')

draft_group_response = client.draft_group_details(draft_group_id=target_draft_group_id)

print(f'Response type: {type(draft_group_response)}')

if draft_group_response and hasattr(draft_group_response, 'draft_group'):
    draft_group_obj = draft_group_response.draft_group
    draft_group = vars(draft_group_obj) if hasattr(draft_group_obj, '__dict__') else draft_group_obj
    
    print('\nDraft Group Details:')
    print('='*60)
    print(f'  Draft Group ID: {draft_group.get("draft_group_id", "N/A")}')
    print(f'  Sport: {draft_group.get("sport", "N/A")}')
    print(f'  Start Time: {draft_group.get("starts_at", "N/A")}')
    
    if 'games' in draft_group:
        games = draft_group['games']
        if games and isinstance(games, list):
            games_list = [vars(g) if hasattr(g, '__dict__') else g for g in games]
            games_df = pd.DataFrame(games_list)
            print(f'  Games: {len(games_df)}')
            
            print('\nGames in Draft Group:')
            display_cols = ['game_id', 'home_team', 'away_team', 'starts_at']
            display_cols = [col for col in display_cols if col in games_df.columns]
            if display_cols:
                display(games_df[display_cols])
            else:
                display(games_df.head())
    
    print(f'\nFull draft group structure:')
    print(json.dumps(draft_group, indent=2, default=str)[:1000] + '...')
    
else:
    print('Could not retrieve draft group details')
    print(f'Response: {draft_group_response}')

Fetching draft group details for 133712...
Response type: <class 'draft_kings.output.objects.draft_group.DraftGroupDetails'>
Could not retrieve draft group details
Response: DraftGroupDetails(contest_details=ContestDetails(game_type_description='SalaryCap', type_id=70), draft_group_id=133712, games=[GameDetails(away_team_id=10, description='HOU @ OKC', game_id=6126966, home_team_id=25, location='Chesapeake Energy Arena', name=None, starts_at=datetime.datetime(2025, 10, 21, 23, 30, tzinfo=datetime.timezone.utc), status_description='Upcoming'), GameDetails(away_team_id=9, description='GSW @ LAL', game_id=6126972, home_team_id=13, location='Staples Center', name=None, starts_at=datetime.datetime(2025, 10, 22, 2, 0, tzinfo=datetime.timezone.utc), status_description='Upcoming')], leagues=[LeagueDetails(abbreviation='NBA', league_id=4, name='National Basketball Association')], sport=<Sport.NBA: 'NBA'>, start_time_details=StartTimeDetails(maximum=datetime.datetime(2025, 10, 22, 2, 0, tzinfo=d

## Step 4: Get Available Players (Draftables)

Retrieve the complete player pool with salaries, positions, and team information.

In [10]:
print(f'Fetching draftable players for draft group {target_draft_group_id}...')

draftables_response = client.draftables(draft_group_id=target_draft_group_id)

print(f'Response type: {type(draftables_response)}')

if draftables_response and hasattr(draftables_response, 'draftables'):
    draftables = draftables_response.draftables
    print(f'Found {len(draftables)} draftable players')
    
    draftables_list = [vars(d) if hasattr(d, '__dict__') else d for d in draftables]
    draftables_df = pd.DataFrame(draftables_list)
    
    print('\nPlayer Pool Summary:')
    print('='*60)
    
    if 'salary' in draftables_df.columns:
        draftables_df['salary'] = pd.to_numeric(draftables_df['salary'], errors='coerce')
        print(f'  Total players: {len(draftables_df)}')
        print(f'  Average salary: ${draftables_df["salary"].mean():,.0f}')
        print(f'  Min salary: ${draftables_df["salary"].min():,.0f}')
        print(f'  Max salary: ${draftables_df["salary"].max():,.0f}')
    
    if 'position' in draftables_df.columns:
        print(f'\n  Players by position:')
        print(draftables_df['position'].value_counts().to_string())
    
    if 'roster_slot_id' in draftables_df.columns:
        print(f'\n  Players by roster slot:')
        print(draftables_df['roster_slot_id'].value_counts().to_string())
    
    print('\nTop 20 Highest Salaries:')
    display_cols = ['display_name', 'position', 'team_abbreviation', 'salary', 'status']
    display_cols = [col for col in display_cols if col in draftables_df.columns]
    
    if 'salary' in draftables_df.columns:
        top_salaries = draftables_df.nlargest(20, 'salary')[display_cols]
        display(top_salaries)
    else:
        display(draftables_df[display_cols].head(20))
    
    print('\nSample player structure:')
    sample_player = draftables_list[0] if draftables_list else {}
    print(json.dumps(sample_player, indent=2, default=str))
    
else:
    print('Could not retrieve draftables')
    print(f'Response: {draftables_response}')
    draftables_df = pd.DataFrame()

Fetching draftable players for draft group 133712...
Response type: <class 'draft_kings.output.objects.draftables.DraftablesDetails'>
Could not retrieve draftables
Response: DraftablesDetails(competitions=[CompetitionDetails(are_depth_charts_available=True, are_starting_lineups_available=False, away_team=CompetitionTeamDetails(abbreviation='HOU', city='Houston', name='Rockets', team_id=10), competition_id=6126966, home_team=CompetitionTeamDetails(abbreviation='OKC', city='Oklahoma City', name='Thunder', team_id=25), name='HOU @ OKC', sport=<Sport.NBA: 'NBA'>, starts_at=datetime.datetime(2025, 10, 21, 23, 30, tzinfo=datetime.timezone.utc), state_description='Upcoming', venue='Chesapeake Energy Arena', weather=None), CompetitionDetails(are_depth_charts_available=True, are_starting_lineups_available=False, away_team=CompetitionTeamDetails(abbreviation='GSW', city='Golden State', name='Warriors', team_id=9), competition_id=6126972, home_team=CompetitionTeamDetails(abbreviation='LAL', city=

## Step 5: Parse and Structure Data

Convert DraftKings API response to structured format compatible with optimization pipeline.

In [None]:
if not draftables_df.empty:
    print('Structuring data for optimization pipeline...')
    
    slate_df = draftables_df.copy()
    
    column_mapping = {
        'draftable_id': 'playerID',
        'display_name': 'playerName',
        'first_name': 'firstName',
        'last_name': 'lastName',
        'position': 'pos',
        'team_abbreviation': 'team',
        'salary': 'salary',
        'roster_slot_id': 'rosterSlot',
        'status': 'status',
        'player_id': 'dkPlayerID'
    }
    
    available_mappings = {k: v for k, v in column_mapping.items() if k in slate_df.columns}
    slate_df = slate_df.rename(columns=available_mappings)
    
    slate_df['salary'] = pd.to_numeric(slate_df['salary'], errors='coerce')
    
    slate_df['draft_group_id'] = target_draft_group_id
    slate_df['collected_at'] = datetime.now(timezone.utc).isoformat()
    
    if 'starts_at' in draft_group:
        slate_df['slate_start_time'] = draft_group['starts_at']
    
    print(f'\nStructured slate data:')
    print(f'  Players: {len(slate_df)}')
    print(f'  Columns: {len(slate_df.columns)}')
    print(f'\nAvailable columns: {slate_df.columns.tolist()}')
    
    print('\nData preview:')
    preview_cols = ['playerName', 'pos', 'team', 'salary', 'status']
    preview_cols = [col for col in preview_cols if col in slate_df.columns]
    display(slate_df[preview_cols].head(20))
    
else:
    print('No draftables data to structure')

## Step 6: Filter Active Players

Remove players with status issues (O, IR, etc.) and those unlikely to play.

In [None]:
if not slate_df.empty and 'status' in slate_df.columns:
    print('Filtering active players...')
    
    print(f'\nStatus distribution before filtering:')
    print(slate_df['status'].value_counts())
    
    active_statuses = ['', None, 'GTD', 'PROBABLE']
    
    slate_active = slate_df[
        (slate_df['status'].isna()) | 
        (slate_df['status'].isin(active_statuses)) |
        (slate_df['status'] == '')
    ].copy()
    
    print(f'\nFiltering results:')
    print(f'  Original players: {len(slate_df)}')
    print(f'  Active players: {len(slate_active)}')
    print(f'  Removed: {len(slate_df) - len(slate_active)}')
    
    if len(slate_active) < len(slate_df):
        removed_df = slate_df[~slate_df.index.isin(slate_active.index)]
        print(f'\nRemoved players:')
        removed_cols = ['playerName', 'team', 'status']
        removed_cols = [col for col in removed_cols if col in removed_df.columns]
        display(removed_df[removed_cols].head(20))
    
else:
    print('No status column or empty dataframe')
    slate_active = slate_df.copy() if not slate_df.empty else pd.DataFrame()

## Step 7: Export Slate Data

Save collected data for use in optimization pipeline.

In [None]:
if not slate_active.empty:
    output_dir = repo_root / 'data' / 'draftkings_slates'
    output_dir.mkdir(parents=True, exist_ok=True)
    
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    
    csv_path = output_dir / f'slate_{target_draft_group_id}_{timestamp}.csv'
    slate_active.to_csv(csv_path, index=False)
    print(f'Saved CSV: {csv_path}')
    
    parquet_path = output_dir / f'slate_{target_draft_group_id}_{timestamp}.parquet'
    slate_active.to_parquet(parquet_path, index=False)
    print(f'Saved Parquet: {parquet_path}')
    
    metadata = {
        'draft_group_id': target_draft_group_id,
        'collected_at': datetime.now(timezone.utc).isoformat(),
        'player_count': len(slate_active),
        'game_count': len(draft_group.get('games', [])) if 'draft_group' in locals() else 0,
        'starts_at': draft_group.get('starts_at') if 'draft_group' in locals() else None,
        'min_salary': int(slate_active['salary'].min()) if 'salary' in slate_active.columns else None,
        'max_salary': int(slate_active['salary'].max()) if 'salary' in slate_active.columns else None,
    }
    
    metadata_path = output_dir / f'slate_{target_draft_group_id}_{timestamp}_metadata.json'
    with open(metadata_path, 'w') as f:
        json.dump(metadata, f, indent=2, default=str)
    print(f'Saved metadata: {metadata_path}')
    
    print('\nExport complete!')
    print(f'Files saved to: {output_dir}')
    
else:
    print('No data to export')

## Step 8: Game Type Rules (Optional)

Retrieve contest rules for specific game types (salary cap, roster construction, scoring).

In [None]:
if not contests_df.empty and 'game_type_id' in contests_df.columns:
    game_type_id = contests_df['game_type_id'].iloc[0]
    
    print(f'Fetching game type rules for game_type_id={game_type_id}...')
    
    rules_response = client.game_type_rules(game_type_id=game_type_id)
    
    print(f'Response type: {type(rules_response)}')
    
    if rules_response:
        if hasattr(rules_response, '__dict__'):
            rules_dict = vars(rules_response)
        else:
            rules_dict = rules_response
            
        print('\nGame Type Rules:')
        print('='*60)
        print(json.dumps(rules_dict, indent=2, default=str)[:2000])
    else:
        print('Could not retrieve game type rules')
else:
    print('No game type ID available from contests')

## Step 9: Integration with Existing Pipeline

Map DraftKings data to format expected by `LinearProgramOptimizer`.

In [None]:
if not slate_active.empty:
    print('Preparing data for optimization pipeline...')
    
    required_columns = ['playerID', 'playerName', 'pos', 'team', 'salary']
    missing_columns = [col for col in required_columns if col not in slate_active.columns]
    
    if missing_columns:
        print(f'\nWARNING: Missing required columns: {missing_columns}')
        print('Data may need additional processing before optimization')
    else:
        print('\nAll required columns present')
    
    optimization_ready = slate_active[required_columns].copy()
    
    optimization_ready['projected_fpts'] = 0.0
    
    print(f'\nOptimization-ready data:')
    print(f'  Players: {len(optimization_ready)}')
    print(f'  Columns: {optimization_ready.columns.tolist()}')
    
    print('\nSample data:')
    display(optimization_ready.head(10))
    
    print('\nNext steps:')
    print('  1. Load historical data for these players')
    print('  2. Generate projections using trained models')
    print('  3. Update projected_fpts column')
    print('  4. Pass to LinearProgramOptimizer.optimize()')
    
else:
    print('No active slate data available')

## Summary

This notebook demonstrated:

1. Querying available NBA contests from DraftKings API
2. Selecting target draft groups (slates)
3. Retrieving player pool with salaries and positions
4. Filtering active players by injury status
5. Exporting data in CSV/Parquet formats
6. Preparing data for optimization pipeline

### Integration with Existing Pipeline

To use this data with your existing models:

```python
from src.optimization.optimizers.linear_program import LinearProgramOptimizer
from src.optimization.constraints.draftkings import DraftKingsConstraints

projections_df = optimization_ready.copy()
projections_df['projected_fpts'] = model.predict(features)

optimizer = LinearProgramOptimizer(
    constraints=[DraftKingsConstraints()],
    salary_cap=50000
)

lineups = optimizer.optimize(projections_df, num_lineups=10)
```

### Next Steps

1. Schedule automated collection before daily slates
2. Match DraftKings player IDs to Tank01 playerIDs
3. Generate projections from trained models
4. Optimize lineups
5. Export for DraftKings CSV upload