# FDS Challenge: Starter Notebook

This notebook will guide you through the first steps of the competition. Our goal here is to show you how to:

1.  Load the `train.jsonl` and `test.jsonl` files from the competition data.
2.  Create a very simple set of features from the data.
3.  Train a basic model.
4.  Generate a `submission.csv` file in the correct format.
5.  Submit your results.

Let's get started!

In [47]:
from typing import Any
import json
import os
from pprint import pprint
import numpy as np
import pandas as pd

from sklearn.preprocessing import StandardScaler

from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.svm import SVC

from xgboost import XGBClassifier

from sklearn.metrics import make_scorer, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from sklearn.model_selection import KFold, StratifiedKFold, cross_validate

pd.set_option("display.max_columns", 0)

### 1. Loading and Inspecting the Data

When you create a notebook within a Kaggle competition, the competition's data is automatically attached and available in the `../input/` directory.

The dataset is in a `.jsonl` format, which means each line is a separate JSON object. This is great because we can process it one line at a time without needing to load the entire large file into memory.

Let's write a simple loop to load the training data and inspect the first battle.

In [48]:
COMPETITION_NAME = 'fds-pokemon-battles-prediction-2025'
DATA_PATH = os.getcwd() #os.path.join('../input', COMPETITION_NAME)

train_file_path = os.path.join(DATA_PATH, 'train.jsonl')
test_file_path = os.path.join(DATA_PATH, 'test.jsonl')

print(f"Loading data from '{train_file_path}'...")
try:
    with open(train_file_path, 'r', encoding="utf-8") as f:
        train_data = [json.loads(line) for line in f]

    print(f"Successfully loaded {len(train_data)} battles.")

    #print("\n--- Structure of the first train battle: ---")
    if train_data:
        first_battle = train_data[0]
        
        battle_for_display = first_battle.copy()
        battle_for_display['battle_timeline'] = battle_for_display.get('battle_timeline', [])[:2] # Show first 2 turns
        
        #pprint(battle_for_display)
        if len(first_battle.get('battle_timeline', [])) > 3:
            print("    ...")
            print("    (battle_timeline has been truncated for display)")

except FileNotFoundError:
    print(f"ERROR: Could not find the training file at '{train_file_path}'.")
    print("Please make sure you have added the competition data to this notebook.")

Loading data from 'c:\Users\stefa\PycharmProjects\pokemon-challenge\train.jsonl'...
Successfully loaded 10000 battles.
    ...
    (battle_timeline has been truncated for display)


### 2. Basic Feature Engineering

A successful model will likely require creating many complex features. For this starter notebook, however, we will create a very simple feature set based **only on the initial team stats**. This will be enough to train a model and generate a submission file.

It's up to you to engineer more powerful features!

In [49]:
def features_check(data: dict) -> None:
    print("All battles have at least one turn: ", all(all(turn for turn in battle.get('battle_timeline', False)) for battle in data))
    print("All battles' turns have at least one P1 move: ", 
        all((
            any((turn.get("p1_move_details", False) for turn in battle.get('battle_timeline', False))) for battle in data
        ))
    )
    print("All battles' turns have at least one P2 move: ", 
        all((
            any((turn.get("p2_move_details", False) for turn in battle.get('battle_timeline', False))) for battle in data
        ))
    )
    print("player_won feature always exists: ", all(('player_won' in battle for battle in data)))
    print("P1 Team always exists: ", all(battle.get('p1_team_details', False) for battle in data))
    print("P2 Team always exists: ", all(battle.get('p2_team_details', False) for battle in data))
    
    return None

In [114]:
def agg_pokemons_stats(prefix: str, stats: dict[str, Any]):
    return {
        f"{prefix}_mean_power": np.mean(stats["powers"]) if stats["powers"] else 0,
        f"{prefix}_mean_accuracy": np.mean(stats["accuracy"]) if stats["accuracy"] else 0,
        f"{prefix}_lost_hp": stats["lost_hp"],
        f"{prefix}_turns_statused": stats["turns_statused"],
        f"{prefix}_missed_turns": stats["missed_turns"],
        f"{prefix}_switches": stats["switches"],
        f"{prefix}_net_boost": stats["net_boost"],
    }

In [115]:
def create_features(data: list[dict]) -> pd.DataFrame:
    feature_list = []
    
    features_check(data)

    for battle in data:
        features = {}

        p1_stats = {
            "powers": [], "accuracy": [], "hp_t0": {}, "lost_hp": 0, "turns_statused": 0,
            "missed_turns": 0, "priority": 0, "switches": 0, "net_boost": 0,
            "base_boosts": {"atk": 0, "def": 0, "spa": 0, "spd": 0, "spe": 0}
        }
        
        p2_stats = {
            "powers": [], "accuracy": [], "hp_t0": {}, "lost_hp": 0, "turns_statused": 0,
            "missed_turns": 0, "priority": 0, "switches": 0, "net_boost": 0,
            "base_boosts": {"atk": 0, "def": 0, "spa": 0, "spd": 0, "spe": 0}
        }

        # --- Initial Pokémon states ---
        first_turn = battle["battle_timeline"][0]
        p1_lead = first_turn.get("p1_pokemon_state", {}).get("name", "")
        p2_lead = battle.get("p2_lead_details", {}).get("name", "")
        p1_prev_hp = first_turn.get("p1_pokemon_state", {}).get("hp_pct", 1.0)
        p2_prev_hp = first_turn.get("p2_pokemon_state", {}).get("hp_pct", 1.0)
        
        # --- Player 1 Team Features ---
        p1_team = battle.get('p1_team_details', None)
        features['p1_mean_hp'] = np.mean([p.get('base_hp') for p in p1_team])
        features['p1_mean_spe'] = np.mean([p.get('base_spe') for p in p1_team])
        features['p1_mean_atk'] = np.mean([p.get('base_atk') for p in p1_team])
        features['p1_mean_def'] = np.mean([p.get('base_def') for p in p1_team])

        # --- Player 2 Lead Features ---
        if p2_lead := battle.get('p2_lead_details'):
            # Player 2's lead Pokémon's stats
            features['p2_lead_hp'] = p2_lead.get('base_hp')
            features['p2_lead_spe'] = p2_lead.get('base_spe')
            features['p2_lead_atk'] = p2_lead.get('base_atk')
            features['p2_lead_def'] = p2_lead.get('base_def')
        
        # --- Battle Timeline Features ---
        if timeline := battle.get('battle_timeline', []):
            turns = len(timeline)
            p1_names = [t['p1_pokemon_state']['name'] for t in timeline if t.get('p1_pokemon_state')]
            p1_moves = [t['p1_move_details']['name'] for t in timeline if t.get('p1_move_details')]
            p2_names = [t['p2_pokemon_state']['name'] for t in timeline if t.get('p2_pokemon_state')]

            # Number of turns and unique Pokémon
            features['n_turns'] = turns
            features['p1_unique_pokemon'] = len(set(p1_names))
            #features['p1_unique_moves'] = len(set(p1_moves))
            features['p2_unique_pokemon'] = len(set(p2_names))

            # Compute damage dealt (approximate)
            # delta HP of opponent between turns
            p2_hp_deltas = []
            for t, t_stats in enumerate(timeline):
                p1_state = t_stats.get("p1_pokemon_state", {})
                p2_state = t_stats.get("p2_pokemon_state", {})

                # --- Moves and accuracy ---
                for player, stats, move_key in [
                    ("p1", p1_stats, "p1_move_details"),
                    ("p2", p2_stats, "p2_move_details")
                ]:
                    move = t_stats.get(move_key)
                    if move:
                        stats["powers"].append(move.get("base_power", 0))
                        stats["accuracy"].append(move.get("accuracy", 0))
                    else:
                        stats["missed_turns"] += 1

                # --- Status tracking ---
                if p1_state.get("status") != "nostatus":
                    p1_stats["turns_statused"] += 1
                if p2_state.get("status") != "nostatus":
                    p2_stats["turns_statused"] += 1
                
                # --- HP and damage tracking ---
                p1_name, p2_name = p1_state.get("name", ""), p2_state.get("name", "")
                p1_hp, p2_hp = p1_state.get("hp_pct", 1.0), p2_state.get("hp_pct", 1.0)

                # --- Switches ---
                if p1_name != p1_lead and p1_prev_hp > 0:
                    p1_stats["switches"] += 1

                if p2_name != p2_lead and p2_prev_hp > 0:
                    p2_stats["switches"] += 1

                # --- Boost tracking ---
                if p1_name != p1_lead:
                    p1_stats["base_boosts"] = {k: 0 for k in p1_stats["base_boosts"]}
                if p2_name != p2_lead:
                    p2_stats["base_boosts"] = {k: 0 for k in p2_stats["base_boosts"]}

                p1_boosts = p1_state.get("boosts", {})
                p2_boosts = p2_state.get("boosts", {})

                for stat in ["atk", "def", "spa", "spd", "spe"]:
                    p1_stats["net_boost"] += (p1_boosts.get(stat, 0) - p1_stats["base_boosts"].get(stat, 0))
                    p2_stats["net_boost"] += (p2_boosts.get(stat, 0) - p2_stats["base_boosts"].get(stat, 0))
                
                prev_hp = timeline[t-1]['p2_pokemon_state']['hp_pct']
                curr_hp = timeline[t]['p2_pokemon_state']['hp_pct']
                p2_hp_deltas.append(prev_hp - curr_hp)
            features['mean_damage_dealt'] = np.mean([d for d in p2_hp_deltas if d > 0]) if p2_hp_deltas else None

            # Final HP and KO counts
            last_state = timeline[-1]['p1_pokemon_state']
            features['final_p1_hp'] = last_state.get('hp_pct', None)
            features['p1_fainted_count'] = sum(t['p1_pokemon_state']['status'] == 'fnt' for t in timeline)
            features['p2_fainted_count'] = sum(t['p2_pokemon_state']['status'] == 'fnt' for t in timeline)
            
        else:
            features.update({
                'n_turns': None,
                'p1_unique_pokemon': None,
                'p1_unique_moves': None,
                'p2_unique_pokemon': None,
                'mean_damage_dealt': None,
                'final_p1_hp': None,
                'p1_fainted_count': None,
                'p2_fainted_count': None,
            })

        features.update(agg_pokemons_stats("p1", p1_stats))
        features.update(agg_pokemons_stats("p2", p2_stats))


        features['battle_id'] = battle.get('battle_id')
        if 'player_won' in battle:
            features['player_won'] = int(battle['player_won'])
            
        feature_list.append(features)
        
    return pd.DataFrame(feature_list).fillna(0)

print("Processing training data...")
train_df = create_features(train_data)

print("\nProcessing test data...")
with open(test_file_path, 'r', encoding="utf-8") as f:
    test_data = [json.loads(line) for line in f]
        
test_df = create_features(test_data)

print("\nTraining dataset preview:")
display(train_df.head())
display(train_df.describe())
display(train_df.dtypes)

print("\nTesting dataset preview:")
display(test_df.head())
display(test_df.describe())
display(test_df.dtypes)

Processing training data...
All battles have at least one turn:  True
All battles' turns have at least one P1 move:  False
All battles' turns have at least one P2 move:  False
player_won feature always exists:  True
P1 Team always exists:  True
P2 Team always exists:  False


  return _methods._mean(a, axis=axis, dtype=dtype,
  ret = ret.dtype.type(ret / rcount)



Processing test data...
All battles have at least one turn:  True
All battles' turns have at least one P1 move:  False
All battles' turns have at least one P2 move:  False
player_won feature always exists:  False
P1 Team always exists:  True
P2 Team always exists:  False

Training dataset preview:


Unnamed: 0,p1_mean_hp,p1_mean_spe,p1_mean_atk,p1_mean_def,p2_lead_hp,p2_lead_spe,p2_lead_atk,p2_lead_def,n_turns,p1_unique_pokemon,p2_unique_pokemon,mean_damage_dealt,final_p1_hp,p1_fainted_count,p2_fainted_count,p1_mean_power,p1_mean_accuracy,p1_lost_hp,p1_turns_statused,p1_missed_turns,p1_switches,p1_net_boost,p2_mean_power,p2_mean_accuracy,p2_lost_hp,p2_turns_statused,p2_missed_turns,p2_switches,p2_net_boost,battle_id,player_won
0,115.833333,80.0,72.5,63.333333,60,115,75,85,30,4,4,0.292968,0.291022,1,1,57.592593,0.925926,0,7,3,22,0,68.75,0.9875,0,17,14,30,-4,0,1
1,123.333333,61.666667,72.5,65.833333,55,120,50,45,30,6,6,0.191667,0.45,3,0,87.652174,0.963043,0,11,7,23,0,63.478261,0.969565,0,5,7,30,-4,1,1
2,124.166667,65.833333,84.166667,71.666667,250,50,5,5,30,3,4,0.26,0.52,1,0,35.222222,0.944444,0,15,3,14,10,54.318182,0.943182,0,14,8,30,-6,2,1
3,121.666667,75.833333,77.5,65.833333,75,110,100,95,30,5,4,0.336667,0.04,3,0,67.608696,0.954348,0,20,7,27,0,89.8,0.94,0,5,5,30,0,3,1
4,114.166667,72.5,75.833333,79.166667,60,115,75,85,30,5,5,0.351818,1.0,1,0,34.961538,0.990385,0,9,4,17,-4,47.115385,0.971154,0,22,4,30,0,4,1


Unnamed: 0,p1_mean_hp,p1_mean_spe,p1_mean_atk,p1_mean_def,p2_lead_hp,p2_lead_spe,p2_lead_atk,p2_lead_def,n_turns,p1_unique_pokemon,p2_unique_pokemon,mean_damage_dealt,final_p1_hp,p1_fainted_count,p2_fainted_count,p1_mean_power,p1_mean_accuracy,p1_lost_hp,p1_turns_statused,p1_missed_turns,p1_switches,p1_net_boost,p2_mean_power,p2_mean_accuracy,p2_lost_hp,p2_turns_statused,p2_missed_turns,p2_switches,p2_net_boost,battle_id,player_won
count,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0,10000.0
mean,113.124317,75.772917,77.711017,70.703667,68.2016,104.4755,63.1382,58.6147,30.0,5.2092,5.1672,0.321432,0.588801,2.404,0.3502,59.861355,0.956606,0.0,12.8262,6.4095,22.7847,3.0507,60.226817,0.957072,0.0,11.1331,6.5131,29.742,4.2357,4999.5,0.5
std,13.405444,8.116724,7.118607,9.887678,29.326111,21.457829,18.474405,22.24854,0.0,0.860993,0.876425,0.094029,0.362657,1.468468,0.583433,19.160794,0.037278,0.0,5.346031,3.073948,5.510116,22.544879,19.09124,0.038735,0.0,5.625349,3.064175,2.770235,25.865522,2886.89568,0.500025
min,63.333333,46.666667,55.833333,48.333333,50.0,30.0,5.0,5.0,30.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-58.0,0.0,0.0,0.0,0.0,0.0,0.0,-60.0,0.0,0.0
25%,109.166667,69.166667,72.5,63.333333,55.0,95.0,50.0,45.0,30.0,5.0,5.0,0.258571,0.280962,1.0,0.0,47.2,0.940741,0.0,9.0,4.0,21.0,-2.0,47.606522,0.941304,0.0,7.0,4.0,30.0,-2.0,2499.75,0.0
50%,116.666667,75.833333,75.833333,69.166667,60.0,115.0,65.0,60.0,30.0,5.0,5.0,0.315,0.66,2.0,0.0,60.809524,0.964583,0.0,13.0,6.0,24.0,0.0,61.538462,0.964583,0.0,11.0,6.0,30.0,0.0,4999.5,0.5
75%,121.666667,80.0,81.666667,75.833333,65.0,120.0,75.0,85.0,30.0,6.0,6.0,0.374555,1.0,3.0,1.0,73.695652,0.982,0.0,17.0,8.0,26.0,0.0,73.958333,0.981818,0.0,15.0,8.0,30.0,0.0,7499.25,1.0
max,135.833333,110.833333,110.666667,112.5,250.0,130.0,134.0,180.0,30.0,6.0,6.0,1.0,1.0,6.0,4.0,134.615385,1.0,0.0,30.0,30.0,29.0,324.0,143.0,1.0,0.0,28.0,30.0,30.0,344.0,9999.0,1.0


p1_mean_hp           float64
p1_mean_spe          float64
p1_mean_atk          float64
p1_mean_def          float64
p2_lead_hp             int64
p2_lead_spe            int64
p2_lead_atk            int64
p2_lead_def            int64
n_turns                int64
p1_unique_pokemon      int64
p2_unique_pokemon      int64
mean_damage_dealt    float64
                      ...   
p1_missed_turns        int64
p1_switches            int64
p1_net_boost           int64
p2_mean_power        float64
p2_mean_accuracy     float64
p2_lost_hp             int64
p2_turns_statused      int64
p2_missed_turns        int64
p2_switches            int64
p2_net_boost           int64
battle_id              int64
player_won             int64
Length: 31, dtype: object


Testing dataset preview:


Unnamed: 0,p1_mean_hp,p1_mean_spe,p1_mean_atk,p1_mean_def,p2_lead_hp,p2_lead_spe,p2_lead_atk,p2_lead_def,n_turns,p1_unique_pokemon,p2_unique_pokemon,mean_damage_dealt,final_p1_hp,p1_fainted_count,p2_fainted_count,p1_mean_power,p1_mean_accuracy,p1_lost_hp,p1_turns_statused,p1_missed_turns,p1_switches,p1_net_boost,p2_mean_power,p2_mean_accuracy,p2_lost_hp,p2_turns_statused,p2_missed_turns,p2_switches,p2_net_boost,battle_id
0,117.5,78.333333,74.166667,61.666667,65,130,65,60,30,5,5,0.352222,1.0,4,0,58.55,0.9875,0,19,10,24,0,73.333333,0.94375,0,17,6,30,40,0
1,70.166667,95.833333,95.666667,96.666667,55,120,50,45,30,4,6,0.154615,1.0,1,0,53.888889,0.87963,0,4,3,28,82,85.0,1.0,0,13,18,30,0,1
2,120.0,61.666667,90.833333,88.333333,55,120,50,45,30,5,6,0.098413,1.0,1,0,35.0,0.822,0,1,5,25,0,51.4,0.946,0,17,5,30,12,2
3,114.166667,71.666667,70.0,71.666667,160,30,110,65,30,3,5,0.361429,0.32,0,0,33.821429,0.9875,0,19,2,22,-4,73.541667,0.983333,0,17,6,30,-4,3
4,116.666667,78.333333,75.0,65.833333,60,110,65,60,30,5,6,0.431722,0.189802,3,1,74.375,0.977083,0,11,6,23,-2,64.5,0.93125,0,9,6,30,-2,4


Unnamed: 0,p1_mean_hp,p1_mean_spe,p1_mean_atk,p1_mean_def,p2_lead_hp,p2_lead_spe,p2_lead_atk,p2_lead_def,n_turns,p1_unique_pokemon,p2_unique_pokemon,mean_damage_dealt,final_p1_hp,p1_fainted_count,p2_fainted_count,p1_mean_power,p1_mean_accuracy,p1_lost_hp,p1_turns_statused,p1_missed_turns,p1_switches,p1_net_boost,p2_mean_power,p2_mean_accuracy,p2_lost_hp,p2_turns_statused,p2_missed_turns,p2_switches,p2_net_boost,battle_id
count,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0
mean,112.775633,75.915333,77.737933,70.8491,68.4294,104.683,62.7956,58.493,30.0,5.2178,5.185,0.318708,0.587873,2.4084,0.3412,59.878691,0.957001,0.0,12.7122,6.4236,22.8206,3.5818,60.542835,0.956858,0.0,11.2052,6.5288,29.718,3.5304,2499.5
std,13.591369,8.136431,7.172697,9.797174,30.67828,20.984103,18.341818,22.215865,0.0,0.861227,0.880074,0.093353,0.363823,1.470388,0.56608,19.190063,0.037974,0.0,5.444199,3.067123,5.437235,24.274562,19.026479,0.037411,0.0,5.732893,3.059321,2.895195,23.894787,1443.520003
min,63.333333,46.666667,57.5,49.166667,50.0,30.0,5.0,5.0,30.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-50.0,0.0,0.0,0.0,0.0,0.0,0.0,-116.0,0.0
25%,108.333333,69.166667,72.5,63.333333,55.0,95.0,50.0,45.0,30.0,5.0,5.0,0.256667,0.28,1.0,0.0,46.991071,0.94115,0.0,9.0,4.0,21.0,-2.0,48.0,0.940909,0.0,7.0,5.0,30.0,-2.0,1249.75
50%,116.666667,76.666667,75.833333,69.166667,60.0,115.0,65.0,60.0,30.0,5.0,5.0,0.31,0.66,2.0,0.0,60.652174,0.965,0.0,13.0,6.0,24.0,0.0,62.083333,0.964,0.0,11.0,6.0,30.0,0.0,2499.5
75%,121.666667,80.0,81.666667,75.833333,65.0,120.0,75.0,85.0,30.0,6.0,6.0,0.371232,1.0,3.0,1.0,73.697665,0.982609,0.0,17.0,8.0,26.0,0.0,74.5,0.981818,0.0,15.0,8.0,30.0,0.0,3749.25
max,135.833333,116.666667,105.666667,112.5,250.0,130.0,134.0,180.0,30.0,6.0,6.0,0.88,1.0,6.0,4.0,128.333333,1.0,0.0,28.0,30.0,29.0,344.0,119.62963,1.0,0.0,29.0,30.0,30.0,344.0,4999.0


p1_mean_hp           float64
p1_mean_spe          float64
p1_mean_atk          float64
p1_mean_def          float64
p2_lead_hp             int64
p2_lead_spe            int64
p2_lead_atk            int64
p2_lead_def            int64
n_turns                int64
p1_unique_pokemon      int64
p2_unique_pokemon      int64
mean_damage_dealt    float64
                      ...   
p1_turns_statused      int64
p1_missed_turns        int64
p1_switches            int64
p1_net_boost           int64
p2_mean_power        float64
p2_mean_accuracy     float64
p2_lost_hp             int64
p2_turns_statused      int64
p2_missed_turns        int64
p2_switches            int64
p2_net_boost           int64
battle_id              int64
Length: 30, dtype: object

In [52]:
scaler = StandardScaler(with_mean=True, with_std=True)


### 3. Training Models

In [110]:
# Define predictor features (X) and target (y)
features = [col for col in train_df.columns if col not in ['battle_id', 'player_won']]
X_train = train_df[features]
y_train = train_df['player_won']

X_test = test_df[features]

print("Training...")
model = XGBClassifier(
    random_state=100,
    n_estimators=200,
    learning_rate=0.05,
    max_depth=3,
    eval_metric='logloss',
    n_jobs=-1
)
model.fit(X_train, y_train)
print("Model training complete.")

Training...
Model training complete.


In [111]:
cv_results = cross_validate(
    model,
    X_train,
    y_train,
    cv=StratifiedKFold(n_splits=10, shuffle=True, random_state=100),
    scoring={
        "accuracy_score": make_scorer(accuracy_score),
        "precision_score": make_scorer(precision_score),
        "recall_score": make_scorer(recall_score),
        "f1_score": make_scorer(f1_score),
        "roc_auc_score": make_scorer(roc_auc_score)
    },
    return_train_score=True,
    n_jobs=-1
)

results_df = pd.DataFrame(cv_results)
display(results_df)

Unnamed: 0,fit_time,score_time,test_accuracy_score,train_accuracy_score,test_precision_score,train_precision_score,test_recall_score,train_recall_score,test_f1_score,train_f1_score,test_roc_auc_score,train_roc_auc_score
0,0.728911,0.050324,0.803,0.837333,0.803607,0.837183,0.802,0.837556,0.802803,0.837369,0.803,0.837333
1,0.585918,0.046988,0.811,0.838333,0.817996,0.839768,0.8,0.836222,0.808898,0.837991,0.811,0.838333
2,0.526318,0.034729,0.817,0.837556,0.815109,0.838458,0.82,0.836222,0.817547,0.837339,0.817,0.837556
3,0.6294,0.036181,0.804,0.838778,0.808943,0.839457,0.796,0.837778,0.802419,0.838616,0.804,0.838778
4,0.520566,0.039515,0.833,0.837556,0.832335,0.837856,0.834,0.837111,0.833167,0.837483,0.833,0.837556
5,0.76131,0.033772,0.833,0.836444,0.841889,0.837494,0.82,0.834889,0.8308,0.83619,0.833,0.836444
6,0.525335,0.036054,0.821,0.837444,0.828221,0.836771,0.81,0.838444,0.819009,0.837607,0.821,0.837444
7,0.753355,0.037128,0.835,0.835,0.831683,0.83627,0.84,0.833111,0.835821,0.834688,0.835,0.835
8,0.383168,0.02925,0.83,0.836111,0.818533,0.837536,0.848,0.834,0.833006,0.835764,0.83,0.836111
9,0.369893,0.019873,0.832,0.835667,0.833333,0.836489,0.83,0.834444,0.831663,0.835466,0.832,0.835667


### 4. Creating the Submission File

The competition requires a `.csv` file with two columns: `battle_id` and `player_won`. Let's use our trained model to make predictions on the test set and format them correctly.

In [55]:
print("Generating predictions on the test set...")
submission_df = pd.DataFrame({
    'battle_id': test_df['battle_id'],
    'player_won': model.predict(X_test)
})

submission_df.to_csv('submission.csv', index=False)

print("\n'submission.csv' file created successfully!")
display(submission_df.head())

Generating predictions on the test set...

'submission.csv' file created successfully!


Unnamed: 0,battle_id,player_won
0,0,0
1,1,1
2,2,1
3,3,1
4,4,1


### 5. Submitting Your Results

Once you have generated your `submission.csv` file, there are two primary ways to submit it to the competition.

---

#### Method A: Submitting Directly from the Notebook

This is the standard method for code competitions. It ensures that your submission is linked to the code that produced it, which is crucial for reproducibility.

1.  **Save Your Work:** Click the **"Save Version"** button in the top-right corner of the notebook editor.
2.  **Run the Notebook:** In the pop-up window, select **"Save & Run All (Commit)"** and then click the **"Save"** button. This will run your entire notebook from top to bottom and save the output, including your `submission.csv` file.
3.  **Go to the Viewer:** Once the save process is complete, navigate to the notebook viewer page. 
4.  **Submit to Competition:** In the viewer, find the **"Submit to Competition"** section. This is usually located in the header of the output section or in the vertical "..." menu on the right side of the page. Clicking the **Submit** button this will submit your generated `submission.csv` file.

After submitting, you will see your score in the **"Submit to Competition"** section or in the [Public Leaderboard](https://www.kaggle.com/competitions/fds-pokemon-battles-prediction-2025/leaderboard?).

---

#### Method B: Manual Upload

You can also generate your predictions and submission file using any environment you prefer (this notebook, Google Colab, or your local machine).

1.  **Generate the `submission.csv` file** using your model.
2.  **Download the file** to your computer.
3.  **Navigate to the [Leaderboard Page](https://www.kaggle.com/competitions/fds-pokemon-battles-prediction-2025/leaderboard?)** and click on the **"Submit Predictions"** button.
4.  **Upload Your File:** Drag and drop or select your `submission.csv` file to upload it.

This method is quick, but keep in mind that for the final evaluation, you might be required to provide the code that generated your submission.

Good luck!