# Comparing Models: Draft-Only vs Draft + Early Objectives

In this notebook we:
- Load preprocessed draft features and labels
- Build Model A (draft only)
- Extend features with first-objectives (Blood, Tower, Dragon, Baron, etc.)
- Build Model B (draft + objectives)
- Compare results

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score, classification_report

# Load draft features + labels
X_draft = pd.read_csv("../data/features.csv")
y = pd.read_csv("../data/labels.csv").squeeze()

# Load original dataset for objectives
df = pd.read_csv("../data/games.csv")

print("Draft feature matrix:", X_draft.shape)
print("Labels:", y.shape)

Draft feature matrix: (51490, 566)
Labels: (51490,)


## 1. Add Objective Features
We add binary indicators for which team got each objective first.

In [2]:
objectives = ['firstBlood','firstTower','firstInhibitor','firstBaron','firstDragon','firstRiftHerald']

X_obj = X_draft.copy()

for obj in objectives:
    if obj in df.columns:
        X_obj[f"{obj}_team1"] = (df[obj] == 1).astype(int)
        X_obj[f"{obj}_team2"] = (df[obj] == 2).astype(int)

print("Draft+Objectives feature matrix:", X_obj.shape)

Draft+Objectives feature matrix: (51490, 578)


## 2. Train/Test Split
We use the same split for both models for fair comparison.

In [3]:
X_train_A, X_test_A, y_train, y_test = train_test_split(
    X_draft, y, test_size=0.2, stratify=y, random_state=42
)

X_train_B, X_test_B, _, _ = train_test_split(
    X_obj, y, test_size=0.2, stratify=y, random_state=42
)

print("Draft-only train size:", X_train_A.shape)
print("Draft+Objectives train size:", X_train_B.shape)

Draft-only train size: (41192, 566)
Draft+Objectives train size: (41192, 578)


## 3. Train Both Models
RandomForest classifiers with the same hyperparameters.

In [4]:
clf_A = RandomForestClassifier(n_estimators=200, max_depth=12, random_state=42, n_jobs=-1)
clf_B = RandomForestClassifier(n_estimators=200, max_depth=12, random_state=42, n_jobs=-1)

clf_A.fit(X_train_A, y_train)
clf_B.fit(X_train_B, y_train)

print("Models trained.")

Models trained.


## 4. Evaluation
We compare Accuracy and ROC-AUC for both models.

In [5]:
y_pred_A = clf_A.predict(X_test_A)
y_proba_A = clf_A.predict_proba(X_test_A)[:,1]

y_pred_B = clf_B.predict(X_test_B)
y_proba_B = clf_B.predict_proba(X_test_B)[:,1]

print("Model A (Draft only):")
print("Accuracy:", accuracy_score(y_test, y_pred_A))
print("ROC AUC:", roc_auc_score(y_test, y_proba_A))

print("\nModel B (Draft + Objectives):")
print("Accuracy:", accuracy_score(y_test, y_pred_B))
print("ROC AUC:", roc_auc_score(y_test, y_proba_B))

Model A (Draft only):
Accuracy: 0.5478733734705769
ROC AUC: 0.5652107140357883

Model B (Draft + Objectives):
Accuracy: 0.8954165857448049
ROC AUC: 0.9447076893651671


## 5. Report
Now we can report the performance improvement from adding objectives.

This will let us conclude whether objectives significantly help prediction.