# Boosting Model Benchmark: LightGBM × XGBoost × CatBoost (F1-Score)


This notebook benchmarks three popular boosting frameworks on the Titanic dataset:

- LightGBM  
- XGBoost  
- CatBoost  

We use:
- Standardized preprocessing  
- F1-score as the main evaluation metric  
- Training time comparison  

This mirrors real-world model selection under cost, performance, and complexity constraints.


In [None]:
import pandas as pd
import time
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score
from sklearn.preprocessing import LabelEncoder

import lightgbm as lgb
import xgboost as xgb
from catboost import CatBoostClassifier

df = pd.read_csv("/kaggle/input/titanic/train.csv")
df.head()


## Preprocessing
We apply simple, consistent preprocessing for all models:

- Drop high-missing or weak-features columns  
- Impute `Age` and `Embarked`  
- Label-encode categorical features  
- Train/validation split with stratification


In [None]:
# Drop columns with high missingness or low signal for this demo
df = df.drop(columns=['Cabin','Name','Ticket'])

# Basic imputations
df['Age'].fillna(df['Age'].median(), inplace=True)
df['Embarked'].fillna('S', inplace=True)

# Encode categorical features
for col in ['Sex','Embarked']:
    df[col] = LabelEncoder().fit_transform(df[col])

# Features / target
X = df.drop(columns=['PassengerId','Survived'])
y = df['Survived']

X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, stratify=y, random_state=42
)
X_train.shape, X_val.shape


## Training & Benchmarking

We train each model, compute F1-score on the validation set, and record training time (seconds).


In [None]:
results = {}

# LightGBM
t0 = time.time()
lgb_model = lgb.LGBMClassifier(random_state=42)
lgb_model.fit(X_train, y_train)
pred = lgb_model.predict(X_val)
results['LightGBM'] = (f1_score(y_val, pred), time.time() - t0)

# XGBoost
t0 = time.time()
xgb_model = xgb.XGBClassifier(
    eval_metric='logloss',
    random_state=42,
    use_label_encoder=False
)
xgb_model.fit(X_train, y_train)
pred = xgb_model.predict(X_val)
results['XGBoost'] = (f1_score(y_val, pred), time.time() - t0)

# CatBoost
t0 = time.time()
cb_model = CatBoostClassifier(verbose=0, random_state=42)
cb_model.fit(X_train, y_train)
pred = cb_model.predict(X_val)
results['CatBoost'] = (f1_score(y_val, pred), time.time() - t0)

results


## F1-Score Comparison


In [None]:
import matplotlib.pyplot as plt

models = list(results.keys())
scores = [results[m][0] for m in models]
times  = [results[m][1] for m in models]

plt.figure(figsize=(6,4))
plt.bar(models, scores)
plt.title("F1-Score by Model")
plt.ylabel("F1-Score")
plt.ylim(0,1)
plt.show()

for m in models:
    print(f"{m}: F1 = {results[m][0]:.3f} | Train time = {results[m][1]:.3f} s")


# Conclusion

All three boosting frameworks deliver strong performance on the Titanic dataset.

**LightGBM**  
- Very fast training time  
- Strong F1-score  
- Great choice when latency and scale matter  

**XGBoost**  
- Stable, widely adopted  
- Rich ecosystem and documentation  

**CatBoost**  
- Handles categorical data very well  
- Often competitive with minimal tuning  

In real projects, the final choice usually balances:
- Performance (F1-score and other metrics)  
- Training / inference time  
- Implementation complexity  
- Infrastructure constraints  

This benchmark provides a reproducible starting point for that decision.
