# Hyperparameter Tuning

This notebook is a place for us to tune our the hyperparameters for both the lightgbm and catboost models for both the standard and complex name sets.

We will tune each model on each data set and then do a final comparisson. So this will in part inform us of the best parameter choices and the best model choice.

In [1]:
!pip install lightgbm
!pip install sqlalchemy
!pip install catboost
!pip install optuna

Collecting lightgbm
  Downloading lightgbm-4.6.0-py3-none-manylinux_2_28_x86_64.whl.metadata (17 kB)
Downloading lightgbm-4.6.0-py3-none-manylinux_2_28_x86_64.whl (3.6 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/3.6 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m3.6/3.6 MB[0m [31m122.4 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.6/3.6 MB[0m [31m78.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: lightgbm
Successfully installed lightgbm-4.6.0
Collecting sqlalchemy
  Downloading sqlalchemy-2.0.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting greenlet>=1 (from sqlalchemy)
  Downloading greenlet-3.2.3-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (4.1 kB)
Downloading sqlalchemy-2.0.41-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
[2K   [90m

In [2]:
import json
import gc
import re
import random
import matplotlib.pyplot as plt
import lightgbm as lgb
import pandas as pd
import numpy as np
import sqlite3
import optuna
from catboost import CatBoostRanker, Pool
from datetime import datetime
from collections import defaultdict
from sqlalchemy import select, func
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.metrics import ndcg_score

In [5]:
from google.colab import drive

drive.mount("/content/drive")

Mounted at /content/drive


## Standard Nameset

We will begin on the standard set.

In [6]:
# Paths to files in Drive
db_path = "/content/drive/MyDrive/Colab Notebooks/database.db"
train_ids_path = "/content/train_std_ids.csv"
val_ids_path = "/content/validation_std_ids.csv"

Pull whole standard dataset from SQL db into pandas frame.

In [5]:
conn = sqlite3.connect(db_path)  # or your local path
full_df = pd.read_sql_query("SELECT * FROM feature_matrix", conn)

Saved the training and validation IDs in a CSV so we can split the data set.

In [6]:
train_ids = pd.read_csv(train_ids_path)["train_ids"].dropna().astype(int).tolist()
val_ids = pd.read_csv(val_ids_path)["validation_ids"].dropna().astype(int).tolist()

Validation dataset pulled from full set, remianing is training set. This is to ensure that we don't train on validation set.

In [7]:
val_ids_set = set(val_ids)
val_df = full_df[full_df["clean_row_id"].isin(val_ids_set)]
full_df = full_df[~full_df["clean_row_id"].isin(val_ids_set)]

This was pulled from the initial model development notebook.

In [13]:
def compute_ranking_metrics(df, k=3):
    # Group
    grouped = df.groupby("clean_row_id")

    # Get top 1 and calculate accuracy
    top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
    acc1 = (top1["label"] == 1).mean()

    # Same with recall
    topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
    recall_k = topk.groupby("clean_row_id")["label"].max().mean()

    # MMR
    def reciprocal_rank(g):
        sorted_g = g.sort_values("score", ascending=False).reset_index()
        match = sorted_g[sorted_g["label"] == 1]
        return 1.0 / (match.index[0] + 1) if not match.empty else 0.0

    mrr = grouped.apply(reciprocal_rank).mean()
    return acc1, recall_k, mrr

We will define our train and val set here so we can reuse on LightGBM tuning.

In [9]:
# Training set
X_train = full_df.drop(
    columns=["label", "clean_row_id", "investor", "firm", "template_id"]
)
y_train = full_df["label"]
train_group_sizes = full_df.groupby("clean_row_id").size().tolist()

# Validation set
X_val = val_df.drop(
    columns=["label", "clean_row_id", "investor", "firm", "template_id"]
)
y_val = val_df["label"]
val_group_sizes = val_df.groupby("clean_row_id").size().tolist()

We define a catboost objective function to tune our parameters. We want to do a full sweep on hyperparameters to make sure our model performs the best it can.

In [10]:
def catboost_objective(trial):
    params = {
        "loss_function": "YetiRank",
        "eval_metric": "NDCG:top=3",
        "learning_rate": trial.suggest_float("learning_rate", 0.01, 0.2),
        "depth": trial.suggest_int("depth", 4, 10),
        "l2_leaf_reg": trial.suggest_float("l2_leaf_reg", 1, 10),
        "random_strength": trial.suggest_float("random_strength", 0, 10),
        "min_data_in_leaf": trial.suggest_int("min_data_in_leaf", 10, 100),
        "subsample": trial.suggest_float("subsample", 0.5, 1.0),
        "colsample_bylevel": trial.suggest_float("colsample_bylevel", 0.5, 1.0),
        "grow_policy": trial.suggest_categorical(
            "grow_policy", ["SymmetricTree", "Lossguide"]
        ),
        "iterations": 100,
        "early_stopping_rounds": 5,
        "random_seed": 42,
        "verbose": False,
        "task_type": "CPU",
    }

    # CatBoost pools
    train_pool = Pool(
        data=X_train,
        label=y_train,
        group_id=np.repeat(np.arange(len(train_group_sizes)), train_group_sizes),
    )
    val_pool = Pool(
        data=X_val,
        label=y_val,
        group_id=np.repeat(np.arange(len(val_group_sizes)), val_group_sizes),
    )

    # Train
    model = CatBoostRanker(**params)
    model.fit(train_pool, eval_set=val_pool)

    # Score
    val_df_copy = val_df.copy()
    val_df_copy["score"] = model.predict(val_pool)
    acc1, recall3, mrr = compute_ranking_metrics(val_df_copy, k=3)

    return 1 - recall3  # minimize (maximize recall@3)

And the same for LightGBM.

In [11]:
def lightgbm_objective(trial):
    params = {
        "objective": "lambdarank",
        "metric": "ndcg",
        "ndcg_eval_at": [3],
        "boosting_type": "gbdt",
        "learning_rate": trial.suggest_float("learning_rate", 0.01, 0.2),
        "num_leaves": trial.suggest_int("num_leaves", 16, 128),
        "min_data_in_leaf": trial.suggest_int("min_data_in_leaf", 10, 100),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.5, 1.0),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.5, 1.0),
        "lambda_l1": trial.suggest_float("lambda_l1", 0, 5),
        "lambda_l2": trial.suggest_float("lambda_l2", 0, 5),
        "bagging_freq": trial.suggest_int("bagging_freq", 1, 5),
        "max_depth": trial.suggest_int("max_depth", 4, 12),
        "verbose": -1,
    }

    train_set = lgb.Dataset(X_train, label=y_train, group=train_group_sizes)
    val_set = lgb.Dataset(X_val, label=y_val, group=val_group_sizes)

    model = lgb.train(
        params,
        train_set,
        num_boost_round=100,
        valid_sets=[val_set],
        callbacks=[
            lgb.early_stopping(stopping_rounds=30),
            lgb.log_evaluation(period=0),
        ],
    )

    preds = model.predict(X_val, num_iteration=model.best_iteration)
    val_df_copy = val_df.copy()
    val_df_copy["score"] = preds

    acc1, recall3, mrr = compute_ranking_metrics(val_df_copy, k=3)
    return 1 - recall3

Lets start our sweep

In [12]:
# Run the sweep
print("Running CatBoost tuning...")
catboost_study = optuna.create_study(direction="minimize", study_name="catboost")
catboost_study.optimize(catboost_objective, n_trials=20)
# Save to drive
with open("/content/catboost_std_best.json", "w") as f:
    json.dump(catboost_study.best_params, f)

[I 2025-07-16 15:16:02,320] A new study created in memory with name: catboost


Running CatBoost tuning...


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:16:51,027] Trial 0 finished with value: 0.006078809983551481 and parameters: {'learning_rate': 0.06275390956292835, 'depth': 7, 'l2_leaf_reg': 5.887087145752858, 'random_strength': 7.15393253729215, 'min_data_in_leaf': 74, 'subsample': 0.5917943183420644, 'colsample_bylevel': 0.533663660106312, 'grow_policy': 'SymmetricTree'}. Best is trial 0 with value: 0.006078809983551481.
  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:18:03,523] Trial 1 finished with value: 0.005649717514124242 and parameters: {'learning_rate': 0.02600340439667241, 'depth': 8, 'l2_leaf_reg': 2.8609555177546246, 'ran

In [13]:
print("Running LightGBM tuning...")
lgbm_study = optuna.create_study(direction="minimize", study_name="lightgbm")
lgbm_study.optimize(lightgbm_objective, n_trials=20)
# Save to drive
with open("/content/light_gbm_std_best.json", "w") as f:
    json.dump(lgbm_study.best_params, f)

[I 2025-07-16 15:51:10,804] A new study created in memory with name: lightgbm


Running LightGBM tuning...
Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.960617


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:51:47,010] Trial 0 finished with value: 0.003575770578559734 and parameters: {'learning_rate': 0.1648331261312991, 'num_leaves': 22, 'min_data_in_leaf': 56, 'feature_fraction': 0.6220368308245395, 'bagging_fraction': 0.9675613645209333, 'lambda_l1': 1.88343488477213, 'lambda_l2': 4.302216020220545, 'bagging_freq': 4, 'max_depth': 8}. Best is trial 0 with value: 0.003575770578559734.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[12]	valid_0's ndcg@3: 0.942363


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:52:11,583] Trial 1 finished with value: 0.006364871629836233 and parameters: {'learning_rate': 0.06319651151842785, 'num_leaves': 91, 'min_data_in_leaf': 76, 'feature_fraction': 0.8801400758268227, 'bagging_fraction': 0.6902664883445727, 'lambda_l1': 0.8220476237823104, 'lambda_l2': 1.8966107727201336, 'bagging_freq': 3, 'max_depth': 5}. Best is trial 0 with value: 0.003575770578559734.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.956018


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:52:45,931] Trial 2 finished with value: 0.003861832224844486 and parameters: {'learning_rate': 0.1628675204371491, 'num_leaves': 109, 'min_data_in_leaf': 54, 'feature_fraction': 0.808298131840267, 'bagging_fraction': 0.6816736027621615, 'lambda_l1': 2.719538148099195, 'lambda_l2': 0.008274120466558177, 'bagging_freq': 2, 'max_depth': 4}. Best is trial 0 with value: 0.003575770578559734.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.960953


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:53:20,899] Trial 3 finished with value: 0.0035042551669884903 and parameters: {'learning_rate': 0.11780394667603233, 'num_leaves': 44, 'min_data_in_leaf': 28, 'feature_fraction': 0.5694367865340455, 'bagging_fraction': 0.8157898821760543, 'lambda_l1': 3.387242537447426, 'lambda_l2': 2.196643362517005, 'bagging_freq': 4, 'max_depth': 8}. Best is trial 3 with value: 0.0035042551669884903.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.961518


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:53:50,949] Trial 4 finished with value: 0.0031466781091324947 and parameters: {'learning_rate': 0.06999205716538405, 'num_leaves': 72, 'min_data_in_leaf': 22, 'feature_fraction': 0.9357340112477351, 'bagging_fraction': 0.9800252910545488, 'lambda_l1': 2.8648804109739916, 'lambda_l2': 3.698302104252323, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 4 with value: 0.0031466781091324947.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.963902


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:54:32,836] Trial 5 finished with value: 0.002932131874418986 and parameters: {'learning_rate': 0.14738398322068144, 'num_leaves': 121, 'min_data_in_leaf': 64, 'feature_fraction': 0.6433135050247664, 'bagging_fraction': 0.8514804878194124, 'lambda_l1': 1.507404135359871, 'lambda_l2': 0.9794529738376201, 'bagging_freq': 2, 'max_depth': 9}. Best is trial 5 with value: 0.002932131874418986.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[97]	valid_0's ndcg@3: 0.952907


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:55:08,277] Trial 6 finished with value: 0.00421940928270037 and parameters: {'learning_rate': 0.11180475399040998, 'num_leaves': 43, 'min_data_in_leaf': 96, 'feature_fraction': 0.5253208524365685, 'bagging_fraction': 0.9078280071557396, 'lambda_l1': 3.1083686627264444, 'lambda_l2': 4.543261230556205, 'bagging_freq': 4, 'max_depth': 5}. Best is trial 5 with value: 0.002932131874418986.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[97]	valid_0's ndcg@3: 0.964107


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:55:45,732] Trial 7 finished with value: 0.002789101051276499 and parameters: {'learning_rate': 0.19426829578921662, 'num_leaves': 64, 'min_data_in_leaf': 75, 'feature_fraction': 0.9047338201190456, 'bagging_fraction': 0.9624860112209651, 'lambda_l1': 0.9753561860776749, 'lambda_l2': 4.367377782782343, 'bagging_freq': 3, 'max_depth': 10}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[90]	valid_0's ndcg@3: 0.940059


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:56:18,090] Trial 8 finished with value: 0.00536365586783949 and parameters: {'learning_rate': 0.04694578401285204, 'num_leaves': 91, 'min_data_in_leaf': 87, 'feature_fraction': 0.9853894028995265, 'bagging_fraction': 0.8310542061183517, 'lambda_l1': 3.474502993210868, 'lambda_l2': 3.679038815230335, 'bagging_freq': 3, 'max_depth': 4}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.963886


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:56:53,658] Trial 9 finished with value: 0.0028606164628477426 and parameters: {'learning_rate': 0.17133439934835623, 'num_leaves': 71, 'min_data_in_leaf': 63, 'feature_fraction': 0.8935746135401298, 'bagging_fraction': 0.8359065031942383, 'lambda_l1': 1.9979525322747542, 'lambda_l2': 3.0203094561358634, 'bagging_freq': 4, 'max_depth': 11}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[84]	valid_0's ndcg@3: 0.962762


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:57:27,590] Trial 10 finished with value: 0.003218193520703738 and parameters: {'learning_rate': 0.19992341367008631, 'num_leaves': 60, 'min_data_in_leaf': 41, 'feature_fraction': 0.7304021151823121, 'bagging_fraction': 0.5325650750056574, 'lambda_l1': 4.696823988195911, 'lambda_l2': 4.8135067623116585, 'bagging_freq': 5, 'max_depth': 12}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[87]	valid_0's ndcg@3: 0.964408


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:58:02,162] Trial 11 finished with value: 0.002932131874418986 and parameters: {'learning_rate': 0.18648665553874924, 'num_leaves': 73, 'min_data_in_leaf': 74, 'feature_fraction': 0.8456555767294333, 'bagging_fraction': 0.7620069958993859, 'lambda_l1': 0.3229943801895545, 'lambda_l2': 2.8260905958204057, 'bagging_freq': 5, 'max_depth': 10}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[96]	valid_0's ndcg@3: 0.954851


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:58:37,399] Trial 12 finished with value: 0.00443395551741399 and parameters: {'learning_rate': 0.013865012169456875, 'num_leaves': 48, 'min_data_in_leaf': 76, 'feature_fraction': 0.7419794180363664, 'bagging_fraction': 0.9167119842334155, 'lambda_l1': 1.6318670693130386, 'lambda_l2': 3.3796462837763004, 'bagging_freq': 3, 'max_depth': 12}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[86]	valid_0's ndcg@3: 0.964195


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:59:12,279] Trial 13 finished with value: 0.0030036472859901187 and parameters: {'learning_rate': 0.14315945630224736, 'num_leaves': 89, 'min_data_in_leaf': 43, 'feature_fraction': 0.9140660329991824, 'bagging_fraction': 0.8976669467954999, 'lambda_l1': 0.9166693506581738, 'lambda_l2': 2.880573092689219, 'bagging_freq': 4, 'max_depth': 10}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.961243


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 15:59:49,451] Trial 14 finished with value: 0.0032897089322748707 and parameters: {'learning_rate': 0.17739359072451508, 'num_leaves': 24, 'min_data_in_leaf': 65, 'feature_fraction': 0.9883234001756489, 'bagging_fraction': 0.9986713339902848, 'lambda_l1': 0.05656406117400681, 'lambda_l2': 1.533955548723954, 'bagging_freq': 2, 'max_depth': 7}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.963043


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 16:00:23,357] Trial 15 finished with value: 0.0031466781091324947 and parameters: {'learning_rate': 0.13663201070222775, 'num_leaves': 60, 'min_data_in_leaf': 98, 'feature_fraction': 0.8113741919550228, 'bagging_fraction': 0.7391807320124738, 'lambda_l1': 2.2025453459421973, 'lambda_l2': 3.85769969942106, 'bagging_freq': 3, 'max_depth': 10}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[94]	valid_0's ndcg@3: 0.963903


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 16:00:57,416] Trial 16 finished with value: 0.0031466781091324947 and parameters: {'learning_rate': 0.19650698464977978, 'num_leaves': 78, 'min_data_in_leaf': 86, 'feature_fraction': 0.9173550312690267, 'bagging_fraction': 0.5850269953630121, 'lambda_l1': 1.0364090140900855, 'lambda_l2': 3.1627364893943533, 'bagging_freq': 5, 'max_depth': 11}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[90]	valid_0's ndcg@3: 0.96156


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 16:01:27,718] Trial 17 finished with value: 0.003075162697561362 and parameters: {'learning_rate': 0.081744877926514, 'num_leaves': 58, 'min_data_in_leaf': 11, 'feature_fraction': 0.7883771480365709, 'bagging_fraction': 0.791480212218783, 'lambda_l1': 4.123555981517363, 'lambda_l2': 4.02983994148207, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[94]	valid_0's ndcg@3: 0.963846


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 16:02:04,973] Trial 18 finished with value: 0.0031466781091324947 and parameters: {'learning_rate': 0.17164432437650498, 'num_leaves': 104, 'min_data_in_leaf': 49, 'feature_fraction': 0.6884650431234792, 'bagging_fraction': 0.8709047214406931, 'lambda_l1': 2.282139761417465, 'lambda_l2': 4.991774985764998, 'bagging_freq': 4, 'max_depth': 11}. Best is trial 7 with value: 0.002789101051276499.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.961642


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 16:02:42,787] Trial 19 finished with value: 0.003075162697561362 and parameters: {'learning_rate': 0.12476106886954541, 'num_leaves': 82, 'min_data_in_leaf': 64, 'feature_fraction': 0.8602645795424939, 'bagging_fraction': 0.9363487391805796, 'lambda_l1': 1.317271262416911, 'lambda_l2': 1.2829909720510708, 'bagging_freq': 2, 'max_depth': 7}. Best is trial 7 with value: 0.002789101051276499.


In [14]:
print("\nBest CatBoost:")
print(catboost_study.best_params)
print("Recall@3:", 1 - catboost_study.best_value)

print("\nBest LightGBM:")
print(lgbm_study.best_params)
print("Recall@3:", 1 - lgbm_study.best_value)


Best CatBoost:
{'learning_rate': 0.13275757957731918, 'depth': 6, 'l2_leaf_reg': 7.142519331365267, 'random_strength': 3.395785387976391, 'min_data_in_leaf': 84, 'subsample': 0.9048958560910838, 'colsample_bylevel': 0.511123337191838, 'grow_policy': 'Lossguide'}
Recall@3: 0.9961381677751555

Best LightGBM:
{'learning_rate': 0.19426829578921662, 'num_leaves': 64, 'min_data_in_leaf': 75, 'feature_fraction': 0.9047338201190456, 'bagging_fraction': 0.9624860112209651, 'lambda_l1': 0.9753561860776749, 'lambda_l2': 4.367377782782343, 'bagging_freq': 3, 'max_depth': 10}
Recall@3: 0.9972108989487235


Now we have our parameters, let's train both models and see which performs the best.

In [15]:
def train_model_lightGBM(
    parameters: dict,
    n_rounds: int = 500,
    lr_decay_gamma: float = 0.95,
    val_df_input
):
    # Load validation set
    lgb_val = lgb.Dataset(X_val, label=y_val, group=val_group_sizes, free_raw_data=False)

    # Load full training set
    lgb_train = lgb.Dataset(X_train, label=y_train, group=train_group_sizes, free_raw_data=False)

    # Learning rate schedule
    def lr_decay(current_round):
        return parameters["learning_rate"] * (lr_decay_gamma ** current_round)

    # Train model
    model = lgb.train(
        params=parameters,
        train_set=lgb_train,
        num_boost_round=n_rounds,
        valid_sets=[lgb_train, lgb_val],
        valid_names=["train", "val"],
        callbacks=[
            lgb.reset_parameter(learning_rate=lr_decay),
            lgb.early_stopping(stopping_rounds=500),
            lgb.log_evaluation(period=1)
        ]
    )

    # Predict and Evaluate
    preds = model.predict(X_val, num_iteration=model.best_iteration)
    val_df["score"] = preds

    acc1, recall3, mrr = compute_ranking_metrics(val_df, k=3)

    print("\Evaluation Metrics (Validation Set):")
    print(f"Accuracy@1 : {acc1:.4f}")
    print(f"Recall@3   : {recall3:.4f}")
    print(f"MRR        : {mrr:.4f}")

    return model


In [11]:
def train_catboost_model(parameters: dict, n_rounds: int = 500, val_df_input=None):
    # Training group_id
    train_group_id = np.repeat(np.arange(len(train_group_sizes)), train_group_sizes)

    # Validation group_id
    val_group_id = np.repeat(np.arange(len(val_group_sizes)), val_group_sizes)
    # Create Pools for CatBoost
    train_pool = Pool(data=X_train, label=y_train, group_id=train_group_id)
    val_pool = Pool(data=X_val, label=y_val, group_id=val_group_id)

    # Train the model
    model = CatBoostRanker(iterations=n_rounds, **parameters)

    model.fit(train_pool, eval_set=val_pool, early_stopping_rounds=50, verbose=True)

    # Predict and evaluate
    preds = model.predict(val_pool)
    val_df = val_df_input.copy()
    val_df["score"] = preds

    acc1, recall3, mrr = compute_ranking_metrics(val_df, k=3)

    print("\nEvaluation Metrics (Validation Set):")
    print(f"Accuracy@1 : {acc1:.4f}")
    print(f"Recall@3   : {recall3:.4f}")
    print(f"MRR        : {mrr:.4f}")

    return model

In [20]:
# Static/default parameters that LightGBM expects
base_params = {
    "objective": "lambdarank",
    "metric": ["ndcg"],
    "eval_at": [1, 3],
    "boosting_type": "gbdt",
    "verbosity": -1,
    "force_row_wise": True,
}

# Load from file
with open("light_gbm_std_best.json", "r") as f:
    best_params_from_json = json.load(f)

# Merge base + Optuna best
final_params = {**base_params, **best_params_from_json}

train_model_lightGBM(parameters=final_params, n_rounds=1000, lr_decay_gamma=0.95)

[1]	train's ndcg@1: 0.876358	train's ndcg@3: 0.94871	val's ndcg@1: 0.67675	val's ndcg@3: 0.853679
Training until validation scores don't improve for 500 rounds
[2]	train's ndcg@1: 0.924617	train's ndcg@3: 0.970516	val's ndcg@1: 0.852106	val's ndcg@3: 0.940443
[3]	train's ndcg@1: 0.929556	train's ndcg@3: 0.972374	val's ndcg@1: 0.877566	val's ndcg@3: 0.949844
[4]	train's ndcg@1: 0.929909	train's ndcg@3: 0.972571	val's ndcg@1: 0.879783	val's ndcg@3: 0.951019
[5]	train's ndcg@1: 0.930351	train's ndcg@3: 0.972744	val's ndcg@1: 0.879282	val's ndcg@3: 0.950731
[6]	train's ndcg@1: 0.930717	train's ndcg@3: 0.972866	val's ndcg@1: 0.881785	val's ndcg@3: 0.951585
[7]	train's ndcg@1: 0.93112	train's ndcg@3: 0.973051	val's ndcg@1: 0.88393	val's ndcg@3: 0.95257
[8]	train's ndcg@1: 0.93123	train's ndcg@3: 0.973103	val's ndcg@1: 0.884074	val's ndcg@3: 0.952595
[9]	train's ndcg@1: 0.931496	train's ndcg@3: 0.97322	val's ndcg@1: 0.885003	val's ndcg@3: 0.953002
[10]	train's ndcg@1: 0.931746	train's ndcg@3:

  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)


\Evaluation Metrics (Validation Set):
Accuracy@1 : 0.9005
Recall@3   : 0.9966
MRR        : 0.9471


  mrr = grouped.apply(reciprocal_rank).mean()


<lightgbm.basic.Booster at 0x7a8b2cbdfad0>

Strong results. The correct prediction is almost always in the top 3 ranked items and is in the top 1 stop 90% of the time.

In [24]:
# Base/defaults for CatBoost
base_params = {
    "loss_function": "YetiRank",
    "eval_metric": "NDCG:top=3",
    "random_seed": 42,
    "task_type": "CPU",  # or "GPU"
}

# Load from file
with open("catboost_std_best.json", "r") as f:
    best_params_from_json = json.load(f)

# Merge base + Optuna best
final_params = {**base_params, **best_params_from_json}

best_catboost_model = train_catboost_model(
    parameters=final_params, n_rounds=1000, val_df_input=val_df
)

0:	test: 0.8931327	best: 0.8931327 (0)	total: 2.67s	remaining: 44m 29s
1:	test: 0.9371556	best: 0.9371556 (1)	total: 5.34s	remaining: 44m 22s
2:	test: 0.9366829	best: 0.9371556 (1)	total: 7.92s	remaining: 43m 52s
3:	test: 0.9369486	best: 0.9371556 (1)	total: 11s	remaining: 45m 29s
4:	test: 0.9365407	best: 0.9371556 (1)	total: 13.7s	remaining: 45m 35s
5:	test: 0.9394015	best: 0.9394015 (5)	total: 16.8s	remaining: 46m 27s
6:	test: 0.9397957	best: 0.9397957 (6)	total: 19.7s	remaining: 46m 41s
7:	test: 0.9400222	best: 0.9400222 (7)	total: 22.4s	remaining: 46m 12s
8:	test: 0.9400545	best: 0.9400545 (8)	total: 24.8s	remaining: 45m 30s
9:	test: 0.9403287	best: 0.9403287 (9)	total: 27.6s	remaining: 45m 29s
10:	test: 0.9410855	best: 0.9410855 (10)	total: 30.2s	remaining: 45m 14s
11:	test: 0.9444052	best: 0.9444052 (11)	total: 32.5s	remaining: 44m 38s
12:	test: 0.9449433	best: 0.9449433 (12)	total: 34.8s	remaining: 44m 2s
13:	test: 0.9430693	best: 0.9449433 (12)	total: 37.5s	remaining: 43m 58s
1

  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)



Evaluation Metrics (Validation Set):
Accuracy@1 : 0.9010
Recall@3   : 0.9961
MRR        : 0.9473


  mrr = grouped.apply(reciprocal_rank).mean()


Catboost has a slight edge here. Lets see how it performs on the complex set.

# Complex Set

Now we will tune the complex model as well. We will delete the previous data to save on RAM first.

In [None]:
# Delete training-related objects
del full_df, X_train, y_train, train_group_sizes
del val_df, X_val, y_val, val_group_sizes

# Run garbage collection
gc.collect()

In [7]:
# Get new ids
train_ids_path = "/content/train_com_ids.csv"
val_ids_path = "/content/validation_com_ids.csv"
train_ids = pd.read_csv(train_ids_path)["train_ids"].dropna().astype(int).tolist()
val_ids = pd.read_csv(val_ids_path)["validation_ids"].dropna().astype(int).tolist()

# Reconnect and load the second dataset
conn = sqlite3.connect(db_path)
full_df = pd.read_sql_query("SELECT * FROM feature_matrix_complex", conn)
val_ids_set = set(val_ids)
val_df = full_df[full_df["clean_row_id"].isin(val_ids_set)]
full_df = full_df[~full_df["clean_row_id"].isin(val_ids_set)]


# Rebuild train/val splits
X_train = full_df.drop(
    columns=["label", "clean_row_id", "investor", "firm", "template_id"]
)
y_train = full_df["label"]
train_group_sizes = full_df.groupby("clean_row_id").size().tolist()
X_val = val_df.drop(
    columns=["label", "clean_row_id", "investor", "firm", "template_id"]
)
y_val = val_df["label"]
val_group_sizes = val_df.groupby("clean_row_id").size().tolist()

Now lets re run the full sweep.

In [26]:
# Run the sweep
print("Running CatBoost tuning...")
catboost_study = optuna.create_study(direction="minimize", study_name="catboost")
catboost_study.optimize(catboost_objective, n_trials=50)
# Save to drive
with open("/content/catboost_com_best.json", "w") as f:
    json.dump(catboost_study.best_params, f)

[I 2025-07-16 17:29:25,081] A new study created in memory with name: catboost


Running CatBoost tuning...


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 17:36:20,396] Trial 0 finished with value: 0.016249999999999987 and parameters: {'learning_rate': 0.16997943397292264, 'depth': 9, 'l2_leaf_reg': 4.325054229205765, 'random_strength': 3.195199898203583, 'min_data_in_leaf': 70, 'subsample': 0.8460831562701309, 'colsample_bylevel': 0.9935706982846684, 'grow_policy': 'Lossguide'}. Best is trial 0 with value: 0.016249999999999987.
  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 17:46:37,382] Trial 1 finished with value: 0.018750000000000044 and parameters: {'learning_rate': 0.1337634400131964, 'depth': 9, 'l2_leaf_reg': 1.8495819339681328, 'random

In [27]:
print("Running LightGBM tuning...")
lgbm_study = optuna.create_study(direction="minimize", study_name="lightgbm")
lgbm_study.optimize(lightgbm_objective, n_trials=50)
# Save to drive
with open("/content/light_gbm_com_best.json", "w") as f:
    json.dump(lgbm_study.best_params, f)

[I 2025-07-16 22:46:54,491] A new study created in memory with name: lightgbm


Running LightGBM tuning...
Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.939641


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:47:34,869] Trial 0 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.16225298188131157, 'num_leaves': 55, 'min_data_in_leaf': 30, 'feature_fraction': 0.8866622471040231, 'bagging_fraction': 0.9882933452901567, 'lambda_l1': 2.9888525353614726, 'lambda_l2': 3.345021012693289, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.912125


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:48:16,395] Trial 1 finished with value: 0.02749999999999997 and parameters: {'learning_rate': 0.011889350993384286, 'num_leaves': 63, 'min_data_in_leaf': 27, 'feature_fraction': 0.6743691753707486, 'bagging_fraction': 0.6859109687414484, 'lambda_l1': 2.3088736630587126, 'lambda_l2': 3.8311251377579225, 'bagging_freq': 4, 'max_depth': 7}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[66]	valid_0's ndcg@3: 0.906366


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:48:58,020] Trial 2 finished with value: 0.030000000000000027 and parameters: {'learning_rate': 0.04098924000068181, 'num_leaves': 115, 'min_data_in_leaf': 96, 'feature_fraction': 0.9016092904066846, 'bagging_fraction': 0.833344685517506, 'lambda_l1': 0.6205022354544598, 'lambda_l2': 4.89630222632678, 'bagging_freq': 3, 'max_depth': 6}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.918152


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:49:35,533] Trial 3 finished with value: 0.02749999999999997 and parameters: {'learning_rate': 0.02553754553002321, 'num_leaves': 114, 'min_data_in_leaf': 15, 'feature_fraction': 0.8543081550302887, 'bagging_fraction': 0.9228072050173419, 'lambda_l1': 2.046699148140832, 'lambda_l2': 2.0743399615844424, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.931352


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:50:21,183] Trial 4 finished with value: 0.01749999999999996 and parameters: {'learning_rate': 0.06481569434502592, 'num_leaves': 89, 'min_data_in_leaf': 77, 'feature_fraction': 0.8042858567059298, 'bagging_fraction': 0.9204961297607742, 'lambda_l1': 1.9533492636279455, 'lambda_l2': 3.255927604510748, 'bagging_freq': 3, 'max_depth': 9}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[95]	valid_0's ndcg@3: 0.929804


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:51:08,588] Trial 5 finished with value: 0.02124999999999999 and parameters: {'learning_rate': 0.06210022987018839, 'num_leaves': 36, 'min_data_in_leaf': 68, 'feature_fraction': 0.8020633392588385, 'bagging_fraction': 0.7795796403250379, 'lambda_l1': 3.4526360059127597, 'lambda_l2': 2.1377737992159007, 'bagging_freq': 2, 'max_depth': 8}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.934209


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:51:56,189] Trial 6 finished with value: 0.016249999999999987 and parameters: {'learning_rate': 0.06876502637976797, 'num_leaves': 105, 'min_data_in_leaf': 60, 'feature_fraction': 0.9241846931925701, 'bagging_fraction': 0.7291122028578276, 'lambda_l1': 2.35395165380886, 'lambda_l2': 3.8512691134746184, 'bagging_freq': 2, 'max_depth': 12}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[92]	valid_0's ndcg@3: 0.934507


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:52:47,864] Trial 7 finished with value: 0.016249999999999987 and parameters: {'learning_rate': 0.17709852985123783, 'num_leaves': 97, 'min_data_in_leaf': 33, 'feature_fraction': 0.612721123167651, 'bagging_fraction': 0.5557109274524021, 'lambda_l1': 1.9343561858566076, 'lambda_l2': 3.78579226355027, 'bagging_freq': 2, 'max_depth': 5}. Best is trial 0 with value: 0.01375000000000004.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[97]	valid_0's ndcg@3: 0.941843


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:53:32,380] Trial 8 finished with value: 0.010000000000000009 and parameters: {'learning_rate': 0.19508897815396264, 'num_leaves': 47, 'min_data_in_leaf': 97, 'feature_fraction': 0.8200412888498705, 'bagging_fraction': 0.8738167254478661, 'lambda_l1': 4.906485872313661, 'lambda_l2': 2.254967729989295, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[57]	valid_0's ndcg@3: 0.94409


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:54:13,186] Trial 9 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.14904209602614502, 'num_leaves': 54, 'min_data_in_leaf': 96, 'feature_fraction': 0.5427647720005709, 'bagging_fraction': 0.6123402354869276, 'lambda_l1': 1.0812031780603892, 'lambda_l2': 0.6656882010689114, 'bagging_freq': 4, 'max_depth': 10}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.935429


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:54:56,643] Trial 10 finished with value: 0.01749999999999996 and parameters: {'learning_rate': 0.11928112721387439, 'num_leaves': 29, 'min_data_in_leaf': 49, 'feature_fraction': 0.9894268770049781, 'bagging_fraction': 0.841159884745378, 'lambda_l1': 4.922302457945405, 'lambda_l2': 0.7814828519650374, 'bagging_freq': 5, 'max_depth': 5}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[71]	valid_0's ndcg@3: 0.93467


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:55:43,015] Trial 11 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.1962460465842539, 'num_leaves': 47, 'min_data_in_leaf': 44, 'feature_fraction': 0.7189137856733783, 'bagging_fraction': 0.9976259614979406, 'lambda_l1': 4.913435869812661, 'lambda_l2': 2.791527958008163, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[89]	valid_0's ndcg@3: 0.940563


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:56:25,023] Trial 12 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.15139431102915868, 'num_leaves': 74, 'min_data_in_leaf': 11, 'feature_fraction': 0.780771100583937, 'bagging_fraction': 0.9798090645663569, 'lambda_l1': 3.5946212472731034, 'lambda_l2': 1.4290245876316787, 'bagging_freq': 1, 'max_depth': 8}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[93]	valid_0's ndcg@3: 0.937632


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:57:05,739] Trial 13 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.19833253244313423, 'num_leaves': 19, 'min_data_in_leaf': 83, 'feature_fraction': 0.9933682094570295, 'bagging_fraction': 0.8706904734900003, 'lambda_l1': 3.916259130668703, 'lambda_l2': 2.798212143910034, 'bagging_freq': 1, 'max_depth': 10}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.936575


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:58:04,112] Trial 14 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.1599052142021365, 'num_leaves': 76, 'min_data_in_leaf': 36, 'feature_fraction': 0.8787419833805229, 'bagging_fraction': 0.9238614963374626, 'lambda_l1': 4.175421044410306, 'lambda_l2': 4.8869882768727795, 'bagging_freq': 2, 'max_depth': 7}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[97]	valid_0's ndcg@3: 0.930295


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:58:56,992] Trial 15 finished with value: 0.02124999999999999 and parameters: {'learning_rate': 0.12756143084742066, 'num_leaves': 79, 'min_data_in_leaf': 41, 'feature_fraction': 0.7175878226940691, 'bagging_fraction': 0.9049485599768822, 'lambda_l1': 4.298492044978823, 'lambda_l2': 4.92794017528262, 'bagging_freq': 2, 'max_depth': 4}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.939209


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 22:59:43,744] Trial 16 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.0949618528996377, 'num_leaves': 127, 'min_data_in_leaf': 58, 'feature_fraction': 0.8389486508963309, 'bagging_fraction': 0.7872005943127132, 'lambda_l1': 4.254797994942413, 'lambda_l2': 1.4801837735893928, 'bagging_freq': 2, 'max_depth': 7}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[98]	valid_0's ndcg@3: 0.941486


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:00:33,775] Trial 17 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.1743635560041369, 'num_leaves': 40, 'min_data_in_leaf': 82, 'feature_fraction': 0.9152685827365797, 'bagging_fraction': 0.7056249015539376, 'lambda_l1': 4.456682586307752, 'lambda_l2': 1.4840102213200277, 'bagging_freq': 3, 'max_depth': 7}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[98]	valid_0's ndcg@3: 0.935161


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:01:28,034] Trial 18 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.13252642968021977, 'num_leaves': 65, 'min_data_in_leaf': 71, 'feature_fraction': 0.6361021465236331, 'bagging_fraction': 0.8202276746354825, 'lambda_l1': 3.1622001859919635, 'lambda_l2': 0.05885210828274001, 'bagging_freq': 2, 'max_depth': 6}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[91]	valid_0's ndcg@3: 0.937304


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:02:22,779] Trial 19 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.17963218827054106, 'num_leaves': 86, 'min_data_in_leaf': 100, 'feature_fraction': 0.7614512405557169, 'bagging_fraction': 0.9382556285823048, 'lambda_l1': 4.945006656437579, 'lambda_l2': 4.449022646545192, 'bagging_freq': 4, 'max_depth': 10}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's ndcg@3: 0.936545


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:02:57,082] Trial 20 finished with value: 0.01749999999999996 and parameters: {'learning_rate': 0.08885438738620009, 'num_leaves': 20, 'min_data_in_leaf': 22, 'feature_fraction': 0.8518542189751281, 'bagging_fraction': 0.8678903564286913, 'lambda_l1': 0.019014596147159946, 'lambda_l2': 2.364152273806752, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[95]	valid_0's ndcg@3: 0.939507


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:03:37,618] Trial 21 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.15443003246253595, 'num_leaves': 55, 'min_data_in_leaf': 34, 'feature_fraction': 0.8906204483986593, 'bagging_fraction': 0.9606154968288049, 'lambda_l1': 2.9190643419362012, 'lambda_l2': 4.3159832531077855, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[80]	valid_0's ndcg@3: 0.940891


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:04:18,074] Trial 22 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.14567594267518513, 'num_leaves': 65, 'min_data_in_leaf': 39, 'feature_fraction': 0.9559526643712262, 'bagging_fraction': 0.9524075487702776, 'lambda_l1': 2.815110863422298, 'lambda_l2': 4.347384733837286, 'bagging_freq': 1, 'max_depth': 8}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[75]	valid_0's ndcg@3: 0.938286


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:04:57,821] Trial 23 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.14055743783564806, 'num_leaves': 62, 'min_data_in_leaf': 52, 'feature_fraction': 0.9541589054490743, 'bagging_fraction': 0.9449249109907848, 'lambda_l1': 2.748300168052951, 'lambda_l2': 4.33866770740528, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[98]	valid_0's ndcg@3: 0.94162


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:05:35,432] Trial 24 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.11549364295239463, 'num_leaves': 48, 'min_data_in_leaf': 23, 'feature_fraction': 0.9366890381526735, 'bagging_fraction': 0.8768434426919739, 'lambda_l1': 1.5538529502016452, 'lambda_l2': 3.2742957692194055, 'bagging_freq': 1, 'max_depth': 8}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[87]	valid_0's ndcg@3: 0.943525


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:06:18,436] Trial 25 finished with value: 0.010000000000000009 and parameters: {'learning_rate': 0.18614718595949256, 'num_leaves': 39, 'min_data_in_leaf': 41, 'feature_fraction': 0.9573862765250418, 'bagging_fraction': 0.9599362734881016, 'lambda_l1': 2.7872874637913254, 'lambda_l2': 4.359440999222507, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[89]	valid_0's ndcg@3: 0.943391


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:07:10,055] Trial 26 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.18866637689452367, 'num_leaves': 31, 'min_data_in_leaf': 47, 'feature_fraction': 0.8365048733461136, 'bagging_fraction': 0.8902895870077736, 'lambda_l1': 3.44515958029183, 'lambda_l2': 2.849062049752076, 'bagging_freq': 3, 'max_depth': 12}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[79]	valid_0's ndcg@3: 0.940697


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:08:06,381] Trial 27 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.16726255783424782, 'num_leaves': 42, 'min_data_in_leaf': 62, 'feature_fraction': 0.9706960189158425, 'bagging_fraction': 0.9629722870184746, 'lambda_l1': 1.3880439418367205, 'lambda_l2': 3.5506997937044327, 'bagging_freq': 2, 'max_depth': 11}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[95]	valid_0's ndcg@3: 0.943361


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:08:53,213] Trial 28 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.1892292463764014, 'num_leaves': 29, 'min_data_in_leaf': 54, 'feature_fraction': 0.805150003502603, 'bagging_fraction': 0.7966402745943224, 'lambda_l1': 3.6916021767385185, 'lambda_l2': 1.883199850442399, 'bagging_freq': 5, 'max_depth': 10}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[89]	valid_0's ndcg@3: 0.936188


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:09:34,214] Trial 29 finished with value: 0.016249999999999987 and parameters: {'learning_rate': 0.16123033277872845, 'num_leaves': 54, 'min_data_in_leaf': 31, 'feature_fraction': 0.884749437175844, 'bagging_fraction': 0.995962792258265, 'lambda_l1': 3.02883697709224, 'lambda_l2': 4.22526140512565, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[68]	valid_0's ndcg@3: 0.941486


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:10:16,574] Trial 30 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.1845576000109755, 'num_leaves': 50, 'min_data_in_leaf': 22, 'feature_fraction': 0.874122724524548, 'bagging_fraction': 0.9703034864132998, 'lambda_l1': 3.287234299549743, 'lambda_l2': 3.066627525216757, 'bagging_freq': 1, 'max_depth': 12}. Best is trial 8 with value: 0.010000000000000009.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[99]	valid_0's ndcg@3: 0.941545


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:10:57,533] Trial 31 finished with value: 0.008750000000000036 and parameters: {'learning_rate': 0.14925362110050383, 'num_leaves': 61, 'min_data_in_leaf': 39, 'feature_fraction': 0.9509281191383269, 'bagging_fraction': 0.9584197752403462, 'lambda_l1': 2.766186105296192, 'lambda_l2': 4.511482480952715, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[98]	valid_0's ndcg@3: 0.940861


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:11:40,440] Trial 32 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.16783013495969817, 'num_leaves': 60, 'min_data_in_leaf': 37, 'feature_fraction': 0.9137916596078073, 'bagging_fraction': 0.90275577201779, 'lambda_l1': 2.609900132385227, 'lambda_l2': 4.593855612770385, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[88]	valid_0's ndcg@3: 0.940132


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:12:22,028] Trial 33 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.16254184310081038, 'num_leaves': 42, 'min_data_in_leaf': 28, 'feature_fraction': 0.9997472017608418, 'bagging_fraction': 0.8459185705923274, 'lambda_l1': 2.381944175630537, 'lambda_l2': 4.048083396781808, 'bagging_freq': 1, 'max_depth': 10}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[71]	valid_0's ndcg@3: 0.940861


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:13:06,520] Trial 34 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.19543111881522443, 'num_leaves': 58, 'min_data_in_leaf': 44, 'feature_fraction': 0.9450001174540787, 'bagging_fraction': 0.638318839628841, 'lambda_l1': 2.8264204162081366, 'lambda_l2': 4.640119267602978, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[83]	valid_0's ndcg@3: 0.939507


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:14:05,264] Trial 35 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.15510397647680255, 'num_leaves': 67, 'min_data_in_leaf': 89, 'feature_fraction': 0.9052510025852627, 'bagging_fraction': 0.9671532227604275, 'lambda_l1': 2.12276800370513, 'lambda_l2': 3.609260117083212, 'bagging_freq': 2, 'max_depth': 8}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[88]	valid_0's ndcg@3: 0.939804


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:14:45,813] Trial 36 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.1729146589589571, 'num_leaves': 34, 'min_data_in_leaf': 67, 'feature_fraction': 0.8738291453596493, 'bagging_fraction': 0.933104194437236, 'lambda_l1': 1.7452123655747371, 'lambda_l2': 4.100988681995403, 'bagging_freq': 1, 'max_depth': 11}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[80]	valid_0's ndcg@3: 0.93912


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:15:34,699] Trial 37 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.1378088797534821, 'num_leaves': 70, 'min_data_in_leaf': 16, 'feature_fraction': 0.8375204907626204, 'bagging_fraction': 0.5052126086954803, 'lambda_l1': 1.0382978438342283, 'lambda_l2': 2.4039823096092396, 'bagging_freq': 2, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[97]	valid_0's ndcg@3: 0.937795


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:16:11,400] Trial 38 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.10395664183191303, 'num_leaves': 46, 'min_data_in_leaf': 31, 'feature_fraction': 0.9732807552814717, 'bagging_fraction': 0.7509592471715375, 'lambda_l1': 3.8851040810040542, 'lambda_l2': 3.9125192457634723, 'bagging_freq': 1, 'max_depth': 10}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[90]	valid_0's ndcg@3: 0.942304


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:17:08,852] Trial 39 finished with value: 0.010000000000000009 and parameters: {'learning_rate': 0.18319687641379984, 'num_leaves': 82, 'min_data_in_leaf': 50, 'feature_fraction': 0.8155794224356111, 'bagging_fraction': 0.9011573915090634, 'lambda_l1': 4.623771755117341, 'lambda_l2': 3.5639370175706917, 'bagging_freq': 3, 'max_depth': 7}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[40]	valid_0's ndcg@3: 0.901157


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:17:41,792] Trial 40 finished with value: 0.02749999999999997 and parameters: {'learning_rate': 0.010193776878215655, 'num_leaves': 87, 'min_data_in_leaf': 49, 'feature_fraction': 0.7313728058835619, 'bagging_fraction': 0.9125536786878152, 'lambda_l1': 4.541988319071369, 'lambda_l2': 3.5641396398322494, 'bagging_freq': 4, 'max_depth': 6}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[81]	valid_0's ndcg@3: 0.938227


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:18:38,431] Trial 41 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.18293981755221142, 'num_leaves': 81, 'min_data_in_leaf': 42, 'feature_fraction': 0.819832463580882, 'bagging_fraction': 0.9496359280878136, 'lambda_l1': 4.506561428624602, 'lambda_l2': 4.718057612245007, 'bagging_freq': 3, 'max_depth': 8}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[88]	valid_0's ndcg@3: 0.939209


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:19:22,853] Trial 42 finished with value: 0.011249999999999982 and parameters: {'learning_rate': 0.17445507050793407, 'num_leaves': 99, 'min_data_in_leaf': 34, 'feature_fraction': 0.7818950053446081, 'bagging_fraction': 0.8494002620518861, 'lambda_l1': 4.719090880724252, 'lambda_l2': 3.751167389586577, 'bagging_freq': 1, 'max_depth': 7}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[40]	valid_0's ndcg@3: 0.936873


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:19:58,787] Trial 43 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.19997363640478644, 'num_leaves': 57, 'min_data_in_leaf': 57, 'feature_fraction': 0.8943885152018101, 'bagging_fraction': 0.8884870320838483, 'lambda_l1': 2.3951700023052833, 'lambda_l2': 3.0894041574816185, 'bagging_freq': 5, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[93]	valid_0's ndcg@3: 0.938911


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:20:57,736] Trial 44 finished with value: 0.01375000000000004 and parameters: {'learning_rate': 0.1870962356424833, 'num_leaves': 93, 'min_data_in_leaf': 64, 'feature_fraction': 0.9311317257995595, 'bagging_fraction': 0.9992055575055298, 'lambda_l1': 3.8509618782939454, 'lambda_l2': 1.962968239339415, 'bagging_freq': 3, 'max_depth': 8}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[78]	valid_0's ndcg@3: 0.941813


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:21:48,402] Trial 45 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.1539296225549537, 'num_leaves': 72, 'min_data_in_leaf': 51, 'feature_fraction': 0.5176712977913858, 'bagging_fraction': 0.9253309233844215, 'lambda_l1': 2.1712913995347476, 'lambda_l2': 4.996140795239308, 'bagging_freq': 1, 'max_depth': 9}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[72]	valid_0's ndcg@3: 0.913241


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:22:36,015] Trial 46 finished with value: 0.025000000000000022 and parameters: {'learning_rate': 0.03930525801832191, 'num_leaves': 26, 'min_data_in_leaf': 47, 'feature_fraction': 0.7844098511370348, 'bagging_fraction': 0.8240999271118816, 'lambda_l1': 4.108114208147697, 'lambda_l2': 4.1022087957852875, 'bagging_freq': 2, 'max_depth': 6}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[83]	valid_0's ndcg@3: 0.934507


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:23:30,666] Trial 47 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.12637347009795435, 'num_leaves': 38, 'min_data_in_leaf': 75, 'feature_fraction': 0.8680593754551115, 'bagging_fraction': 0.9834351606367228, 'lambda_l1': 4.7381986552629645, 'lambda_l2': 4.519630002082221, 'bagging_freq': 2, 'max_depth': 7}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Did not meet early stopping. Best iteration is:
[84]	valid_0's ndcg@3: 0.938882


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:24:22,234] Trial 48 finished with value: 0.012499999999999956 and parameters: {'learning_rate': 0.1920428649594978, 'num_leaves': 51, 'min_data_in_leaf': 26, 'feature_fraction': 0.7560982715103031, 'bagging_fraction': 0.9051172767594667, 'lambda_l1': 3.308447166112485, 'lambda_l2': 3.3993554590269572, 'bagging_freq': 4, 'max_depth': 5}. Best is trial 31 with value: 0.008750000000000036.


Training until validation scores don't improve for 30 rounds
Early stopping, best iteration is:
[61]	valid_0's ndcg@3: 0.936843


  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)
  mrr = grouped.apply(reciprocal_rank).mean()
[I 2025-07-16 23:25:03,535] Trial 49 finished with value: 0.015000000000000013 and parameters: {'learning_rate': 0.16975654380712957, 'num_leaves': 109, 'min_data_in_leaf': 35, 'feature_fraction': 0.8182234916546454, 'bagging_fraction': 0.8655903725582945, 'lambda_l1': 3.0090595304292993, 'lambda_l2': 3.907284352997599, 'bagging_freq': 1, 'max_depth': 10}. Best is trial 31 with value: 0.008750000000000036.


In [28]:
print("\nBest CatBoost:")
print(catboost_study.best_params)
print("Recall@3:", 1 - catboost_study.best_value)

print("\nBest LightGBM:")
print(lgbm_study.best_params)
print("Recall@3:", 1 - lgbm_study.best_value)


Best CatBoost:
{'learning_rate': 0.1329416174119581, 'depth': 8, 'l2_leaf_reg': 6.628807908190987, 'random_strength': 4.881288964388853, 'min_data_in_leaf': 100, 'subsample': 0.708915364734102, 'colsample_bylevel': 0.6444718826631752, 'grow_policy': 'Lossguide'}
Recall@3: 0.985

Best LightGBM:
{'learning_rate': 0.14925362110050383, 'num_leaves': 61, 'min_data_in_leaf': 39, 'feature_fraction': 0.9509281191383269, 'bagging_fraction': 0.9584197752403462, 'lambda_l1': 2.766186105296192, 'lambda_l2': 4.511482480952715, 'bagging_freq': 1, 'max_depth': 9}
Recall@3: 0.99125


In [29]:
# Static/default parameters that LightGBM expects
base_params = {
    "objective": "lambdarank",
    "metric": ["ndcg"],
    "eval_at": [1, 3],
    "boosting_type": "gbdt",
    "verbosity": -1,
    "force_row_wise": True,
}

# Load from file
with open("light_gbm_com_best.json", "r") as f:
    best_params_from_json = json.load(f)

# Merge base + Optuna best
final_params = {**base_params, **best_params_from_json}

train_model_lightGBM(parameters=final_params, n_rounds=1000, lr_decay_gamma=0.95)

[1]	train's ndcg@1: 0.712771	train's ndcg@3: 0.881039	val's ndcg@1: 0.49875	val's ndcg@3: 0.697153
Training until validation scores don't improve for 500 rounds
[2]	train's ndcg@1: 0.916362	train's ndcg@3: 0.966712	val's ndcg@1: 0.69125	val's ndcg@3: 0.83077
[3]	train's ndcg@1: 0.934003	train's ndcg@3: 0.973692	val's ndcg@1: 0.7825	val's ndcg@3: 0.894476
[4]	train's ndcg@1: 0.936357	train's ndcg@3: 0.974752	val's ndcg@1: 0.79125	val's ndcg@3: 0.898494
[5]	train's ndcg@1: 0.938536	train's ndcg@3: 0.975715	val's ndcg@1: 0.7925	val's ndcg@3: 0.898792
[6]	train's ndcg@1: 0.942472	train's ndcg@3: 0.977236	val's ndcg@1: 0.79375	val's ndcg@3: 0.899714
[7]	train's ndcg@1: 0.943334	train's ndcg@3: 0.977538	val's ndcg@1: 0.795	val's ndcg@3: 0.899848
[8]	train's ndcg@1: 0.945068	train's ndcg@3: 0.978197	val's ndcg@1: 0.795	val's ndcg@3: 0.900176
[9]	train's ndcg@1: 0.946764	train's ndcg@3: 0.978825	val's ndcg@1: 0.8025	val's ndcg@3: 0.903107
[10]	train's ndcg@1: 0.948822	train's ndcg@3: 0.979593	

  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)


\Evaluation Metrics (Validation Set):
Accuracy@1 : 0.8137
Recall@3   : 0.9712
MRR        : 0.8911


  mrr = grouped.apply(reciprocal_rank).mean()


<lightgbm.basic.Booster at 0x7a8a37fa4250>

In [14]:
# Base/defaults for CatBoost
base_params = {
    "loss_function": "YetiRank",
    "eval_metric": "NDCG:top=3",
    "random_seed": 42,
    "task_type": "CPU",  # or "GPU"
}

# Load from file
best_params_from_json = {
    "learning_rate": 0.1329416174119581,
    "depth": 8,
    "l2_leaf_reg": 6.628807908190987,
    "random_strength": 4.881288964388853,
    "min_data_in_leaf": 100,
    "subsample": 0.708915364734102,
    "colsample_bylevel": 0.6444718826631752,
    "grow_policy": "Lossguide",
}

# Merge base + Optuna best
final_params = {**base_params, **best_params_from_json}

best_catboost_model = train_catboost_model(
    parameters=final_params, n_rounds=1000, val_df_input=val_df
)

0:	test: 0.7810507	best: 0.7810507 (0)	total: 8.58s	remaining: 2h 22m 50s
1:	test: 0.8003522	best: 0.8003522 (1)	total: 17.3s	remaining: 2h 24m 9s
2:	test: 0.8017659	best: 0.8017659 (2)	total: 26.1s	remaining: 2h 24m 43s
3:	test: 0.8056795	best: 0.8056795 (3)	total: 34.4s	remaining: 2h 22m 39s
4:	test: 0.8048909	best: 0.8056795 (3)	total: 43.3s	remaining: 2h 23m 32s
5:	test: 0.8048909	best: 0.8056795 (3)	total: 52s	remaining: 2h 23m 42s
6:	test: 0.8073612	best: 0.8073612 (6)	total: 1m	remaining: 2h 23m 45s
7:	test: 0.8083136	best: 0.8083136 (7)	total: 1m 9s	remaining: 2h 22m 35s
8:	test: 0.8084772	best: 0.8084772 (8)	total: 1m 17s	remaining: 2h 22m 33s
9:	test: 0.8093999	best: 0.8093999 (9)	total: 1m 25s	remaining: 2h 21m 40s
10:	test: 0.8109476	best: 0.8109476 (10)	total: 1m 34s	remaining: 2h 21m 36s
11:	test: 0.8120339	best: 0.8120339 (11)	total: 1m 43s	remaining: 2h 21m 27s
12:	test: 0.8200403	best: 0.8200403 (12)	total: 1m 51s	remaining: 2h 21m 1s
13:	test: 0.8527336	best: 0.852733

  top1 = grouped.apply(lambda g: g.loc[g["score"].idxmax()]).reset_index(drop=True)
  topk = grouped.apply(lambda g: g.nlargest(k, "score")).reset_index(drop=True)



Evaluation Metrics (Validation Set):
Accuracy@1 : 0.8612
Recall@3   : 0.9875
MRR        : 0.9246


  mrr = grouped.apply(reciprocal_rank).mean()


Cat boost takes it again, with a larger margin this time. Think this is the way to go. Next thing to do is to finalise the whole pipeline and do one final train.