# Домашняя работа по теме "Машинное обучение ранжированию"

В этом ДЗ мы:
- научимся работать со стандартным датасетом для машинного обучения ранжированию [MSLR](https://www.microsoft.com/en-us/research/project/mslr/)
- попробуем применить на практике все то, чему мы научились на семинаре

## Как будет происходить сдача ДЗ

Вам надо:
- форкнуть эту репу
- создать бранч в котором вы дальше будете работать
- реализовать класс Model в этом ноутбуке
- убедиться, что ваша реализация выбивает NDCG@10 выше бейзлайна (см. ниже)
- запушить ваш бранч и поставить Pull Request
- в комментарии написать какой скор вы выбили

В таком случае мы (организаторы):

- счекаутим вашу бранчу
- проверим что ваша реализация действительно выбивает заявленный скор

Предполагается, что и вы, и мы работаем в виртаульном окружении как в семинаре про машинное обучение ранжированию: seminars/7-learning-to-rank/requirements.txt(подробнее про работу с виртуальными окружениями README в корне этой репы).

Оценка:
- За выбитый скор больше **0.507** назначаем **5** баллов, за скор больше (или равно) **0.510** назначаем максимальный балл -- 10 баллов
- Тот из участников кто выбъет самый высокий скор получит еще +10 баллов

При сдаче кода важно помнить о том, что:
- В коде не должно быть захардкоженных с потолка взятых гиперпараметров (таких как число деревьев, learning rate и т.п.) -- обязательно должен быть представлен код который их подбирает!
- Решение должно быть стабильно от запуска к запуску (на CPU) т.е. все seed'ы для генераторов случайных чисел должны быть фиксированы
- Мы (организаторы) будем запускать код на CPU поэтому, даже если вы использовали для подбора параметров GPU, финальный скор надо репортить на CPU

## Пререквизиты

Импортируем все что нам понадобится для дальнейшей работы:

In [1]:
import pathlib
from timeit import default_timer as timer
import copy

import numpy as np
import pandas as pd

from hyperopt import hp, tpe
from hyperopt.fmin import fmin
from hyperopt.pyll import scope

from catboost import datasets, utils, CatBoost, Pool

## Датасет MSLR (Microsoft Learning to Rank)

Загрузим датасет MSLR.

Полный датасет можно скачать с официального сайта: https://www.microsoft.com/en-us/research/project/mslr/

Строго говоря, он состоит их 2х частей:

- основной датасет MSLR-WEB30K -- он содержит более 30 тыс. запросов
- "маленький" датасет MSLR-WEB10K, который содержит только 10 тыс. запросов и является случайным сэмплом датасета MSLR-WEB30K

в этом ДЗ мы будем работать с MSLR-WEB10K, т.к. полная версия датасета может просто не поместиться у нас в RAM (и, тем более, в память видеокарты если мы учимся на GPU)

Будем считать, что мы самостоятельно скачали датасет MSLR-WEB10K с официального сайта, поместили его в папку КОРЕНЬ-ЭТОЙ-РЕПЫ/data/mslr-web10k и раззиповали.

В результате у нас должна получиться следующая структура папок:

In [2]:
!ls /kaggle/input/

mslr-web10k


Заметим, что датасет довольно большой, в распакованном виде он весит 7.7 GB.

Датасет состоит из нескольких фолдов, которые по сути представляют из себя разные разбиения одних и тех же данных на обучающее, валидационное и тестовые множеста.

Дальше мы будем использовать только первый фолд: Fold1.

Заглянем внутрь:

In [3]:
!ls -lh /kaggle/input/mslr-web10k/Fold1

total 1.3G
-rw-r--r-- 1 nobody nogroup 267M Apr 24 11:09 test.txt
-rw-r--r-- 1 nobody nogroup 800M Apr 24 11:09 train.txt
-rw-r--r-- 1 nobody nogroup 261M Apr 24 11:09 vali.txt


Видим, что у нас 3 файла с говорящими названиями, соответсвующими сплитам нашего датасета.

Посмотрим на содержимое одного из файлов:

In [4]:
!head -n 1 /kaggle/input/mslr-web10k/Fold1/train.txt

2 qid:1 1:3 2:3 3:0 4:0 5:3 6:1 7:1 8:0 9:0 10:1 11:156 12:4 13:0 14:7 15:167 16:6.931275 17:22.076928 18:19.673353 19:22.255383 20:6.926551 21:3 22:3 23:0 24:0 25:6 26:1 27:1 28:0 29:0 30:2 31:1 32:1 33:0 34:0 35:2 36:1 37:1 38:0 39:0 40:2 41:0 42:0 43:0 44:0 45:0 46:0.019231 47:0.75000 48:0 49:0 50:0.035928 51:0.00641 52:0.25000 53:0 54:0 55:0.011976 56:0.00641 57:0.25000 58:0 59:0 60:0.011976 61:0.00641 62:0.25000 63:0 64:0 65:0.011976 66:0 67:0 68:0 69:0 70:0 71:6.931275 72:22.076928 73:0 74:0 75:13.853103 76:1.152128 77:5.99246 78:0 79:0 80:2.297197 81:3.078917 82:8.517343 83:0 84:0 85:6.156595 86:2.310425 87:7.358976 88:0 89:0 90:4.617701 91:0.694726 92:1.084169 93:0 94:0 95:2.78795 96:1 97:1 98:0 99:0 100:1 101:1 102:1 103:0 104:0 105:1 106:12.941469 107:20.59276 108:0 109:0 110:16.766961 111:-18.567793 112:-7.760072 113:-20.838749 114:-25.436074 115:-14.518523 116:-21.710022 117:-21.339609 118:-24.497864 119:-27.690319 120:-20.203779 121:-15.449379 122:-4.474452 123:-23.634899 

Видим, что данные лежат в уже знакомом нам по семинару формате:

- В первой колонке лежит таргет (оценка асессора), по 5-балльной шкале релевантности: от 0 до 4 (включительно)
- Во второй колонке лежит ID запроса, по которому можно сгруппировать все оценки документов в рамках одного и того же запроса
- Дальше идет вектор из 128 фичей (таких как значения BM25 и т.п.), их точная природа нам сейчас на важна

В файле qid и все-фичи кодируются в формате КЛЮЧ:ЗНАЧЕНИЕ, напр. 130:116 -- тут 130 это номер фичи, а 116 -- ее значение.

Такой формат в мире машинного обучения часто называют svm light формат (в честь когда-то популярной библиотеки SVM-Light)

Напишем немного вспомогательного кода для загрузки этого датасета:

In [5]:
def generate_column_names(num_features):
    """Generates column names for LETOR-like datasets"""
    columns = ['label', 'qid']
    for i in range(num_features):
        column = f"feature_{i+1}"
        columns.append(column)
    return columns
    
def load_svmlight_file(input_file, max_num_lines=0):
    """Loads dataset split in SVM-Light format"""
    def _parse_field(field):
        parts = field.split(':')
        if len(parts) != 2:
            raise Exception(f"invalid number of parts in field {field}")
        return parts

    num_features = 136
    exp_num_fields = num_features + 2
    num_lines = 0
    X = []
    with open(input_file, 'rt') as f:
        for line in f:
            try:
                num_lines += 1
                                  
                # Parse into fields
                fields = line.rstrip().split(' ')
                num_fields = len(fields)
                if num_fields != exp_num_fields:
                    raise Exception(f"invalid number of fields {num_fields}")
    
                # Parse every field
                x = np.zeros(exp_num_fields, dtype=np.float32)
                label = int(fields[0])
                x[0] = label
                _, qid_str = _parse_field(fields[1])
                qid = int(qid_str)
                x[1] = qid
                for i, field in enumerate(fields[2:]):
                    _, feature_str = _parse_field(field)
                    x[i+2] = float(feature_str)
    
                # Add new object
                X.append(x)
                if num_lines % 50000 == 0:
                    print(f"Loaded {num_lines} lines...")
                if max_num_lines > 0 and num_lines == max_num_lines:
                    print(f"WARNING: stop loading, line limit reached: max_num_lines = {max_num_lines} input_file = {input_file}")
                    break
            except Exception as e:
                raise Exception(f"error at line {num_lines} in {input_file}") from e
    
    # To pandas
    df = pd.DataFrame(X, columns=generate_column_names(num_features))
    print(f"Loaded SVM-Light file {input_file}")
    return df

И теперь загрузим датасет:

In [6]:
fold_dir = pathlib.Path("/kaggle/input/mslr-web10k/Fold1")

df_train = load_svmlight_file(fold_dir.joinpath("train.txt"))
df_valid = load_svmlight_file(fold_dir.joinpath("vali.txt"))
df_test = load_svmlight_file(fold_dir.joinpath("test.txt"))
print(f"Dataset loaded from fold_dir {fold_dir}")

Loaded 50000 lines...
Loaded 100000 lines...
Loaded 150000 lines...
Loaded 200000 lines...
Loaded 250000 lines...
Loaded 300000 lines...
Loaded 350000 lines...
Loaded 400000 lines...
Loaded 450000 lines...
Loaded 500000 lines...
Loaded 550000 lines...
Loaded 600000 lines...
Loaded 650000 lines...
Loaded 700000 lines...
Loaded SVM-Light file /kaggle/input/mslr-web10k/Fold1/train.txt
Loaded 50000 lines...
Loaded 100000 lines...
Loaded 150000 lines...
Loaded 200000 lines...
Loaded SVM-Light file /kaggle/input/mslr-web10k/Fold1/vali.txt
Loaded 50000 lines...
Loaded 100000 lines...
Loaded 150000 lines...
Loaded 200000 lines...
Loaded SVM-Light file /kaggle/input/mslr-web10k/Fold1/test.txt
Dataset loaded from fold_dir /kaggle/input/mslr-web10k/Fold1


Посмотрим на данные:

In [7]:
print(df_train.head(5))

   label  qid  feature_1  feature_2  feature_3  feature_4  feature_5  \
0    2.0  1.0        3.0        3.0        0.0        0.0        3.0   
1    2.0  1.0        3.0        0.0        3.0        0.0        3.0   
2    0.0  1.0        3.0        0.0        2.0        0.0        3.0   
3    2.0  1.0        3.0        0.0        3.0        0.0        3.0   
4    1.0  1.0        3.0        0.0        3.0        0.0        3.0   

   feature_6  feature_7  feature_8  ...  feature_127  feature_128  \
0        1.0        1.0   0.000000  ...         62.0   11089534.0   
1        1.0        0.0   1.000000  ...         54.0   11089534.0   
2        1.0        0.0   0.666667  ...         45.0          3.0   
3        1.0        0.0   1.000000  ...         56.0   11089534.0   
4        1.0        0.0   1.000000  ...         64.0          5.0   

   feature_129  feature_130  feature_131  feature_132  feature_133  \
0          2.0        116.0      64034.0         13.0          3.0   
1          2

Т.е. теперь мы видим что данные доступны в точно таком же виде, как это было в семинаре.

Проведем небольшой EDA.

Всего у нас 723 тыс. документов в трейне:

In [8]:
print(df_train.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 723412 entries, 0 to 723411
Columns: 138 entries, label to feature_136
dtypes: float32(138)
memory usage: 380.8 MB
None


235 тыс. документов в валидации:

In [9]:
print(df_valid.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 235259 entries, 0 to 235258
Columns: 138 entries, label to feature_136
dtypes: float32(138)
memory usage: 123.8 MB
None


И 241 тыс. документов в тесте:

In [10]:
print(df_test.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 241521 entries, 0 to 241520
Columns: 138 entries, label to feature_136
dtypes: float32(138)
memory usage: 127.1 MB
None


Сколько у нас запросов?

In [11]:
num_queries_train = df_train['qid'].nunique()
num_queries_valid = df_valid['qid'].nunique()
num_queries_test = df_test['qid'].nunique()
print(f"Got {num_queries_train} train, {num_queries_valid} valid and {num_queries_test} test queries")

Got 6000 train, 2000 valid and 2000 test queries


## Обучаем модель

Теперь можно приступить непосредственно к обучению модели. 

Объявим класс модели, который надо будем заимлементить в этом ДЗ:

In [12]:
from hyperopt import hp, tpe
from hyperopt.fmin import fmin
from hyperopt.pyll import scope
from catboost import CatBoost, Pool

In [13]:
import torch

train_on_gpu = torch.cuda.is_available()

if not train_on_gpu:
    print('CUDA is not available.  Training on CPU ...')
else:
    print('CUDA is available!  Training on GPU ...')

CUDA is available!  Training on GPU ...


In [14]:
EVAL_METRIC = 'NDCG:top=10;type=Exp'
DEFAULT_PARAMS = {
    'n_estimators': 1000,            # maximum possible number of trees
    'eval_metric': EVAL_METRIC,      # metric used for early stopping
    'random_seed': 123,
    'verbose': 10,
    'eta': 0.1,
    'max_bin': 64,
    'max_depth': 4,
    'task_type': 'GPU',
    'loss_function': 'YetiRank'
}

In [15]:
def to_catboost_dataset(df):
    y = df['label'].to_numpy()                       # Label: [0-4]
    q = df['qid'].to_numpy().astype('uint32')        # Query Id
    X = df.drop(columns=['label', 'qid']).to_numpy() # 136 features
    return (X, y, q)

In [16]:
X_train, y_train, q_train = to_catboost_dataset(df_train)
X_test, y_test, q_test = to_catboost_dataset(df_test)
X_valid, y_valid, q_valid = to_catboost_dataset(df_valid)
        
pool_train = Pool(data=X_train, label=y_train, group_id=q_train)
pool_test = Pool(data=X_test, label=y_test, group_id=q_test)
pool_valid = Pool(data=X_valid, label=y_valid, group_id=q_valid)

In [17]:
def objective(params=DEFAULT_PARAMS):
    p = copy.deepcopy(DEFAULT_PARAMS)
    p.update(params)
    
    model = CatBoost(p)
    
    model.fit(pool_train, eval_set=pool_valid, use_best_model=True)
    valid_hat = model.predict(pool_valid)
    
    return -utils.eval_metric(y_valid, valid_hat, 'NDCG:top=10;type=Exp', group_id=q_valid)[0]

In [18]:
class FindBestModel:
    def __init__(self):
        self.best_params = None
        self.space = {
            'n_estimators': scope.int(hp.quniform('n_estimators', low=500, high=3500, q=1)),
            'max_bin': scope.int(hp.quniform('max_bin', low=63, high=257, q=1)),
            'max_depth': scope.int(hp.quniform('max_depth', low=4, high=10, q=1))
        }
        
    def find_best(self):
        self.best_params = fmin(fn=objective, space=self.space, algo=tpe.suggest, max_evals=15, show_progressbar=True)
        
    def get_best(self):
        params = copy.deepcopy(DEFAULT_PARAMS)
        params.update(self.best_params)
        model = CatBoost(params)
        model.fit(pool_train, eval_set=pool_valid, use_best_model=True)
        
        return model

In [19]:
model = FindBestModel()

start = timer()
model.find_best()
elapsed = timer() - start

print(f"Model fit: elapsed = {elapsed:.3f}")

  0%|          | 0/15 [00:00<?, ?trial/s, best loss=?]

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3305098	best: 0.3305098 (0)	total: 512ms	remaining: 25m 51s

10:	test: 0.4235520	best: 0.4235520 (10)	total: 972ms	remaining: 4m 27s

20:	test: 0.4318579	best: 0.4318579 (20)	total: 1.29s	remaining: 3m 5s

30:	test: 0.4440280	best: 0.4440280 (30)	total: 1.61s	remaining: 2m 36s

40:	test: 0.4515038	best: 0.4515038 (40)	total: 1.93s	remaining: 2m 21s

50:	test: 0.4559475	best: 0.4559475 (50)	total: 2.24s	remaining: 2m 11s

60:	test: 0.4597177	best: 0.4597177 (60)	total: 2.56s	remaining: 2m 4s

70:	test: 0.4643410	best: 0.4643410 (70)	total: 2.88s	remaining: 2m

80:	test: 0.4673886	best: 0.4673886 (80)	total: 3.19s	remaining: 1m 56s

90:	test: 0.4707830	best: 0.4707830 (90)	total: 3.5s	remaining: 1m 53s

100:	test: 0.4732528	best: 0.4732528 (100)	total: 3.82s	remaining: 1m 50s

110:	test: 0.4759078	best: 0.4759078 (110)	total: 4.12s	remaining: 1m 48s

120:	test: 0.4779856	best: 0.4779856 (120)	total: 4.41s	remaining: 1m 46s

130:	test: 0.4803785	best: 0.4803785 (130)	total: 4.7

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3305098	best: 0.3305098 (0)	total: 82.6ms	remaining: 57.4s            

10:	test: 0.4259285	best: 0.4259285 (10)	total: 532ms	remaining: 33.2s           

20:	test: 0.4323091	best: 0.4323091 (20)	total: 862ms	remaining: 27.7s           

30:	test: 0.4438363	best: 0.4438363 (30)	total: 1.19s	remaining: 25.5s           

40:	test: 0.4515857	best: 0.4515857 (40)	total: 1.52s	remaining: 24.2s           

50:	test: 0.4545468	best: 0.4545468 (50)	total: 1.84s	remaining: 23.3s           

60:	test: 0.4605629	best: 0.4605629 (60)	total: 2.18s	remaining: 22.7s           

70:	test: 0.4627597	best: 0.4627597 (70)	total: 2.51s	remaining: 22.1s           

80:	test: 0.4675805	best: 0.4675805 (80)	total: 2.83s	remaining: 21.5s           

90:	test: 0.4699340	best: 0.4699340 (90)	total: 3.16s	remaining: 21s             

100:	test: 0.4726433	best: 0.4726433 (100)	total: 3.49s	remaining: 20.5s         

110:	test: 0.4750135	best: 0.4750135 (110)	total: 3.81s	remaining: 20.1s         

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3035030	best: 0.3035030 (0)	total: 66.3ms	remaining: 3m 39s           

10:	test: 0.4094649	best: 0.4094649 (10)	total: 452ms	remaining: 2m 15s          

20:	test: 0.4239933	best: 0.4239933 (20)	total: 716ms	remaining: 1m 52s          

30:	test: 0.4380516	best: 0.4380516 (30)	total: 979ms	remaining: 1m 43s          

40:	test: 0.4459011	best: 0.4459011 (40)	total: 1.24s	remaining: 1m 39s          

50:	test: 0.4522585	best: 0.4522585 (50)	total: 1.51s	remaining: 1m 36s          

60:	test: 0.4566453	best: 0.4566453 (60)	total: 1.77s	remaining: 1m 34s          

70:	test: 0.4606864	best: 0.4606864 (70)	total: 2.03s	remaining: 1m 32s          

80:	test: 0.4637705	best: 0.4637705 (80)	total: 2.29s	remaining: 1m 31s          

90:	test: 0.4654960	best: 0.4654960 (90)	total: 2.55s	remaining: 1m 30s          

100:	test: 0.4680721	best: 0.4680721 (100)	total: 2.81s	remaining: 1m 29s        

110:	test: 0.4704299	best: 0.4704299 (110)	total: 3.07s	remaining: 1m 28s        

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2686260	best: 0.2686260 (0)	total: 61.5ms	remaining: 2m 19s           

10:	test: 0.3914561	best: 0.3914561 (10)	total: 409ms	remaining: 1m 24s          

20:	test: 0.4140621	best: 0.4140621 (20)	total: 657ms	remaining: 1m 10s          

30:	test: 0.4289021	best: 0.4289021 (30)	total: 893ms	remaining: 1m 4s           

40:	test: 0.4403381	best: 0.4403381 (40)	total: 1.13s	remaining: 1m 1s           

50:	test: 0.4453322	best: 0.4453322 (50)	total: 1.36s	remaining: 59.4s           

60:	test: 0.4513907	best: 0.4513907 (60)	total: 1.6s	remaining: 58.1s            

70:	test: 0.4541289	best: 0.4541289 (70)	total: 1.84s	remaining: 57.1s           

80:	test: 0.4572853	best: 0.4572853 (80)	total: 2.1s	remaining: 56.8s            

90:	test: 0.4598308	best: 0.4598308 (90)	total: 2.34s	remaining: 56.1s           

100:	test: 0.4621531	best: 0.4621531 (100)	total: 2.58s	remaining: 55.4s         

110:	test: 0.4639538	best: 0.4639538 (110)	total: 2.82s	remaining: 54.9s         

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3454577	best: 0.3454577 (0)	total: 93.1ms	remaining: 4m 54s           

10:	test: 0.4331950	best: 0.4331950 (10)	total: 638ms	remaining: 3m 2s           

20:	test: 0.4469700	best: 0.4469700 (20)	total: 1.09s	remaining: 2m 43s          

30:	test: 0.4552667	best: 0.4552667 (30)	total: 1.55s	remaining: 2m 36s          

40:	test: 0.4618028	best: 0.4618028 (40)	total: 2.01s	remaining: 2m 33s          

50:	test: 0.4656601	best: 0.4656601 (50)	total: 2.47s	remaining: 2m 30s          

60:	test: 0.4690902	best: 0.4690902 (60)	total: 2.93s	remaining: 2m 28s          

70:	test: 0.4718949	best: 0.4718949 (70)	total: 3.39s	remaining: 2m 27s          

80:	test: 0.4756263	best: 0.4756263 (80)	total: 3.88s	remaining: 2m 27s          

90:	test: 0.4790121	best: 0.4790121 (90)	total: 4.38s	remaining: 2m 27s          

100:	test: 0.4828266	best: 0.4828266 (100)	total: 4.83s	remaining: 2m 26s        

110:	test: 0.4841857	best: 0.4841857 (110)	total: 5.29s	remaining: 2m 25s        

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3448810	best: 0.3448810 (0)	total: 90.8ms	remaining: 1m 2s            

10:	test: 0.4337507	best: 0.4337507 (10)	total: 629ms	remaining: 38.8s           

20:	test: 0.4492481	best: 0.4492481 (20)	total: 1.03s	remaining: 32.7s           

30:	test: 0.4544605	best: 0.4544605 (30)	total: 1.43s	remaining: 30.3s           

40:	test: 0.4600075	best: 0.4600075 (40)	total: 1.82s	remaining: 28.8s           

50:	test: 0.4653929	best: 0.4653929 (50)	total: 2.22s	remaining: 27.8s           

60:	test: 0.4695723	best: 0.4695723 (60)	total: 2.62s	remaining: 27s             

70:	test: 0.4736115	best: 0.4736115 (70)	total: 3.02s	remaining: 26.3s           

80:	test: 0.4766646	best: 0.4766646 (80)	total: 3.42s	remaining: 25.7s           

90:	test: 0.4807572	best: 0.4807572 (90)	total: 3.82s	remaining: 25.1s           

100:	test: 0.4832644	best: 0.4832644 (100)	total: 4.22s	remaining: 24.6s         

110:	test: 0.4849614	best: 0.4849614 (110)	total: 4.62s	remaining: 24.1s         

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2354538	best: 0.2354538 (0)	total: 50.3ms	remaining: 1m 11s           

10:	test: 0.3887324	best: 0.3887324 (10)	total: 372ms	remaining: 47.6s           

20:	test: 0.4050617	best: 0.4050617 (20)	total: 577ms	remaining: 38.5s           

30:	test: 0.4178292	best: 0.4178292 (30)	total: 765ms	remaining: 34.3s           

40:	test: 0.4282227	best: 0.4282227 (40)	total: 953ms	remaining: 32.1s           

50:	test: 0.4314208	best: 0.4314208 (50)	total: 1.16s	remaining: 31.2s           

60:	test: 0.4415318	best: 0.4415318 (60)	total: 1.37s	remaining: 30.6s           

70:	test: 0.4465823	best: 0.4465823 (70)	total: 1.59s	remaining: 30.2s           

80:	test: 0.4518575	best: 0.4518575 (80)	total: 1.8s	remaining: 29.8s            

90:	test: 0.4534747	best: 0.4534747 (90)	total: 2.02s	remaining: 29.5s           

100:	test: 0.4560308	best: 0.4560308 (100)	total: 2.23s	remaining: 29.2s         

110:	test: 0.4573881	best: 0.4573881 (110)	total: 2.46s	remaining: 29s           

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3454577	best: 0.3454577 (0)	total: 111ms	remaining: 3m 34s            

10:	test: 0.4290658	best: 0.4290658 (10)	total: 674ms	remaining: 1m 58s          

20:	test: 0.4473333	best: 0.4473333 (20)	total: 1.14s	remaining: 1m 44s          

30:	test: 0.4543474	best: 0.4543474 (30)	total: 1.62s	remaining: 1m 40s          

40:	test: 0.4605663	best: 0.4605663 (40)	total: 2.09s	remaining: 1m 37s          

50:	test: 0.4666727	best: 0.4666727 (50)	total: 2.57s	remaining: 1m 35s          

60:	test: 0.4692063	best: 0.4692063 (60)	total: 3.04s	remaining: 1m 33s          

70:	test: 0.4723051	best: 0.4723051 (70)	total: 3.51s	remaining: 1m 32s          

80:	test: 0.4771152	best: 0.4771152 (80)	total: 3.98s	remaining: 1m 31s          

90:	test: 0.4789892	best: 0.4789892 (90)	total: 4.45s	remaining: 1m 30s          

100:	test: 0.4812594	best: 0.4812594 (100)	total: 4.92s	remaining: 1m 29s        

110:	test: 0.4837223	best: 0.4837223 (110)	total: 5.39s	remaining: 1m 28s        

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2686260	best: 0.2686260 (0)	total: 55.3ms	remaining: 2m 1s            

10:	test: 0.3980837	best: 0.3980837 (10)	total: 385ms	remaining: 1m 16s          

20:	test: 0.4155278	best: 0.4155278 (20)	total: 607ms	remaining: 1m 2s           

30:	test: 0.4281692	best: 0.4281692 (30)	total: 835ms	remaining: 58.1s           

40:	test: 0.4373016	best: 0.4373016 (40)	total: 1.07s	remaining: 56.1s           

50:	test: 0.4430516	best: 0.4430516 (50)	total: 1.3s	remaining: 54.7s            

60:	test: 0.4496925	best: 0.4496925 (60)	total: 1.54s	remaining: 53.7s           

70:	test: 0.4532435	best: 0.4532435 (70)	total: 1.77s	remaining: 52.7s           

80:	test: 0.4572926	best: 0.4572926 (80)	total: 1.98s	remaining: 51.6s           

90:	test: 0.4592057	best: 0.4592057 (90)	total: 2.2s	remaining: 50.7s            

100:	test: 0.4624363	best: 0.4624363 (100)	total: 2.43s	remaining: 50.2s         

110:	test: 0.4650298	best: 0.4650298 (110)	total: 2.66s	remaining: 49.8s         

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2717128	best: 0.2717128 (0)	total: 54.1ms	remaining: 2m 18s           

10:	test: 0.3953301	best: 0.3953301 (10)	total: 365ms	remaining: 1m 24s          

20:	test: 0.4103364	best: 0.4103364 (20)	total: 599ms	remaining: 1m 12s          

30:	test: 0.4272071	best: 0.4272071 (30)	total: 825ms	remaining: 1m 7s           

40:	test: 0.4362770	best: 0.4362770 (40)	total: 1.03s	remaining: 1m 3s           

50:	test: 0.4431007	best: 0.4431007 (50)	total: 1.25s	remaining: 1m 1s           

60:	test: 0.4501155	best: 0.4501155 (60)	total: 1.46s	remaining: 59.7s           

70:	test: 0.4533306	best: 0.4533306 (70)	total: 1.67s	remaining: 58.4s           

80:	test: 0.4564760	best: 0.4564760 (80)	total: 1.87s	remaining: 57.4s           

90:	test: 0.4588103	best: 0.4588103 (90)	total: 2.08s	remaining: 56.5s           

100:	test: 0.4612105	best: 0.4612105 (100)	total: 2.29s	remaining: 55.9s         

110:	test: 0.4637424	best: 0.4637424 (110)	total: 2.51s	remaining: 55.3s         

120:

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2347339	best: 0.2347339 (0)	total: 57.8ms	remaining: 2m 59s            

10:	test: 0.3895952	best: 0.3895952 (10)	total: 393ms	remaining: 1m 50s           

20:	test: 0.4074315	best: 0.4074315 (20)	total: 640ms	remaining: 1m 33s           

30:	test: 0.4189417	best: 0.4189417 (30)	total: 868ms	remaining: 1m 26s           

40:	test: 0.4282542	best: 0.4282542 (40)	total: 1.08s	remaining: 1m 20s           

50:	test: 0.4347243	best: 0.4347243 (50)	total: 1.31s	remaining: 1m 18s           

60:	test: 0.4422436	best: 0.4422436 (60)	total: 1.54s	remaining: 1m 17s           

70:	test: 0.4468532	best: 0.4468532 (70)	total: 1.78s	remaining: 1m 16s           

80:	test: 0.4504671	best: 0.4504671 (80)	total: 2.03s	remaining: 1m 15s           

90:	test: 0.4531859	best: 0.4531859 (90)	total: 2.25s	remaining: 1m 14s           

100:	test: 0.4567612	best: 0.4567612 (100)	total: 2.47s	remaining: 1m 13s         

110:	test: 0.4577927	best: 0.4583827 (105)	total: 2.68s	remaining: 1m 12s   

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.3029651	best: 0.3029651 (0)	total: 58.9ms	remaining: 3m 1s             

10:	test: 0.4099538	best: 0.4099538 (10)	total: 393ms	remaining: 1m 49s           

20:	test: 0.4255094	best: 0.4255094 (20)	total: 651ms	remaining: 1m 35s           

30:	test: 0.4381382	best: 0.4381382 (30)	total: 909ms	remaining: 1m 29s           

40:	test: 0.4459608	best: 0.4459608 (40)	total: 1.17s	remaining: 1m 26s           

50:	test: 0.4519370	best: 0.4519370 (50)	total: 1.43s	remaining: 1m 24s           

60:	test: 0.4555992	best: 0.4555992 (60)	total: 1.71s	remaining: 1m 24s           

70:	test: 0.4602166	best: 0.4602166 (70)	total: 1.97s	remaining: 1m 23s           

80:	test: 0.4638481	best: 0.4638481 (80)	total: 2.23s	remaining: 1m 22s           

90:	test: 0.4661155	best: 0.4661155 (90)	total: 2.49s	remaining: 1m 22s           

100:	test: 0.4680726	best: 0.4680726 (100)	total: 2.75s	remaining: 1m 21s         

110:	test: 0.4697756	best: 0.4697756 (110)	total: 3.01s	remaining: 1m 20s   

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2691789	best: 0.2691789 (0)	total: 57.9ms	remaining: 1m 57s            

10:	test: 0.3975317	best: 0.3975317 (10)	total: 406ms	remaining: 1m 14s           

20:	test: 0.4115430	best: 0.4115430 (20)	total: 634ms	remaining: 1m               

30:	test: 0.4259471	best: 0.4259471 (30)	total: 860ms	remaining: 55.5s            

40:	test: 0.4380428	best: 0.4380428 (40)	total: 1.08s	remaining: 52.7s            

50:	test: 0.4440155	best: 0.4440155 (50)	total: 1.31s	remaining: 50.9s            

60:	test: 0.4497332	best: 0.4497332 (60)	total: 1.53s	remaining: 49.6s            

70:	test: 0.4535811	best: 0.4535811 (70)	total: 1.76s	remaining: 48.7s            

80:	test: 0.4580418	best: 0.4580418 (80)	total: 1.99s	remaining: 47.9s            

90:	test: 0.4589226	best: 0.4589226 (90)	total: 2.21s	remaining: 47.3s            

100:	test: 0.4621927	best: 0.4621927 (100)	total: 2.44s	remaining: 46.8s          

110:	test: 0.4643027	best: 0.4643027 (110)	total: 2.69s	remaining: 46.6s    

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2701194	best: 0.2701194 (0)	total: 67.6ms	remaining: 2m 48s            

10:	test: 0.4007102	best: 0.4007102 (10)	total: 407ms	remaining: 1m 31s           

20:	test: 0.4132794	best: 0.4132794 (20)	total: 648ms	remaining: 1m 16s           

30:	test: 0.4267182	best: 0.4267182 (30)	total: 880ms	remaining: 1m 10s           

40:	test: 0.4381576	best: 0.4381576 (40)	total: 1.11s	remaining: 1m 6s            

50:	test: 0.4434908	best: 0.4434908 (50)	total: 1.34s	remaining: 1m 4s            

60:	test: 0.4483035	best: 0.4483035 (60)	total: 1.57s	remaining: 1m 3s            

70:	test: 0.4525847	best: 0.4525847 (70)	total: 1.81s	remaining: 1m 2s            

80:	test: 0.4562688	best: 0.4562688 (80)	total: 2.04s	remaining: 1m               

90:	test: 0.4594746	best: 0.4594746 (90)	total: 2.26s	remaining: 59.9s            

100:	test: 0.4617401	best: 0.4617401 (100)	total: 2.48s	remaining: 58.9s          

110:	test: 0.4640238	best: 0.4640238 (110)	total: 2.69s	remaining: 58s      

Default metric period is 5 because NDCG
 is/are not implemented for GPU

Metric NDCG:type=Base is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time

Metric NDCG:top=10;type=Exp is not implemented on GPU. Will use CPU for metric computation, this could significantly affect learning time



0:	test: 0.2686260	best: 0.2686260 (0)	total: 61.5ms	remaining: 3m 9s             

10:	test: 0.3916085	best: 0.3916085 (10)	total: 412ms	remaining: 1m 55s           

20:	test: 0.4142697	best: 0.4142697 (20)	total: 653ms	remaining: 1m 35s           

30:	test: 0.4289084	best: 0.4289084 (30)	total: 889ms	remaining: 1m 27s           

40:	test: 0.4389140	best: 0.4389140 (40)	total: 1.1s	remaining: 1m 21s            

50:	test: 0.4448593	best: 0.4448593 (50)	total: 1.31s	remaining: 1m 18s           

60:	test: 0.4502921	best: 0.4502921 (60)	total: 1.52s	remaining: 1m 15s           

70:	test: 0.4538242	best: 0.4538242 (70)	total: 1.78s	remaining: 1m 15s           

80:	test: 0.4573320	best: 0.4573320 (80)	total: 2.02s	remaining: 1m 15s           

90:	test: 0.4597935	best: 0.4597935 (90)	total: 2.27s	remaining: 1m 14s           

100:	test: 0.4618585	best: 0.4618585 (100)	total: 2.5s	remaining: 1m 13s          

110:	test: 0.4629360	best: 0.4629360 (110)	total: 2.74s	remaining: 1m 13s   

Создадим и применим модель:

In [20]:
print(model.best_params)

{'max_bin': 111.0, 'max_depth': 10.0, 'n_estimators': 3164.0}


In [29]:
# Fit
start = timer()
best_model = model.get_best()
elapsed = timer() - start

print(f"Model fit: elapsed = {elapsed:.3f}")

AttributeError: 'CatBoost' object has no attribute 'get_best'

Сохраняем модель

In [22]:
model_file = "/kaggle/working/model.cbm"

# Save and zip model
best_model.save_model(model_file)
!zip model /kaggle/working/model.cbm

In [26]:
model = CatBoost()
model.load_model(model_file)

<catboost.core.CatBoost at 0x7bd219bccdf0>

In [32]:
y_hat_test = model.predict(pool_test)
print(f"Predicted: y_hat_test.shape = {y_hat_test.shape}")

Predicted: y_hat_test.shape = (241521,)


Теперь, имея предикты, можно посчитать метрики качества:

In [33]:
def compute_metrics(y_true, y_hat, q):
    # List of metrics to evaluate
    eval_metrics = ['NDCG:top=10;type=Exp']
    
    for eval_metric in eval_metrics:
        scores = utils.eval_metric(y_true, y_hat, eval_metric, group_id=q)
    
        # Print scores
        print(f"metric = {eval_metric} score = {scores[0]:.3f}")

# Get test targets and groups
y_test = df_test['label'].to_numpy()
q_test = df_test['qid'].to_numpy().astype('uint32')
    
# Compute metrics on test
compute_metrics(y_test, y_hat_test, q_test)

metric = NDCG:top=10;type=Exp score = 0.511


Ожидаем, что ваша модель покажет результаты выше бейзлайна!

 **Результат работы: 0.511**