# Цель занятия
На этом занятии мы рассмотрим применение алгоритма Catboost для задачи регрессии.

CatBoost (Categorical Boosting) - это библиотека машинного обучения для градиентного бустинга на деревьях решений. Она была разработана компанией Яндекс и имеет ряд преимуществ в сравнении с другими алгоритмами бустинга, такими как XGBoost и LightGBM.

Одно из главных преимуществ CatBoost заключается в том, что она автоматически обрабатывает категориальные признаки, не требуя их предварительной обработки. Кроме того, она имеет встроенные методы обработки пропущенных значений и автоматический подбор гиперпараметров. Также CatBoost поддерживает распараллеливание и может работать с различными типами данных.

In [1]:
# Установка библиотеки catboost
# !pip install catboost

In [2]:
# Импортирование необходимых библиотек
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from catboost import CatBoostRegressor
from sklearn.metrics import mean_squared_error

In [3]:
# Загрузка данных и разделение на обучающий и тестовый наборы
data = fetch_california_housing()
X = pd.DataFrame(data['data'], columns=data['feature_names'])
y = pd.Series(data['target'])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [4]:
# Инициализация и обучение модели CatBoostRegressor
model = CatBoostRegressor()
model.fit(X_train, y_train)

Learning rate set to 0.063766
0:	learn: 1.1147097	total: 59ms	remaining: 58.9s
1:	learn: 1.0778029	total: 61ms	remaining: 30.4s
2:	learn: 1.0430071	total: 62.6ms	remaining: 20.8s
3:	learn: 1.0106614	total: 64.3ms	remaining: 16s
4:	learn: 0.9819674	total: 65.8ms	remaining: 13.1s
5:	learn: 0.9546991	total: 67.2ms	remaining: 11.1s
6:	learn: 0.9281868	total: 68.8ms	remaining: 9.76s
7:	learn: 0.9037831	total: 70.5ms	remaining: 8.74s
8:	learn: 0.8825569	total: 72.1ms	remaining: 7.93s
9:	learn: 0.8608727	total: 73.6ms	remaining: 7.29s
10:	learn: 0.8418439	total: 75.1ms	remaining: 6.75s
11:	learn: 0.8242003	total: 76.7ms	remaining: 6.32s
12:	learn: 0.8085311	total: 78.1ms	remaining: 5.93s
13:	learn: 0.7918148	total: 79.4ms	remaining: 5.59s
14:	learn: 0.7785217	total: 81.3ms	remaining: 5.34s
15:	learn: 0.7662624	total: 83.5ms	remaining: 5.13s
16:	learn: 0.7528313	total: 85.3ms	remaining: 4.93s
17:	learn: 0.7416752	total: 86.7ms	remaining: 4.73s
18:	learn: 0.7316411	total: 88.1ms	remaining: 4.55

232:	learn: 0.4562158	total: 395ms	remaining: 1.3s
233:	learn: 0.4559226	total: 396ms	remaining: 1.3s
234:	learn: 0.4555963	total: 398ms	remaining: 1.29s
235:	learn: 0.4552493	total: 399ms	remaining: 1.29s
236:	learn: 0.4549251	total: 400ms	remaining: 1.29s
237:	learn: 0.4546846	total: 401ms	remaining: 1.28s
238:	learn: 0.4543230	total: 403ms	remaining: 1.28s
239:	learn: 0.4540855	total: 405ms	remaining: 1.28s
240:	learn: 0.4536201	total: 406ms	remaining: 1.28s
241:	learn: 0.4532681	total: 408ms	remaining: 1.28s
242:	learn: 0.4529652	total: 409ms	remaining: 1.27s
243:	learn: 0.4527180	total: 410ms	remaining: 1.27s
244:	learn: 0.4524231	total: 412ms	remaining: 1.27s
245:	learn: 0.4520919	total: 413ms	remaining: 1.27s
246:	learn: 0.4518734	total: 414ms	remaining: 1.26s
247:	learn: 0.4515555	total: 416ms	remaining: 1.26s
248:	learn: 0.4512858	total: 417ms	remaining: 1.26s
249:	learn: 0.4508883	total: 419ms	remaining: 1.25s
250:	learn: 0.4505844	total: 420ms	remaining: 1.25s
251:	learn: 0.

497:	learn: 0.3984481	total: 791ms	remaining: 798ms
498:	learn: 0.3983340	total: 793ms	remaining: 796ms
499:	learn: 0.3981924	total: 794ms	remaining: 794ms
500:	learn: 0.3980439	total: 795ms	remaining: 792ms
501:	learn: 0.3979232	total: 796ms	remaining: 790ms
502:	learn: 0.3977006	total: 797ms	remaining: 788ms
503:	learn: 0.3976408	total: 799ms	remaining: 786ms
504:	learn: 0.3975152	total: 800ms	remaining: 784ms
505:	learn: 0.3973568	total: 801ms	remaining: 782ms
506:	learn: 0.3971558	total: 802ms	remaining: 780ms
507:	learn: 0.3969344	total: 804ms	remaining: 779ms
508:	learn: 0.3968406	total: 805ms	remaining: 777ms
509:	learn: 0.3966924	total: 807ms	remaining: 775ms
510:	learn: 0.3966185	total: 808ms	remaining: 773ms
511:	learn: 0.3964841	total: 809ms	remaining: 772ms
512:	learn: 0.3961797	total: 811ms	remaining: 770ms
513:	learn: 0.3960527	total: 812ms	remaining: 768ms
514:	learn: 0.3959269	total: 813ms	remaining: 766ms
515:	learn: 0.3957726	total: 815ms	remaining: 764ms
516:	learn: 

764:	learn: 0.3647082	total: 1.19s	remaining: 364ms
765:	learn: 0.3645694	total: 1.19s	remaining: 363ms
766:	learn: 0.3644659	total: 1.19s	remaining: 361ms
767:	learn: 0.3644030	total: 1.19s	remaining: 360ms
768:	learn: 0.3642909	total: 1.19s	remaining: 358ms
769:	learn: 0.3641889	total: 1.19s	remaining: 356ms
770:	learn: 0.3640961	total: 1.2s	remaining: 355ms
771:	learn: 0.3640330	total: 1.2s	remaining: 354ms
772:	learn: 0.3639340	total: 1.2s	remaining: 352ms
773:	learn: 0.3637815	total: 1.2s	remaining: 350ms
774:	learn: 0.3637084	total: 1.2s	remaining: 349ms
775:	learn: 0.3636285	total: 1.2s	remaining: 347ms
776:	learn: 0.3634953	total: 1.2s	remaining: 346ms
777:	learn: 0.3633941	total: 1.21s	remaining: 344ms
778:	learn: 0.3633098	total: 1.21s	remaining: 342ms
779:	learn: 0.3632252	total: 1.21s	remaining: 341ms
780:	learn: 0.3631540	total: 1.21s	remaining: 339ms
781:	learn: 0.3630184	total: 1.21s	remaining: 338ms
782:	learn: 0.3629240	total: 1.21s	remaining: 336ms
783:	learn: 0.36271

<catboost.core.CatBoostRegressor at 0x17b25a8c0>

In [5]:
# Предсказание на тестовом наборе
y_pred = model.predict(X_test)

In [6]:
# Оценка качества модели
mse = mean_squared_error(y_test, y_pred)

print(f"Mean Squared Error: {mse}")

Mean Squared Error: 0.19760814499635423
