### TP AutoML Catboost - Parte 3
#### Principales funcionalidades

* **Gran calidad sin ajuste de parámetros:**
Reduzca el tiempo dedicado al ajuste de parámetros, porque CatBoost proporciona excelentes resultados con parámetros predeterminados
* **Soporte de características categóricas:** 
Mejore los resultados de su entrenamiento con CatBoost que le permite usar factores no numéricos, en lugar de tener que procesar previamente sus datos o dedicar tiempo y esfuerzo a convertirlos en números. 
* **Versión de GPU rápida y escalable:** 
Entrene su modelo en una implementación rápida del algoritmo de aumento de gradiente para GPU. Utilice una configuración de varias tarjetas para grandes conjuntos de datos.
* **Precisión mejorada:** 
Reduzca el sobreajuste al construir sus modelos con un novedoso esquema de aumento de gradiente.
* **Predicción rápida:** 
Aplique su modelo entrenado de manera rápida y eficiente incluso en tareas críticas de latencia utilizando el aplicador de modelos de CatBoost

[Dirección de la página de la libreria (Documentación y ejemplos)](https://catboost.ai/)

Para visualizar las caracterísitcas más relevantes de la librería en atributos categóricos se usará el set de datos  de "titanic" de clasificación binaria

Para visualizar algunas graficas de analisis de features se requiere instalar algunos componentes
```
!pip install catboost
!pip install ipywidgets
!jupyter nbextension enable --py widgetsnbextension
```

In [48]:
# importamos las librerías básicas
import numpy as np
import pandas as pd

from numpy.random import RandomState

# Catboost tiene incorporado set de datos para probar 
from catboost.datasets import titanic
# Catbost clasificador, pool para agrupar información del set de datos y CV para cross-validation
from catboost import CatBoostClassifier, Pool, cv
from catboost import MetricVisualizer

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [49]:
train_df, test_df = titanic()

train_df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


Verificación de NaN y datos faltantes y se complentan con valores para ser tenidos en cuenta en el modelo

In [50]:
null_value_stats = train_df.isnull().sum(axis=0)
null_value_stats[null_value_stats != 0]

Age         177
Cabin       687
Embarked      2
dtype: int64

In [51]:
train_df.fillna(-999, inplace=True)
test_df.fillna(-999, inplace=True)

Separamos en datos y el objetivo a predecir que es la columna "Survived" 

In [52]:
X = train_df.drop('Survived', axis=1)
y = train_df.Survived

In [53]:
# verificamos el tipo de datos de las columnas
print(X.dtypes)

PassengerId      int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object


In [54]:
# creamos una lista de índices que indican las columnas con atributos categóricos
categorical_features_indices = np.where(X.dtypes != np.float)[0]

# Aunque el tipo de dato sea int64 se lo considerará como un ID o codificación categórica

In [55]:
categorical_features_indices

array([ 0,  1,  2,  3,  5,  6,  7,  9, 10])

Se se paran los set de datos en Train y Validacion

In [56]:
X_train, X_validation, y_train, y_validation = train_test_split(X, y, train_size=0.75, random_state=42)
X_test = test_df

### Entrenar el modelo

Creemos el modelo con parámetros predeterminados, ya que proporcionan una línea de base realmente buena. Se especificael parámetro custom_loss, ya que esto nos daría la capacidad de ver lo que está sucediendo en términos de esta métrica, así como también para poder observar el Logloss.

In [57]:
model = CatBoostClassifier(
    custom_loss=['Accuracy'],
    random_seed=42,
    verbose=50
)

In [58]:
model.fit(
    X_train, y_train,
    cat_features=categorical_features_indices,
    eval_set=(X_validation, y_validation),
    plot=True
);

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))

Learning rate set to 0.028683
0:	learn: 0.6739988	test: 0.6742630	best: 0.6742630 (0)	total: 55.2ms	remaining: 55.2s
50:	learn: 0.4167287	test: 0.4462036	best: 0.4462036 (50)	total: 379ms	remaining: 7.04s
100:	learn: 0.3711681	test: 0.4134127	best: 0.4134127 (100)	total: 583ms	remaining: 5.19s
150:	learn: 0.3448945	test: 0.4008998	best: 0.4008998 (150)	total: 802ms	remaining: 4.51s
200:	learn: 0.3265759	test: 0.3967844	best: 0.3965197 (194)	total: 1.05s	remaining: 4.16s
250:	learn: 0.3105904	test: 0.3946490	best: 0.3945034 (247)	total: 1.37s	remaining: 4.09s
300:	learn: 0.2953614	test: 0.3944096	best: 0.3934154 (283)	total: 1.66s	remaining: 3.86s
350:	learn: 0.2777571	test: 0.3940834	best: 0.3925689 (343)	total: 1.9s	remaining: 3.51s
400:	learn: 0.2594838	test: 0.3946853	best: 0.3925689 (343)	total: 2.3s	remaining: 3.43s
450:	learn: 0.2447373	test: 0.3979196	best: 0.3925689 (343)	total: 2.54s	remaining: 3.1s
500:	learn: 0.2299916	test: 0.4000835	best: 0.3925689 (343)	total: 2.81s	remai

La mejor presición que se obtuvo es de 83% en 150 iteraciones

#### Validar con Cross-Validation

In [59]:
# No se esta usando cross-validation estratificado

cv_params = model.get_params()
cv_params.update({
    'loss_function': 'Logloss'
})

cv_data = cv(
    Pool(X, y, cat_features=categorical_features_indices),
    cv_params,
    plot=True
)


MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))

0:	learn: 0.6764772	test: 0.6769355	best: 0.6769355 (0)	total: 1.13s	remaining: 18m 46s
50:	learn: 0.4032720	test: 0.4493630	best: 0.4493630 (50)	total: 58.3s	remaining: 18m 3s
100:	learn: 0.3517610	test: 0.4281837	best: 0.4281837 (100)	total: 1m 58s	remaining: 17m 33s
150:	learn: 0.3258848	test: 0.4231984	best: 0.4231186 (142)	total: 3m 3s	remaining: 17m 10s
200:	learn: 0.3043477	test: 0.4217408	best: 0.4213916 (167)	total: 4m 8s	remaining: 16m 29s
250:	learn: 0.2864213	test: 0.4214269	best: 0.4210862 (237)	total: 5m 20s	remaining: 15m 56s
300:	learn: 0.2690885	test: 0.4225761	best: 0.4210862 (237)	total: 6m 36s	remaining: 15m 20s
350:	learn: 0.2501281	test: 0.4236048	best: 0.4210862 (237)	total: 7m 57s	remaining: 14m 42s
400:	learn: 0.2324071	test: 0.4248877	best: 0.4210862 (237)	total: 9m 9s	remaining: 13m 40s
450:	learn: 0.2183595	test: 0.4250668	best: 0.4210862 (237)	total: 10m 18s	remaining: 12m 33s
500:	learn: 0.2055346	test: 0.4271164	best: 0.4210862 (237)	total: 11m 25s	remain

In [60]:
# No se esta usando cross-validation estratificado por lo que la desviación estandar es más grande

print('Mejor presición en validación: {:.2f}±{:.2f} en la iteración {}'.format(
    np.max(cv_data['test-Accuracy-mean']),
    cv_data['test-Accuracy-std'][np.argmax(cv_data['test-Accuracy-mean'])],
    np.argmax(cv_data['test-Accuracy-mean'])
))

Mejor presición en validación: 0.83±0.02 en la iteración 543


In [61]:
print('Presición exacta en la validación: {}'.format(np.max(cv_data['test-Accuracy-mean'])))

Presición exacta en la validación: 0.8260381593714928


La estimación inicial del 83% rendimiento en un solo set de datos de validación fue demasiado optimista, por lo que es necesario realizar un cross-validation

### Cross-validation

In [62]:
def print_cv_summary(cv_data):
    cv_data.head(10)

    best_value = cv_data['test-Logloss-mean'].min()
    best_iter = cv_data['test-Logloss-mean'].values.argmin()

    print('Mejor valor de Logloss: {:.4f}±{:.4f} en la iteración {}'.format(
        best_value,
        cv_data['test-Logloss-std'][best_iter],
        best_iter)
    )

#### Cross validation estratificada
Si el parámetro Si `stratified=True`, cada fold contiene el mismo porcentaje de filas de cada clase que el conjunto de datos. Si `stratified=False`, los pliegues se muestrean uniformemente.

Por lo general, la validación cruzada estratificada tiene una desviación estándar más pequeña del valor de la función de costo.

In [63]:
params = {
    'loss_function': 'Logloss',
    'iterations': 1000,
    'custom_loss': 'AUC',
    'learning_rate': 0.05,
}

In [64]:
cv_data = cv(
    params = params,
    pool = Pool(X, y, cat_features=categorical_features_indices),
    fold_count=5,
    shuffle=True,
    partition_random_seed=0,
    plot=True,
    stratified=True,
    verbose=10
)

print_cv_summary(cv_data)

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))

0:	learn: 0.6673448	test: 0.6672121	best: 0.6672121 (0)	total: 2s	remaining: 33m 19s
10:	learn: 0.5012465	test: 0.5160606	best: 0.5160606 (10)	total: 22.9s	remaining: 34m 18s
20:	learn: 0.4412190	test: 0.4696547	best: 0.4696547 (20)	total: 43s	remaining: 33m 22s
30:	learn: 0.4082074	test: 0.4491204	best: 0.4491204 (30)	total: 1m 3s	remaining: 32m 56s
40:	learn: 0.3860414	test: 0.4319088	best: 0.4319088 (40)	total: 1m 23s	remaining: 32m 23s
50:	learn: 0.3702245	test: 0.4212413	best: 0.4212413 (50)	total: 1m 43s	remaining: 32m 3s
60:	learn: 0.3590748	test: 0.4188728	best: 0.4188728 (60)	total: 2m 3s	remaining: 31m 42s
70:	learn: 0.3490142	test: 0.4157400	best: 0.4157400 (70)	total: 2m 24s	remaining: 31m 24s
80:	learn: 0.3404113	test: 0.4131682	best: 0.4131682 (80)	total: 2m 44s	remaining: 31m 3s
90:	learn: 0.3336128	test: 0.4120402	best: 0.4120220 (89)	total: 3m 5s	remaining: 30m 52s
100:	learn: 0.3273486	test: 0.4117166	best: 0.4117166 (100)	total: 3m 26s	remaining: 30m 37s
110:	learn: 

890:	learn: 0.1010352	test: 0.4473438	best: 0.4117166 (100)	total: 32m 13s	remaining: 3m 56s
900:	learn: 0.1000364	test: 0.4477044	best: 0.4117166 (100)	total: 32m 34s	remaining: 3m 34s
910:	learn: 0.0991702	test: 0.4482303	best: 0.4117166 (100)	total: 32m 55s	remaining: 3m 12s
920:	learn: 0.0982790	test: 0.4482049	best: 0.4117166 (100)	total: 33m 16s	remaining: 2m 51s
930:	learn: 0.0970930	test: 0.4487783	best: 0.4117166 (100)	total: 33m 38s	remaining: 2m 29s
940:	learn: 0.0960667	test: 0.4490552	best: 0.4117166 (100)	total: 33m 58s	remaining: 2m 7s
950:	learn: 0.0952262	test: 0.4493474	best: 0.4117166 (100)	total: 34m 20s	remaining: 1m 46s
960:	learn: 0.0943647	test: 0.4502592	best: 0.4117166 (100)	total: 34m 41s	remaining: 1m 24s
970:	learn: 0.0933399	test: 0.4511375	best: 0.4117166 (100)	total: 35m 2s	remaining: 1m 2s
980:	learn: 0.0924160	test: 0.4515593	best: 0.4117166 (100)	total: 35m 23s	remaining: 41.1s
990:	learn: 0.0913994	test: 0.4519991	best: 0.4117166 (100)	total: 35m 43s

Los siguientes parámetros indican como dividir el dataset para cross-validation:
- `stratified` (True / False)
- `folds` (stratified_shuffle_split / shuffle_split / repeated_split / custom iterator)
- `type` (Classical / Inverted / TimeSeries)


**Catboost puede usar sklearn folds:** https://scikit-learn.org/stable/modules/classes.html#module-sklearn.model_selection

Considerar tres variantes de sklearn folds: ShuffleSplit, RepeatedKFold, y StratifiedShuffleSplit.
    
    
```python
from sklearn.model_selection import ShuffleSplit
shuffle_split = ShuffleSplit(n_splits=5, test_size=.25, random_state=0)
cv_data = cv(
    params = params,
    pool = train_pool,
    verbose = False,
    folds = shuffle_split,
    plot=True
)

```

```python
from sklearn.model_selection import RepeatedKFold
repeated_split = RepeatedKFold(n_splits=5, n_repeats=2, random_state=2652124)
cv_data = cv(
    params = params,
    pool = train_pool,
    verbose = False,
    plot=True,
    folds = repeated_split
)

```

```python
from sklearn.model_selection import StratifiedShuffleSplit
stratified_shuffle_split = StratifiedShuffleSplit(n_splits=5, test_size=0.5, random_state=0)
cv_data = cv(
    params = params,
    pool = train_pool,
    verbose = False,
    folds = stratified_shuffle_split,
    plot = True
)
```
    
    
    
    

#### Tipos de folds en CatBoost 

Si ```type = 'Classical'```, entonces cada conjunto de prueba consta de un solo fold, y el conjunto de entrenamientp correspondiente consta de los k-1 fold restantes.

Si ```type = 'Inverted'```, entonces cada conjunto de prueba consta de k-1 fold, y el conjunto de entrenamiento correspondiente consiste en el fold restante.

Si ```type = TimeSeries```, entonces cada conjunto de entrenamiento consta de los primeros k fold, y el conjunto de prueba correspondiente consiste en el (k + 1)-ésimo fold.

Normalmente, este tipo de fold reduce la desviación estándar del valor de la funcion de costo, si el orden de las filas en el conjunto de datos es importante.


```python
cv_data = cv(
    params = params,
    pool = train_pool,
    fold_count=5,
    shuffle=True,
    plot=True,
    type = 'TimeSeries',
    verbose=False
)
```

### Aplicando las predicciones
* El método predict detorno una lista con las etiquetas predecidas para el set de datos de validación

* El método preidct_proba, retorna una lista con un array que indica la probabilidad de cada etiqueta para el set de datos de validación

In [65]:
pred = model.predict(X_test)
print(pred[:10])

[0 0 0 0 1 0 1 0 1 0]


In [66]:
pred_proba = model.predict_proba(X_test)

#prob de etiqueta "0" y prob de etiqueta 1
print(pred_proba[:10])

[[0.85473931 0.14526069]
 [0.76313031 0.23686969]
 [0.88972889 0.11027111]
 [0.87876173 0.12123827]
 [0.3611047  0.6388953 ]
 [0.90513381 0.09486619]
 [0.33434185 0.66565815]
 [0.78468564 0.21531436]
 [0.39429048 0.60570952]
 [0.94047549 0.05952451]]


### Funcionalidades de CatBoost

1. **Usar mejor modelo**
   
Siempre es mejor usar el parámetro ```use_best_model``` durante el entrenamiento. De forma predeterminada, este parámetro está habilitado (True). Si está habilitado, el conjunto de árboles resultante se está reduciendo a la mejor iteración. Es decir que si se indica 5000 iteraciones, el emsable resultante podría ser menor

In [67]:
params_catboost = {
    'loss_function': 'Logloss',
    'iterations': 1000,
    'custom_loss': 'AUC',
    'learning_rate': 0.05,
    'use_best_model': False #por defecto esta siempre en TRUE. En el caso de que se requiere usar el 100% del ensamble se debe poner a FALSE
}

Instanciamos el modelo de clasificación usando los parametros indicados arriba

In [68]:
# modelo sin best_model
catboost_model = CatBoostClassifier(**params_catboost)

# actualizamos el parametro use_best_model a True y creamos otro clasificador
best_model_params = params_catboost.copy()
best_model_params.update({
    'use_best_model': True
})

catboost_best_model = CatBoostClassifier(**best_model_params)


# Se crean los pool de train y test
train_pool = Pool(X_train, 
                  y_train, 
                  cat_features=categorical_features_indices
                 )

validate_pool = Pool(X_validation, 
                     y_validation, 
                     cat_features=categorical_features_indices
                    )

In [69]:
# se entrena el modelo
catboost_model.fit(train_pool, 
                   eval_set=validate_pool,
                   plot=False,
                   verbose=False
                  );

catboost_best_model.fit(train_pool, 
                   eval_set=validate_pool,
                   plot=False,
                   verbose=False
                  );


In [70]:
print('Presición en validación: {:.4}'.format(
    accuracy_score(y_validation, catboost_model.predict(X_validation))
))
print('Cantidad de árboles: ' +  str(catboost_model.tree_count_))
print('')

print('Best model validation accuracy: {:.4}'.format(
    accuracy_score(y_validation, catboost_best_model.predict(X_validation))
))
print('Cantidad de árboles: ' +  str(catboost_best_model.tree_count_) + ' - Se truncó la cantidad de árboles a la mejor iteración' )
print('')

Presición en validación: 0.8251
Cantidad de árboles: 1000

Best model validation accuracy: 0.8206
Cantidad de árboles: 225 - Se truncó la cantidad de árboles a la mejor iteración



2. **Detención anticipada**

Siempre es más fácil y mejor utilizar la detención anticipada. Esta característica es similar a la anterior, pero solo además de mejorar la calidad, aún ahorra tiempo.

In [71]:
%%time
model = CatBoostClassifier(**best_model_params)
model.fit(train_pool, eval_set=validate_pool)

0:	learn: 0.6661115	test: 0.6664227	best: 0.6664227 (0)	total: 13.1ms	remaining: 13.1s
1:	learn: 0.6362938	test: 0.6373589	best: 0.6373589 (1)	total: 21.8ms	remaining: 10.9s
2:	learn: 0.6129752	test: 0.6138551	best: 0.6138551 (2)	total: 34.7ms	remaining: 11.5s
3:	learn: 0.5913690	test: 0.5936460	best: 0.5936460 (3)	total: 44ms	remaining: 11s
4:	learn: 0.5720040	test: 0.5745107	best: 0.5745107 (4)	total: 49.2ms	remaining: 9.79s
5:	learn: 0.5549127	test: 0.5589440	best: 0.5589440 (5)	total: 62ms	remaining: 10.3s
6:	learn: 0.5397623	test: 0.5458768	best: 0.5458768 (6)	total: 68.1ms	remaining: 9.66s
7:	learn: 0.5321961	test: 0.5386301	best: 0.5386301 (7)	total: 70.7ms	remaining: 8.77s
8:	learn: 0.5217373	test: 0.5290940	best: 0.5290940 (8)	total: 77.6ms	remaining: 8.55s
9:	learn: 0.5099187	test: 0.5181228	best: 0.5181228 (9)	total: 88.5ms	remaining: 8.76s
10:	learn: 0.4997593	test: 0.5091932	best: 0.5091932 (10)	total: 95.9ms	remaining: 8.62s
11:	learn: 0.4894315	test: 0.4990581	best: 0.49

112:	learn: 0.3316453	test: 0.4017035	best: 0.4017035 (112)	total: 1.01s	remaining: 7.92s
113:	learn: 0.3309326	test: 0.4022236	best: 0.4017035 (112)	total: 1.02s	remaining: 7.91s
114:	learn: 0.3300984	test: 0.4017055	best: 0.4017035 (112)	total: 1.03s	remaining: 7.92s
115:	learn: 0.3300476	test: 0.4017062	best: 0.4017035 (112)	total: 1.03s	remaining: 7.86s
116:	learn: 0.3296911	test: 0.4019286	best: 0.4017035 (112)	total: 1.04s	remaining: 7.87s
117:	learn: 0.3289675	test: 0.4024211	best: 0.4017035 (112)	total: 1.05s	remaining: 7.87s
118:	learn: 0.3274966	test: 0.4030426	best: 0.4017035 (112)	total: 1.06s	remaining: 7.86s
119:	learn: 0.3269642	test: 0.4031505	best: 0.4017035 (112)	total: 1.07s	remaining: 7.84s
120:	learn: 0.3252655	test: 0.4032811	best: 0.4017035 (112)	total: 1.08s	remaining: 7.82s
121:	learn: 0.3245251	test: 0.4031647	best: 0.4017035 (112)	total: 1.08s	remaining: 7.8s
122:	learn: 0.3231629	test: 0.4023914	best: 0.4017035 (112)	total: 1.09s	remaining: 7.79s
123:	learn:

223:	learn: 0.2736367	test: 0.3939295	best: 0.3936039 (222)	total: 2.04s	remaining: 7.05s
224:	learn: 0.2721890	test: 0.3934973	best: 0.3934973 (224)	total: 2.05s	remaining: 7.05s
225:	learn: 0.2720077	test: 0.3938456	best: 0.3934973 (224)	total: 2.06s	remaining: 7.06s
226:	learn: 0.2714347	test: 0.3940282	best: 0.3934973 (224)	total: 2.07s	remaining: 7.05s
227:	learn: 0.2707710	test: 0.3945215	best: 0.3934973 (224)	total: 2.08s	remaining: 7.04s
228:	learn: 0.2706186	test: 0.3948545	best: 0.3934973 (224)	total: 2.09s	remaining: 7.02s
229:	learn: 0.2690993	test: 0.3936733	best: 0.3934973 (224)	total: 2.1s	remaining: 7.01s
230:	learn: 0.2682132	test: 0.3939455	best: 0.3934973 (224)	total: 2.11s	remaining: 7.01s
231:	learn: 0.2665247	test: 0.3960825	best: 0.3934973 (224)	total: 2.12s	remaining: 7.01s
232:	learn: 0.2662854	test: 0.3960323	best: 0.3934973 (224)	total: 2.13s	remaining: 7.01s
233:	learn: 0.2662065	test: 0.3959754	best: 0.3934973 (224)	total: 2.14s	remaining: 7.01s
234:	learn:

328:	learn: 0.2229219	test: 0.4018457	best: 0.3934973 (224)	total: 3.05s	remaining: 6.21s
329:	learn: 0.2227278	test: 0.4017520	best: 0.3934973 (224)	total: 3.06s	remaining: 6.2s
330:	learn: 0.2223145	test: 0.4019490	best: 0.3934973 (224)	total: 3.06s	remaining: 6.2s
331:	learn: 0.2212678	test: 0.4021486	best: 0.3934973 (224)	total: 3.07s	remaining: 6.18s
332:	learn: 0.2206488	test: 0.4019114	best: 0.3934973 (224)	total: 3.08s	remaining: 6.17s
333:	learn: 0.2198825	test: 0.4018050	best: 0.3934973 (224)	total: 3.09s	remaining: 6.16s
334:	learn: 0.2190972	test: 0.4017126	best: 0.3934973 (224)	total: 3.12s	remaining: 6.18s
335:	learn: 0.2187154	test: 0.4020202	best: 0.3934973 (224)	total: 3.13s	remaining: 6.2s
336:	learn: 0.2184664	test: 0.4022574	best: 0.3934973 (224)	total: 3.14s	remaining: 6.19s
337:	learn: 0.2182589	test: 0.4018414	best: 0.3934973 (224)	total: 3.15s	remaining: 6.17s
338:	learn: 0.2176334	test: 0.4023003	best: 0.3934973 (224)	total: 3.16s	remaining: 6.17s
339:	learn: 0

434:	learn: 0.1863233	test: 0.4030989	best: 0.3934973 (224)	total: 4.07s	remaining: 5.28s
435:	learn: 0.1862809	test: 0.4029619	best: 0.3934973 (224)	total: 4.08s	remaining: 5.27s
436:	learn: 0.1859309	test: 0.4034397	best: 0.3934973 (224)	total: 4.1s	remaining: 5.28s
437:	learn: 0.1856564	test: 0.4036096	best: 0.3934973 (224)	total: 4.11s	remaining: 5.27s
438:	learn: 0.1855794	test: 0.4035988	best: 0.3934973 (224)	total: 4.12s	remaining: 5.26s
439:	learn: 0.1852298	test: 0.4038266	best: 0.3934973 (224)	total: 4.13s	remaining: 5.25s
440:	learn: 0.1846981	test: 0.4037714	best: 0.3934973 (224)	total: 4.13s	remaining: 5.24s
441:	learn: 0.1841001	test: 0.4042563	best: 0.3934973 (224)	total: 4.14s	remaining: 5.23s
442:	learn: 0.1836799	test: 0.4045402	best: 0.3934973 (224)	total: 4.15s	remaining: 5.22s
443:	learn: 0.1830037	test: 0.4043488	best: 0.3934973 (224)	total: 4.16s	remaining: 5.21s
444:	learn: 0.1829182	test: 0.4042936	best: 0.3934973 (224)	total: 4.17s	remaining: 5.2s
445:	learn: 

526:	learn: 0.1623362	test: 0.4063700	best: 0.3934973 (224)	total: 5.12s	remaining: 4.6s
527:	learn: 0.1621115	test: 0.4065510	best: 0.3934973 (224)	total: 5.13s	remaining: 4.59s
528:	learn: 0.1620903	test: 0.4065453	best: 0.3934973 (224)	total: 5.14s	remaining: 4.58s
529:	learn: 0.1615827	test: 0.4072985	best: 0.3934973 (224)	total: 5.15s	remaining: 4.57s
530:	learn: 0.1614671	test: 0.4071436	best: 0.3934973 (224)	total: 5.17s	remaining: 4.56s
531:	learn: 0.1612529	test: 0.4071254	best: 0.3934973 (224)	total: 5.19s	remaining: 4.56s
532:	learn: 0.1608891	test: 0.4074124	best: 0.3934973 (224)	total: 5.21s	remaining: 4.56s
533:	learn: 0.1606280	test: 0.4075417	best: 0.3934973 (224)	total: 5.22s	remaining: 4.55s
534:	learn: 0.1603693	test: 0.4075007	best: 0.3934973 (224)	total: 5.23s	remaining: 4.54s
535:	learn: 0.1601960	test: 0.4072386	best: 0.3934973 (224)	total: 5.24s	remaining: 4.53s
536:	learn: 0.1600332	test: 0.4069727	best: 0.3934973 (224)	total: 5.25s	remaining: 4.52s
537:	learn:

630:	learn: 0.1386964	test: 0.4155630	best: 0.3934973 (224)	total: 6.12s	remaining: 3.58s
631:	learn: 0.1386049	test: 0.4155320	best: 0.3934973 (224)	total: 6.13s	remaining: 3.57s
632:	learn: 0.1384697	test: 0.4155111	best: 0.3934973 (224)	total: 6.14s	remaining: 3.56s
633:	learn: 0.1383596	test: 0.4156923	best: 0.3934973 (224)	total: 6.15s	remaining: 3.55s
634:	learn: 0.1382533	test: 0.4161131	best: 0.3934973 (224)	total: 6.17s	remaining: 3.54s
635:	learn: 0.1381519	test: 0.4159563	best: 0.3934973 (224)	total: 6.17s	remaining: 3.53s
636:	learn: 0.1379729	test: 0.4153001	best: 0.3934973 (224)	total: 6.18s	remaining: 3.52s
637:	learn: 0.1376866	test: 0.4155902	best: 0.3934973 (224)	total: 6.19s	remaining: 3.51s
638:	learn: 0.1376108	test: 0.4157340	best: 0.3934973 (224)	total: 6.2s	remaining: 3.5s
639:	learn: 0.1373940	test: 0.4155320	best: 0.3934973 (224)	total: 6.21s	remaining: 3.5s
640:	learn: 0.1373056	test: 0.4155109	best: 0.3934973 (224)	total: 6.22s	remaining: 3.48s
641:	learn: 0

733:	learn: 0.1225613	test: 0.4162365	best: 0.3934973 (224)	total: 7.14s	remaining: 2.59s
734:	learn: 0.1223399	test: 0.4164676	best: 0.3934973 (224)	total: 7.15s	remaining: 2.58s
735:	learn: 0.1221428	test: 0.4163327	best: 0.3934973 (224)	total: 7.16s	remaining: 2.57s
736:	learn: 0.1220187	test: 0.4165281	best: 0.3934973 (224)	total: 7.17s	remaining: 2.56s
737:	learn: 0.1218730	test: 0.4168618	best: 0.3934973 (224)	total: 7.18s	remaining: 2.55s
738:	learn: 0.1218568	test: 0.4168313	best: 0.3934973 (224)	total: 7.19s	remaining: 2.54s
739:	learn: 0.1214881	test: 0.4171112	best: 0.3934973 (224)	total: 7.2s	remaining: 2.53s
740:	learn: 0.1213694	test: 0.4172766	best: 0.3934973 (224)	total: 7.21s	remaining: 2.52s
741:	learn: 0.1212176	test: 0.4173662	best: 0.3934973 (224)	total: 7.21s	remaining: 2.51s
742:	learn: 0.1211018	test: 0.4175433	best: 0.3934973 (224)	total: 7.23s	remaining: 2.5s
743:	learn: 0.1209372	test: 0.4174329	best: 0.3934973 (224)	total: 7.24s	remaining: 2.49s
744:	learn: 

835:	learn: 0.1091916	test: 0.4217603	best: 0.3934973 (224)	total: 8.16s	remaining: 1.6s
836:	learn: 0.1090841	test: 0.4217743	best: 0.3934973 (224)	total: 8.17s	remaining: 1.59s
837:	learn: 0.1089637	test: 0.4221013	best: 0.3934973 (224)	total: 8.18s	remaining: 1.58s
838:	learn: 0.1088849	test: 0.4220676	best: 0.3934973 (224)	total: 8.18s	remaining: 1.57s
839:	learn: 0.1088501	test: 0.4220255	best: 0.3934973 (224)	total: 8.19s	remaining: 1.56s
840:	learn: 0.1086580	test: 0.4220097	best: 0.3934973 (224)	total: 8.2s	remaining: 1.55s
841:	learn: 0.1082784	test: 0.4229413	best: 0.3934973 (224)	total: 8.21s	remaining: 1.54s
842:	learn: 0.1082444	test: 0.4229678	best: 0.3934973 (224)	total: 8.21s	remaining: 1.53s
843:	learn: 0.1082174	test: 0.4230005	best: 0.3934973 (224)	total: 8.22s	remaining: 1.52s
844:	learn: 0.1080392	test: 0.4232288	best: 0.3934973 (224)	total: 8.24s	remaining: 1.51s
845:	learn: 0.1079757	test: 0.4233507	best: 0.3934973 (224)	total: 8.25s	remaining: 1.5s
846:	learn: 0

940:	learn: 0.0972237	test: 0.4332945	best: 0.3934973 (224)	total: 9.14s	remaining: 573ms
941:	learn: 0.0970403	test: 0.4336040	best: 0.3934973 (224)	total: 9.15s	remaining: 563ms
942:	learn: 0.0969263	test: 0.4339504	best: 0.3934973 (224)	total: 9.16s	remaining: 554ms
943:	learn: 0.0967754	test: 0.4339189	best: 0.3934973 (224)	total: 9.17s	remaining: 544ms
944:	learn: 0.0967069	test: 0.4340484	best: 0.3934973 (224)	total: 9.18s	remaining: 534ms
945:	learn: 0.0967016	test: 0.4340586	best: 0.3934973 (224)	total: 9.19s	remaining: 524ms
946:	learn: 0.0966063	test: 0.4343879	best: 0.3934973 (224)	total: 9.2s	remaining: 515ms
947:	learn: 0.0965588	test: 0.4342716	best: 0.3934973 (224)	total: 9.21s	remaining: 505ms
948:	learn: 0.0964985	test: 0.4343901	best: 0.3934973 (224)	total: 9.22s	remaining: 495ms
949:	learn: 0.0964171	test: 0.4343663	best: 0.3934973 (224)	total: 9.23s	remaining: 486ms
950:	learn: 0.0963118	test: 0.4349201	best: 0.3934973 (224)	total: 9.24s	remaining: 476ms
951:	learn:

<catboost.core.CatBoostClassifier at 0x7f684c43eda0>

In [72]:
%%time
earlystop_params = best_model_params.copy()
earlystop_params.update({
    'early_stopping_rounds': 40
})

model = CatBoostClassifier(**best_model_params)
model.fit(train_pool, eval_set=validate_pool)

earlystop_model = CatBoostClassifier(**earlystop_params)
earlystop_model.fit(train_pool, eval_set=validate_pool);

0:	learn: 0.6661115	test: 0.6664227	best: 0.6664227 (0)	total: 7.59ms	remaining: 7.58s
1:	learn: 0.6362938	test: 0.6373589	best: 0.6373589 (1)	total: 14.5ms	remaining: 7.24s
2:	learn: 0.6129752	test: 0.6138551	best: 0.6138551 (2)	total: 29.9ms	remaining: 9.94s
3:	learn: 0.5913690	test: 0.5936460	best: 0.5936460 (3)	total: 45.9ms	remaining: 11.4s
4:	learn: 0.5720040	test: 0.5745107	best: 0.5745107 (4)	total: 51.1ms	remaining: 10.2s
5:	learn: 0.5549127	test: 0.5589440	best: 0.5589440 (5)	total: 64.8ms	remaining: 10.7s
6:	learn: 0.5397623	test: 0.5458768	best: 0.5458768 (6)	total: 77.2ms	remaining: 10.9s
7:	learn: 0.5321961	test: 0.5386301	best: 0.5386301 (7)	total: 79.9ms	remaining: 9.91s
8:	learn: 0.5217373	test: 0.5290940	best: 0.5290940 (8)	total: 83.5ms	remaining: 9.19s
9:	learn: 0.5099187	test: 0.5181228	best: 0.5181228 (9)	total: 93.7ms	remaining: 9.27s
10:	learn: 0.4997593	test: 0.5091932	best: 0.5091932 (10)	total: 102ms	remaining: 9.18s
11:	learn: 0.4894315	test: 0.4990581	best:

102:	learn: 0.3374607	test: 0.4019946	best: 0.4019946 (102)	total: 925ms	remaining: 8.05s
103:	learn: 0.3369098	test: 0.4022174	best: 0.4019946 (102)	total: 933ms	remaining: 8.04s
104:	learn: 0.3367348	test: 0.4024092	best: 0.4019946 (102)	total: 950ms	remaining: 8.09s
105:	learn: 0.3358539	test: 0.4024527	best: 0.4019946 (102)	total: 956ms	remaining: 8.06s
106:	learn: 0.3353567	test: 0.4025521	best: 0.4019946 (102)	total: 963ms	remaining: 8.04s
107:	learn: 0.3348592	test: 0.4024851	best: 0.4019946 (102)	total: 977ms	remaining: 8.07s
108:	learn: 0.3340554	test: 0.4021210	best: 0.4019946 (102)	total: 992ms	remaining: 8.11s
109:	learn: 0.3327101	test: 0.4028799	best: 0.4019946 (102)	total: 1s	remaining: 8.12s
110:	learn: 0.3325930	test: 0.4024938	best: 0.4019946 (102)	total: 1.01s	remaining: 8.08s
111:	learn: 0.3320595	test: 0.4020349	best: 0.4019946 (102)	total: 1.01s	remaining: 8.04s
112:	learn: 0.3316453	test: 0.4017035	best: 0.4017035 (112)	total: 1.02s	remaining: 8.03s
113:	learn: 0

196:	learn: 0.2870780	test: 0.3953145	best: 0.3939341 (172)	total: 1.81s	remaining: 7.4s
197:	learn: 0.2861543	test: 0.3958038	best: 0.3939341 (172)	total: 1.82s	remaining: 7.38s
198:	learn: 0.2860905	test: 0.3957613	best: 0.3939341 (172)	total: 1.83s	remaining: 7.37s
199:	learn: 0.2844734	test: 0.3941847	best: 0.3939341 (172)	total: 1.84s	remaining: 7.36s
200:	learn: 0.2843032	test: 0.3941173	best: 0.3939341 (172)	total: 1.86s	remaining: 7.4s
201:	learn: 0.2841056	test: 0.3940873	best: 0.3939341 (172)	total: 1.87s	remaining: 7.39s
202:	learn: 0.2838816	test: 0.3944619	best: 0.3939341 (172)	total: 1.88s	remaining: 7.38s
203:	learn: 0.2834191	test: 0.3946855	best: 0.3939341 (172)	total: 1.89s	remaining: 7.37s
204:	learn: 0.2833602	test: 0.3946689	best: 0.3939341 (172)	total: 1.9s	remaining: 7.36s
205:	learn: 0.2823548	test: 0.3950284	best: 0.3939341 (172)	total: 1.91s	remaining: 7.35s
206:	learn: 0.2822267	test: 0.3950451	best: 0.3939341 (172)	total: 1.92s	remaining: 7.34s
207:	learn: 0

300:	learn: 0.2358881	test: 0.4025869	best: 0.3934973 (224)	total: 2.86s	remaining: 6.64s
301:	learn: 0.2358284	test: 0.4025841	best: 0.3934973 (224)	total: 2.87s	remaining: 6.62s
302:	learn: 0.2357635	test: 0.4023529	best: 0.3934973 (224)	total: 2.87s	remaining: 6.61s
303:	learn: 0.2353495	test: 0.4024581	best: 0.3934973 (224)	total: 2.88s	remaining: 6.59s
304:	learn: 0.2345390	test: 0.4022856	best: 0.3934973 (224)	total: 2.89s	remaining: 6.58s
305:	learn: 0.2344851	test: 0.4022721	best: 0.3934973 (224)	total: 2.9s	remaining: 6.57s
306:	learn: 0.2337840	test: 0.4015348	best: 0.3934973 (224)	total: 2.91s	remaining: 6.56s
307:	learn: 0.2333662	test: 0.4019050	best: 0.3934973 (224)	total: 2.92s	remaining: 6.55s
308:	learn: 0.2332488	test: 0.4021346	best: 0.3934973 (224)	total: 2.92s	remaining: 6.54s
309:	learn: 0.2319784	test: 0.4017143	best: 0.3934973 (224)	total: 2.94s	remaining: 6.54s
310:	learn: 0.2314757	test: 0.4009766	best: 0.3934973 (224)	total: 2.94s	remaining: 6.53s
311:	learn:

405:	learn: 0.1951586	test: 0.4026080	best: 0.3934973 (224)	total: 3.86s	remaining: 5.65s
406:	learn: 0.1946763	test: 0.4027811	best: 0.3934973 (224)	total: 3.87s	remaining: 5.64s
407:	learn: 0.1942547	test: 0.4027530	best: 0.3934973 (224)	total: 3.88s	remaining: 5.63s
408:	learn: 0.1938664	test: 0.4027429	best: 0.3934973 (224)	total: 3.89s	remaining: 5.62s
409:	learn: 0.1937407	test: 0.4027188	best: 0.3934973 (224)	total: 3.9s	remaining: 5.61s
410:	learn: 0.1937116	test: 0.4027131	best: 0.3934973 (224)	total: 3.91s	remaining: 5.61s
411:	learn: 0.1932613	test: 0.4025284	best: 0.3934973 (224)	total: 3.92s	remaining: 5.6s
412:	learn: 0.1930785	test: 0.4025539	best: 0.3934973 (224)	total: 3.93s	remaining: 5.59s
413:	learn: 0.1927541	test: 0.4027931	best: 0.3934973 (224)	total: 3.94s	remaining: 5.58s
414:	learn: 0.1920875	test: 0.4024993	best: 0.3934973 (224)	total: 3.95s	remaining: 5.57s
415:	learn: 0.1919569	test: 0.4025684	best: 0.3934973 (224)	total: 3.96s	remaining: 5.55s
416:	learn: 

497:	learn: 0.1701176	test: 0.4039360	best: 0.3934973 (224)	total: 4.74s	remaining: 4.77s
498:	learn: 0.1698055	test: 0.4040335	best: 0.3934973 (224)	total: 4.75s	remaining: 4.77s
499:	learn: 0.1697771	test: 0.4040005	best: 0.3934973 (224)	total: 4.76s	remaining: 4.76s
500:	learn: 0.1697685	test: 0.4039794	best: 0.3934973 (224)	total: 4.76s	remaining: 4.75s
501:	learn: 0.1691811	test: 0.4047605	best: 0.3934973 (224)	total: 4.77s	remaining: 4.73s
502:	learn: 0.1688547	test: 0.4046422	best: 0.3934973 (224)	total: 4.78s	remaining: 4.72s
503:	learn: 0.1684313	test: 0.4045723	best: 0.3934973 (224)	total: 4.79s	remaining: 4.72s
504:	learn: 0.1682768	test: 0.4047450	best: 0.3934973 (224)	total: 4.8s	remaining: 4.71s
505:	learn: 0.1679218	test: 0.4045880	best: 0.3934973 (224)	total: 4.81s	remaining: 4.7s
506:	learn: 0.1676768	test: 0.4045255	best: 0.3934973 (224)	total: 4.82s	remaining: 4.69s
507:	learn: 0.1676650	test: 0.4045291	best: 0.3934973 (224)	total: 4.83s	remaining: 4.68s
508:	learn: 

590:	learn: 0.1458445	test: 0.4105030	best: 0.3934973 (224)	total: 5.68s	remaining: 3.93s
591:	learn: 0.1455429	test: 0.4105921	best: 0.3934973 (224)	total: 5.69s	remaining: 3.92s
592:	learn: 0.1452805	test: 0.4107758	best: 0.3934973 (224)	total: 5.7s	remaining: 3.91s
593:	learn: 0.1452517	test: 0.4107528	best: 0.3934973 (224)	total: 5.71s	remaining: 3.9s
594:	learn: 0.1451723	test: 0.4107632	best: 0.3934973 (224)	total: 5.72s	remaining: 3.89s
595:	learn: 0.1449999	test: 0.4108991	best: 0.3934973 (224)	total: 5.72s	remaining: 3.88s
596:	learn: 0.1449666	test: 0.4109306	best: 0.3934973 (224)	total: 5.73s	remaining: 3.87s
597:	learn: 0.1447806	test: 0.4113453	best: 0.3934973 (224)	total: 5.74s	remaining: 3.86s
598:	learn: 0.1447521	test: 0.4114683	best: 0.3934973 (224)	total: 5.75s	remaining: 3.85s
599:	learn: 0.1443244	test: 0.4126090	best: 0.3934973 (224)	total: 5.76s	remaining: 3.84s
600:	learn: 0.1441914	test: 0.4124774	best: 0.3934973 (224)	total: 5.78s	remaining: 3.83s
601:	learn: 

688:	learn: 0.1286441	test: 0.4151272	best: 0.3934973 (224)	total: 6.7s	remaining: 3.02s
689:	learn: 0.1286340	test: 0.4152068	best: 0.3934973 (224)	total: 6.71s	remaining: 3.01s
690:	learn: 0.1285865	test: 0.4150973	best: 0.3934973 (224)	total: 6.72s	remaining: 3s
691:	learn: 0.1284324	test: 0.4149987	best: 0.3934973 (224)	total: 6.72s	remaining: 2.99s
692:	learn: 0.1279411	test: 0.4152615	best: 0.3934973 (224)	total: 6.73s	remaining: 2.98s
693:	learn: 0.1278570	test: 0.4151877	best: 0.3934973 (224)	total: 6.74s	remaining: 2.97s
694:	learn: 0.1278400	test: 0.4151874	best: 0.3934973 (224)	total: 6.75s	remaining: 2.96s
695:	learn: 0.1275985	test: 0.4155036	best: 0.3934973 (224)	total: 6.75s	remaining: 2.95s
696:	learn: 0.1275374	test: 0.4154218	best: 0.3934973 (224)	total: 6.77s	remaining: 2.94s
697:	learn: 0.1273582	test: 0.4154246	best: 0.3934973 (224)	total: 6.78s	remaining: 2.93s
698:	learn: 0.1273474	test: 0.4153870	best: 0.3934973 (224)	total: 6.79s	remaining: 2.92s
699:	learn: 0.

780:	learn: 0.1168797	test: 0.4172845	best: 0.3934973 (224)	total: 7.57s	remaining: 2.12s
781:	learn: 0.1167469	test: 0.4170513	best: 0.3934973 (224)	total: 7.57s	remaining: 2.11s
782:	learn: 0.1166496	test: 0.4171006	best: 0.3934973 (224)	total: 7.58s	remaining: 2.1s
783:	learn: 0.1165821	test: 0.4170774	best: 0.3934973 (224)	total: 7.59s	remaining: 2.09s
784:	learn: 0.1164386	test: 0.4169819	best: 0.3934973 (224)	total: 7.6s	remaining: 2.08s
785:	learn: 0.1163521	test: 0.4169142	best: 0.3934973 (224)	total: 7.61s	remaining: 2.07s
786:	learn: 0.1163012	test: 0.4170665	best: 0.3934973 (224)	total: 7.61s	remaining: 2.06s
787:	learn: 0.1162850	test: 0.4171080	best: 0.3934973 (224)	total: 7.62s	remaining: 2.05s
788:	learn: 0.1161318	test: 0.4172540	best: 0.3934973 (224)	total: 7.63s	remaining: 2.04s
789:	learn: 0.1161234	test: 0.4172486	best: 0.3934973 (224)	total: 7.64s	remaining: 2.03s
790:	learn: 0.1159326	test: 0.4176226	best: 0.3934973 (224)	total: 7.65s	remaining: 2.02s
791:	learn: 

880:	learn: 0.1037939	test: 0.4267653	best: 0.3934973 (224)	total: 8.51s	remaining: 1.15s
881:	learn: 0.1037216	test: 0.4267257	best: 0.3934973 (224)	total: 8.52s	remaining: 1.14s
882:	learn: 0.1034465	test: 0.4275159	best: 0.3934973 (224)	total: 8.53s	remaining: 1.13s
883:	learn: 0.1033259	test: 0.4278346	best: 0.3934973 (224)	total: 8.54s	remaining: 1.12s
884:	learn: 0.1031101	test: 0.4279060	best: 0.3934973 (224)	total: 8.55s	remaining: 1.11s
885:	learn: 0.1029100	test: 0.4284699	best: 0.3934973 (224)	total: 8.56s	remaining: 1.1s
886:	learn: 0.1028759	test: 0.4284343	best: 0.3934973 (224)	total: 8.56s	remaining: 1.09s
887:	learn: 0.1027722	test: 0.4286250	best: 0.3934973 (224)	total: 8.57s	remaining: 1.08s
888:	learn: 0.1027681	test: 0.4285918	best: 0.3934973 (224)	total: 8.58s	remaining: 1.07s
889:	learn: 0.1027477	test: 0.4286621	best: 0.3934973 (224)	total: 8.59s	remaining: 1.06s
890:	learn: 0.1026126	test: 0.4291721	best: 0.3934973 (224)	total: 8.6s	remaining: 1.05s
891:	learn: 

974:	learn: 0.0938631	test: 0.4359327	best: 0.3934973 (224)	total: 9.39s	remaining: 241ms
975:	learn: 0.0938093	test: 0.4358564	best: 0.3934973 (224)	total: 9.4s	remaining: 231ms
976:	learn: 0.0937957	test: 0.4358291	best: 0.3934973 (224)	total: 9.41s	remaining: 222ms
977:	learn: 0.0937147	test: 0.4359749	best: 0.3934973 (224)	total: 9.42s	remaining: 212ms
978:	learn: 0.0936237	test: 0.4361099	best: 0.3934973 (224)	total: 9.43s	remaining: 202ms
979:	learn: 0.0935963	test: 0.4361211	best: 0.3934973 (224)	total: 9.44s	remaining: 193ms
980:	learn: 0.0935669	test: 0.4361244	best: 0.3934973 (224)	total: 9.45s	remaining: 183ms
981:	learn: 0.0933992	test: 0.4361082	best: 0.3934973 (224)	total: 9.46s	remaining: 173ms
982:	learn: 0.0931804	test: 0.4362405	best: 0.3934973 (224)	total: 9.47s	remaining: 164ms
983:	learn: 0.0931436	test: 0.4362865	best: 0.3934973 (224)	total: 9.47s	remaining: 154ms
984:	learn: 0.0930827	test: 0.4363361	best: 0.3934973 (224)	total: 9.48s	remaining: 144ms
985:	learn:

70:	learn: 0.3590455	test: 0.4080519	best: 0.4080519 (70)	total: 611ms	remaining: 8s
71:	learn: 0.3579120	test: 0.4075957	best: 0.4075957 (71)	total: 620ms	remaining: 7.99s
72:	learn: 0.3574215	test: 0.4076435	best: 0.4075957 (71)	total: 628ms	remaining: 7.98s
73:	learn: 0.3568130	test: 0.4075172	best: 0.4075172 (73)	total: 634ms	remaining: 7.94s
74:	learn: 0.3568124	test: 0.4075158	best: 0.4075158 (74)	total: 635ms	remaining: 7.84s
75:	learn: 0.3556253	test: 0.4070813	best: 0.4070813 (75)	total: 643ms	remaining: 7.82s
76:	learn: 0.3539924	test: 0.4062711	best: 0.4062711 (76)	total: 650ms	remaining: 7.79s
77:	learn: 0.3528299	test: 0.4058454	best: 0.4058454 (77)	total: 658ms	remaining: 7.77s
78:	learn: 0.3527896	test: 0.4059041	best: 0.4058454 (77)	total: 660ms	remaining: 7.69s
79:	learn: 0.3522908	test: 0.4049107	best: 0.4049107 (79)	total: 667ms	remaining: 7.67s
80:	learn: 0.3506812	test: 0.4041611	best: 0.4041611 (80)	total: 677ms	remaining: 7.68s
81:	learn: 0.3496995	test: 0.404119

176:	learn: 0.2993261	test: 0.3941297	best: 0.3939341 (172)	total: 1.65s	remaining: 7.65s
177:	learn: 0.2990009	test: 0.3943818	best: 0.3939341 (172)	total: 1.65s	remaining: 7.64s
178:	learn: 0.2984980	test: 0.3944077	best: 0.3939341 (172)	total: 1.66s	remaining: 7.62s
179:	learn: 0.2974294	test: 0.3948760	best: 0.3939341 (172)	total: 1.67s	remaining: 7.62s
180:	learn: 0.2967891	test: 0.3943291	best: 0.3939341 (172)	total: 1.68s	remaining: 7.61s
181:	learn: 0.2962807	test: 0.3945732	best: 0.3939341 (172)	total: 1.7s	remaining: 7.65s
182:	learn: 0.2959431	test: 0.3945935	best: 0.3939341 (172)	total: 1.71s	remaining: 7.65s
183:	learn: 0.2954397	test: 0.3948167	best: 0.3939341 (172)	total: 1.72s	remaining: 7.64s
184:	learn: 0.2941583	test: 0.3948699	best: 0.3939341 (172)	total: 1.73s	remaining: 7.63s
185:	learn: 0.2936413	test: 0.3949253	best: 0.3939341 (172)	total: 1.75s	remaining: 7.65s
186:	learn: 0.2934959	test: 0.3948023	best: 0.3939341 (172)	total: 1.76s	remaining: 7.63s
187:	learn:

<catboost.core.CatBoostClassifier at 0x7f6854837550>

In [73]:


print('Simple model tree count: {}'.format(model.tree_count_))
print('Simple model validation accuracy: {:.4}'.format(
    accuracy_score(y_validation, model.predict(X_validation))
))
print('')

print('Early-stopped model tree count: {}'.format(earlystop_model.tree_count_))
print('Early-stopped model validation accuracy: {:.4}'.format(
    accuracy_score(y_validation, earlystop_model.predict(X_validation))
))



Simple model tree count: 225
Simple model validation accuracy: 0.8206

Early-stopped model tree count: 173
Early-stopped model validation accuracy: 0.8251


3. **Uso de la línea de base**

Es posible utilizar los resultados de un modelo pre-engrendado (línea de base) para un próximo entrenamiento.

In [74]:
current_params = params.copy()
current_params.update({
    'iterations': 10
})

model = CatBoostClassifier(**current_params).fit(X_train, y_train, categorical_features_indices)

# Get baseline (only with prediction_type='RawFormulaVal')
baseline = model.predict(X_train, prediction_type='RawFormulaVal')
print('')
print('')
# Fit new model
model.fit(X_train, y_train, categorical_features_indices, baseline=baseline);


0:	learn: 0.6770412	total: 2.25ms	remaining: 20.2ms
1:	learn: 0.6612952	total: 4.11ms	remaining: 16.4ms
2:	learn: 0.6448236	total: 5.11ms	remaining: 11.9ms
3:	learn: 0.6317603	total: 6.71ms	remaining: 10.1ms
4:	learn: 0.6178815	total: 10.6ms	remaining: 10.6ms
5:	learn: 0.6055198	total: 15.7ms	remaining: 10.5ms
6:	learn: 0.5939761	total: 21.8ms	remaining: 9.33ms
7:	learn: 0.5832543	total: 25ms	remaining: 6.24ms
8:	learn: 0.5736928	total: 28.5ms	remaining: 3.17ms
9:	learn: 0.5647460	total: 33.1ms	remaining: 0us


0:	learn: 0.5542795	total: 2.04ms	remaining: 18.3ms
1:	learn: 0.5468125	total: 5.09ms	remaining: 20.4ms
2:	learn: 0.5390119	total: 6.78ms	remaining: 15.8ms
3:	learn: 0.5325658	total: 11.9ms	remaining: 17.9ms
4:	learn: 0.5274449	total: 13.6ms	remaining: 13.6ms
5:	learn: 0.5216457	total: 15.8ms	remaining: 10.5ms
6:	learn: 0.5166928	total: 18.9ms	remaining: 8.09ms
7:	learn: 0.5126876	total: 26.5ms	remaining: 6.64ms
8:	learn: 0.5086844	total: 29.7ms	remaining: 3.3ms
9:	learn: 0.5039

4. **Snapshot (Resguardo)**

Catboost soporta snapshots. Se pueden utilizar para recuperar un entrenamiento después de una interrupción del proceso o para iniciar un entrenamiento con resultados previos.

In [75]:
params_with_snapshot = params.copy()
params_with_snapshot.update({
    'iterations': 5,
    'learning_rate': 0.5,
    'logging_level': 'Verbose'
})
model = CatBoostClassifier(**params_with_snapshot).fit(train_pool, eval_set=validate_pool, save_snapshot=True)
params_with_snapshot.update({
    'iterations': 10,
    'learning_rate': 0.1,
})
model = CatBoostClassifier(**params_with_snapshot).fit(train_pool, eval_set=validate_pool, save_snapshot=True)


bestTest = 0.428116432
bestIteration = 9


bestTest = 0.428116432
bestIteration = 9



5. **Predicción por etapas**

CatBoost tiene el método ```staged_predict``` que permite obtener predicciones de forma iterativa para un rango determinado de árboles.

In [76]:
model = CatBoostClassifier(iterations=10, random_seed=42, logging_level='Silent').fit(train_pool)

ntree_start, ntree_end, eval_period = 3, 9, 2

predictions_iterator = model.staged_predict(validate_pool, 'Probability', ntree_start, ntree_end, eval_period)

for preds, tree_count in zip(predictions_iterator, range(ntree_start, ntree_end, eval_period)):
    print('First class probabilities using the first {} trees: {}'.format(tree_count, preds[:5, 1]))



First class probabilities using the first 3 trees: [0.53597869 0.41039128 0.42057479 0.64281031 0.46576685]
First class probabilities using the first 5 trees: [0.63722688 0.42492029 0.46209302 0.70926021 0.44280772]
First class probabilities using the first 7 trees: [0.66964764 0.42409144 0.46124982 0.76101033 0.47205986]


6. **Características (columnas) más importantes**

Es importante comprender qué característica contribuyó en mayor medida al resultado final. Para ello, tiene un método ```get_feature_importance```.

In [77]:
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)

feature_importances = model.get_feature_importance(train_pool)

feature_names = X_train.columns

for score, name in sorted(zip(feature_importances, feature_names), reverse=True):
    print('{}: {}'.format(name, score))


Sex: 59.004092014268586
Pclass: 16.340887169747035
Ticket: 6.028107169932204
Cabin: 3.8347242202560192
Fare: 3.712969667934384
Age: 3.484451204182482
Parch: 3.378089740355865
Embarked: 2.3139994072899555
SibSp: 1.9026794060334504
PassengerId: 0.0
Name: 0.0


7. **Eval Metrics**

CatBoost posee el método ```eval_metrics``` que permite calcular las métricas que se indiquen en el dataset y también graficarlas

In [78]:
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)
eval_metrics = model.eval_metrics(validate_pool, ['AUC','Accuracy', 'F1'], plot=True)

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))

In [79]:
print(eval_metrics['AUC'][:6])
print(eval_metrics['F1'][:6])


[0.8627368774106994, 0.8623176253563642, 0.8602213650846889, 0.8514170719436525, 0.8495723629045783, 0.8569092738554419]
[0.7124999999999999, 0.7305389221556887, 0.7349397590361446, 0.7349397590361446, 0.7407407407407406, 0.725]


8. **Comparar procesos de aprendizaje**

Se puede comparar los procesos de aprendizaje de diferentes modelos en una solo gráfico.

In [80]:
model1 = CatBoostClassifier(iterations=10, depth=1, train_dir='model_depth_1/', logging_level='Silent')
model1.fit(train_pool, eval_set=validate_pool)

model2 = CatBoostClassifier(iterations=10, depth=5, train_dir='model_depth_5/', logging_level='Silent')
model2.fit(train_pool, eval_set=validate_pool);

In [81]:
widget = MetricVisualizer(['model_depth_1', 'model_depth_5'])
widget.start()

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))

9. **Guardar Modelo**

Catboost permite guardar el modelo entrenado en disco y poder cargarlo posteriormente. Es muy útil para el despliegue de la aplicación y para evitar realizar re-entrenamientos (especialmente si toman mucho tiempo).

In [82]:
model = CatBoostClassifier(iterations=10, random_seed=42, logging_level='Silent').fit(train_pool)
model.save_model('catboost_model.dump')
model = CatBoostClassifier()
model.load_model('catboost_model.dump');


### Ajuste de hiperparametros

Si bien puede seleccionar el número óptimo de iteraciones mediante la validación cruzada y los gráficos de la curva de aprendizaje, también es importante jugar con algunos de los parámetros del modelo, por ejmplo los parámetros ```l2_leaf_reg``` y ```learning_rate```.

Seleccionaremos estos parámetros usando el paquete hyperopt.

```
!pip install hyperopt
```


In [83]:
import hyperopt

def hyperopt_objective(params):
    
    model = CatBoostClassifier(
        
        l2_leaf_reg=int(params['l2_leaf_reg']),
        learning_rate=params['learning_rate'],
        iterations=500,
        eval_metric='Accuracy',
        random_seed=42,
        verbose=False,
        loss_function='Logloss',
        
    )
    
    cv_data = cv(
        Pool(X_train, y_train, cat_features=categorical_features_indices),
        model.get_params()
    )
    
    best_accuracy = np.max(cv_data['test-Accuracy-mean'])
    
    return 1 - best_accuracy # as hyperopt minimises

In [84]:
params_space = {
    'l2_leaf_reg': hyperopt.hp.qloguniform('l2_leaf_reg', 0, 2, 1),
    'learning_rate': hyperopt.hp.uniform('learning_rate', 1e-3, 5e-1),
}

trials = hyperopt.Trials()

best = hyperopt.fmin(
    hyperopt_objective,
    space=params_space,
    algo=hyperopt.tpe.suggest,
    max_evals=10,
    trials=trials,
    rstate=RandomState(123)
)


100%|██████████| 10/10 [3:43:06<00:00, 1338.65s/trial, best loss: 0.17811561561561573]


In [132]:
# Mostramos cuáles fueron los mejores parámetros obtenidos
print(best)

{'l2_leaf_reg': 2.0, 'learning_rate': 0.1661030060980962}


In [106]:
# usando los mejores hiperparametros

model = CatBoostClassifier(
    l2_leaf_reg=int(best['l2_leaf_reg']),
    learning_rate=best['learning_rate'],
    iterations=500,
    eval_metric='Accuracy',
    random_seed=42,
    verbose=50,
    loss_function='Logloss',
)

# Seteamos la clase de cross validation con el modelo

cv_data = cv(Pool(X, y, cat_features=categorical_features_indices), model.get_params())



0:	learn: 0.7985410	test: 0.7946128	best: 0.7946128 (0)	total: 1.29s	remaining: 10m 46s
50:	learn: 0.9085297	test: 0.8114478	best: 0.8181818 (48)	total: 54.1s	remaining: 7m 56s
100:	learn: 0.9511785	test: 0.8170595	best: 0.8181818 (48)	total: 1m 46s	remaining: 7m
150:	learn: 0.9713805	test: 0.8092031	best: 0.8181818 (48)	total: 2m 37s	remaining: 6m 4s
200:	learn: 0.9837262	test: 0.8159371	best: 0.8181818 (48)	total: 3m 30s	remaining: 5m 13s
250:	learn: 0.9898990	test: 0.8103255	best: 0.8181818 (48)	total: 4m 39s	remaining: 4m 37s
300:	learn: 0.9932660	test: 0.8125701	best: 0.8181818 (48)	total: 5m 48s	remaining: 3m 50s
350:	learn: 0.9943883	test: 0.8092031	best: 0.8181818 (48)	total: 7m 6s	remaining: 3m 1s
400:	learn: 0.9966330	test: 0.8058361	best: 0.8181818 (48)	total: 8m 50s	remaining: 2m 11s
450:	learn: 0.9966330	test: 0.8058361	best: 0.8181818 (48)	total: 9m 58s	remaining: 1m 5s
499:	learn: 0.9977553	test: 0.8058361	best: 0.8181818 (48)	total: 10m 57s	remaining: 0us


In [134]:
print('Presición obtenida: {}'.format(np.max(cv_data['test-Accuracy-mean'])))

Presición obtenida: 0.8181818181818182


### Analisis de Features
Catboost tiene herramientas que permiten analizar estadísticamente los features del set de datos

In [98]:
# Cargamos un modelo anterio

model = CatBoostClassifier()
model.load_model('catboost_model.dump');

In [99]:
feature = 'Fare'
res = model.calc_feature_statistics(X, y, feature, plot=True)

In [100]:
feature = 'Sex'
res = model.calc_feature_statistics(X, y, feature, plot=True)

In [105]:
feature = 'Age'
res = model.calc_feature_statistics(X, y, feature, plot=True)

## Grid Search

Catboost también implementa grid search, random grid search y multiple grid search

In [107]:
# Preparación del set de datos de Titanic

titanic_train, titanic_test = titanic()
titanic_train_target = titanic_train.Survived

titanic_train.drop(['PassengerId', 'Survived', 'Name', 'Parch', 'Ticket', 'Cabin', 'Embarked'], axis=1, inplace=True)
titanic_test.drop(['PassengerId', 'Name', 'Parch', 'Ticket', 'Cabin', 'Embarked'], axis=1, inplace=True)

titanic_train.head(3)

Unnamed: 0,Pclass,Sex,Age,SibSp,Fare
0,3,male,22.0,1,7.25
1,1,female,38.0,1,71.2833
2,3,female,26.0,0,7.925


### Simple grid search

In [109]:
titanic_model = CatBoostClassifier(iterations=1000)

In [111]:
# se crean los pool de train y test
train_pool = Pool(titanic_train, titanic_train_target, cat_features=['Pclass', 'Sex', 'SibSp'])
test_pool = Pool(titanic_test, cat_features=['Pclass', 'Sex', 'SibSp'])

In [112]:
grid = {
    'learning_rate': [0.03, 0.1],
    'depth':[4, 6, 10],
    'l2_leaf_reg': [1, 3, 5, 7, 9]
}

grid_search_results = titanic_model.grid_search(grid, train_pool, shuffle=False, verbose=3, plot=True)

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))


bestTest = 0.3533606492
bestIteration = 704

0:	loss: 0.3533606	best: 0.3533606 (0)	total: 1.74s	remaining: 50.4s

bestTest = 0.3645722309
bestIteration = 165


bestTest = 0.3559884562
bestIteration = 908


bestTest = 0.3519706972
bestIteration = 255

3:	loss: 0.3519707	best: 0.3519707 (3)	total: 6.22s	remaining: 40.4s

bestTest = 0.3577066876
bestIteration = 971


bestTest = 0.3565241144
bestIteration = 303


bestTest = 0.3591680582
bestIteration = 774

6:	loss: 0.3591681	best: 0.3519707 (3)	total: 9.96s	remaining: 32.7s

bestTest = 0.3573501719
bestIteration = 381


bestTest = 0.3586468709
bestIteration = 969


bestTest = 0.3608339509
bestIteration = 275

9:	loss: 0.3608340	best: 0.3519707 (3)	total: 13.3s	remaining: 26.7s

bestTest = 0.3515593532
bestIteration = 449


bestTest = 0.3554182624
bestIteration = 91


bestTest = 0.3504053528
bestIteration = 709

12:	loss: 0.3504054	best: 0.3504054 (12)	total: 18.6s	remaining: 24.3s

bestTest = 0.3520440963
bestIteration = 102


bestTest 

Parametros obtenidos para el mejor valore de función de costo:

In [118]:
grid_search_results['params']

{'depth': 6, 'l2_leaf_reg': 7, 'learning_rate': 0.03}

Estadística disponible para cross-validation

In [119]:
grid_search_results['cv_results'].keys()

dict_keys(['iterations', 'test-Logloss-mean', 'test-Logloss-std', 'train-Logloss-mean', 'train-Logloss-std'])

Calidad estimada mediante cross-validation:

In [120]:
grid_search_results['cv_results']['test-Logloss-mean'][-1]

0.44332543740080776

El modelo ya se encuentra listo para usar después de la búsqueda de hiperparametros:

In [121]:
predicted = titanic_model.predict_proba(test_pool)
predicted[:3]

array([[0.93320467, 0.06679533],
       [0.95417845, 0.04582155],
       [0.61872333, 0.38127667]])

### Múltiple grids seatch

In [126]:
titanic_model = CatBoostClassifier(iterations=1000)

In [127]:
grid_1 = {
    'learning_rate': [0.03, 0.1, 0.3, 0.9],
    'bootstrap_type':['Bayesian'],
    'bagging_temperature': [0, 1, 10]
}
grid_2 = {
    'learning_rate': [0.03, 0.1, 0.3, 0.9],
    'bootstrap_type':['Bernoulli'],
    'subsample': [0.66, 0.7, 0.8]
}

grid_search_results = titanic_model.grid_search([grid_1, grid_2], train_pool, shuffle=False, verbose=4, plot=True)

Grid #0


MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))


bestTest = 0.3553808424
bestIteration = 392

0:	loss: 0.3553808	best: 0.3553808 (0)	total: 1.36s	remaining: 15s

bestTest = 0.3507052896
bestIteration = 261


bestTest = 0.3598624264
bestIteration = 40


bestTest = 0.3531334503
bestIteration = 3


bestTest = 0.360983333
bestIteration = 351

4:	loss: 0.3609833	best: 0.3507053 (1)	total: 7.55s	remaining: 10.6s

bestTest = 0.3631232071
bestIteration = 76


bestTest = 0.3638504011
bestIteration = 26


bestTest = 0.3695870848
bestIteration = 12


bestTest = 0.3673042864
bestIteration = 327

8:	loss: 0.3673043	best: 0.3507053 (1)	total: 14.4s	remaining: 4.8s

bestTest = 0.3600570241
bestIteration = 106


bestTest = 0.4068758899
bestIteration = 124


bestTest = 0.4257563136
bestIteration = 35

11:	loss: 0.4257563	best: 0.3507053 (1)	total: 19.7s	remaining: 0us
Grid #1

bestTest = 0.3525782562
bestIteration = 420

0:	loss: 0.3525783	best: 0.3525783 (0)	total: 1.81s	remaining: 19.9s

bestTest = 0.347737115
bestIteration = 109


bestTest = 0.36

In [128]:
grid_search_results['params']

{'subsample': 0.7, 'learning_rate': 0.9, 'bootstrap_type': 'Bernoulli'}

### Randomized search

In [130]:
from scipy import stats

titanic_model = CatBoostClassifier(iterations=1000)

class StrangeDistribution:
    def __init__(self, values):
        self.values = values

    def rvs(self):
        return self.values[0]

param_distribution = {
    'one_hot_max_size': stats.bernoulli(p=0.2, loc=2),
    'learning_rate': StrangeDistribution([0.03, 0.1]),
    'l2_leaf_reg': [1, 3, 5, 7, 9],
    'depth': stats.binom(n=10, p=0.2)
}

randomized_search_results = titanic_model.randomized_search(
    param_distribution,
    train_pool,
    n_iter=12,
    shuffle=False,
    plot=True
)

MetricVisualizer(layout=Layout(align_self='stretch', height='500px'))


bestTest = 0.37851656
bestIteration = 998

0:	loss: 0.3785166	best: 0.3785166 (0)	total: 859ms	remaining: 9.45s

bestTest = 0.3772842649
bestIteration = 876

1:	loss: 0.3772843	best: 0.3772843 (1)	total: 1.48s	remaining: 7.41s

bestTest = 0.6545574596
bestIteration = 132

2:	loss: 0.6545575	best: 0.3772843 (1)	total: 1.76s	remaining: 5.29s

bestTest = 0.375956388
bestIteration = 999

3:	loss: 0.3759564	best: 0.3759564 (3)	total: 2.38s	remaining: 4.75s

bestTest = 0.3655980784
bestIteration = 972

4:	loss: 0.3655981	best: 0.3655981 (4)	total: 3.03s	remaining: 4.24s

bestTest = 0.425041713
bestIteration = 995

5:	loss: 0.4250417	best: 0.3655981 (4)	total: 3.56s	remaining: 3.56s

bestTest = 0.375956388
bestIteration = 999

6:	loss: 0.3759564	best: 0.3655981 (4)	total: 4.35s	remaining: 3.11s

bestTest = 0.425041713
bestIteration = 995

7:	loss: 0.4250417	best: 0.3655981 (4)	total: 4.95s	remaining: 2.48s

bestTest = 0.3794279383
bestIteration = 997

8:	loss: 0.3794279	best: 0.3655981 (4)	t

In [131]:
randomized_search_results['params']

{'l2_leaf_reg': 7,
 'depth': 4.0,
 'one_hot_max_size': 3.0,
 'learning_rate': 0.03}

#### Estos ejemplos y muchos más se encuentran en la sección de tutoriales de la documentación de la libreria o en el repositorio de este práctico.

En resumen esta librería cuenta con un buen set de parametros iniciales que permiten obtener resultados muy buenos y de alta presición. Las herramientas auxiliares para el analisis de caracteristicas, graficos de aprendizaje ayudan mucho a entender el problema y el modelo resultante.