<h1><center>Laboratorio 9: Optimización de modelos 💯</center></h1>

<center><strong>MDS7202: Laboratorio de Programación Científica para Ciencia de Datos</strong></center>

### Cuerpo Docente:

- Profesores: Ignacio Meza, Sebastián Tinoco
- Auxiliares: Catherine Benavides y Consuelo Rojas
- Ayudante: Nicolás Ojeda, Eduardo Moya

### Equipo: **SUPER IMPORTANTE - notebooks sin nombre no serán revisados**

- Nombre de alumno 1: Sergio Rehbein
- Nombre de alumno 2: Matías Cornejo

\### **Link de repositorio de GitHub Matias:** https://github.com/s-kill/MDS7202

\### **Link de repositorio de GitHub Sergio:** https://github.com/sergiorehbein/MDS7201---Proyecto-de-Ciencia-de-Datos


### Temas a tratar

- Predicción de demanda usando `xgboost`
- Búsqueda del modelo óptimo de clasificación usando `optuna`
- Uso de pipelines.

### Reglas:

- **Grupos de 2 personas**
- Cualquier duda fuera del horario de clases al foro. Mensajes al equipo docente serán respondidos por este medio.
- Prohibidas las copias.
- Pueden usar cualquer matrial del curso que estimen conveniente.
- Código que no se pueda ejecutar, no será revisado.

### Objetivos principales del laboratorio

- Optimizar modelos usando `optuna`
- Recurrir a técnicas de *prunning*
- Forzar el aprendizaje de relaciones entre variables mediante *constraints*
- Fijar un pipeline con un modelo base que luego se irá optimizando.

El laboratorio deberá ser desarrollado sin el uso indiscriminado de iteradores nativos de python (aka "for", "while"). La idea es que aprendan a exprimir al máximo las funciones optimizadas que nos entrega `pandas`, las cuales vale mencionar, son bastante más eficientes que los iteradores nativos sobre DataFrames.

### **Link de repositorio de GitHub:** `http://....`

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Importamos librerias útiles

In [4]:
!pip install -qq xgboost optuna

# El emprendimiento de Fiu

Tras liderar de manera exitosa la implementación de un proyecto de ciencia de datos para caracterizar los datos generados en Santiago 2023, el misterioso corpóreo **Fiu** se anima y decide levantar su propio negocio de consultoría en machine learning. Tras varias e intensas negociaciones, Fiu logra encontrar su *primera chamba*: predecir la demanda (cantidad de venta) de una famosa productora de bebidas de calibre mundial. Como usted tuvo un rendimiento sobresaliente en el proyecto de caracterización de datos, Fiu lo contrata como *data scientist* de su emprendimiento.

Para este laboratorio deben trabajar con los datos `sales.csv` subidos a u-cursos, el cual contiene una muestra de ventas de la empresa para diferentes productos en un determinado tiempo.

Para comenzar, cargue el dataset señalado y visualice a través de un `.head` los atributos que posee el dataset.

<i><p align="center">Fiu siendo felicitado por su excelente desempeño en el proyecto de caracterización de datos</p></i>
<p align="center">
  <img src="https://media-front.elmostrador.cl/2023/09/A_UNO_1506411_2440e.jpg">
</p>

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime

df = pd.read_csv('sales.csv')
df['date'] = pd.to_datetime(df['date'])

df.head()

  df['date'] = pd.to_datetime(df['date'])


Unnamed: 0,id,date,city,lat,long,pop,shop,brand,container,capacity,price,quantity
0,0,2012-01-31,Athens,37.97945,23.71622,672130,shop_1,kinder-cola,glass,500ml,0.96,13280
1,1,2012-01-31,Athens,37.97945,23.71622,672130,shop_1,kinder-cola,plastic,1.5lt,2.86,6727
2,2,2012-01-31,Athens,37.97945,23.71622,672130,shop_1,kinder-cola,can,330ml,0.87,9848
3,3,2012-01-31,Athens,37.97945,23.71622,672130,shop_1,adult-cola,glass,500ml,1.0,20050
4,4,2012-01-31,Athens,37.97945,23.71622,672130,shop_1,adult-cola,can,330ml,0.39,25696


## 1 Generando un Baseline (0.5 puntos)

<p align="center">
  <img src="https://media.tenor.com/O-lan6TkadUAAAAC/what-i-wnna-do-after-a-baseline.gif">
</p>

Antes de entrenar un algoritmo, usted recuerda los apuntes de su magíster en ciencia de datos y recuerda que debe seguir una serie de *buenas prácticas* para entrenar correcta y debidamente su modelo. Después de un par de vueltas, llega a las siguientes tareas:

1. Separe los datos en conjuntos de train (70%), validation (20%) y test (10%). Fije una semilla para controlar la aleatoriedad.
2. Implemente un `FunctionTransformer` para extraer el día, mes y año de la variable `date`. Guarde estas variables en el formato categorical de pandas.
3. Implemente un `ColumnTransformer` para procesar de manera adecuada los datos numéricos y categóricos. Use `OneHotEncoder` para las variables categóricas.
4. Guarde los pasos anteriores en un `Pipeline`, dejando como último paso el regresor `DummyRegressor` para generar predicciones en base a promedios.
5. Entrene el pipeline anterior y reporte la métrica `mean_absolute_error` sobre los datos de validación. ¿Cómo se interpreta esta métrica para el contexto del negocio?
6. Finalmente, vuelva a entrenar el `Pipeline` pero esta vez usando `XGBRegressor` como modelo **utilizando los parámetros por default**. ¿Cómo cambia el MAE al implementar este algoritmo? ¿Es mejor o peor que el `DummyRegressor`?
7. Guarde ambos modelos en un archivo .pkl (uno cada uno)

In [25]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import FunctionTransformer, OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.dummy import DummyRegressor
from sklearn.metrics import mean_absolute_error
from xgboost import XGBRegressor
import joblib


data = df.copy()

# División de los datos en conjuntos de entrenamiento, validación y prueba
train_data, temp_data = train_test_split(data, test_size=0.3, random_state=42)
val_data, test_data = train_test_split(temp_data, test_size=1/3, random_state=42)

# Función para extraer día, mes y año de la fecha
def extract_date_features(df):
    df = df.copy()
    df['date'] = pd.to_datetime(df['date'], format='%d/%m/%y')
    df['day'] = df['date'].dt.day.astype('category')
    df['month'] = df['date'].dt.month.astype('category')
    df['year'] = df['date'].dt.year.astype('category')
    return df.drop(columns=['date','id'])

# Transformer para la extracción de características de la fecha
date_transformer = FunctionTransformer(extract_date_features)

# Aplicar la transformación a los datos de entrenamiento y validación
transformed_train_data = date_transformer.fit_transform(train_data)
transformed_val_data = date_transformer.transform(val_data)

# Separar características y variable objetivo
X_train = transformed_train_data.drop(columns=['quantity'])
y_train = transformed_train_data['quantity']

X_val = transformed_val_data.drop(columns=['quantity'])
y_val = transformed_val_data['quantity']

# Definir el column transformer
numeric_features = ['lat', 'long', 'pop', 'price']
categorical_features = ['city', 'shop', 'brand', 'container', 'capacity', 'day', 'month', 'year']

preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_features),
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
    ])

# Definir el pipeline con DummyRegressor
pipeline_dummy = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('regressor', DummyRegressor(strategy='mean'))
])

# Definir el pipeline con XGBRegressor
pipeline_xgb = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('regressor', XGBRegressor())
])

# Entrenar el pipeline con DummyRegressor
pipeline_dummy.fit(X_train, y_train)

# Predicciones y cálculo del MAE con DummyRegressor
y_val_pred_dummy = pipeline_dummy.predict(X_val)
mae_dummy = mean_absolute_error(y_val, y_val_pred_dummy)

# Entrenar el pipeline con XGBRegressor
pipeline_xgb.fit(X_train, y_train)

# Predicciones y cálculo del MAE con XGBRegressor
y_val_pred_xgb = pipeline_xgb.predict(X_val)
mae_xgb = mean_absolute_error(y_val, y_val_pred_xgb)

# Guardar ambos modelos en archivos .pkl
joblib.dump(pipeline_dummy, 'dummy_regressor_pipeline.pkl')
joblib.dump(pipeline_xgb, 'xgb_regressor_pipeline.pkl')

# Imprimir los resultados
print(f'MAE DummyRegressor: {mae_dummy}')
print(f'MAE XGBRegressor: {mae_xgb}')

MAE DummyRegressor: 13298.497767341096
MAE XGBRegressor: 2433.320936196607


## 2. Forzando relaciones entre parámetros con XGBoost (1.0 puntos)

<p align="center">
  <img src="https://64.media.tumblr.com/14cc45f9610a6ee341a45fd0d68f4dde/20d11b36022bca7b-bf/s640x960/67ab1db12ff73a530f649ac455c000945d99c0d6.gif">
</p>

Un colega aficionado a la economía le *sopla* que la demanda guarda una relación inversa con el precio del producto. Motivado para impresionar al querido corpóreo, se propone hacer uso de esta información para mejorar su modelo realizando las siguientes tareas:

1. Vuelva a entrenar el `Pipeline`, pero esta vez forzando una relación monótona negativa entre el precio y la cantidad. Para aplicar esta restricción apóyese en la siguiente <a href = https://xgboost.readthedocs.io/en/stable/tutorials/monotonic.html>documentación</a>. Hint: Para implementar el constraint se le sugiere hacerlo especificando el nombre de la variable. De ser así, probablemente le sea útil **mantener el formato de pandas** antes del step de entrenamiento.

2. Luego, vuelva a reportar el `MAE` sobre el conjunto de validación.

3. ¿Cómo cambia el error al incluir esta relación? ¿Tenía razón su amigo?




In [49]:
X_train.columns

Index(['city', 'lat', 'long', 'pop', 'shop', 'brand', 'container', 'capacity',
       'price', 'day', 'month', 'year'],
      dtype='object')

In [51]:
import xgboost as xgb
# Ajustamos el preprocesador para obtener los nombres de las características procesadas
preprocessor.fit(X_train)
X_train_processed = preprocessor.transform(X_train)
X_val_processed = preprocessor.transform(X_val)

# Crear DMatrix para XGBoost
dtrain = xgb.DMatrix(X_train_processed, label=y_train)
dval = xgb.DMatrix(X_val_processed, label=y_val)

# Definir parámetros con restricciones de monotonía
# La restricción se aplica en la posición correspondiente a 'price' después de OneHotEncoding
params = {
    'objective': 'reg:squarederror',
    'monotone_constraints': '(0,0,0,0,0,0,0,0,-1,0,0,0)'  # Ajustar según el número de características
}

# Definir el conjunto de evaluación
evallist = [(dtrain, 'train'), (dval, 'eval')]

# Entrenar el modelo con restricciones de monotonía
model_with_constraints = xgb.train(params, dtrain, num_boost_round=1000, evals=evallist, early_stopping_rounds=10)

# Predicciones y cálculo del MAE
y_val_pred_xgb_monotonic = model_with_constraints.predict(dval)
mae_xgb_monotonic = mean_absolute_error(y_val, y_val_pred_xgb_monotonic)

# Guardar el modelo
joblib.dump(model_with_constraints, 'xgb_regressor_monotonic_model.pkl')

# Imprimir los resultados
print(f'MAE XGBRegressor with Monotonic Constraint: {mae_xgb_monotonic}')

[0]	train-rmse:14358.74031	eval-rmse:14425.54552
[1]	train-rmse:12086.40610	eval-rmse:12310.91967
[2]	train-rmse:10363.16486	eval-rmse:10705.48184
[3]	train-rmse:9015.11646	eval-rmse:9497.48118
[4]	train-rmse:8125.69084	eval-rmse:8653.92247
[5]	train-rmse:7309.49944	eval-rmse:7913.85200
[6]	train-rmse:6742.70352	eval-rmse:7385.17539
[7]	train-rmse:6116.11076	eval-rmse:6818.22918
[8]	train-rmse:5762.69212	eval-rmse:6503.54446
[9]	train-rmse:5319.56047	eval-rmse:6100.98820
[10]	train-rmse:5020.93902	eval-rmse:5864.68814
[11]	train-rmse:4792.98022	eval-rmse:5656.41176
[12]	train-rmse:4519.17089	eval-rmse:5405.79187
[13]	train-rmse:4369.70338	eval-rmse:5293.33037
[14]	train-rmse:4118.68878	eval-rmse:5067.56430
[15]	train-rmse:3952.11328	eval-rmse:4934.09139
[16]	train-rmse:3829.15266	eval-rmse:4826.21858
[17]	train-rmse:3726.88112	eval-rmse:4782.57366
[18]	train-rmse:3632.17398	eval-rmse:4722.63810
[19]	train-rmse:3494.02078	eval-rmse:4601.30130
[20]	train-rmse:3440.97291	eval-rmse:4564.87

Aplicar Monotonic Contraint baja el MAE de 2433.32 a 2296.99

## 3. Optimización de Hiperparámetros con Optuna (2.0 puntos)

<p align="center">
  <img src="https://media.tenor.com/fmNdyGN4z5kAAAAi/hacking-lucy.gif">
</p>

Luego de presentarle sus resultados, Fiu le pregunta si es posible mejorar *aun más* su modelo. En particular, le comenta de la optimización de hiperparámetros con metodologías bayesianas a través del paquete `optuna`. Como usted es un aficionado al entrenamiento de modelos de ML, se propone implementar la descabellada idea de su jefe.

A partir de la mejor configuración obtenida en la sección anterior, utilice `optuna` para optimizar sus hiperparámetros. En particular, se le pide:

- Fijar una semilla en las instancias necesarias para garantizar la reproducibilidad de resultados
- Utilice `TPESampler` como método de muestreo
- De `XGBRegressor`, optimice los siguientes hiperparámetros:
    - `learning_rate` buscando valores flotantes en el rango (0.001, 0.1)
    - `n_estimators` buscando valores enteros en el rango (50, 1000)
    - `max_depth` buscando valores enteros en el rango (3, 10)
    - `max_leaves` buscando valores enteros en el rango (0, 100)
    - `min_child_weight` buscando valores enteros en el rango (1, 5)
    - `reg_alpha` buscando valores flotantes en el rango (0, 1)
    - `reg_lambda` buscando valores flotantes en el rango (0, 1)
- De `OneHotEncoder`, optimice el hiperparámetro `min_frequency` buscando el mejor valor flotante en el rango (0.0, 1.0)
- Explique cada hiperparámetro y su rol en el modelo. ¿Hacen sentido los rangos de optimización indicados?
- Fije el tiempo de entrenamiento a 5 minutos
- Reportar el número de *trials*, el `MAE` y los mejores hiperparámetros encontrados. ¿Cómo cambian sus resultados con respecto a la sección anterior? ¿A qué se puede deber esto?
- Guardar su modelo en un archivo .pkl

In [53]:
from optuna.samplers import TPESampler
import optuna
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split, cross_val_score
def objective(trial):
    # Optimizar hiperparámetros del preprocesador
    min_frequency = trial.suggest_float('min_frequency', 0.0, 1.0)
    params = {
        'objective': 'reg:squarederror',
        'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.1),
        'n_estimators': trial.suggest_int('n_estimators', 50, 1000),
        'max_depth': trial.suggest_int('max_depth', 3, 10),
        'max_leaves': trial.suggest_int('max_leaves', 0, 100),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 5),
        'reg_alpha': trial.suggest_float('reg_alpha', 0, 1),
        'reg_lambda': trial.suggest_float('reg_lambda', 0, 1),
        'monotone_constraints':'(0,0,0,0,0,0,0,0,-1,0,0,0)'
    }

    min_frequency = trial.suggest_float('min_frequency', 0.0, 1.0)
    categorical_transformer = OneHotEncoder(handle_unknown='ignore')
    categorical_transformer.set_params(min_frequency=min_frequency)
    preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_features),
        ('cat', categorical_transformer, categorical_features)
    ])

    # Entrenar el modelo con restricciones de monotonía
    model = XGBRegressor(seed=42, **params)

    pipeline = Pipeline(steps=[
        ('preprocessor', preprocessor),
        ('regressor', model)
    ])

    # Predicciones y cálculo del MAE
    pipeline.fit(X_train, y_train)
    y_val_pred = pipeline.predict(X_val)
    mae = mean_absolute_error(y_val, y_val_pred)

    return mae

# Fijar la semilla para reproducibilidad
optuna.logging.set_verbosity(optuna.logging.WARNING) #Quitar Log
sampler = TPESampler(seed=42)

# Crear y ejecutar el estudio
study = optuna.create_study(direction='minimize', sampler=sampler)
study.optimize(objective, n_trials=100, timeout=300) #Fijado a 100 trials o 5 min


In [54]:
# Reportar los resultados
print("Number of finished trials: ", len(study.trials))
print("Best trial:")
trial = study.best_trial

print("  Value: ", trial.value)
print("  Params: ")
for key, value in trial.params.items():
    print(f"    {key}: {value}")

Number of finished trials:  100
Best trial:
  Value:  1961.6943135047263
  Params: 
    min_frequency: 0.002924968027274369
    learning_rate: 0.058141548044030086
    n_estimators: 998
    max_depth: 8
    max_leaves: 78
    min_child_weight: 4
    reg_alpha: 0.5899725982095636
    reg_lambda: 0.3757980010998163


In [61]:
#Mejore parametros
best_params = study.best_params
#Mejor modelo
# Crear y ajustar el modelo final
preprocessor = ColumnTransformer(
transformers=[
    ('num', StandardScaler(), numeric_features),
    ('cat', OneHotEncoder(handle_unknown='ignore', min_frequency=best_params['min_frequency']), categorical_features)
])

# Entrenar el modelo con mejores parametros
model = XGBRegressor(
    seed=42,
    learning_rate=best_params['learning_rate'],
    n_estimators=best_params['n_estimators'],
    max_depth=best_params['max_depth'],
    max_leaves=best_params['max_leaves'],
    min_child_weight=best_params['min_child_weight'],
    reg_alpha=best_params['reg_alpha'],
    reg_lambda=best_params['reg_lambda'])

pipeline = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('regressor', model)
])

# Entrenamiento y Dump
pipeline.fit(X_train, y_train)
joblib.dump(pipeline, 'best_model.pkl')

['preprocessor.pkl']

Explicación:
1. Explique cada hiperparámetro y su rol en el modelo. ¿Hacen sentido los rangos de optimización indicados?:

Hiperparámetros del Modelo:

* min_frequency: Este parámetro del OneHotEncoder controla la frecuencia mínima de las categorías para ser incluidas en la codificación. Un valor más alto reduce la dimensionalidad eliminando categorías raras. Rango (0.0, 1.0) es razonable, ya que cubre desde no filtrar categorías hasta eliminar las más raras.
* learning_rate: Tasa de aprendizaje que controla el tamaño del paso en cada iteración. Valores más bajos permiten que el modelo aprenda de manera más precisa pero requieren más iteraciones. Rango (0.001, 0.1) es adecuado para evitar tanto el subajuste como el sobreajuste.
* n_estimators: Número de árboles en el modelo. Más árboles generalmente mejoran el ajuste pero pueden aumentar el riesgo de sobreajuste. Rango (50, 1000) es amplio para explorar tanto modelos más simples como más complejos.
* max_depth: Profundidad máxima de los árboles. Controla la complejidad del modelo. Valores más altos pueden capturar relaciones más complejas pero también pueden llevar al sobreajuste. Rango (3, 10) es adecuado para encontrar un equilibrio entre complejidad y generalización.
* max_leaves: Número máximo de hojas en los árboles. Similar a max_depth, controla la complejidad del modelo. Rango (0, 100) es razonable para explorar desde modelos muy simples hasta muy complejos.
* min_child_weight: Peso mínimo de la suma de instancias necesarias en una hoja hija. Ayuda a evitar el sobreajuste. Rango (1, 5) es adecuado para controlar la división de hojas en el árbol.
* reg_alpha: Término de regularización L1 que penaliza las características menos importantes, ayudando a evitar el sobreajuste. Rango (0, 1) es adecuado para explorar desde ninguna penalización hasta una penalización fuerte.
* reg_lambda: Término de regularización L2 que penaliza las características de manera diferente a reg_alpha. Rango (0, 1) es adecuado para explorar diferentes niveles de penalización.
*monotone_constraints: Restricciones de monotonía para garantizar que el modelo respete una relación monótona entre las características y la variable objetivo. En este caso, garantiza que un aumento en el precio (price) resulte en una disminución en la cantidad (quantity).

2. Reportar el número de trials, el MAE y los mejores hiperparámetros encontrados. ¿Cómo cambian sus resultados con respecto a la sección anterior? ¿A qué se puede deber esto?:

Número de Trials, MAE y Mejores Hiperparámetros Encontrados:

* Número de trials: 100
* MAE (Mean Absolute Error): 1961.69
Mejores Hiperparámetros:
* min_frequency: 0.002924968027274369
* learning_rate: 0.058141548044030086
* n_estimators: 998
* max_depth: 8
* max_leaves: 78
* min_child_weight: 4
* reg_alpha: 0.5899725982095636
* reg_lambda: 0.3757980010998163

De lo anterior podemos notar que:
* MAE con Hiperparámetros Optimizados (Optuna): 1961.69
* MAE Anterior (sin Optuna): 2296.988080173751

El MAE ha mejorado significativamente con la optimización de hiperparámetros utilizando Optuna. Esta mejora se debe a que el proceso de optimización bayesiana explora sistemáticamente el espacio de hiperparámetros para encontrar la combinación que minimiza el error de predicción

## 4. Optimización de Hiperparámetros con Optuna y Prunners (1.7)

<p align="center">
  <img src="https://i.pinimg.com/originals/90/16/f9/9016f919c2259f3d0e8fe465049638a7.gif">
</p>

Después de optimizar el rendimiento de su modelo varias veces, Fiu le pregunta si no es posible optimizar el entrenamiento del modelo en sí mismo. Después de leer un par de post de personas de dudosa reputación en la *deepweb*, usted llega a la conclusión que puede cumplir este objetivo mediante la implementación de **Prunning**.

Vuelva a optimizar los mismos hiperparámetros que la sección pasada, pero esta vez utilizando **Prunning** en la optimización. En particular, usted debe:

- Responder: ¿Qué es prunning? ¿De qué forma debería impactar en el entrenamiento?
- Utilizar `optuna.integration.XGBoostPruningCallback` como método de **Prunning**
- Fijar nuevamente el tiempo de entrenamiento a 5 minutos
- Reportar el número de *trials*, el `MAE` y los mejores hiperparámetros encontrados. ¿Cómo cambian sus resultados con respecto a la sección anterior? ¿A qué se puede deber esto?
- Guardar su modelo en un archivo .pkl

Nota: Si quieren silenciar los prints obtenidos en el prunning, pueden hacerlo mediante el siguiente comando:

```
optuna.logging.set_verbosity(optuna.logging.WARNING)
```

De implementar la opción anterior, pueden especificar `show_progress_bar = True` en el método `optimize` para *más sabor*.

Hint: Si quieren especificar parámetros del método .fit() del modelo a través del pipeline, pueden hacerlo por medio de la siguiente sintaxis: `pipeline.fit(stepmodelo__parametro = valor)`

Hint2: Este <a href = https://stackoverflow.com/questions/40329576/sklearn-pass-fit-parameters-to-xgboost-in-pipeline>enlace</a> les puede ser de ayuda en su implementación

In [64]:
!pip install optuna-integration

Collecting optuna-integration
  Downloading optuna_integration-3.6.0-py3-none-any.whl (93 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/93.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m92.2/93.4 kB[0m [31m3.1 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m93.4/93.4 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: optuna-integration
Successfully installed optuna-integration-3.6.0


In [75]:
from optuna.samplers import TPESampler
import optuna
from xgboost import XGBRegressor
def objective(trial):
    # Optimizar hiperparámetros del preprocesador
    min_frequency = trial.suggest_float('min_frequency', 0.0, 1.0)
    params = {
        'objective': 'reg:squarederror',
        "eval_metric": "mae",
        'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.1),
        'n_estimators': trial.suggest_int('n_estimators', 50, 1000),
        'max_depth': trial.suggest_int('max_depth', 3, 10),
        'max_leaves': trial.suggest_int('max_leaves', 0, 100),
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 5),
        'reg_alpha': trial.suggest_float('reg_alpha', 0, 1),
        'reg_lambda': trial.suggest_float('reg_lambda', 0, 1),
        'monotone_constraints':'(0,0,0,0,0,0,0,0,-1,0,0,0)'
    }

    min_frequency = trial.suggest_float('min_frequency', 0.0, 1.0)
    categorical_transformer = OneHotEncoder(handle_unknown='ignore')
    categorical_transformer.set_params(min_frequency=min_frequency)
    preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_features),
        ('cat', categorical_transformer, categorical_features)
    ])
    # Entrenar el modelo con prunning
    pruning_callback = optuna.integration.XGBoostPruningCallback(
        trial, observation_key="validation_0-mae"
    )
    # Entrenar el modelo con restricciones de monotonía
    model = XGBRegressor(seed=42, **params)

    pipeline = Pipeline(steps=[
        ('preprocessor', preprocessor),
        ('regressor', model)
    ])
    X_train_transformed = preprocessor.fit_transform(X_train)
    X_val_transformed = preprocessor.transform(X_val)
    # Predicciones y cálculo del MAE
    pipeline.fit(X_train, y_train, regressor__eval_set=[(X_val_transformed, y_val)], regressor__early_stopping_rounds=10, regressor__callbacks=[pruning_callback])
    y_val_pred = pipeline.predict(X_val)
    mae = mean_absolute_error(y_val, y_val_pred)

    return mae

# Fijar la semilla para reproducibilidad
optuna.logging.set_verbosity(optuna.logging.WARNING) #Quitar Log
sampler = TPESampler(seed=42)

# Crear y ejecutar el estudio
study = optuna.create_study(direction='minimize', sampler=sampler)
study.optimize(objective, n_trials=100, timeout=300, show_progress_bar = True) #Fijado a 100 trials o 5 min

  0%|          | 0/100 [00:00<?, ?it/s]

[0]	validation_0-mae:12592.86576
[1]	validation_0-mae:11998.08750
[2]	validation_0-mae:11492.05480
[3]	validation_0-mae:11048.62646
[4]	validation_0-mae:10665.51945
[5]	validation_0-mae:10346.54007
[6]	validation_0-mae:10084.89470




[7]	validation_0-mae:9869.63969
[8]	validation_0-mae:9673.69848
[9]	validation_0-mae:9501.85683
[10]	validation_0-mae:9329.10604
[11]	validation_0-mae:9199.70781
[12]	validation_0-mae:9091.05398
[13]	validation_0-mae:8982.87844
[14]	validation_0-mae:8891.23863
[15]	validation_0-mae:8811.53678
[16]	validation_0-mae:8733.13107
[17]	validation_0-mae:8673.36623
[18]	validation_0-mae:8627.67944
[19]	validation_0-mae:8579.37769
[20]	validation_0-mae:8546.48327
[21]	validation_0-mae:8519.31996
[22]	validation_0-mae:8492.86745
[23]	validation_0-mae:8464.73626
[24]	validation_0-mae:8440.39767
[25]	validation_0-mae:8427.25415
[26]	validation_0-mae:8409.20505
[27]	validation_0-mae:8397.23210
[28]	validation_0-mae:8384.48886
[29]	validation_0-mae:8361.96410
[30]	validation_0-mae:8351.30970
[31]	validation_0-mae:8349.69375
[32]	validation_0-mae:8336.44146
[33]	validation_0-mae:8339.38079
[34]	validation_0-mae:8327.85994
[35]	validation_0-mae:8313.80361
[36]	validation_0-mae:8308.19398
[37]	validati



[20]	validation_0-mae:8607.38724
[21]	validation_0-mae:8568.67665
[22]	validation_0-mae:8535.97672
[23]	validation_0-mae:8508.93630
[24]	validation_0-mae:8475.66755
[25]	validation_0-mae:8458.13071
[26]	validation_0-mae:8435.47802
[27]	validation_0-mae:8422.50159
[28]	validation_0-mae:8410.47751
[29]	validation_0-mae:8401.46466
[30]	validation_0-mae:8401.19135
[31]	validation_0-mae:8388.43038
[32]	validation_0-mae:8383.43781
[33]	validation_0-mae:8374.87741
[34]	validation_0-mae:8372.22665
[35]	validation_0-mae:8374.64699
[36]	validation_0-mae:8375.09618
[37]	validation_0-mae:8380.46320
[38]	validation_0-mae:8378.79226
[39]	validation_0-mae:8374.89031
[40]	validation_0-mae:8377.41930
[41]	validation_0-mae:8375.82993
[42]	validation_0-mae:8373.43452
[43]	validation_0-mae:8378.45503
[0]	validation_0-mae:12854.62528
[1]	validation_0-mae:12446.77584
[2]	validation_0-mae:12061.12471
[3]	validation_0-mae:11705.63174
[4]	validation_0-mae:11366.66655
[5]	validation_0-mae:11055.87494
[6]	valida



[20]	validation_0-mae:8228.10479
[21]	validation_0-mae:8117.04663
[22]	validation_0-mae:8017.28281
[23]	validation_0-mae:7911.39798
[24]	validation_0-mae:7817.63617
[25]	validation_0-mae:7722.76742
[26]	validation_0-mae:7647.81023
[27]	validation_0-mae:7555.38127
[28]	validation_0-mae:7472.56167
[29]	validation_0-mae:7399.21832
[30]	validation_0-mae:7334.11391
[31]	validation_0-mae:7272.96041
[32]	validation_0-mae:7212.26135
[33]	validation_0-mae:7162.78875
[34]	validation_0-mae:7107.60715
[35]	validation_0-mae:7047.69944
[36]	validation_0-mae:7001.25329
[37]	validation_0-mae:6963.97866
[38]	validation_0-mae:6918.99868
[39]	validation_0-mae:6871.49718
[40]	validation_0-mae:6836.68817
[41]	validation_0-mae:6795.42380
[42]	validation_0-mae:6764.49340
[43]	validation_0-mae:6729.57847
[44]	validation_0-mae:6705.24112
[45]	validation_0-mae:6672.56812
[46]	validation_0-mae:6650.92597
[47]	validation_0-mae:6624.31807
[48]	validation_0-mae:6604.86351
[49]	validation_0-mae:6587.46908
[50]	valid



[29]	validation_0-mae:8388.60821
[30]	validation_0-mae:8382.86480
[31]	validation_0-mae:8378.02766
[32]	validation_0-mae:8376.93315
[33]	validation_0-mae:8376.45877
[34]	validation_0-mae:8373.29588
[35]	validation_0-mae:8377.26275
[36]	validation_0-mae:8373.42088
[37]	validation_0-mae:8367.46219
[38]	validation_0-mae:8370.48340
[39]	validation_0-mae:8363.69659
[40]	validation_0-mae:8368.06264
[41]	validation_0-mae:8371.82144
[42]	validation_0-mae:8377.23453
[43]	validation_0-mae:8374.98132
[44]	validation_0-mae:8379.32486
[45]	validation_0-mae:8386.62057
[46]	validation_0-mae:8390.34767
[47]	validation_0-mae:8393.45797
[48]	validation_0-mae:8391.35698
[0]	validation_0-mae:12530.95382
[1]	validation_0-mae:11847.58985
[2]	validation_0-mae:11248.15742
[3]	validation_0-mae:10703.82785
[4]	validation_0-mae:10235.03444
[5]	validation_0-mae:9793.87549
[6]	validation_0-mae:9402.16248
[7]	validation_0-mae:9105.29472
[8]	validation_0-mae:8753.19265
[9]	validation_0-mae:8435.51441
[10]	validation



[17]	validation_0-mae:6608.97607
[18]	validation_0-mae:6440.74507
[19]	validation_0-mae:6283.90818
[20]	validation_0-mae:6121.32705
[21]	validation_0-mae:5959.42829
[22]	validation_0-mae:5802.04566
[23]	validation_0-mae:5676.22255
[24]	validation_0-mae:5535.47165
[25]	validation_0-mae:5414.99821
[26]	validation_0-mae:5313.78819
[27]	validation_0-mae:5201.95085
[28]	validation_0-mae:5105.70138
[29]	validation_0-mae:5002.06905
[30]	validation_0-mae:4893.01921
[31]	validation_0-mae:4817.38336
[32]	validation_0-mae:4737.25309
[33]	validation_0-mae:4622.17508
[34]	validation_0-mae:4557.66789
[35]	validation_0-mae:4486.41174
[36]	validation_0-mae:4430.00504
[37]	validation_0-mae:4349.49654
[38]	validation_0-mae:4288.34941
[39]	validation_0-mae:4238.46052
[40]	validation_0-mae:4180.07534
[41]	validation_0-mae:4136.25762
[42]	validation_0-mae:4061.28163
[43]	validation_0-mae:4006.93083
[44]	validation_0-mae:3965.65001
[45]	validation_0-mae:3924.25960
[46]	validation_0-mae:3875.59839
[47]	valid



[0]	validation_0-mae:13120.37489




[0]	validation_0-mae:12540.67415
[1]	validation_0-mae:11885.32211
[2]	validation_0-mae:11287.81576
[3]	validation_0-mae:10720.86966
[4]	validation_0-mae:10232.97187
[5]	validation_0-mae:9779.74533
[6]	validation_0-mae:9375.16762
[7]	validation_0-mae:8971.02224
[8]	validation_0-mae:8606.86503
[9]	validation_0-mae:8279.03690
[10]	validation_0-mae:7958.53281
[11]	validation_0-mae:7681.89493
[12]	validation_0-mae:7414.37048
[13]	validation_0-mae:7174.81226
[14]	validation_0-mae:6946.76235




[15]	validation_0-mae:6711.20195
[16]	validation_0-mae:6473.36818
[17]	validation_0-mae:6300.87273
[18]	validation_0-mae:6109.62007
[19]	validation_0-mae:5935.15754
[20]	validation_0-mae:5743.76437
[21]	validation_0-mae:5603.15448
[22]	validation_0-mae:5440.61949
[23]	validation_0-mae:5318.07687
[24]	validation_0-mae:5177.68792
[25]	validation_0-mae:5055.31963
[26]	validation_0-mae:4943.65139
[27]	validation_0-mae:4829.27556
[28]	validation_0-mae:4719.16282
[29]	validation_0-mae:4607.39510
[30]	validation_0-mae:4510.46599
[31]	validation_0-mae:4411.26035
[32]	validation_0-mae:4338.07338
[33]	validation_0-mae:4242.52330
[34]	validation_0-mae:4178.30538
[35]	validation_0-mae:4106.56905
[36]	validation_0-mae:4028.62350
[37]	validation_0-mae:3965.80946
[38]	validation_0-mae:3901.76071
[39]	validation_0-mae:3845.08796
[40]	validation_0-mae:3776.29975
[41]	validation_0-mae:3727.85955
[42]	validation_0-mae:3671.89287
[43]	validation_0-mae:3632.68859
[44]	validation_0-mae:3588.66467
[45]	valid



[3]	validation_0-mae:10506.10271
[4]	validation_0-mae:9977.11320
[5]	validation_0-mae:9494.83895
[6]	validation_0-mae:9073.35571
[7]	validation_0-mae:8692.59849
[8]	validation_0-mae:8324.82049
[9]	validation_0-mae:8027.90071
[10]	validation_0-mae:7718.97614
[11]	validation_0-mae:7414.80918
[12]	validation_0-mae:7168.12047
[13]	validation_0-mae:6906.81612
[14]	validation_0-mae:6681.30018
[15]	validation_0-mae:6496.64975
[16]	validation_0-mae:6280.14209
[17]	validation_0-mae:6113.55497
[18]	validation_0-mae:5932.91315
[19]	validation_0-mae:5791.11814
[20]	validation_0-mae:5633.81800
[21]	validation_0-mae:5483.47620
[22]	validation_0-mae:5339.51940
[23]	validation_0-mae:5214.46807
[24]	validation_0-mae:5093.35396
[25]	validation_0-mae:4969.52923
[26]	validation_0-mae:4876.52316
[27]	validation_0-mae:4767.09617
[28]	validation_0-mae:4672.45044
[29]	validation_0-mae:4571.06747
[30]	validation_0-mae:4479.96480
[31]	validation_0-mae:4387.11162
[32]	validation_0-mae:4322.15123
[33]	validation_



[13]	validation_0-mae:6785.32061
[14]	validation_0-mae:6559.81974
[15]	validation_0-mae:6337.01788
[16]	validation_0-mae:6172.75549
[17]	validation_0-mae:5971.33836
[18]	validation_0-mae:5814.09071
[19]	validation_0-mae:5629.63234
[20]	validation_0-mae:5446.39603
[21]	validation_0-mae:5290.09866
[22]	validation_0-mae:5169.92845
[23]	validation_0-mae:5019.78358
[24]	validation_0-mae:4903.98433
[25]	validation_0-mae:4802.69692
[26]	validation_0-mae:4698.96349
[27]	validation_0-mae:4615.29756
[28]	validation_0-mae:4524.55563
[29]	validation_0-mae:4431.88159
[30]	validation_0-mae:4354.29656
[31]	validation_0-mae:4281.62570
[32]	validation_0-mae:4203.39230
[33]	validation_0-mae:4127.73536
[34]	validation_0-mae:4028.70712
[35]	validation_0-mae:3966.64532
[36]	validation_0-mae:3913.34121
[37]	validation_0-mae:3845.17252
[38]	validation_0-mae:3795.49430
[39]	validation_0-mae:3743.10469
[40]	validation_0-mae:3705.03155
[41]	validation_0-mae:3653.51021
[42]	validation_0-mae:3601.62231
[43]	valid



[0]	validation_0-mae:12583.40113
[1]	validation_0-mae:11965.62753
[2]	validation_0-mae:11432.12662




[3]	validation_0-mae:10971.31654
[4]	validation_0-mae:10581.89844
[5]	validation_0-mae:10252.87610
[6]	validation_0-mae:9961.88020
[7]	validation_0-mae:9715.92564
[8]	validation_0-mae:9492.16654
[9]	validation_0-mae:9303.22809
[10]	validation_0-mae:9146.51926
[11]	validation_0-mae:9005.76081
[12]	validation_0-mae:8889.21329
[13]	validation_0-mae:8788.61727
[14]	validation_0-mae:8700.77067
[15]	validation_0-mae:8620.65785
[16]	validation_0-mae:8564.30941
[17]	validation_0-mae:8510.38177
[18]	validation_0-mae:8472.01613




[0]	validation_0-mae:12993.62525




[0]	validation_0-mae:12510.06704
[1]	validation_0-mae:11806.05032
[2]	validation_0-mae:11185.37403
[3]	validation_0-mae:10647.55975
[4]	validation_0-mae:10162.79271
[5]	validation_0-mae:9701.54330
[6]	validation_0-mae:9295.22991
[7]	validation_0-mae:8906.99757
[8]	validation_0-mae:8573.95697
[9]	validation_0-mae:8277.65206
[10]	validation_0-mae:7994.63371
[11]	validation_0-mae:7711.70167
[12]	validation_0-mae:7438.96514
[13]	validation_0-mae:7199.53502
[14]	validation_0-mae:6948.45597
[15]	validation_0-mae:6745.81907




[16]	validation_0-mae:6541.33460
[17]	validation_0-mae:6376.94071
[18]	validation_0-mae:6210.38365
[19]	validation_0-mae:6053.80438
[20]	validation_0-mae:5893.09533
[21]	validation_0-mae:5737.81784
[22]	validation_0-mae:5603.12097
[23]	validation_0-mae:5488.14096
[24]	validation_0-mae:5343.89236
[25]	validation_0-mae:5240.61196
[26]	validation_0-mae:5123.05642
[27]	validation_0-mae:5018.09417
[28]	validation_0-mae:4940.50282
[29]	validation_0-mae:4838.98575
[30]	validation_0-mae:4760.91356
[31]	validation_0-mae:4657.42567
[32]	validation_0-mae:4588.20388
[33]	validation_0-mae:4502.02210
[34]	validation_0-mae:4422.01187
[35]	validation_0-mae:4350.83851
[36]	validation_0-mae:4303.04942
[37]	validation_0-mae:4230.71729
[38]	validation_0-mae:4167.70761
[39]	validation_0-mae:4111.55314
[40]	validation_0-mae:4049.75796
[41]	validation_0-mae:4013.30467
[42]	validation_0-mae:3943.59264
[43]	validation_0-mae:3892.51977
[44]	validation_0-mae:3868.29730
[45]	validation_0-mae:3823.44409
[46]	valid



[13]	validation_0-mae:7474.43201
[14]	validation_0-mae:7234.00904
[15]	validation_0-mae:7044.77311
[16]	validation_0-mae:6838.74284
[17]	validation_0-mae:6677.39001
[18]	validation_0-mae:6482.35418
[19]	validation_0-mae:6287.99154
[20]	validation_0-mae:6152.63251
[21]	validation_0-mae:6007.53531
[22]	validation_0-mae:5854.04412
[23]	validation_0-mae:5703.16590
[24]	validation_0-mae:5580.80523
[25]	validation_0-mae:5471.63980
[26]	validation_0-mae:5370.24798
[27]	validation_0-mae:5252.81484
[28]	validation_0-mae:5160.78010
[29]	validation_0-mae:5092.50237
[30]	validation_0-mae:5003.39783
[31]	validation_0-mae:4921.99455
[32]	validation_0-mae:4844.53557
[33]	validation_0-mae:4754.24339
[34]	validation_0-mae:4694.03622
[35]	validation_0-mae:4616.92836
[36]	validation_0-mae:4552.97071
[37]	validation_0-mae:4481.01272
[38]	validation_0-mae:4426.24012
[39]	validation_0-mae:4365.80928
[40]	validation_0-mae:4334.75694
[41]	validation_0-mae:4282.41740
[42]	validation_0-mae:4230.61384
[43]	valid



[18]	validation_0-mae:7063.52939
[19]	validation_0-mae:6968.86279
[20]	validation_0-mae:6883.79235
[21]	validation_0-mae:6815.97943
[22]	validation_0-mae:6758.23328
[23]	validation_0-mae:6693.27457
[24]	validation_0-mae:6643.44824
[25]	validation_0-mae:6591.54321
[26]	validation_0-mae:6554.40473
[27]	validation_0-mae:6522.63220
[28]	validation_0-mae:6506.93108
[29]	validation_0-mae:6477.22313
[30]	validation_0-mae:6451.41445
[31]	validation_0-mae:6433.34425
[32]	validation_0-mae:6414.11695
[33]	validation_0-mae:6396.93903
[34]	validation_0-mae:6382.84399
[35]	validation_0-mae:6366.73100
[36]	validation_0-mae:6368.17717
[37]	validation_0-mae:6358.93518
[38]	validation_0-mae:6350.11052
[39]	validation_0-mae:6337.75140
[40]	validation_0-mae:6333.03173
[41]	validation_0-mae:6328.93158
[42]	validation_0-mae:6322.46879
[43]	validation_0-mae:6314.57745
[44]	validation_0-mae:6311.16688
[45]	validation_0-mae:6311.49527
[0]	validation_0-mae:12667.25825




[0]	validation_0-mae:12712.33812




[0]	validation_0-mae:12429.31786
[1]	validation_0-mae:11690.18382
[2]	validation_0-mae:11022.39873
[3]	validation_0-mae:10449.45973
[4]	validation_0-mae:9918.93890
[5]	validation_0-mae:9450.77031
[6]	validation_0-mae:9033.31337
[7]	validation_0-mae:8637.19922
[8]	validation_0-mae:8294.57828
[9]	validation_0-mae:8005.13009
[10]	validation_0-mae:7689.45887
[11]	validation_0-mae:7424.53021
[12]	validation_0-mae:7112.97834
[13]	validation_0-mae:6874.80204
[14]	validation_0-mae:6629.97495
[15]	validation_0-mae:6402.62958
[16]	validation_0-mae:6198.91159
[17]	validation_0-mae:6016.69858
[18]	validation_0-mae:5819.97343
[19]	validation_0-mae:5601.74240
[20]	validation_0-mae:5407.07111
[21]	validation_0-mae:5271.17177
[22]	validation_0-mae:5128.36427
[23]	validation_0-mae:4973.99603
[24]	validation_0-mae:4840.78709
[25]	validation_0-mae:4745.21679
[26]	validation_0-mae:4623.54752
[27]	validation_0-mae:4494.78619
[28]	validation_0-mae:4417.39529
[29]	validation_0-mae:4299.37355
[30]	validation_



[0]	validation_0-mae:12855.87152
[0]	validation_0-mae:12549.36450
[1]	validation_0-mae:11907.67301




[2]	validation_0-mae:11355.94351
[3]	validation_0-mae:10890.36273
[4]	validation_0-mae:10495.72064
[5]	validation_0-mae:10156.31445
[6]	validation_0-mae:9863.24738
[7]	validation_0-mae:9615.97270
[8]	validation_0-mae:9398.58047
[9]	validation_0-mae:9213.95462
[10]	validation_0-mae:9069.00069
[11]	validation_0-mae:8935.84804
[12]	validation_0-mae:8824.71151
[13]	validation_0-mae:8721.09482
[14]	validation_0-mae:8644.20639
[15]	validation_0-mae:8580.94235
[16]	validation_0-mae:8521.53080
[17]	validation_0-mae:8475.75171
[18]	validation_0-mae:8441.02551
[19]	validation_0-mae:8409.55195
[0]	validation_0-mae:13241.93469




[0]	validation_0-mae:12633.71349




[0]	validation_0-mae:12651.62236
[0]	validation_0-mae:12455.36119




[1]	validation_0-mae:11738.01471
[2]	validation_0-mae:11120.59513
[3]	validation_0-mae:10613.43213
[4]	validation_0-mae:10174.67646
[5]	validation_0-mae:9796.37062
[6]	validation_0-mae:9475.74059
[7]	validation_0-mae:9214.44432
[8]	validation_0-mae:8980.44674
[9]	validation_0-mae:8796.47601
[10]	validation_0-mae:8637.35334
[11]	validation_0-mae:8493.19806
[12]	validation_0-mae:8373.34216
[13]	validation_0-mae:8266.80824
[14]	validation_0-mae:8176.62677
[15]	validation_0-mae:8107.84997
[16]	validation_0-mae:8048.44243
[17]	validation_0-mae:7992.33119
[18]	validation_0-mae:7954.01085
[19]	validation_0-mae:7922.55635
[20]	validation_0-mae:7888.66495
[21]	validation_0-mae:7869.99304
[22]	validation_0-mae:7854.45933
[23]	validation_0-mae:7841.19489
[24]	validation_0-mae:7818.68433
[0]	validation_0-mae:12529.24963
[1]	validation_0-mae:11858.84709
[2]	validation_0-mae:11265.31772
[3]	validation_0-mae:10705.65724
[4]	validation_0-mae:10205.11938
[5]	validation_0-mae:9742.81762
[6]	validation_0



[12]	validation_0-mae:7463.55716
[13]	validation_0-mae:7191.98478
[14]	validation_0-mae:6949.29555
[15]	validation_0-mae:6745.26388
[16]	validation_0-mae:6531.59615
[17]	validation_0-mae:6330.18904
[18]	validation_0-mae:6156.79431
[19]	validation_0-mae:5984.86134
[20]	validation_0-mae:5823.11733
[21]	validation_0-mae:5680.43639
[22]	validation_0-mae:5516.61757
[23]	validation_0-mae:5385.38494
[24]	validation_0-mae:5266.28388
[25]	validation_0-mae:5133.41995
[26]	validation_0-mae:5029.53712
[27]	validation_0-mae:4918.01578
[28]	validation_0-mae:4825.00125
[29]	validation_0-mae:4724.03939
[30]	validation_0-mae:4617.53887
[31]	validation_0-mae:4526.60784
[32]	validation_0-mae:4430.47026
[33]	validation_0-mae:4359.75374
[34]	validation_0-mae:4284.85584
[35]	validation_0-mae:4227.86093
[36]	validation_0-mae:4162.47255
[37]	validation_0-mae:4088.65946
[38]	validation_0-mae:4034.21984
[39]	validation_0-mae:3986.86346
[40]	validation_0-mae:3932.41192
[41]	validation_0-mae:3875.25757
[42]	valid



[0]	validation_0-mae:12767.34221




[0]	validation_0-mae:12630.34557




[0]	validation_0-mae:12728.89190




[0]	validation_0-mae:12489.00208
[1]	validation_0-mae:11799.98971
[2]	validation_0-mae:11216.00455
[3]	validation_0-mae:10733.57107
[4]	validation_0-mae:10340.75334
[5]	validation_0-mae:10003.73430
[6]	validation_0-mae:9711.50848
[7]	validation_0-mae:9471.15821
[8]	validation_0-mae:9265.45966
[9]	validation_0-mae:9099.83037
[10]	validation_0-mae:8960.14030
[11]	validation_0-mae:8837.68117
[12]	validation_0-mae:8732.29728
[13]	validation_0-mae:8659.57847
[14]	validation_0-mae:8585.63468




[15]	validation_0-mae:8524.89761
[16]	validation_0-mae:8482.27698
[17]	validation_0-mae:8459.08509
[18]	validation_0-mae:8420.88989




[0]	validation_0-mae:12511.72295
[1]	validation_0-mae:11823.00218
[2]	validation_0-mae:11213.77961
[3]	validation_0-mae:10671.11273
[4]	validation_0-mae:10176.99608
[5]	validation_0-mae:9751.66789
[6]	validation_0-mae:9372.78323
[7]	validation_0-mae:9038.48388
[8]	validation_0-mae:8743.57439
[9]	validation_0-mae:8472.61374
[10]	validation_0-mae:8242.04515
[11]	validation_0-mae:8027.50280
[12]	validation_0-mae:7838.34939
[13]	validation_0-mae:7666.62294
[14]	validation_0-mae:7510.35631




[15]	validation_0-mae:7363.74912
[16]	validation_0-mae:7237.96084
[17]	validation_0-mae:7104.26473
[18]	validation_0-mae:7012.33618
[19]	validation_0-mae:6910.23211
[20]	validation_0-mae:6833.02051
[21]	validation_0-mae:6768.41661
[22]	validation_0-mae:6714.68039
[23]	validation_0-mae:6659.29775
[24]	validation_0-mae:6601.64979
[25]	validation_0-mae:6568.04175
[26]	validation_0-mae:6525.20710
[27]	validation_0-mae:6487.44578
[28]	validation_0-mae:6456.83425
[29]	validation_0-mae:6438.62915
[30]	validation_0-mae:6422.05209
[31]	validation_0-mae:6402.10242
[32]	validation_0-mae:6395.99772
[33]	validation_0-mae:6375.84990
[34]	validation_0-mae:6376.59074
[35]	validation_0-mae:6367.63659
[36]	validation_0-mae:6356.04908
[37]	validation_0-mae:6349.23412
[38]	validation_0-mae:6344.17357
[39]	validation_0-mae:6333.34284
[40]	validation_0-mae:6330.01993
[41]	validation_0-mae:6330.11249
[42]	validation_0-mae:6325.90362
[43]	validation_0-mae:6325.25994
[44]	validation_0-mae:6325.32227
[45]	valid



[0]	validation_0-mae:12425.97045




[1]	validation_0-mae:11696.45057
[2]	validation_0-mae:11036.39609
[3]	validation_0-mae:10455.37122
[4]	validation_0-mae:9948.20983
[5]	validation_0-mae:9490.75640
[6]	validation_0-mae:9119.93749
[7]	validation_0-mae:8751.76926
[8]	validation_0-mae:8428.73165
[9]	validation_0-mae:8114.60042
[10]	validation_0-mae:7847.96408
[11]	validation_0-mae:7609.82800
[12]	validation_0-mae:7388.84228
[13]	validation_0-mae:7200.49778
[14]	validation_0-mae:7027.09185
[15]	validation_0-mae:6882.62917
[16]	validation_0-mae:6754.07167
[17]	validation_0-mae:6629.92077
[18]	validation_0-mae:6530.66077
[19]	validation_0-mae:6449.76628
[20]	validation_0-mae:6376.28528
[21]	validation_0-mae:6304.04858
[22]	validation_0-mae:6224.45438
[23]	validation_0-mae:6170.75718
[24]	validation_0-mae:6117.48326
[25]	validation_0-mae:6079.62241
[26]	validation_0-mae:6028.89567
[27]	validation_0-mae:5993.91071
[28]	validation_0-mae:5972.69281
[29]	validation_0-mae:5940.08085
[30]	validation_0-mae:5920.04585
[31]	validation_



[0]	validation_0-mae:13116.34137




[0]	validation_0-mae:12694.18579




[0]	validation_0-mae:12732.93109




[0]	validation_0-mae:12635.37072




[0]	validation_0-mae:12510.56730
[1]	validation_0-mae:11835.91469
[2]	validation_0-mae:11257.12899
[3]	validation_0-mae:10774.19375
[4]	validation_0-mae:10371.22787
[5]	validation_0-mae:10020.74575
[6]	validation_0-mae:9727.57191
[7]	validation_0-mae:9479.27770
[8]	validation_0-mae:9277.91961
[9]	validation_0-mae:9120.18208
[10]	validation_0-mae:8977.56858
[11]	validation_0-mae:8860.82591
[12]	validation_0-mae:8762.29354
[13]	validation_0-mae:8682.24474
[14]	validation_0-mae:8615.14099
[15]	validation_0-mae:8554.56755
[16]	validation_0-mae:8516.26777
[17]	validation_0-mae:8478.91285
[18]	validation_0-mae:8456.61095




[0]	validation_0-mae:12810.82993




[0]	validation_0-mae:12595.09619
[0]	validation_0-mae:12755.27615




[0]	validation_0-mae:12917.65344




[0]	validation_0-mae:12416.37295
[1]	validation_0-mae:11683.01773
[2]	validation_0-mae:11016.68106
[3]	validation_0-mae:10442.46179
[4]	validation_0-mae:9931.37406
[5]	validation_0-mae:9480.62759
[6]	validation_0-mae:9107.44330
[7]	validation_0-mae:8751.38170
[8]	validation_0-mae:8425.53328
[9]	validation_0-mae:8118.76050
[10]	validation_0-mae:7858.46264
[11]	validation_0-mae:7617.75318
[12]	validation_0-mae:7400.54900
[13]	validation_0-mae:7216.50603
[14]	validation_0-mae:7048.37667
[15]	validation_0-mae:6895.83454
[16]	validation_0-mae:6759.30253
[17]	validation_0-mae:6651.24892
[18]	validation_0-mae:6540.35609




[19]	validation_0-mae:6440.09265
[20]	validation_0-mae:6359.62241
[21]	validation_0-mae:6279.79095
[22]	validation_0-mae:6215.06940
[23]	validation_0-mae:6162.32163
[24]	validation_0-mae:6114.63272
[25]	validation_0-mae:6072.81952
[26]	validation_0-mae:6026.99964
[27]	validation_0-mae:5990.31253
[28]	validation_0-mae:5949.45128
[29]	validation_0-mae:5913.31694
[30]	validation_0-mae:5887.10597
[31]	validation_0-mae:5873.69903
[32]	validation_0-mae:5855.33551
[33]	validation_0-mae:5838.51404
[34]	validation_0-mae:5816.78681
[35]	validation_0-mae:5791.23614
[36]	validation_0-mae:5775.20264
[37]	validation_0-mae:5766.92064
[38]	validation_0-mae:5752.53943
[39]	validation_0-mae:5735.52907
[40]	validation_0-mae:5734.86702
[41]	validation_0-mae:5724.83089
[42]	validation_0-mae:5706.79879
[43]	validation_0-mae:5714.58230
[44]	validation_0-mae:5710.56969
[0]	validation_0-mae:12533.67228
[1]	validation_0-mae:11851.50336
[2]	validation_0-mae:11237.41674
[3]	validation_0-mae:10707.19429
[4]	valida



[14]	validation_0-mae:7119.43714
[15]	validation_0-mae:6916.25164
[16]	validation_0-mae:6700.75582
[17]	validation_0-mae:6532.76961
[18]	validation_0-mae:6358.56510
[19]	validation_0-mae:6181.56274
[20]	validation_0-mae:6029.42188
[21]	validation_0-mae:5871.17499
[22]	validation_0-mae:5725.85086
[23]	validation_0-mae:5587.89160
[24]	validation_0-mae:5449.63527
[25]	validation_0-mae:5324.94874
[26]	validation_0-mae:5220.08061
[27]	validation_0-mae:5112.32283
[28]	validation_0-mae:5004.24374
[29]	validation_0-mae:4905.23787
[30]	validation_0-mae:4804.64547
[31]	validation_0-mae:4731.59919
[32]	validation_0-mae:4650.39459
[33]	validation_0-mae:4564.54690
[34]	validation_0-mae:4485.52737
[35]	validation_0-mae:4413.92110
[36]	validation_0-mae:4355.03931
[37]	validation_0-mae:4303.74334
[38]	validation_0-mae:4234.63858
[39]	validation_0-mae:4183.19526
[40]	validation_0-mae:4133.81459
[41]	validation_0-mae:4075.06902
[42]	validation_0-mae:4034.92307
[43]	validation_0-mae:3979.53519
[44]	valid



[14]	validation_0-mae:6381.01627
[15]	validation_0-mae:6202.21455
[16]	validation_0-mae:5984.46887
[17]	validation_0-mae:5804.10404
[18]	validation_0-mae:5618.45169
[19]	validation_0-mae:5457.64731
[20]	validation_0-mae:5283.91188
[21]	validation_0-mae:5129.89436
[22]	validation_0-mae:4985.35138
[23]	validation_0-mae:4858.79180
[24]	validation_0-mae:4724.82823
[25]	validation_0-mae:4599.76338
[26]	validation_0-mae:4476.03931
[27]	validation_0-mae:4381.85598
[28]	validation_0-mae:4302.83674
[29]	validation_0-mae:4206.02737
[30]	validation_0-mae:4128.16031
[31]	validation_0-mae:4057.45048
[32]	validation_0-mae:3976.36724
[33]	validation_0-mae:3903.75586
[34]	validation_0-mae:3832.85424
[35]	validation_0-mae:3765.28678
[36]	validation_0-mae:3712.61660
[37]	validation_0-mae:3669.22601
[38]	validation_0-mae:3616.74659
[39]	validation_0-mae:3566.19160
[40]	validation_0-mae:3519.58676
[41]	validation_0-mae:3484.20388
[42]	validation_0-mae:3445.36909
[43]	validation_0-mae:3386.78234
[44]	valid



[13]	validation_0-mae:7296.23530
[14]	validation_0-mae:7087.22602
[15]	validation_0-mae:6858.74798
[16]	validation_0-mae:6689.65257
[17]	validation_0-mae:6535.85837
[18]	validation_0-mae:6360.79830
[19]	validation_0-mae:6194.45710
[20]	validation_0-mae:6022.81513
[21]	validation_0-mae:5882.37082
[22]	validation_0-mae:5730.90471
[23]	validation_0-mae:5594.36098
[24]	validation_0-mae:5491.98322
[25]	validation_0-mae:5347.68274
[26]	validation_0-mae:5229.49883
[27]	validation_0-mae:5133.93463
[28]	validation_0-mae:5021.53485
[29]	validation_0-mae:4922.95992
[30]	validation_0-mae:4822.78257
[31]	validation_0-mae:4755.33738
[32]	validation_0-mae:4663.93960
[33]	validation_0-mae:4579.21815
[34]	validation_0-mae:4526.34233
[35]	validation_0-mae:4456.09915
[36]	validation_0-mae:4401.04419
[37]	validation_0-mae:4345.34935
[38]	validation_0-mae:4279.27712
[39]	validation_0-mae:4212.79209
[40]	validation_0-mae:4164.74515
[41]	validation_0-mae:4114.18099
[42]	validation_0-mae:4072.80574
[43]	valid



[13]	validation_0-mae:7266.62788
[14]	validation_0-mae:7098.00341
[15]	validation_0-mae:6941.71359
[16]	validation_0-mae:6799.89319
[17]	validation_0-mae:6677.03495
[18]	validation_0-mae:6578.94603
[19]	validation_0-mae:6475.97828
[20]	validation_0-mae:6381.37321
[21]	validation_0-mae:6311.13914
[22]	validation_0-mae:6242.09904
[23]	validation_0-mae:6181.09488
[24]	validation_0-mae:6133.97443
[25]	validation_0-mae:6080.62611
[26]	validation_0-mae:6039.67858
[27]	validation_0-mae:6000.87968
[28]	validation_0-mae:5959.94881
[29]	validation_0-mae:5931.37931
[30]	validation_0-mae:5898.96971
[31]	validation_0-mae:5884.60689
[32]	validation_0-mae:5869.03036
[33]	validation_0-mae:5861.21904
[34]	validation_0-mae:5841.62743
[35]	validation_0-mae:5822.77661
[36]	validation_0-mae:5805.36057
[37]	validation_0-mae:5786.17036
[38]	validation_0-mae:5775.27363
[39]	validation_0-mae:5766.23698
[40]	validation_0-mae:5758.40703
[41]	validation_0-mae:5756.65819
[42]	validation_0-mae:5744.39927
[43]	valid



[0]	validation_0-mae:12658.00031




[0]	validation_0-mae:12705.04094




[0]	validation_0-mae:12412.16284
[1]	validation_0-mae:11680.45301
[2]	validation_0-mae:11015.56373
[3]	validation_0-mae:10439.71363
[4]	validation_0-mae:9927.40028
[5]	validation_0-mae:9461.43468
[6]	validation_0-mae:9102.26617
[7]	validation_0-mae:8754.88805
[8]	validation_0-mae:8425.79738
[9]	validation_0-mae:8125.88528
[10]	validation_0-mae:7862.26158
[11]	validation_0-mae:7623.38235
[12]	validation_0-mae:7402.05005
[13]	validation_0-mae:7213.01824
[14]	validation_0-mae:7044.33445
[15]	validation_0-mae:6890.11879
[16]	validation_0-mae:6757.50097
[17]	validation_0-mae:6641.93249




[18]	validation_0-mae:6530.17497
[19]	validation_0-mae:6443.30666
[20]	validation_0-mae:6350.11209
[21]	validation_0-mae:6269.97894
[22]	validation_0-mae:6210.54653
[23]	validation_0-mae:6144.75619
[24]	validation_0-mae:6094.11914
[25]	validation_0-mae:6054.56517
[26]	validation_0-mae:6018.66640
[27]	validation_0-mae:5973.61433
[28]	validation_0-mae:5942.89990
[29]	validation_0-mae:5913.26302
[30]	validation_0-mae:5884.04172
[31]	validation_0-mae:5863.91474
[32]	validation_0-mae:5864.78338
[33]	validation_0-mae:5837.81460
[34]	validation_0-mae:5815.89449
[35]	validation_0-mae:5807.46491
[36]	validation_0-mae:5798.88511
[37]	validation_0-mae:5789.11496
[38]	validation_0-mae:5779.30028
[39]	validation_0-mae:5766.38996
[40]	validation_0-mae:5766.99914
[41]	validation_0-mae:5755.96560
[42]	validation_0-mae:5753.21708
[43]	validation_0-mae:5743.23982
[44]	validation_0-mae:5741.74120
[45]	validation_0-mae:5728.02250
[0]	validation_0-mae:12845.61154




[0]	validation_0-mae:12400.91333
[1]	validation_0-mae:11618.35355
[2]	validation_0-mae:10908.46263
[3]	validation_0-mae:10277.13987
[4]	validation_0-mae:9711.01877
[5]	validation_0-mae:9205.82556
[6]	validation_0-mae:8730.45434
[7]	validation_0-mae:8315.35326
[8]	validation_0-mae:7938.53334




[9]	validation_0-mae:7586.04359
[10]	validation_0-mae:7274.31326
[11]	validation_0-mae:7014.58875
[12]	validation_0-mae:6763.56117
[13]	validation_0-mae:6449.21046
[14]	validation_0-mae:6208.22462
[15]	validation_0-mae:5953.42684
[16]	validation_0-mae:5749.47780
[17]	validation_0-mae:5544.69897
[18]	validation_0-mae:5384.78820
[19]	validation_0-mae:5213.13627
[20]	validation_0-mae:5046.03158
[21]	validation_0-mae:4926.35705
[22]	validation_0-mae:4777.34989
[23]	validation_0-mae:4633.35046
[24]	validation_0-mae:4514.01777
[25]	validation_0-mae:4397.46382
[26]	validation_0-mae:4284.35204
[27]	validation_0-mae:4188.32253
[28]	validation_0-mae:4089.15230
[29]	validation_0-mae:4005.96794
[30]	validation_0-mae:3925.96752
[31]	validation_0-mae:3833.96672
[32]	validation_0-mae:3767.61686
[33]	validation_0-mae:3702.88575
[34]	validation_0-mae:3638.26754
[35]	validation_0-mae:3573.62147
[36]	validation_0-mae:3504.60748
[37]	validation_0-mae:3456.03192
[38]	validation_0-mae:3406.61746
[39]	valida



[12]	validation_0-mae:6820.17288
[13]	validation_0-mae:6512.54891
[14]	validation_0-mae:6304.00459
[15]	validation_0-mae:6087.05713
[16]	validation_0-mae:5877.80688
[17]	validation_0-mae:5686.11177
[18]	validation_0-mae:5514.92300
[19]	validation_0-mae:5333.31250
[20]	validation_0-mae:5179.55031
[21]	validation_0-mae:5006.97108
[22]	validation_0-mae:4874.41320
[23]	validation_0-mae:4739.56052
[24]	validation_0-mae:4613.46216
[25]	validation_0-mae:4494.14069
[26]	validation_0-mae:4374.72260
[27]	validation_0-mae:4275.59027
[28]	validation_0-mae:4193.90526
[29]	validation_0-mae:4115.41344
[30]	validation_0-mae:4046.41540
[31]	validation_0-mae:3971.68105
[32]	validation_0-mae:3905.58861
[33]	validation_0-mae:3831.91248
[34]	validation_0-mae:3737.06454
[35]	validation_0-mae:3671.84802
[36]	validation_0-mae:3613.25930
[37]	validation_0-mae:3566.54637
[38]	validation_0-mae:3518.58097
[39]	validation_0-mae:3465.80597
[40]	validation_0-mae:3408.68707
[41]	validation_0-mae:3368.31141
[42]	valid



[11]	validation_0-mae:7038.38347
[12]	validation_0-mae:6762.00870
[13]	validation_0-mae:6507.58090
[14]	validation_0-mae:6271.14979
[15]	validation_0-mae:6034.46007
[16]	validation_0-mae:5829.06522
[17]	validation_0-mae:5639.78383
[18]	validation_0-mae:5462.66802
[19]	validation_0-mae:5298.45200
[20]	validation_0-mae:5148.32966
[21]	validation_0-mae:5006.92824
[22]	validation_0-mae:4868.39409
[23]	validation_0-mae:4757.76010
[24]	validation_0-mae:4621.70514
[25]	validation_0-mae:4499.17662
[26]	validation_0-mae:4405.94437
[27]	validation_0-mae:4302.49977
[28]	validation_0-mae:4215.32231
[29]	validation_0-mae:4126.03700
[30]	validation_0-mae:4029.38495
[31]	validation_0-mae:3954.21794
[32]	validation_0-mae:3880.18113
[33]	validation_0-mae:3808.42637
[34]	validation_0-mae:3749.14020
[35]	validation_0-mae:3692.64850
[36]	validation_0-mae:3636.36548
[37]	validation_0-mae:3586.85004
[38]	validation_0-mae:3533.72372
[39]	validation_0-mae:3496.66216
[40]	validation_0-mae:3458.80099
[41]	valid



[9]	validation_0-mae:7872.89478
[10]	validation_0-mae:7581.91865
[11]	validation_0-mae:7284.37030
[12]	validation_0-mae:7035.34301
[13]	validation_0-mae:6733.94266
[14]	validation_0-mae:6494.14937
[15]	validation_0-mae:6280.43619
[16]	validation_0-mae:6051.96639
[17]	validation_0-mae:5840.38394
[18]	validation_0-mae:5653.54991
[19]	validation_0-mae:5462.39633
[20]	validation_0-mae:5312.06602
[21]	validation_0-mae:5136.64821
[22]	validation_0-mae:5005.90198
[23]	validation_0-mae:4867.21250
[24]	validation_0-mae:4737.49880
[25]	validation_0-mae:4626.21403
[26]	validation_0-mae:4517.49868
[27]	validation_0-mae:4411.11270
[28]	validation_0-mae:4297.03238
[29]	validation_0-mae:4203.31668
[30]	validation_0-mae:4121.30027
[31]	validation_0-mae:4032.24543
[32]	validation_0-mae:3965.11336
[33]	validation_0-mae:3883.98785
[34]	validation_0-mae:3810.10878
[35]	validation_0-mae:3725.09656
[36]	validation_0-mae:3659.19503
[37]	validation_0-mae:3604.34216
[38]	validation_0-mae:3550.94326
[39]	valida



[0]	validation_0-mae:12432.56170
[1]	validation_0-mae:11672.15668
[2]	validation_0-mae:10990.79981
[3]	validation_0-mae:10364.91724
[4]	validation_0-mae:9819.73992
[5]	validation_0-mae:9306.32176
[6]	validation_0-mae:8856.31918
[7]	validation_0-mae:8419.89908
[8]	validation_0-mae:8056.82457
[9]	validation_0-mae:7703.99007
[10]	validation_0-mae:7361.23235
[11]	validation_0-mae:7059.42920




[12]	validation_0-mae:6770.40241
[13]	validation_0-mae:6520.51201
[14]	validation_0-mae:6293.46960
[15]	validation_0-mae:6065.89575
[16]	validation_0-mae:5849.00412
[17]	validation_0-mae:5621.72211
[18]	validation_0-mae:5458.79175
[19]	validation_0-mae:5285.55584
[20]	validation_0-mae:5123.63823
[21]	validation_0-mae:4971.27088
[22]	validation_0-mae:4843.64856
[23]	validation_0-mae:4708.81386
[24]	validation_0-mae:4579.48027
[25]	validation_0-mae:4447.22673
[26]	validation_0-mae:4334.83197
[27]	validation_0-mae:4226.13231
[28]	validation_0-mae:4122.95248
[29]	validation_0-mae:4022.16182
[30]	validation_0-mae:3933.02940
[31]	validation_0-mae:3843.96087
[32]	validation_0-mae:3768.31161
[33]	validation_0-mae:3695.96349
[34]	validation_0-mae:3638.09547
[35]	validation_0-mae:3573.49075
[36]	validation_0-mae:3520.37795
[37]	validation_0-mae:3467.11482
[38]	validation_0-mae:3416.27143
[39]	validation_0-mae:3361.51897
[40]	validation_0-mae:3307.11557
[41]	validation_0-mae:3260.64481
[42]	valid



[0]	validation_0-mae:12521.04248
[1]	validation_0-mae:11845.32269
[2]	validation_0-mae:11226.04915
[3]	validation_0-mae:10647.45930
[4]	validation_0-mae:10117.75513
[5]	validation_0-mae:9628.84715
[6]	validation_0-mae:9197.58739
[7]	validation_0-mae:8801.66816
[8]	validation_0-mae:8447.93961
[9]	validation_0-mae:8104.92306
[10]	validation_0-mae:7782.61093
[11]	validation_0-mae:7514.92855




[12]	validation_0-mae:7228.98719
[13]	validation_0-mae:6981.50969
[14]	validation_0-mae:6754.03415
[15]	validation_0-mae:6535.11914
[16]	validation_0-mae:6321.16495
[17]	validation_0-mae:6110.36680
[18]	validation_0-mae:5929.94971
[19]	validation_0-mae:5740.98685
[20]	validation_0-mae:5586.47514
[21]	validation_0-mae:5415.00997
[22]	validation_0-mae:5251.79889
[23]	validation_0-mae:5109.31018
[24]	validation_0-mae:4978.37114
[25]	validation_0-mae:4832.57002
[26]	validation_0-mae:4730.79243
[27]	validation_0-mae:4602.06030
[28]	validation_0-mae:4497.85359
[29]	validation_0-mae:4395.82351
[30]	validation_0-mae:4295.16646
[31]	validation_0-mae:4192.27148
[32]	validation_0-mae:4101.29742
[33]	validation_0-mae:4024.22497
[34]	validation_0-mae:3944.57141
[35]	validation_0-mae:3873.73239
[36]	validation_0-mae:3807.54264
[37]	validation_0-mae:3740.59093
[38]	validation_0-mae:3668.48444
[39]	validation_0-mae:3608.28181
[40]	validation_0-mae:3552.42239
[41]	validation_0-mae:3497.30459
[42]	valid



[1]	validation_0-mae:11588.93975
[2]	validation_0-mae:10861.64053
[3]	validation_0-mae:10215.58788
[4]	validation_0-mae:9637.04217
[5]	validation_0-mae:9132.09751
[6]	validation_0-mae:8669.91686
[7]	validation_0-mae:8256.55562
[8]	validation_0-mae:7865.46451
[9]	validation_0-mae:7498.63869
[10]	validation_0-mae:7188.45270
[11]	validation_0-mae:6917.27145
[12]	validation_0-mae:6615.01143
[13]	validation_0-mae:6353.37636
[14]	validation_0-mae:6114.42121
[15]	validation_0-mae:5892.53651
[16]	validation_0-mae:5671.70336
[17]	validation_0-mae:5448.87643
[18]	validation_0-mae:5253.94666
[19]	validation_0-mae:5078.27119
[20]	validation_0-mae:4907.15094
[21]	validation_0-mae:4766.26478
[22]	validation_0-mae:4617.45789
[23]	validation_0-mae:4501.20007
[24]	validation_0-mae:4378.85325
[25]	validation_0-mae:4270.44440
[26]	validation_0-mae:4168.49795
[27]	validation_0-mae:4063.72923
[28]	validation_0-mae:3974.67553
[29]	validation_0-mae:3887.41372
[30]	validation_0-mae:3804.12776
[31]	validation_



[5]	validation_0-mae:9130.03381
[6]	validation_0-mae:8665.66428
[7]	validation_0-mae:8245.63365
[8]	validation_0-mae:7854.76007
[9]	validation_0-mae:7504.51340
[10]	validation_0-mae:7185.59412
[11]	validation_0-mae:6907.59406
[12]	validation_0-mae:6630.74786
[13]	validation_0-mae:6367.47218
[14]	validation_0-mae:6150.08565
[15]	validation_0-mae:5905.72056
[16]	validation_0-mae:5697.54284
[17]	validation_0-mae:5479.96144
[18]	validation_0-mae:5292.95770
[19]	validation_0-mae:5122.76026
[20]	validation_0-mae:4941.96383
[21]	validation_0-mae:4793.19806
[22]	validation_0-mae:4648.44075
[23]	validation_0-mae:4512.31204
[24]	validation_0-mae:4386.82324
[25]	validation_0-mae:4271.08001
[26]	validation_0-mae:4175.00423
[27]	validation_0-mae:4083.80228
[28]	validation_0-mae:3980.38429
[29]	validation_0-mae:3876.88755
[30]	validation_0-mae:3803.31144
[31]	validation_0-mae:3717.68561
[32]	validation_0-mae:3641.13203
[33]	validation_0-mae:3572.97142
[34]	validation_0-mae:3521.99816
[35]	validation



[6]	validation_0-mae:8653.07439
[7]	validation_0-mae:8235.39785
[8]	validation_0-mae:7846.64035
[9]	validation_0-mae:7498.88635
[10]	validation_0-mae:7178.06687
[11]	validation_0-mae:6901.18912
[12]	validation_0-mae:6629.86356
[13]	validation_0-mae:6378.68546
[14]	validation_0-mae:6117.17806
[15]	validation_0-mae:5878.99080
[16]	validation_0-mae:5660.76028
[17]	validation_0-mae:5443.54134
[18]	validation_0-mae:5258.57541
[19]	validation_0-mae:5069.89781
[20]	validation_0-mae:4913.82662
[21]	validation_0-mae:4752.11522
[22]	validation_0-mae:4610.57406
[23]	validation_0-mae:4481.72772
[24]	validation_0-mae:4361.14052
[25]	validation_0-mae:4243.98628
[26]	validation_0-mae:4140.47536
[27]	validation_0-mae:4048.95656
[28]	validation_0-mae:3953.61693
[29]	validation_0-mae:3861.77878
[30]	validation_0-mae:3787.85165
[31]	validation_0-mae:3717.65327
[32]	validation_0-mae:3638.61234
[33]	validation_0-mae:3575.21560
[34]	validation_0-mae:3518.85966
[35]	validation_0-mae:3463.17394
[36]	validatio



[10]	validation_0-mae:7215.32482
[11]	validation_0-mae:6908.78204
[12]	validation_0-mae:6623.37200
[13]	validation_0-mae:6352.09547
[14]	validation_0-mae:6110.87882
[15]	validation_0-mae:5879.28934
[16]	validation_0-mae:5660.62469
[17]	validation_0-mae:5481.03009
[18]	validation_0-mae:5287.94654
[19]	validation_0-mae:5116.34692
[20]	validation_0-mae:4949.26538
[21]	validation_0-mae:4800.72349
[22]	validation_0-mae:4671.49107
[23]	validation_0-mae:4528.87710
[24]	validation_0-mae:4400.59700
[25]	validation_0-mae:4283.29691
[26]	validation_0-mae:4192.42955
[27]	validation_0-mae:4086.35418
[28]	validation_0-mae:3990.83747
[29]	validation_0-mae:3891.69086
[30]	validation_0-mae:3812.51910
[31]	validation_0-mae:3729.40029
[32]	validation_0-mae:3660.97781
[33]	validation_0-mae:3592.51620
[34]	validation_0-mae:3527.34151
[35]	validation_0-mae:3472.30508
[36]	validation_0-mae:3420.19312
[37]	validation_0-mae:3367.81684
[38]	validation_0-mae:3319.50575
[39]	validation_0-mae:3276.20355
[40]	valid



[0]	validation_0-mae:12441.12115
[1]	validation_0-mae:11696.41028
[2]	validation_0-mae:11018.03895
[3]	validation_0-mae:10392.02413
[4]	validation_0-mae:9849.18629
[5]	validation_0-mae:9337.79940
[6]	validation_0-mae:8890.67741
[7]	validation_0-mae:8465.14609
[8]	validation_0-mae:8090.99478
[9]	validation_0-mae:7740.62755
[10]	validation_0-mae:7413.91802




[11]	validation_0-mae:7133.22848
[12]	validation_0-mae:6856.43914
[13]	validation_0-mae:6599.88387
[14]	validation_0-mae:6367.31644
[15]	validation_0-mae:6124.47444
[16]	validation_0-mae:5928.59999
[17]	validation_0-mae:5735.83681
[18]	validation_0-mae:5569.13440
[19]	validation_0-mae:5376.68559
[20]	validation_0-mae:5234.72729
[21]	validation_0-mae:5076.85123
[22]	validation_0-mae:4924.93503
[23]	validation_0-mae:4798.69577
[24]	validation_0-mae:4666.38540
[25]	validation_0-mae:4543.73526
[26]	validation_0-mae:4445.85249
[27]	validation_0-mae:4341.51236
[28]	validation_0-mae:4233.46940
[29]	validation_0-mae:4137.10507
[30]	validation_0-mae:4045.53048
[31]	validation_0-mae:3959.39322
[32]	validation_0-mae:3878.32427
[33]	validation_0-mae:3806.19266
[34]	validation_0-mae:3730.27401
[35]	validation_0-mae:3663.01596
[36]	validation_0-mae:3607.37289
[37]	validation_0-mae:3551.60839
[38]	validation_0-mae:3508.39642
[39]	validation_0-mae:3453.28610
[40]	validation_0-mae:3404.75445
[41]	valid



[10]	validation_0-mae:7610.64282
[11]	validation_0-mae:7288.43165
[12]	validation_0-mae:7021.28075
[13]	validation_0-mae:6759.86123
[14]	validation_0-mae:6508.07348
[15]	validation_0-mae:6264.37375
[16]	validation_0-mae:6030.97764
[17]	validation_0-mae:5819.64055
[18]	validation_0-mae:5623.61158
[19]	validation_0-mae:5449.89608
[20]	validation_0-mae:5277.79774
[21]	validation_0-mae:5124.97231
[22]	validation_0-mae:4969.66893
[23]	validation_0-mae:4839.18999
[24]	validation_0-mae:4712.62631
[25]	validation_0-mae:4584.57154
[26]	validation_0-mae:4472.12747
[27]	validation_0-mae:4364.15477
[28]	validation_0-mae:4256.41598
[29]	validation_0-mae:4162.83905
[30]	validation_0-mae:4071.77779
[31]	validation_0-mae:3992.99997
[32]	validation_0-mae:3900.55700
[33]	validation_0-mae:3822.02503
[34]	validation_0-mae:3744.57750
[35]	validation_0-mae:3681.11648
[36]	validation_0-mae:3617.29482
[37]	validation_0-mae:3553.08597
[38]	validation_0-mae:3504.32929
[39]	validation_0-mae:3455.41847
[40]	valid



[0]	validation_0-mae:12382.31712
[1]	validation_0-mae:11593.89132
[2]	validation_0-mae:10875.46547
[3]	validation_0-mae:10224.37270
[4]	validation_0-mae:9641.34497
[5]	validation_0-mae:9134.02011
[6]	validation_0-mae:8674.52556
[7]	validation_0-mae:8254.02208
[8]	validation_0-mae:7866.32162
[9]	validation_0-mae:7517.36074
[10]	validation_0-mae:7192.42687




[11]	validation_0-mae:6866.73934
[12]	validation_0-mae:6602.01846
[13]	validation_0-mae:6355.68504
[14]	validation_0-mae:6128.37029
[15]	validation_0-mae:5912.04096
[16]	validation_0-mae:5685.45609
[17]	validation_0-mae:5486.88020
[18]	validation_0-mae:5283.46066
[19]	validation_0-mae:5114.88065
[20]	validation_0-mae:4950.24370
[21]	validation_0-mae:4800.73667
[22]	validation_0-mae:4652.52359
[23]	validation_0-mae:4543.05803
[24]	validation_0-mae:4416.03949
[25]	validation_0-mae:4309.94492
[26]	validation_0-mae:4194.22536
[27]	validation_0-mae:4100.36697
[28]	validation_0-mae:4005.36749
[29]	validation_0-mae:3922.10753
[30]	validation_0-mae:3829.23859
[31]	validation_0-mae:3739.91159
[32]	validation_0-mae:3659.84532
[33]	validation_0-mae:3583.50871
[34]	validation_0-mae:3518.31042
[35]	validation_0-mae:3464.05930
[36]	validation_0-mae:3407.36116
[37]	validation_0-mae:3367.15156
[38]	validation_0-mae:3315.02669
[39]	validation_0-mae:3262.32350
[40]	validation_0-mae:3216.26805
[41]	valid



[0]	validation_0-mae:12462.00928
[1]	validation_0-mae:11738.89166
[2]	validation_0-mae:11071.27787
[3]	validation_0-mae:10475.06817
[4]	validation_0-mae:9944.69952
[5]	validation_0-mae:9470.68291
[6]	validation_0-mae:9058.77919
[7]	validation_0-mae:8700.26669
[8]	validation_0-mae:8379.28002
[9]	validation_0-mae:8078.81250




Prunning permitirá "podar" todos aquellos intentos que tienden a disminuir el desempeño final del modelo

In [76]:
# Reportar los resultados
print("Number of finished trials: ", len(study.trials))
print("Best trial:")
trial = study.best_trial

print("  Value: ", trial.value)
print("  Params: ")
for key, value in trial.params.items():
    print(f"    {key}: {value}")

Number of finished trials:  100
Best trial:
  Value:  1972.1627931652415
  Params: 
    min_frequency: 0.06682216541029647
    learning_rate: 0.0916109108265852
    n_estimators: 648
    max_depth: 9
    max_leaves: 90
    min_child_weight: 3
    reg_alpha: 0.3613749021550724
    reg_lambda: 0.6886578311868585


In [79]:
#Mejore parametros
best_params = study.best_params
#Mejor modelo
# Crear y ajustar el modelo final
preprocessor = ColumnTransformer(
transformers=[
    ('num', StandardScaler(), numeric_features),
    ('cat', OneHotEncoder(handle_unknown='ignore', min_frequency=best_params['min_frequency']), categorical_features)
])

# Entrenar el modelo con mejores parametros
model = XGBRegressor(
    seed=42,
    learning_rate=best_params['learning_rate'],
    n_estimators=best_params['n_estimators'],
    max_depth=best_params['max_depth'],
    max_leaves=best_params['max_leaves'],
    min_child_weight=best_params['min_child_weight'],
    reg_alpha=best_params['reg_alpha'],
    reg_lambda=best_params['reg_lambda'])

pipeline = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('regressor', model)
])

# Entrenamiento y Dump
pipeline.fit(X_train, y_train)
joblib.dump(pipeline, 'best_model.pkl')

['best_model.pkl']

* MAE con Hiperparámetros Optimizados sin prunning (Optuna): 1961.69
* MAE con Hiperparámetros Optimizados con prunning (Optuna): 1972.16


El MAE subió levemente comparado al metodo sin prunning, esto no es algo extraño, ya que el método de prunning no cambia la optimización en sí, sino que "poda" pruebas o trials menos prometedoras, acelerando el proceso de actualización. Esta leve diferencia puede estar dada por una razón más probabilistica.

## 5. Visualizaciones (0.5 puntos)

<p align="center">
  <img src="https://media.tenor.com/F-LgB1xTebEAAAAd/look-at-this-graph-nickelback.gif">
</p>


Satisfecho con su trabajo, Fiu le pregunta si es posible generar visualizaciones que permitan entender el entrenamiento de su modelo.

A partir del siguiente <a href = https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/005_visualization.html#visualization>enlace</a>, genere las siguientes visualizaciones:

1. Gráfico de historial de optimización
2. Gráfico de coordenadas paralelas
3. Gráfico de importancia de hiperparámetros

Comente sus resultados:

4. ¿Desde qué *trial* se empiezan a observar mejoras notables en sus resultados?
5. ¿Qué tendencias puede observar a partir del gráfico de coordenadas paralelas?
6. ¿Cuáles son los hiperparámetros con mayor importancia para la optimización de su modelo?

In [77]:
# Crear las visualizaciones
fig1 = optuna.visualization.plot_optimization_history(study)
fig2 = optuna.visualization.plot_parallel_coordinate(study)
fig3 = optuna.visualization.plot_param_importances(study)

In [78]:
# Mostrar las visualizaciones
fig1.show()
fig2.show()
fig3.show()

*   Del primer gráfico, se observa que las mejoras son a partir del trial = 4. En este punto hay una caida significativa en el valor objetivo.

*   Del segundo gráfico:
  *   Los valores bajos de learning_rate (alrededor de 0.05) tienden a estar asociados con menores valores de error (MAE).

  *   Los valores más altos de max_depth (alrededor de 8-10) están asociados con mejores resultados.
  *   min_frequency muestra que valores más bajos tienden a estar asociados con un menor MAE.
  *   La cantidad de estimadores (n_estimators) no tiene una tendencia clara, pero los valores intermedios a altos (alrededor de 500-800) parecen estar asociados con mejores resultados.

* Del Tercer grafico:

  * min_frequency: Es el hiperparámetro más importante, con una importancia de 0.50. Esto indica que ajustar la frecuencia mínima de las categorías tiene un gran impacto en el rendimiento del modelo.
  * learning_rate: Tiene una importancia de 0.18, lo que sugiere que el ajuste de la tasa de aprendizaje es crucial para el rendimiento del modelo.
  * reg_alpha y max_depth: Ambos tienen una importancia significativa (0.12 y 0.11, respectivamente), lo que indica que la regularización L1 y la profundidad máxima de los árboles son importantes para el modelo.
  * n_estimators y reg_lambda: También tienen una cierta importancia (0.05 y 0.03, respectivamente), aunque menor en comparación con los otros hiperparámetros.







## 6. Síntesis de resultados (0.3)

Finalmente:

1. Genere una tabla resumen del MAE obtenido en los 5 modelos entrenados desde Baseline hasta XGBoost con Constraints, Optuna y Prunning.
2. Compare los resultados de la tabla y responda, ¿qué modelo obtiene el mejor rendimiento?
3. Cargue el mejor modelo, prediga sobre el conjunto de **test** y reporte su MAE.
4. ¿Existen diferencias con respecto a las métricas obtenidas en el conjunto de validación? ¿Porqué puede ocurrir esto?

Resultados:
*   XGBRegressor_Baseline:	2433.32
*   XGBRegressor_Monotonic_Constraint:	2296.99
*   XGBRegressor_Optimizado_Sin_Pruners:	1961.69
*   XGBRegressor_Optimizado_Con_Pruners:	1972.16

El modelo que obtiene el mejor rendimiento es el XGBRegressor_Optimizado_Sin_Pruners con un MAE de 1961.69.
Aunque el modelo con pruning también tiene un rendimiento cercano, no supera al modelo optimizado sin pruning.





In [80]:
# Cargar el mejor modelo
best_model = joblib.load('best_model.pkl')

In [89]:
transformed_test_data = date_transformer.transform(test_data)
X_test = transformed_test_data.drop(columns=['quantity'])
y_test = transformed_test_data['quantity']
# Predecir sobre el conjunto de test
y_test_pred = best_model.predict(X_test)
y_val_pred = best_model.predict(X_val)

# Calcular el MAE en el conjunto de test
mae_eval = mean_absolute_error(y_val, y_val_pred)
mae_test = mean_absolute_error(y_test, y_test_pred)

print(f"MAE en el conjunto de validación: {mae_eval}")
print(f"MAE en el conjunto de test: {mae_test}")

MAE en el conjunto de validación: 1981.524438787674
MAE en el conjunto de test: 1990.437329494282


Las métricas para el conjunto de validación y test son bastante cercanas, pero hay una ligera diferencia.

Aunque el conjunto de validación y el conjunto de test son muestras del mismo conjunto de datos original, pueden tener distribuciones ligeramente diferentes, lo que puede llevar a diferencias en las métricas.

Por otro lado, el proceso de entrenamiento de XGBoost incluye elementos estocásticos como la inicialización de pesos y el muestreo de datos. Aunque estos efectos suelen ser pequeños, pueden contribuir a ligeras diferencias en el rendimiento entre conjuntos.

En conclusión, las diferencias observadas entre las métricas de validación y test son normales y esperadas hasta cierto punto, debido a los factores mencionados anteriormente. La proximidad de las métricas sugiere que el modelo generaliza bien a datos no vistos, aunque siempre existe un margen para mejoras adicionales.

# Conclusión
El proceso de optimización de modelos XGBoost, que incluyó restricciones monotónicas, ajuste de hiperparámetros con Optuna y técnicas de pruning, mejoró significativamente el rendimiento del modelo. XGBRegressor optimizado con pruning, obtuvo un MAE de 1981.52 en validación y 1990.44 en test, demostrando una buena capacidad de generalización. Las visualizaciones destacaron la importancia del hiperparámetro min_frequency. En resumen, el enfoque utilizado resultó efectivo para optimizar modelos de predicción en tareas de regresión.

---------------------


Eso ha sido todo para el lab de hoy, recuerden que el laboratorio tiene un plazo de entrega de una semana. Cualquier duda del laboratorio, no duden en contactarnos por mail o U-cursos.

<p align="center">
  <img src="https://media.tenor.com/8CT1AXElF_cAAAAC/gojo-satoru.gif">
</p>

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=87110296-876e-426f-b91d-aaf681223468' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>