# Descripción del proyecto

La compañía Sweet Lift Taxi ha recopilado datos históricos sobre pedidos de taxis en los aeropuertos. Para atraer a más conductores durante las horas pico, necesitamos predecir la cantidad de pedidos de taxis para la próxima hora. Construye un modelo para dicha predicción.

La métrica RECM en el conjunto de prueba no debe ser superior a 48.

## Instrucciones del proyecto.

1. Descarga los datos y haz el remuestreo por una hora.
2. Analiza los datos
3. Entrena diferentes modelos con diferentes hiperparámetros. La muestra de prueba debe ser el 10% del conjunto de datos inicial.4. Prueba los datos usando la muestra de prueba y proporciona una conclusión.

## Descripción de los datos

Los datos se almacenan en el archivo `taxi.csv`. 	
El número de pedidos está en la columna `num_orders`.

## Preparación de librerias

In [10]:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
import lightgbm as lgb
import xgboost as xgb
from sklearn.model_selection import GridSearchCV


## Preparación de datos

In [11]:
# Cargar los datos
try:
    # no windows
    data = pd.read_csv('/datasets/taxi.csv')
    print("Archivo cargado exitosamente desde '/datasets/taxi.csv'")
except FileNotFoundError:
    try:
        # windows
        data = pd.read_csv('datasets/taxi.csv')
        print("Archivo cargado exitosamente desde 'datasets/taxi.csv'")
    except FileNotFoundError:
        print("Error: el archivo 'taxi.csv' no se encuentra en ninguna de las rutas especificadas.")

Archivo cargado exitosamente desde '/datasets/taxi.csv'


## Análisis

In [12]:
# Mostrar las primeras filas del dataset
print("Primeras filas del dataset:")
display(data.head())

# Obtener información sobre el dataset
print("Información del dataset:")
display(data.info())

# Describir el dataset para obtener estadísticas básicas
print("Descripción del dataset:")
display(data.describe(include='all'))

# Verificar las columnas presentes en el DataFrame
print("Columnas en el DataFrame:", data.columns)

Primeras filas del dataset:


Unnamed: 0,datetime,num_orders
0,2018-03-01 00:00:00,9
1,2018-03-01 00:10:00,14
2,2018-03-01 00:20:00,28
3,2018-03-01 00:30:00,20
4,2018-03-01 00:40:00,32


Información del dataset:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26496 entries, 0 to 26495
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   datetime    26496 non-null  object
 1   num_orders  26496 non-null  int64 
dtypes: int64(1), object(1)
memory usage: 414.1+ KB


None

Descripción del dataset:


Unnamed: 0,datetime,num_orders
count,26496,26496.0
unique,26496,
top,2018-05-21 14:50:00,
freq,1,
mean,,14.070463
std,,9.21133
min,,0.0
25%,,8.0
50%,,13.0
75%,,19.0


Columnas en el DataFrame: Index(['datetime', 'num_orders'], dtype='object')


## Preparación y Remuestreo de Datos

In [13]:
# Convertir la columna de fecha a tipo datetime
data['datetime'] = pd.to_datetime(data['datetime'])

# Remuestrear los datos a intervalos de una hora
data_resampled = data.resample('H', on='datetime').sum().reset_index()

display(data_resampled.head())

Unnamed: 0,datetime,num_orders
0,2018-03-01 00:00:00,124
1,2018-03-01 01:00:00,85
2,2018-03-01 02:00:00,71
3,2018-03-01 03:00:00,66
4,2018-03-01 04:00:00,43


## División del Conjunto de Datos

In [14]:
# Dividir en características (X) y objetivo (y)
X = data_resampled.index.astype(int).values.reshape(-1, 1)  # Convertir fecha a enteros
y = data_resampled['num_orders'].values

# Dividir los datos en conjuntos de entrenamiento y prueba
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)

## Entrenamiento de Modelos

In [15]:
# Ajustar los hiperparámetros de Random Forest
# Definir los hiperparámetros a ajustar para Random Forest
param_grid_rf = {
    'n_estimators': [100, 200, 300],
    'max_depth': [10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Crear el modelo Random Forest
rf_model = RandomForestRegressor(random_state=42)

# Realizar la búsqueda de hiperparámetros
grid_search_rf = GridSearchCV(estimator=rf_model, param_grid=param_grid_rf, cv=3, scoring='neg_mean_squared_error', verbose=2, n_jobs=-1)

# Ajustar el modelo
grid_search_rf.fit(X_train, y_train)

# Obtener los mejores hiperparámetros
best_params_rf = grid_search_rf.best_params_
print("Best parameters found for Random Forest: ", best_params_rf)

# Evaluar el modelo con los mejores hiperparámetros
best_rf_model = grid_search_rf.best_estimator_
y_pred_rf = best_rf_model.predict(X_test)
rmse_rf = np.sqrt(mean_squared_error(y_test, y_pred_rf))
print(f'Random Forest RMSE with best parameters: {rmse_rf}')



Fitting 3 folds for each of 81 candidates, totalling 243 fits
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.4s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.3s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.3s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   0.7s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   0.7s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   0.7s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.0s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.0s
[CV] END max_depth=10, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.0s
[CV] END max_depth=10, min_sa

[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   1.0s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   0.9s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   0.9s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.4s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.4s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.4s
[CV] END max_depth=20, min_samples_leaf=1, min_samples_split=5, n_estimators=100; total tim

[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=100; total time=   0.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   1.0s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   1.0s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=200; total time=   1.0s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=2, n_estimators=300; total time=   1.5s
[CV] END max_depth=30, min_samples_leaf=1, min_samples_split=5, n_estimators=100; total tim

Best parameters found for Random Forest:  {'max_depth': 20, 'min_samples_leaf': 1, 'min_samples_split': 5, 'n_estimators': 300}
Random Forest RMSE with best parameters: 30.43997666766964


In [16]:
# Definir los hiperparámetros a ajustar para LightGBM
param_grid_lgb = {
    'num_leaves': [31, 50, 70],
    'learning_rate': [0.01, 0.05, 0.1],
    'n_estimators': [100, 200, 300],
    'max_depth': [-1, 10, 20],
    'min_child_samples': [20, 50, 100],
    'subsample': [0.8, 1.0]
}

# Crear el modelo LightGBM
lgb_model = lgb.LGBMRegressor(random_state=42)

# Realizar la búsqueda de hiperparámetros
grid_search_lgb = GridSearchCV(estimator=lgb_model, param_grid=param_grid_lgb, cv=3, scoring='neg_mean_squared_error', verbose=2, n_jobs=-1)

# Ajustar el modelo
grid_search_lgb.fit(X_train, y_train)

# Obtener los mejores hiperparámetros
best_params_lgb = grid_search_lgb.best_params_
print("Best parameters found for LightGBM: ", best_params_lgb)

# Evaluar el modelo con los mejores hiperparámetros
best_lgb_model = grid_search_lgb.best_estimator_
y_pred_lgb = best_lgb_model.predict(X_test)
rmse_lgb = np.sqrt(mean_squared_error(y_test, y_pred_lgb))
print(f'LightGBM RMSE with best parameters: {rmse_lgb}')

Fitting 3 folds for each of 486 candidates, totalling 1458 fits
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.4s
[CV] END lea

[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=50, n_estimator

[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.1s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=-1, min_child_samples=100, n_e

[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.8s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimators=200, num_leaves=70, subsample=0.8; total time=   1.0s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=20, n_estimator

[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=50, n_estimator

[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.01, max_depth=10, min_child_samples=100, n_e

[CV] END learning_rate=0.01, max_depth=20, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=0.8; total time=   1.6s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=0.8; total time=   1.5s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=1.0; total time=   1.5s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=1.0; total time=   1.5s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=1.0; total time=   2.0s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=50, n_estimator

[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.1s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.1s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.01, max_depth=20, min_child_samples=100, n_e

[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=20, n_estimator

[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=50, n_estimator

[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=-1, min_child_samples=100, n_e

[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.9s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=20, n_estimator

[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=1.0; total time=   0.8s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=50, n_estimators=300, num_leaves=70, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=10, min_child_samples=100, n_estimato

[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=20, n_estimator

[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=50, n_estimator

[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.05, max_depth=20, min_child_samples=100, n_e

[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   1.1s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=70, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=70, subsample=0.8; total time=   1.1s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, num_leaves=70, subsample=0.8; total time=   1.3s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=20, n_estimators=200, n

[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=0.8; total time=   1.1s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.8s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=50, n_estimators=300, n

[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimators=300, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=-1, min_child_samples=100, n_estimator

[CV] END learning_rate=0.1, max_depth=10, min_child_samples=20, n_estimators=300, num_leaves=70, subsample=1.0; total time=   0.9s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.1s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=0.8; total time=   0.3s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, num_leaves=31, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=50, n_estimators=100, n

[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=0.8; total time=   0.1s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=50, subsample=1.0; total time=   0.1s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.1s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimators=100, num_leaves=70, subsample=0.8; total time=   0.2s
[CV] END learning_rate=0.1, max_depth=10, min_child_samples=100, n_estimator

[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=100, num_leaves=70, subsample=1.0; total time=   0.4s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, num_leaves=31, subsample=1.0; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=20, n_estimators=200, n

[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=50, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.6s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, num_leaves=70, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=50, n_estimators=200, n

[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=200, num_leaves=70, subsample=1.0; total time=   0.3s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=0.8; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.7s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimators=300, num_leaves=31, subsample=1.0; total time=   0.5s
[CV] END learning_rate=0.1, max_depth=20, min_child_samples=100, n_estimator

In [17]:
# Definir los hiperparámetros a ajustar para XGBoost
param_grid_xgb = {
    'n_estimators': [100, 200, 300],
    'learning_rate': [0.01, 0.05, 0.1],
    'max_depth': [3, 6, 9],
    'min_child_weight': [1, 3, 5],
    'subsample': [0.8, 1.0]
}

# Crear el modelo XGBoost
xgb_model = xgb.XGBRegressor(random_state=42)

# Realizar la búsqueda de hiperparámetros
grid_search_xgb = GridSearchCV(estimator=xgb_model, param_grid=param_grid_xgb, cv=3, scoring='neg_mean_squared_error', verbose=2, n_jobs=-1)

# Ajustar el modelo
grid_search_xgb.fit(X_train, y_train)

# Obtener los mejores hiperparámetros
best_params_xgb = grid_search_xgb.best_params_
print("Best parameters found for XGBoost: ", best_params_xgb)

# Evaluar el modelo con los mejores hiperparámetros
best_xgb_model = grid_search_xgb.best_estimator_
y_pred_xgb = best_xgb_model.predict(X_test)
rmse_xgb = np.sqrt(mean_squared_error(y_test, y_pred_xgb))
print(f'XGBoost RMSE with best parameters: {rmse_xgb}')

Fitting 3 folds for each of 162 candidates, totalling 486 fits
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   0.9s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.0s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   0.9s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.0s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   0.9s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   1.9s
[CV] END learning_rate=0.01, max_depth=3, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   1.9s
[CV] END learning_rate=0.

[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.1s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.2s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.2s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.4s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=200, subsample=0.8; total time=   2.4s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=200, subsample=0.8; total time=   2.3s
[CV] END learning_rate=0.01, max_depth=6, min_child_weight=3, n_estimators=200, subsampl

[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=0.8; total time=   1.3s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   1.7s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   1.3s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   1.3s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=200, subsample=0.8; total time=   2.7s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=200, subsample=0.8; total time=   2.7s
[CV] END learning_rate=0.01, max_depth=9, min_child_weight=5, n_estimators=200, subsampl

[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.2s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.8s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.7s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.3s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.1s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   2.2s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   2.4s
[CV] END learning_rate=0.05, max_depth=6, min_child_weight=1, n_estimators=200, subsampl

[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.8s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.5s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.3s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=100, subsample=1.0; total time=   1.4s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=200, subsample=0.8; total time=   3.2s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=200, subsample=0.8; total time=   2.8s
[CV] END learning_rate=0.05, max_depth=9, min_child_weight=3, n_estimators=200, subsampl

[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=100, subsample=0.8; total time=   1.0s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=100, subsample=0.8; total time=   1.0s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   0.9s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   1.3s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=100, subsample=1.0; total time=   1.0s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=200, subsample=0.8; total time=   1.9s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=200, subsample=0.8; total time=   1.8s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=200, subsample=0.8; total time=   1.9s
[CV] END learning_rate=0.1, max_depth=3, min_child_weight=5, n_estimators=200, subsample=1.0; to

[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=100, subsample=0.8; total time=   1.4s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.5s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.8s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=100, subsample=1.0; total time=   1.5s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   2.8s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   2.8s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=200, subsample=0.8; total time=   3.2s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=200, subsample=1.0; total time=   2.9s
[CV] END learning_rate=0.1, max_depth=9, min_child_weight=1, n_estimators=200, subsample=1.0; to

## Evaluación y Conclusión

In [18]:
results = {
    "Model": ["Random Forest", "LightGBM", "XGBoost"],
    "RMSE": [rmse_rf, rmse_lgb, rmse_xgb]
}

results_df = pd.DataFrame(results)
display(results_df)

Unnamed: 0,Model,RMSE
0,Random Forest,30.439977
1,LightGBM,37.487693
2,XGBoost,33.013619


Hallazgos:<br><br>
Random Forest:<br>
RMSE: 30.44 <br>
Este modelo tiene el RMSE más bajo entre los modelos evaluados, lo que indica que tiene la mejor capacidad de predicción para el conjunto de datos actual. Esto puede deberse a la capacidad del bosque aleatorio para manejar la variabilidad y reducir el sobreajuste mediante la agregación de múltiples árboles de decisión.

<br>
LightGBM:<br>
RMSE: 37.49<br>
LightGBM también ha obtenido buenos resultado. Su capacidad para manejar grandes conjuntos de datos y su velocidad de entrenamiento lo hacen una opción viable, especialmente cuando se necesita un equilibrio entre rendimiento y eficiencia.

<br><br>
XGBoost:<br>
RMSE: 37.37<br>
XGBoost ha mostrado un rendimiento muy similar al de LightGBM.

<br><br>
Random Forest es la mejor opción en términos de RMSE. Se recomienda usar este modelo en producción para la predicción de pedidos de taxis.
LightGBM y XGBoost también son opciones sólidas y podrían ser preferidos en casos donde la velocidad de predicción sea crucial, o se necesite un balance entre rendimiento y tiempo de entrenamiento.