#### Este baseline se basara en armar un modelo LSTM por cada producto, con una optimizacion de hiper parametros escueta, para poder comparar con futuros experimientos. En caso de que esta alternativa funcione bien, seria recomendable incorporar parametros de optimizacion extra.

#### Imports

In [28]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np

In [29]:
final_dataset = pd.read_csv('../../Datasets/final_dataset.csv', sep='\t')

In [30]:
final_dataset.head()

Unnamed: 0,periodo,product_id,plan_precios_cuidados,cust_request_qty,cust_request_tn,y,cat1,cat2,cat3,brand,sku_size,stock_final,close_quarter,age
0,201701,20001,0,479,937.72717,934.77222,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,0
1,201702,20001,0,432,833.72187,798.0162,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,1
2,201703,20001,0,509,1330.74697,1303.35771,HC,ROPA LAVADO,Liquido,ARIEL,3000,,1,2
3,201704,20001,0,279,1132.9443,1069.9613,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,3
4,201705,20001,0,701,1550.68936,1502.20132,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,4


In [32]:
# Asegurarse de que la columna 'periodo' sea de tipo string
final_dataset['periodo'] = final_dataset['periodo'].astype(str)
final_dataset = final_dataset[final_dataset['periodo'].str.startswith(('2019', '2018'))]
final_dataset.head()

Unnamed: 0,periodo,product_id,plan_precios_cuidados,cust_request_qty,cust_request_tn,y,cat1,cat2,cat3,brand,sku_size,stock_final,close_quarter,age
12,201801,20001,0,437,1256.01136,1169.07532,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,12
13,201802,20001,0,302,1150.37849,1043.7647,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,13
14,201803,20001,0,591,1902.79056,1856.83534,HC,ROPA LAVADO,Liquido,ARIEL,3000,,1,14
15,201804,20001,0,384,1286.12277,1251.28462,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,15
16,201805,20001,0,456,1303.62129,1293.89788,HC,ROPA LAVADO,Liquido,ARIEL,3000,,0,16


In [33]:
columns = ['plan_precios_cuidados', 'cust_request_qty', 'cust_request_tn', 'close_quarter','y']

#### Funcion para crear los modelos

#### Armado de los modelos

In [35]:
import os
from statsmodels.tsa.arima.model import ARIMA
from pmdarima import auto_arima
import joblib

product_ids = final_dataset['product_id'].unique()
predictions = []
    
for product_id in product_ids:
    product_data = final_dataset[final_dataset['product_id'] == product_id].sort_values(by='periodo')[columns]
    
    product_data_array = product_data['y'].values
    
       
    model = auto_arima(product_data_array, seasonal=True, trace=True, error_action='ignore', suppress_warnings=True)

    os.makedirs('Arima_results', exist_ok=True)
    joblib.dump(model, f'Models_params/model_product_{product_id}.pkl')
    
    forecast_steps = 2
    forecast = model.predict(n_periods=forecast_steps)
    
    # Agregar predicción al resultado
    predicted_y = forecast[-1]
    predictions.append({'product_id': product_id, 'predicted_y': predicted_y})

    print(f'Modelo para el producto {product_id} entrenado y guardado. Predicción a 2 meses: {predicted_y}')

Performing stepwise search to minimize aic
 ARIMA(2,0,2)(0,0,0)[0] intercept   : AIC=350.325, Time=0.47 sec
 ARIMA(0,0,0)(0,0,0)[0] intercept   : AIC=343.021, Time=0.01 sec
 ARIMA(1,0,0)(0,0,0)[0] intercept   : AIC=344.984, Time=0.02 sec
 ARIMA(0,0,1)(0,0,0)[0] intercept   : AIC=344.984, Time=0.06 sec
 ARIMA(0,0,0)(0,0,0)[0]             : AIC=421.368, Time=0.01 sec
 ARIMA(1,0,1)(0,0,0)[0] intercept   : AIC=347.209, Time=0.04 sec

Best model:  ARIMA(0,0,0)(0,0,0)[0] intercept
Total fit time: 0.610 seconds
Modelo para el producto 20001 entrenado y guardado. Predicción a 2 meses: 1480.3078595833342
Performing stepwise search to minimize aic
 ARIMA(2,0,2)(0,0,0)[0] intercept   : AIC=345.793, Time=0.19 sec
 ARIMA(0,0,0)(0,0,0)[0] intercept   : AIC=341.578, Time=0.01 sec
 ARIMA(1,0,0)(0,0,0)[0] intercept   : AIC=342.294, Time=0.02 sec
 ARIMA(0,0,1)(0,0,0)[0] intercept   : AIC=341.607, Time=0.08 sec
 ARIMA(0,0,0)(0,0,0)[0]             : AIC=408.977, Time=0.00 sec
 ARIMA(1,0,1)(0,0,0)[0] inter

In [36]:
predictions_df = pd.DataFrame(predictions)
predictions_df.to_csv('../../Datasets/predictions.csv', index=False)

print('Todas las predicciones han sido generadas y guardadas en predictions.csv.')

Todas las predicciones han sido generadas y guardadas en predictions.csv.
