## 🧱 1. Armado del Dataset

Se parte del dataset original de ventas sell-in.

Granularidad: `<product_id, periodo>`

Transformación:
- Agrupar por `product_id`, `periodo`
- Sumarizar `tn` (toneladas)


In [37]:
import pandas as pd

# Cargar el archivo de sell-in (ajustar el path si es necesario)
df_raw = pd.read_csv("sell-in.txt", delimiter='\t')

# Agrupar por product_id y periodo
df_agg = (
    df_raw.groupby(['product_id', 'periodo'], as_index=False)
           .agg({'tn': 'sum'})
           .sort_values(['product_id', 'periodo'])
)

df_agg.head()


Unnamed: 0,product_id,periodo,tn
0,20001,201701,934.77222
1,20001,201702,798.0162
2,20001,201703,1303.35771
3,20001,201704,1069.9613
4,20001,201705,1502.20132


In [38]:
df_agg.shape

(31243, 3)

## 🧮 2. Cálculo de la Clase (target)

Se crea un nuevo campo `clase` que representa `tn` en `periodo + 2`.

Notas:
- Para calcularlo se hace un merge desplazando dos períodos hacia atrás por `product_id`.
- Los períodos `201911` y `201912` quedan sin target (NaN).


In [39]:
# Asegurarse de que periodo sea tipo entero
df_agg['periodo'] = df_agg['periodo'].astype(int)

# Crear índice temporal: mes_abs (mes absoluto)
periodos_ordenados = sorted(df_agg['periodo'].unique())
map_periodo_to_mesabs = {p: i + 1 for i, p in enumerate(periodos_ordenados)}

# Agregar columna mes_abs
df_agg['mes_abs'] = df_agg['periodo'].map(map_periodo_to_mesabs)

# Ordenar correctamente por producto y tiempo
df_agg = df_agg.sort_values(['product_id', 'mes_abs'])

# Crear campo tn+2 (la clase) como tn desplazado -2 hacia adelante
df_agg['tn+2'] = df_agg.groupby('product_id')['tn'].shift(-2)

# Revisar el resultado
df_agg.head(40)


Unnamed: 0,product_id,periodo,tn,mes_abs,tn+2
0,20001,201701,934.77222,1,1303.35771
1,20001,201702,798.0162,2,1069.9613
2,20001,201703,1303.35771,3,1502.20132
3,20001,201704,1069.9613,4,1520.06539
4,20001,201705,1502.20132,5,1030.67391
5,20001,201706,1520.06539,6,1267.39462
6,20001,201707,1030.67391,7,1316.94604
7,20001,201708,1267.39462,8,1439.75563
8,20001,201709,1316.94604,9,1580.47401
9,20001,201710,1439.75563,10,1049.3886


## 🛠️ 3. Feature Engineering: Lags

Se generan 11 columnas de `tn` anteriores (tn_1 a tn_11), por `product_id`.

En cada fila, se tiene el historial de 12 meses completos si está disponible:
- `tn`, `tn_1`, ..., `tn_11`

No se generan features adicionales.


In [40]:
# Crear los lags tn_1 a tn_23
for lag in range(1, 24):
    df_agg[f'tn_{lag}'] = df_agg.groupby('product_id')['tn'].shift(lag)

df_agg.head(40)


Unnamed: 0,product_id,periodo,tn,mes_abs,tn+2,tn_1,tn_2,tn_3,tn_4,tn_5,...,tn_14,tn_15,tn_16,tn_17,tn_18,tn_19,tn_20,tn_21,tn_22,tn_23
0,20001,201701,934.77222,1,1303.35771,,,,,,...,,,,,,,,,,
1,20001,201702,798.0162,2,1069.9613,934.77222,,,,,...,,,,,,,,,,
2,20001,201703,1303.35771,3,1502.20132,798.0162,934.77222,,,,...,,,,,,,,,,
3,20001,201704,1069.9613,4,1520.06539,1303.35771,798.0162,934.77222,,,...,,,,,,,,,,
4,20001,201705,1502.20132,5,1030.67391,1069.9613,1303.35771,798.0162,934.77222,,...,,,,,,,,,,
5,20001,201706,1520.06539,6,1267.39462,1502.20132,1069.9613,1303.35771,798.0162,934.77222,...,,,,,,,,,,
6,20001,201707,1030.67391,7,1316.94604,1520.06539,1502.20132,1069.9613,1303.35771,798.0162,...,,,,,,,,,,
7,20001,201708,1267.39462,8,1439.75563,1030.67391,1520.06539,1502.20132,1069.9613,1303.35771,...,,,,,,,,,,
8,20001,201709,1316.94604,9,1580.47401,1267.39462,1030.67391,1520.06539,1502.20132,1069.9613,...,,,,,,,,,,
9,20001,201710,1439.75563,10,1049.3886,1316.94604,1267.39462,1030.67391,1520.06539,1502.20132,...,,,,,,,,,,


## 🎯 4. Dataset de Entrenamiento

Se selecciona únicamente el período `201812`.

Subset estratégico: solo los 33 `product_id` mágicos con datos completos.

Campos usados:
- Input: `tn`, `tn_1`, ..., `tn_11`
- Target: `clase` (mes `201902`)


In [41]:
# Lista de product_id mágicos
magicos = [20002, 20003, 20006, 20010, 20011, 20018, 20019, 20021,
   20026, 20028, 20035, 20039, 20042, 20044, 20045, 20046, 20049,
   20051, 20052, 20053, 20055, 20008, 20001, 20017, 20086, 20180,
   20193, 20320, 20532, 20612, 20637, 20807, 20838]


Prods12Meses = [20004,20005,20007,20009,20012,20013,20014,20015,20016,20020,20022,20023,20024,20025,20027,20029,20030,20031,20033,20037,20038,20041,20043,20047,20048,20050,20054,20056,20057,20058,20059,20061,20062,20063,20065,20066,20067,20068,20069,20070,20071,20072,20073,20074,20075,20076,20077,20078,20079,20080,20081,20082,20084,20085,20087,20088,20089,20090,20091,20092,20093,20094,20095,20096,20097,20099,20100,20101,20102,20103,20105,20106,20107,20108,20109,20111,20112,20113,20114,20116,20117,20118,20119,20120,20121,20122,20123,20124,20125,20126,20128,20129,20130,20132,20133,20134,20135,20136,20137,20138,20139,20140,20142,20143,20144,20145,20146,20148,20149,20150,20151,20152,20153,20155,20157,20158,20159,20160,20161,20162,20163,20164,20165,20166,20167,20168,20169,20170,20171,20173,20175,20176,20177,20178,20179,20181,20182,20183,20184,20185,20186,20187,20188,20189,20190,20191,20192,20194,20196,20197,20198,20200,20201,20202,20203,20205,20206,20207,20208,20209,20211,20212,20215,20216,20217,20218,20219,20220,20222,20224,20225,20226,20227,20228,20229,20230,20231,20232,20233,20234,20235,20237,20238,20239,20240,20241,20242,20244,20246,20249,20250,20251,20252,20253,20254,20255,20256,20259,20262,20263,20264,20265,20266,20267,20268,20269,20270,20271,20272,20273,20275,20276,20277,20278,20280,20281,20282,20283,20284,20285,20288,20289,20290,20291,20292,20295,20296,20297,20298,20299,20300,20301,20302,20303,20304,20305,20306,20307,20308,20309,20310,20311,20312,20313,20314,20315,20316,20317,20319,20321,20322,20323,20324,20325,20326,20327,20328,20329,20330,20332,20334,20335,20336,20337,20338,20340,20341,20342,20343,20344,20345,20346,20348,20349,20350,20351,20352,20353,20354,20356,20357,20358,20359,20360,20361,20362,20364,20365,20366,20367,20368,20372,20375,20376,20377,20378,20379,20380,20381,20382,20383,20384,20385,20386,20387,20388,20389,20390,20394,20395,20396,20398,20399,20400,20401,20402,20403,20404,20406,20407,20408,20409,20410,20411,20412,20413,20415,20416,20417,20418,20419,20421,20422,20424,20426,20428,20429,20432,20433,20434,20435,20438,20443,20447,20449,20450,20453,20454,20456,20459,20460,20463,20464,20465,20466,20469,20470,20471,20473,20474,20477,20478,20479,20480,20481,20482,20483,20484,20485,20488,20490,20495,20496,20497,20500,20501,20502,20503,20505,20507,20508,20509,20512,20513,20514,20517,20520,20522,20523,20524,20527,20530,20536,20538,20539,20540,20541,20542,20544,20546,20547,20549,20551,20552,20553,20555,20556,20558,20559,20561,20563,20564,20565,20567,20568,20569,20570,20571,20572,20574,20576,20578,20579,20580,20583,20585,20586,20588,20589,20594,20595,20597,20599,20600,20601,20602,20604,20605,20606,20609,20611,20614,20617,20622,20624,20627,20628,20629,20632,20636,20638,20639,20640,20642,20644,20645,20646,20647,20651,20652,20653,20654,20655,20657,20658,20660,20661,20663,20664,20666,20667,20669,20670,20672,20676,20677,20678,20679,20680,20682,20684,20685,20689,20693,20696,20697,20699,20700,20701,20702,20705,20706,20708,20709,20710,20712,20713,20714,20715,20724,20725,20729,20730,20733,20735,20737,20739,20741,20742,20743,20744,20745,20749,20750,20751,20756,20758,20759,20761,20763,20765,20768,20771,20773,20775,20777,20778,20780,20781,20783,20786,20788,20789,20793,20796,20798,20800,20801,20802,20803,20809,20810,20811,20812,20817,20818,20820,20821,20823,20824,20826,20830,20831,20832,20835,20836,20840,20843,20846,20847,20849,20850,20852,20853,20855,20862,20863,20864,20865,20870,20873,20874,20877,20878,20882,20883,20885,20892,20894,20901,20902,20906,20908,20913,20914,20917,20922,20925,20931,20935,20937,20941,20945,20947,20948,20949,20951,20952,20956,20957,20960,20961,20965,20967,20970,20973,20974,20976,20977,20981,20982,20985,20986,20990,20991,20994,20996,20997,21001,21003,21005,21008,21013,21014,21016,21022,21024,21027,21028,21032,21034,21037,21038,21040,21048,21049,21055,21057,21063,21065,21071,21077,21080,21084,21088,21093,21102,21105,21118,21124,21126,21131,21133,21142,21155,21156,21157,21164,21167,21170,21176,21180,21181,21184,21191,21192,21194,21195,21201,21207,21209,21212,21218,21224,21226,21233,21244,21245,21255,21257,21271]

Prods24Meses = [20004,20005,20007,20009,20012,20013,20014,20015,20016,20020,20022,20023,20024,20025,20027,20029,20030,20031,20033,20037,20038,20041,20043,20047,20048,20050,20054,20056,20057,20058,20059,20061,20062,20063,20065,20066,20067,20068,20069,20070,20071,20072,20073,20074,20075,20076,20077,20078,20079,20080,20081,20082,20084,20087,20088,20090,20091,20092,20093,20094,20095,20096,20097,20099,20100,20101,20102,20103,20105,20106,20107,20108,20109,20111,20112,20113,20114,20116,20117,20118,20119,20120,20121,20122,20123,20124,20125,20128,20129,20132,20133,20134,20137,20138,20139,20140,20142,20144,20145,20146,20148,20149,20151,20152,20153,20155,20157,20158,20160,20161,20162,20163,20164,20165,20166,20167,20168,20169,20171,20173,20175,20176,20177,20178,20179,20181,20182,20183,20184,20185,20187,20188,20189,20190,20191,20194,20196,20197,20198,20200,20201,20202,20203,20205,20206,20207,20208,20209,20211,20212,20215,20216,20217,20218,20219,20220,20222,20224,20225,20226,20227,20228,20230,20231,20232,20233,20234,20235,20238,20239,20240,20241,20242,20244,20246,20249,20250,20251,20252,20253,20254,20255,20256,20259,20263,20264,20265,20267,20268,20269,20270,20271,20272,20273,20275,20276,20277,20278,20280,20281,20282,20283,20284,20285,20288,20289,20290,20291,20292,20295,20296,20297,20298,20299,20300,20301,20302,20303,20304,20305,20306,20307,20308,20309,20310,20311,20312,20313,20314,20315,20316,20317,20319,20321,20322,20323,20324,20325,20326,20327,20328,20329,20330,20332,20334,20335,20336,20337,20338,20340,20341,20342,20343,20344,20346,20348,20349,20350,20352,20353,20354,20356,20357,20358,20359,20360,20361,20362,20365,20366,20367,20372,20375,20376,20377,20379,20380,20381,20382,20383,20384,20385,20386,20387,20388,20390,20394,20396,20398,20399,20400,20401,20402,20403,20404,20406,20407,20409,20410,20411,20412,20413,20415,20416,20417,20418,20419,20421,20422,20424,20428,20429,20432,20433,20434,20435,20438,20443,20447,20449,20450,20453,20454,20463,20464,20465,20466,20469,20470,20471,20473,20474,20478,20479,20480,20482,20483,20484,20485,20490,20496,20497,20500,20501,20502,20505,20507,20508,20509,20512,20514,20517,20524,20530,20536,20538,20539,20542,20544,20549,20551,20552,20555,20561,20563,20564,20565,20567,20568,20570,20572,20574,20578,20579,20583,20585,20586,20588,20589,20594,20595,20597,20599,20600,20601,20602,20605,20606,20609,20614,20617,20622,20624,20628,20629,20632,20636,20639,20640,20642,20644,20645,20646,20647,20651,20652,20653,20654,20655,20657,20658,20660,20661,20663,20664,20667,20669,20670,20672,20676,20677,20678,20680,20684,20685,20693,20696,20697,20699,20701,20702,20705,20706,20708,20710,20713,20714,20715,20724,20725,20729,20730,20733,20735,20737,20739,20741,20742,20743,20744,20745,20749,20750,20751,20756,20758,20759,20761,20765,20768,20771,20773,20775,20777,20778,20780,20781,20786,20788,20789,20793,20796,20800,20801,20802,20803,20809,20810,20812,20818,20820,20821,20823,20826,20830,20831,20832,20840,20843,20846,20847,20849,20850,20855,20862,20863,20864,20865,20870,20873,20874,20877,20878,20882,20883,20885,20892,20894,20901,20906,20913,20914,20922,20925,20931,20935,20937,20941,20945,20947,20948,20949,20951,20952,20956,20957,20960,20961,20965,20970,20973,20974,20976,20977,20982,20985,20986,20991,20994,20996,21003,21005,21008,21014,21016,21024,21027,21028,21032,21038,21048,21055,21057,21071,21077,21080,21088,21118,21124,21131,21155,21156,21167,21170,21181,21184,21194,21195,21207,21212,21218,21224,21255,21257
]

Prods24MesesHC = [20007,20009,20012,20013,20014,20015,20016,20020,20022,20024,20025,20027,20029,20030,20031,20038,20041,20043,20050,20056,20057,20062,20063,20065,20066,20067,20068,20069,20070,20071,20072,20073,20074,20076,20082,20087,20088,20091,20092,20097,20099,20102,20103,20109,20112,20113,20114,20117,20124,20128,20129,20137,20138,20144,20148,20149,20151,20160,20162,20163,20164,20165,20166,20168,20171,20178,20183,20185,20190,20191,20196,20197,20201,20202,20203,20205,20206,20209,20217,20218,20219,20222,20233,20239,20246,20253,20254,20280,20281,20288,20304,20308,20311,20312,20313,20319,20332,20341,20357,20358,20361,20366,20376,20388,20390,20412,20413,20415,20421,20447,20473,20478,20479,20485,20507,20508,20524,20530,20564,20588,20595,20652,20653,20657,20705,20724,20733,20737,20741,20750,20780,20803,20855,20877,20941,20996,21003,21048,21155,21167,21184,21195,21212]

Seleccionados = Prods12Meses

# Selección de features y target
features = ['tn'] + [f'tn_{i}' for i in range(1, 24)]
target = 'tn+2'

# Filtrar el dataset para entrenamiento: mes_abs == 24 (equivale a periodo 201712)
df_train = df_agg[df_agg['mes_abs'] == 24].copy()

# Quedarse solo con los Seleccionados
df_train = df_train[df_train['product_id'].isin(Seleccionados)]

# Eliminar registros incompletos
df_train = df_train.dropna(subset=features + [target])

# Mostrar resultados
print(f"Registros para entrenamiento: {df_train.shape[0]}")
display(df_train[features + [target]].head(10))


Registros para entrenamiento: 491


Unnamed: 0,tn,tn_1,tn_2,tn_3,tn_4,tn_5,tn_6,tn_7,tn_8,tn_9,...,tn_15,tn_16,tn_17,tn_18,tn_19,tn_20,tn_21,tn_22,tn_23,tn+2
131,585.56477,802.34669,809.67086,948.86342,936.42001,653.4231,447.84475,641.37063,611.51237,488.92473,...,1259.6456,1042.52979,569.88117,590.50779,543.3667,512.05402,489.91328,508.20044,555.91614,441.70332
167,372.63428,469.26344,893.74086,761.7752,874.88924,502.34077,547.62513,637.11135,496.41774,559.98671,...,1247.8888,1068.01865,625.84925,528.58883,515.58711,662.59032,563.89955,551.4306,494.27011,409.8995
239,361.82904,447.26564,547.65697,434.30577,694.49793,694.87111,778.84928,718.11211,670.18111,1205.48871,...,912.1132,874.86774,767.23749,850.75738,858.04498,741.17156,840.83303,638.62996,464.67137,368.79546
311,555.27622,551.96254,596.92913,587.6409,529.56178,383.58812,398.99459,392.31112,297.43841,449.68079,...,464.70505,411.07364,406.5181,548.52156,455.3711,338.7186,456.07282,475.242,378.08172,366.72969
419,325.60163,344.45169,422.68261,320.11756,582.80037,566.99689,509.51298,680.23243,590.48738,831.62216,...,731.15982,643.97083,699.41399,872.83606,718.24425,749.91517,649.49079,509.04048,476.39728,330.26012
455,333.70155,367.82928,469.93401,235.49526,437.85378,391.482,432.0225,494.79885,448.49259,593.35731,...,556.0191,381.76047,484.92171,527.88645,600.53175,515.65878,641.94039,343.98819,433.34928,377.10855
491,362.544,560.28336,325.1976,379.02228,555.83892,542.35272,473.01072,533.08164,441.25536,522.03441,...,531.66431,483.52381,667.88038,564.42105,410.7887,603.43722,410.33466,272.35644,248.65917,332.43756
527,452.26356,447.93655,554.465,498.47955,402.60881,318.59313,301.50633,373.35601,281.02337,509.0393,...,467.91974,579.14327,467.67899,515.15543,454.21516,399.85075,462.48432,386.11926,304.24755,424.16407
563,215.90478,326.01114,454.11912,465.0282,474.3648,409.92588,550.97406,535.00356,481.62114,533.72592,...,573.56208,466.37136,436.2813,361.44108,539.34426,459.7047,477.29682,249.36912,293.66064,292.64508
707,187.36972,190.00801,166.28102,179.85676,235.66237,317.11681,348.98573,459.19912,378.21636,354.92622,...,719.0078,338.83445,462.69354,469.64741,614.8222,420.6384,382.3223,296.98903,293.38983,210.74725


## 📈 5. Entrenamiento del Modelo (Regresión Lineal)

Modelo: Regresión Lineal sin hiperparámetros

- X: tn, tn_1, ..., tn_11
- y: clase


In [42]:
from sklearn.linear_model import LinearRegression
import pandas as pd

# Definir features y target
X = df_train[features]
y = df_train['tn+2']

# Entrenar modelo de regresión lineal
model = LinearRegression()
model.fit(X, y)

# Crear DataFrame de coeficientes
coef = pd.DataFrame({
    'feature': ['intercept'] + features,
    'coeficiente': [model.intercept_] + list(model.coef_)
})

# Ordenar por valor absoluto del coeficiente (opcional)
coef['abs'] = coef['coeficiente'].abs()
# coef = coef.sort_values(by='abs', ascending=False).drop(columns='abs').reset_index(drop=True)

# Mostrar coeficientes
display(coef)


Unnamed: 0,feature,coeficiente,abs
0,intercept,0.904621,0.904621
1,tn,0.211209,0.211209
2,tn_1,0.008272,0.008272
3,tn_2,0.150047,0.150047
4,tn_3,-0.038397,0.038397
5,tn_4,0.08321,0.08321
6,tn_5,0.138002,0.138002
7,tn_6,-0.033715,0.033715
8,tn_7,-0.058175,0.058175
9,tn_8,-0.109329,0.109329


## 📊 6. Aplicación del Modelo a los 780 registros finales

- Se aplicará solo a los 656 con datos completos.
- Los 124 restantes se imputan con el promedio.


In [43]:
# Agarramos los 780 productos correspondientes a predecir
df_780 = pd.read_csv("ListadoIDS.txt", sep=';', header=None)
df_780.columns = ['product_id']
df_780.shape

(781, 1)

In [44]:
df_agg['product_id'] = df_agg['product_id'].astype(str)
df_780['product_id'] = df_780['product_id'].astype(str)

In [45]:
import numpy as np

# --- FILTRADO BASE ---
# Solo productos a predecir y mes 201912 (mes_abs == 36)
df_pred = df_agg[
    (df_agg['mes_abs'] == 36) & 
    (df_agg['product_id'].isin(df_780['product_id']))
].copy()


# Determinar qué productos tienen todos los features disponibles
df_pred['completos'] = df_pred[features].notnull().all(axis=1)
df_pred.head (10)

Unnamed: 0,product_id,periodo,tn,mes_abs,tn+2,tn_1,tn_2,tn_3,tn_4,tn_5,...,tn_15,tn_16,tn_17,tn_18,tn_19,tn_20,tn_21,tn_22,tn_23,completos
35,20001,201912,1504.68856,36,,1397.37231,1561.50552,1660.00561,1261.34529,1678.99318,...,1438.67455,1800.96168,1470.41009,1150.79169,1293.89788,1251.28462,1856.83534,1043.7647,1169.07532,True
71,20002,201912,1087.30855,36,,1423.57739,1979.53635,1090.18771,813.78215,1066.44999,...,954.23575,1161.8843,977.40239,1033.82845,1103.39191,999.20934,966.86044,712.00087,984.80167,True
107,20003,201912,892.50129,36,,948.29393,1081.36645,967.77116,635.59563,715.20314,...,912.34156,955.97079,656.227,660.73323,784.35885,765.47838,778.55594,788.30749,907.56304,True
143,20004,201912,637.90002,36,,723.94206,1064.69633,786.1714,482.13372,521.71519,...,948.86342,936.42001,653.4231,447.84475,641.37063,611.51237,488.92473,503.65326,415.52538,True
179,20005,201912,593.24443,36,,606.91173,996.78275,879.52808,536.668,745.74978,...,761.7752,874.88924,502.34077,547.62513,637.11135,496.41774,559.98671,399.20878,417.53208,True
215,20006,201912,417.23228,36,,399.6142,528.3263,409.95501,262.73593,343.11053,...,478.04388,615.70617,515.20419,468.1526,865.28861,748.44391,862.19361,588.56272,470.33785,True
251,20007,201912,390.43432,36,,357.85913,445.34884,369.74894,307.82899,573.37257,...,434.30577,694.49793,694.87111,778.84928,718.11211,670.18111,1205.48871,383.80253,635.25815,True
287,20008,201912,195.36854,36,,396.49833,452.77197,330.56343,233.00983,524.04994,...,436.96269,554.82147,526.38149,554.57063,707.59267,691.53246,765.98901,506.25385,469.29224,True
323,20009,201912,495.03574,36,,711.89025,556.15182,558.45719,520.41758,716.07987,...,587.6409,529.56178,383.58812,398.99459,392.31112,297.43841,449.68079,434.03086,264.48599,True
359,20010,201912,359.59998,36,,470.96658,448.82078,524.94628,199.86233,463.91662,...,480.60235,582.83104,331.96807,223.87746,227.24082,171.74107,653.77607,477.48363,298.25586,True


In [46]:
# Separar completos e incompletos
df_completos = df_pred[df_pred['completos']].copy()
df_incompletos = df_pred[~df_pred['completos']].copy()

# --- PREDICCIÓN PARA COMPLETOS ---
df_completos['pred'] = model.predict(df_completos[features])
df_completos['pred_tipo'] = 'modelo'

In [47]:
import numpy as np
# --- PREDICCIÓN PARA INCOMPLETOS: promedio ponderado con énfasis en febrero anterior (tn_10) ---

# Pesos manuales: últimos 3 meses, febrero pasado, resto
# pesos = np.array([
#     3.0,  # tn
#     2.5,  # tn_1
#     2.0,  # tn_2
#     1.5,  # tn_3
#     1.5,  # tn_4
#     1.5,  # tn_5
#     1.0,  # tn_6
#     1.0,  # tn_7
#     1.0,  # tn_8
#     1.0,  # tn_9
#     4.0,  # tn_10 Febrero 2019
#     1.0,   # tn_11
#     1.0,   # tn_12
#     1.0,   # tn_13
#     1.0,  # tn_14
#     1.0,  # tn_15
#     1.0,  # tn_16
#     1.0,  # tn_17
#     1.0,  # tn_18
#     1.0,  # tn_19
#     1.0,  # tn_20
#     1.0,  # tn_21
#     4.0,  # tn_22 Febrero 2018
#     1.0,   # tn_23

# ])

# # Dar más peso a tn_10 (correspondiente a mes_abs 26 = 201902)
# pesos[22] *= 12  # Aumentar influencia de febrero del año 2018
# pesos[10] *= 12  # Aumentar influencia de febrero del año 2019


# Decaimiento exponencial
base = 0.85
pesos = np.array([base**i for i in range(24)])

# Boost a febreros
pesos[10] *= 2.5  # Febrero 2019
pesos[22] *= 1.5  # Febrero 2018

# Normalizar
pesos = pesos / pesos.sum()





# Extraer matriz de features
X_incompletos = df_incompletos[features].values

# Crear máscara de valores válidos
máscara_validos = ~np.isnan(X_incompletos)

# Expandir pesos por fila, aplicando la máscara
pesos_expandido = np.tile(pesos, (X_incompletos.shape[0], 1))
pesos_validos = pesos_expandido * máscara_validos

# Calcular promedio ponderado por fila
suma_ponderada = np.nansum(X_incompletos * pesos_validos, axis=1)
suma_pesos = np.nansum(pesos_validos, axis=1)
df_incompletos['pred'] = suma_ponderada / suma_pesos
df_incompletos['pred_tipo'] = 'promedio_ponderado_febrero'


# --- UNIÓN FINAL ---
df_final = pd.concat([df_completos, df_incompletos], axis=0).sort_values('product_id')

# --- VALIDACIÓN DE RESULTADOS ---
print("🔍 Suma de TN a 201912:")
print(f"Completos (modelo): {df_completos['tn'].sum():,.2f}")
print(f"Incompletos (promedio historia): {df_incompletos['tn'].sum():,.2f}")
print(f"Total final: {df_final['tn'].sum():,.2f}")

print("🔍 Suma de predicciones a 202002:")
print(f"Completos (modelo): {df_completos['pred'].sum():,.2f}")
print(f"Incompletos (promedio historia): {df_incompletos['pred'].sum():,.2f}")
print(f"Total final: {df_final['pred'].sum():,.2f}")

print("\n📦 Desglose:")
print(f"Total a predecir: {df_final.shape[0]}  (esperado: {df_780.shape[0]})")
print(f"Completos: {df_completos.shape[0]}  |  Incompletos: {df_incompletos.shape[0]}")
print(df_final['pred_tipo'].value_counts())

# --- VISTA RÁPIDA ---
display(df_final[['product_id', 'periodo', 'pred', 'pred_tipo']].head())



🔍 Suma de TN a 201912:
Completos (modelo): 22,104.07
Incompletos (promedio historia): 3,041.18
Total final: 25,145.25
🔍 Suma de predicciones a 202002:
Completos (modelo): 23,956.95
Incompletos (promedio historia): 3,904.14
Total final: 27,861.09

📦 Desglose:
Total a predecir: 780  (esperado: 781)
Completos: 568  |  Incompletos: 212
pred_tipo
modelo                        568
promedio_ponderado_febrero    212
Name: count, dtype: int64


Unnamed: 0,product_id,periodo,pred,pred_tipo
35,20001,201912,1127.784583,modelo
71,20002,201912,1024.473409,modelo
107,20003,201912,783.242329,modelo
143,20004,201912,455.589237,modelo
179,20005,201912,401.141283,modelo


In [48]:
import os
from datetime import datetime

# Crear carpeta 'kaggle' si no existe
os.makedirs("kaggle", exist_ok=True)

# Generar timestamp actual
timestamp = datetime.now().strftime("%Y%m%d_%H%M")

# Nombre de archivo
filename = f"kaggle/predicciones_201912_{timestamp}.csv"

# Exportar CSV
df_final[['product_id', 'pred']].to_csv(filename, index=False)

print(f"Archivo guardado como: {filename}")


Archivo guardado como: kaggle/predicciones_201912_20250716_1930.csv
