# Tabular 
En esta libreta vamos a explorar el módulo "tabular" de fastai, que sirve para hacer redes neuronales para datos "tabulares" (es decir, el tipo de datos que encontrarías en un csv, por ejemplo)


> Vamos a explorar esto en el concurso de kaggle llamado "Rossmann Store Sales"

# Indice del notebook
- [Función de pérdida](#lostFunction)
- [Mejoras](#improve)
- [Modelo](#model)
- [Agregar más datos](#data)
- [¡Más datos!](#moredata)


In [1]:
import fastai.tabular.all as ft
import pandas as pd 
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np

In [8]:
!ls

Tabular_RN2.ipynb  data   images  mydata
archive		   faces  models  rossmann-store-sales.zip


In [9]:
pd.read_csv('mydata/train.csv')

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


Unnamed: 0,Store,DayOfWeek,Date,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday
0,1,5,2015-07-31,5263,555,1,1,0,1
1,2,5,2015-07-31,6064,625,1,1,0,1
2,3,5,2015-07-31,8314,821,1,1,0,1
3,4,5,2015-07-31,13995,1498,1,1,0,1
4,5,5,2015-07-31,4822,559,1,1,0,1
...,...,...,...,...,...,...,...,...,...
1017204,1111,2,2013-01-01,0,0,0,0,a,1
1017205,1112,2,2013-01-01,0,0,0,0,a,1
1017206,1113,2,2013-01-01,0,0,0,0,a,1
1017207,1114,2,2013-01-01,0,0,0,0,a,1


In [10]:
df = pd.read_csv('mydata/train.csv', low_memory=True, parse_dates=['Date'])

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


In [11]:
df.head(5)

Unnamed: 0,Store,DayOfWeek,Date,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday
0,1,5,2015-07-31,5263,555,1,1,0,1
1,2,5,2015-07-31,6064,625,1,1,0,1
2,3,5,2015-07-31,8314,821,1,1,0,1
3,4,5,2015-07-31,13995,1498,1,1,0,1
4,5,5,2015-07-31,4822,559,1,1,0,1


In [12]:
df.describe()

Unnamed: 0,Store,DayOfWeek,Sales,Customers,Open,Promo,SchoolHoliday
count,1017209.0,1017209.0,1017209.0,1017209.0,1017209.0,1017209.0,1017209.0
mean,558.4297,3.998341,5773.819,633.1459,0.8301067,0.3815145,0.1786467
std,321.9087,1.997391,3849.926,464.4117,0.3755392,0.4857586,0.3830564
min,1.0,1.0,0.0,0.0,0.0,0.0,0.0
25%,280.0,2.0,3727.0,405.0,1.0,0.0,0.0
50%,558.0,4.0,5744.0,609.0,1.0,0.0,0.0
75%,838.0,6.0,7856.0,837.0,1.0,1.0,0.0
max,1115.0,7.0,41551.0,7388.0,1.0,1.0,1.0


In [13]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1017209 entries, 0 to 1017208
Data columns (total 9 columns):
 #   Column         Non-Null Count    Dtype         
---  ------         --------------    -----         
 0   Store          1017209 non-null  int64         
 1   DayOfWeek      1017209 non-null  int64         
 2   Date           1017209 non-null  datetime64[ns]
 3   Sales          1017209 non-null  int64         
 4   Customers      1017209 non-null  int64         
 5   Open           1017209 non-null  int64         
 6   Promo          1017209 non-null  int64         
 7   StateHoliday   1017209 non-null  object        
 8   SchoolHoliday  1017209 non-null  int64         
dtypes: datetime64[ns](1), int64(7), object(1)
memory usage: 69.8+ MB


In [15]:
df = df.sort_values(['Date']).copy()

In [16]:
df

Unnamed: 0,Store,DayOfWeek,Date,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday
1017208,1115,2,2013-01-01,0,0,0,0,a,1
1016473,379,2,2013-01-01,0,0,0,0,a,1
1016472,378,2,2013-01-01,0,0,0,0,a,1
1016471,377,2,2013-01-01,0,0,0,0,a,1
1016470,376,2,2013-01-01,0,0,0,0,a,1
...,...,...,...,...,...,...,...,...,...
745,746,5,2015-07-31,9082,638,1,1,0,1
746,747,5,2015-07-31,10708,826,1,1,0,1
747,748,5,2015-07-31,7481,578,1,1,0,1
741,742,5,2015-07-31,10460,1016,1,1,0,1


<a id="lostFunction"><h1><strong>RMSPE</strong></h1></a>

viendo el kaggle, el problema se evalúa con RMSPE, definido por: 


> RMSPE($y'$, $y$) = $\sqrt{\frac{1}{n}\sum \left(\frac{y - y´}{y}\right)^2}$

Donde, como siempre $y'$ es la predicción y $y$ es el real. Además, aquellos en donde $y$ = 0 se ignoran.



In [17]:
def rmspe(yp, y):
    u = (y-yp)/y
    return torch.sqrt((u*u).mean())

In [18]:
df = df[df['Sales'] > 0].copy()

df['Sales'] = df['Sales'].astype('float64')

In [19]:
df['Sales'].max()

41551.0

In [20]:
df['Sales'] /= 1000. # Para quitar los nan que obtenía

In [21]:
df['Sales'].max()

41.551

Ahora, vamos a dividir en variables categóricas, continuas y "y"...

In [22]:
# Variables categoricas, nombres
cat_names = ['Store', 'DayOfWeek', 'Promo', 'SchoolHoliday', 'StateHoliday']
# Variables continuas, nombres
cont_names = [] 
# Variable a predecir
y_names = ['Sales']

In [23]:
len(df)

844338

In [24]:
X = list(range(len(df)))

In [25]:
valid_cut = len(df) - 20000

In [26]:
# df corresponde al dataframe
# procs es como las transformaciones
# cat_names son los nombres de mis variables categoricas
# cont_names son los nombres de mis variables continuas
# y_names es lo que voy a predecir
# splits como dividir en entrenamiento y validación
# FillMissing rellena mis datos faltantes 
src = ft.TabularPandas(df,
                      procs = [ft.Categorify, ft.FillMissing],
                      cat_names = cat_names,
                      cont_names = cont_names,
                      y_names = y_names, 
                      splits = (X[:valid_cut], X[valid_cut:])
                      )

In [27]:
dls = src.dataloaders(bs=1024)

In [28]:
batch = dls.one_batch()

In [29]:
# Variables categoricas, continuas y dependientes
cat, cont, y = batch 

In [30]:
cat.shape # El 5 corresponde a nuestro cat_names y 2048 a nuestro batchsize

torch.Size([2048, 5])

In [31]:
cont.shape

torch.Size([2048, 0])

In [32]:
cat

tensor([[325,   6,   1,   1,   2],
        [916,   3,   2,   1,   2],
        [109,   5,   2,   1,   1],
        ...,
        [ 22,   5,   1,   2,   2],
        [758,   6,   1,   1,   2],
        [474,   6,   1,   1,   2]], device='cuda:0')

In [33]:
y

tensor([[7.9240],
        [6.9670],
        [6.9590],
        ...,
        [4.0530],
        [2.7530],
        [4.3000]], device='cuda:0')

In [34]:
learn = ft.tabular_learner(dls, opt_func=ft.ranger, metrics=rmspe)

In [35]:
learn.model

TabularModel(
  (embeds): ModuleList(
    (0): Embedding(1116, 81)
    (1): Embedding(8, 5)
    (2): Embedding(3, 3)
    (3): Embedding(3, 3)
    (4): Embedding(6, 4)
  )
  (emb_drop): Dropout(p=0.0, inplace=False)
  (bn_cont): BatchNorm1d(0, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): LinBnDrop(
      (0): BatchNorm1d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): Linear(in_features=96, out_features=200, bias=False)
      (2): ReLU(inplace=True)
    )
    (1): LinBnDrop(
      (0): BatchNorm1d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): Linear(in_features=200, out_features=100, bias=False)
      (2): ReLU(inplace=True)
    )
    (2): LinBnDrop(
      (0): Linear(in_features=100, out_features=1, bias=True)
    )
  )
)

In [36]:
learn.summary()

epoch,train_loss,valid_loss,rmspe,time
0,,,00:00,


TabularModel (Input shape: ['2048 x 5', '2048 x 0'])
Layer (type)         Output Shape         Param #    Trainable 
Embedding            2048 x 81            90,396     True      
________________________________________________________________
Embedding            2048 x 5             40         True      
________________________________________________________________
Embedding            2048 x 3             9          True      
________________________________________________________________
Embedding            2048 x 3             9          True      
________________________________________________________________
Embedding            2048 x 4             24         True      
________________________________________________________________
Dropout              2048 x 96            0          False     
________________________________________________________________
BatchNorm1d          2048 x 96            192        True      
_____________________________________________

In [37]:
learn.lr__find()

AttributeError: 'TabularLearner' object has no attribute 'lr__find'

In [38]:
learn.fit_one_cycle(20, 1e-3, div=2, pct_start=0.5)

epoch,train_loss,valid_loss,rmspe,time
0,2.477104,1.53947,0.204244,00:07
1,1.729336,1.303543,0.161561,00:07
2,1.580386,1.238143,0.155651,00:07
3,1.553917,1.244746,0.155626,00:07
4,1.525184,1.209038,0.154815,00:07
5,1.526017,1.225854,0.15596,00:07
6,1.516794,1.185819,0.15192,00:07
7,1.503533,1.209811,0.15607,00:07
8,1.506392,1.20621,0.154842,00:07
9,1.507272,1.198135,0.153936,00:07


<a id="improve"><h1><strong>Mejoras</strong></h1></a>

Podemos hacer algunas mejoras: 
1. Función de pérdida
2. Podemos cambiar el modelo (más capas, dropout...)
3. Agregar más datos (agregar mes, temperatura, que si llueve, etc etc. Agregar TODO lo que podría tener que ver)

# Función de pérdida

Si lo pensamos bien, **no estamos optimizando lo que queremos optimizar**. Estamos usando MSE, pero nos pidieron optimizar MSPE (el sqrt no importa)



In [39]:
def mspe(yp, y):
    u = (y-yp)/y
    return (u*u).mean()

In [40]:
learn = ft.tabular_learner(dls,
                          loss_func = mspe, 
                          opt_func = ft.ranger, 
                          metrics = rmspe)

In [41]:
learn.fit_one_cycle(20, 1e-3, div=2, pct_start = 0.5)

epoch,train_loss,valid_loss,rmspe,time
0,0.056346,0.035308,0.18504,00:07
1,0.040151,0.028105,0.166225,00:07
2,0.037116,0.025812,0.159788,00:07
3,0.034538,0.024632,0.156076,00:07
4,0.032985,0.024064,0.154154,00:07
5,0.035044,0.024842,0.156683,00:07
6,0.034641,0.025234,0.157746,00:07
7,0.032467,0.024162,0.154452,00:07
8,0.030319,0.023606,0.152511,00:07
9,0.040663,0.025652,0.159449,00:07


<a id="model"><h1><strong>Modelo</strong></h1></a>

Fastai nos permite configurar el modelo mediante la variable **layers**, que es una lista con las capas intermedias.  

También, podemos ajustar los tamañaos de los embeddings con un diccionario llamado **emb_szs**

Y con la variable **config** podemos configurar algunas otras cosas del modelo creado. 

Por supueso, podríamos crear nuestro propio modelo, pero fastai lo hace bastante bien por nosotros, y ya no nos tenemos que preocupar por los tamaños de los embeddings. 

In [42]:
# el arreglo de layers corresponde al número de neuronas
# en cada capa, por defecto es [200, 100]
# emb_szs es por si quieres cambiar los tamaños
# de los embeddings
## *Store* es una columna
learn = ft.tabular_learner(dls,
                          loss_func = mspe,
                           emb_szs = {'Store': 16},
                           layers = [256, 128, 128], 
                           config = ft.tabular_config(act_cls=nn.LeakyReLU(inplace=True)),
                          opt_func = ft.ranger, 
                          metrics = rmspe)

In [43]:
learn.model

TabularModel(
  (embeds): ModuleList(
    (0): Embedding(1116, 16)
    (1): Embedding(8, 5)
    (2): Embedding(3, 3)
    (3): Embedding(3, 3)
    (4): Embedding(6, 4)
  )
  (emb_drop): Dropout(p=0.0, inplace=False)
  (bn_cont): BatchNorm1d(0, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): LinBnDrop(
      (0): BatchNorm1d(31, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): Linear(in_features=31, out_features=256, bias=False)
      (2): LeakyReLU(negative_slope=0.01, inplace=True)
    )
    (1): LinBnDrop(
      (0): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): Linear(in_features=256, out_features=128, bias=False)
      (2): LeakyReLU(negative_slope=0.01, inplace=True)
    )
    (2): LinBnDrop(
      (0): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): Linear(in_features=128, out_features=128, bias=False)
      (2): Leaky

In [44]:
learn.fit_one_cycle(20, 1e-3, div=2, pct_start = 0.5)

epoch,train_loss,valid_loss,rmspe,time
0,0.074428,0.034317,0.181945,00:08
1,0.053864,0.028431,0.167282,00:08
2,0.038603,0.026308,0.161168,00:08
3,0.030302,0.023811,0.152779,00:08
4,0.03072,0.024139,0.154024,00:08
5,0.043707,0.0252,0.157476,00:08
6,0.03552,0.024814,0.156059,00:08
7,0.045263,0.025522,0.158366,00:08
8,0.029267,0.023749,0.152757,00:08
9,0.041455,0.026195,0.160614,00:08


<a id="data"><h1><strong>Agregar más datos</strong></h1></a>

Tenemos datos sobre la fecha que no estamos usando para nada

In [45]:
df = pd.read_csv('mydata/train.csv', low_memory=True, parse_dates=['Date'])

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


In [46]:
df = df.sort_values(['Date']).copy()
#add_datepart le pasas un dataframe, donde esta una fecha
#y lo que hace es añadir columnas que son relevantes
#a la fecha
ft.add_datepart(df, 'Date')

Unnamed: 0,Store,DayOfWeek,Week,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday,Year,...,Day,Dayofweek,Dayofyear,Is_month_end,Is_month_start,Is_quarter_end,Is_quarter_start,Is_year_end,Is_year_start,Elapsed
1017208,1115,2,1,0,0,0,0,a,1,2013,...,1,1,1,False,True,False,True,False,True,1356998400
1016473,379,2,1,0,0,0,0,a,1,2013,...,1,1,1,False,True,False,True,False,True,1356998400
1016472,378,2,1,0,0,0,0,a,1,2013,...,1,1,1,False,True,False,True,False,True,1356998400
1016471,377,2,1,0,0,0,0,a,1,2013,...,1,1,1,False,True,False,True,False,True,1356998400
1016470,376,2,1,0,0,0,0,a,1,2013,...,1,1,1,False,True,False,True,False,True,1356998400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
745,746,5,31,9082,638,1,1,0,1,2015,...,31,4,212,True,False,False,False,False,False,1438300800
746,747,5,31,10708,826,1,1,0,1,2015,...,31,4,212,True,False,False,False,False,False,1438300800
747,748,5,31,7481,578,1,1,0,1,2015,...,31,4,212,True,False,False,False,False,False,1438300800
741,742,5,31,10460,1016,1,1,0,1,2015,...,31,4,212,True,False,False,False,False,False,1438300800


In [47]:
df.drop('DayOfWeek', axis=1, inplace=True)

In [49]:
df = df[df['Sales'] > 0].copy()

df['Sales'] = df['Sales'].astype('float64')/1000

In [50]:
df

Unnamed: 0,Store,Week,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday,Year,Month,Day,Dayofweek,Dayofyear,Is_month_end,Is_month_start,Is_quarter_end,Is_quarter_start,Is_year_end,Is_year_start,Elapsed
1016447,353,1,3.139,820,1,0,a,1,2013,1,1,1,1,False,True,False,True,False,True,1356998400
1016429,335,1,2.401,482,1,0,a,1,2013,1,1,1,1,False,True,False,True,False,True,1356998400
1016606,512,1,2.646,625,1,0,a,1,2013,1,1,1,1,False,True,False,True,False,True,1356998400
1016588,494,1,3.113,527,1,0,a,1,2013,1,1,1,1,False,True,False,True,False,True,1356998400
1016624,530,1,2.907,532,1,0,a,1,2013,1,1,1,1,False,True,False,True,False,True,1356998400
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
745,746,31,9.082,638,1,1,0,1,2015,7,31,4,212,True,False,False,False,False,False,1438300800
746,747,31,10.708,826,1,1,0,1,2015,7,31,4,212,True,False,False,False,False,False,1438300800
747,748,31,7.481,578,1,1,0,1,2015,7,31,4,212,True,False,False,False,False,False,1438300800
741,742,31,10.460,1016,1,1,0,1,2015,7,31,4,212,True,False,False,False,False,False,1438300800


Además, tenemos datos sobre las tiendas!

In [51]:
stores = pd.read_csv('mydata/store.csv')

In [52]:
stores

Unnamed: 0,Store,StoreType,Assortment,CompetitionDistance,CompetitionOpenSinceMonth,CompetitionOpenSinceYear,Promo2,Promo2SinceWeek,Promo2SinceYear,PromoInterval
0,1,c,a,1270.0,9.0,2008.0,0,,,
1,2,a,a,570.0,11.0,2007.0,1,13.0,2010.0,"Jan,Apr,Jul,Oct"
2,3,a,a,14130.0,12.0,2006.0,1,14.0,2011.0,"Jan,Apr,Jul,Oct"
3,4,c,c,620.0,9.0,2009.0,0,,,
4,5,a,a,29910.0,4.0,2015.0,0,,,
...,...,...,...,...,...,...,...,...,...,...
1110,1111,a,a,1900.0,6.0,2014.0,1,31.0,2013.0,"Jan,Apr,Jul,Oct"
1111,1112,c,c,1880.0,4.0,2006.0,0,,,
1112,1113,a,c,9260.0,,,0,,,
1113,1114,a,c,870.0,,,0,,,


In [53]:
stores.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1115 entries, 0 to 1114
Data columns (total 10 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   Store                      1115 non-null   int64  
 1   StoreType                  1115 non-null   object 
 2   Assortment                 1115 non-null   object 
 3   CompetitionDistance        1112 non-null   float64
 4   CompetitionOpenSinceMonth  761 non-null    float64
 5   CompetitionOpenSinceYear   761 non-null    float64
 6   Promo2                     1115 non-null   int64  
 7   Promo2SinceWeek            571 non-null    float64
 8   Promo2SinceYear            571 non-null    float64
 9   PromoInterval              571 non-null    object 
dtypes: float64(5), int64(2), object(3)
memory usage: 87.2+ KB


In [54]:
# Primero le asginamos el indice store
# para que al momento de juntarlo
# decirle que se una en ese indice
df_stores = df.join(stores.set_index('Store'), on='Store')

In [55]:
df_stores.head(4)

Unnamed: 0,Store,Week,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday,Year,Month,...,Elapsed,StoreType,Assortment,CompetitionDistance,CompetitionOpenSinceMonth,CompetitionOpenSinceYear,Promo2,Promo2SinceWeek,Promo2SinceYear,PromoInterval
1016447,353,1,3.139,820,1,0,a,1,2013,1,...,1356998400,b,b,900.0,,,1,14.0,2013.0,"Feb,May,Aug,Nov"
1016429,335,1,2.401,482,1,0,a,1,2013,1,...,1356998400,b,a,90.0,,,1,31.0,2013.0,"Jan,Apr,Jul,Oct"
1016606,512,1,2.646,625,1,0,a,1,2013,1,...,1356998400,b,b,590.0,,,1,5.0,2013.0,"Mar,Jun,Sept,Dec"
1016588,494,1,3.113,527,1,0,a,1,2013,1,...,1356998400,b,a,1260.0,6.0,2011.0,0,,,


In [56]:
# Para ver todas las columnas cuando ya son demasiadas
df_stores.head().T

Unnamed: 0,1016447,1016429,1016606,1016588,1016624
Store,353,335,512,494,530
Week,1,1,1,1,1
Sales,3.139,2.401,2.646,3.113,2.907
Customers,820,482,625,527,532
Open,1,1,1,1,1
Promo,0,0,0,0,0
StateHoliday,a,a,a,a,a
SchoolHoliday,1,1,1,1,1
Year,2013,2013,2013,2013,2013
Month,1,1,1,1,1


In [57]:
df_stores.columns

Index(['Store', 'Week', 'Sales', 'Customers', 'Open', 'Promo', 'StateHoliday',
       'SchoolHoliday', 'Year', 'Month', 'Day', 'Dayofweek', 'Dayofyear',
       'Is_month_end', 'Is_month_start', 'Is_quarter_end', 'Is_quarter_start',
       'Is_year_end', 'Is_year_start', 'Elapsed', 'StoreType', 'Assortment',
       'CompetitionDistance', 'CompetitionOpenSinceMonth',
       'CompetitionOpenSinceYear', 'Promo2', 'Promo2SinceWeek',
       'Promo2SinceYear', 'PromoInterval'],
      dtype='object')

In [58]:
cat_names = ['Store', 'Dayofweek', 'Open', 'Promo', 'StateHoliday',
            'SchoolHoliday', 'Year', 'Month', 'Week', 'Dayofweek',
            'Is_month_end', 'Is_month_start', 'Is_quarter_end',
            'Is_quarter_start', 'Is_year_end', 'Is_year_start', 'StoreType', 'Assortment',
            'CompetitionOpenSinceYear', 'CompetitionOpenSinceMonth', 'Promo2', 'PromoInterval']

cont_names = ['Day', 'Dayofyear', 'Elapsed', 'CompetitionDistance', 'CompetitionOpenSinceMonth', 'Promo2SinceWeek', 'Promo2SinceYear']

y_names = ['Sales']


In [59]:
X = list(range(len(df)))

In [60]:
valid_cut = len(df) - 20000

In [61]:
src = ft.TabularPandas(df_stores,
                      procs = [ft.Categorify, ft.FillMissing],
                      cat_names = cat_names,
                      cont_names = cont_names,
                      y_names = y_names, 
                      splits = (X[:valid_cut], X[valid_cut:])
                      )

In [62]:
dls = src.dataloaders(bs=1024)

In [63]:
learn = ft.tabular_learner(dls,
                          loss_func = mspe,
                           emb_szs = {'Store': 16},
                           layers = [256, 128], 
                           config = ft.tabular_config(act_cls=nn.LeakyReLU(inplace=True)),
                          opt_func = ft.ranger, 
                          metrics = rmspe)

In [64]:
learn.model

TabularModel(
  (embeds): ModuleList(
    (0): Embedding(1116, 16)
    (1): Embedding(8, 5)
    (2): Embedding(2, 2)
    (3): Embedding(3, 3)
    (4): Embedding(6, 4)
    (5): Embedding(3, 3)
    (6): Embedding(4, 3)
    (7): Embedding(13, 7)
    (8): Embedding(53, 15)
    (9): Embedding(8, 5)
    (10): Embedding(3, 3)
    (11): Embedding(3, 3)
    (12): Embedding(3, 3)
    (13): Embedding(3, 3)
    (14): Embedding(3, 3)
    (15): Embedding(3, 3)
    (16): Embedding(5, 4)
    (17): Embedding(4, 3)
    (18): Embedding(24, 9)
    (19): Embedding(13, 7)
    (20): Embedding(3, 3)
    (21): Embedding(4, 3)
    (22): Embedding(3, 3)
    (23): Embedding(3, 3)
    (24): Embedding(3, 3)
    (25): Embedding(3, 3)
  )
  (emb_drop): Dropout(p=0.0, inplace=False)
  (bn_cont): BatchNorm1d(7, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): LinBnDrop(
      (0): BatchNorm1d(129, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (1): 

In [65]:
learn.fit_one_cycle(20, 1e-3, div=2, pct_start = 0.5)

epoch,train_loss,valid_loss,rmspe,time
0,0.038367,0.028384,0.165413,00:23
1,0.032822,0.021587,0.145773,00:23
2,0.028634,0.018507,0.134549,00:24
3,0.024698,0.01756,0.131181,00:24
4,0.017704,0.015369,0.123193,00:24
5,0.018725,0.015176,0.122288,00:24
6,0.018677,0.015483,0.123391,00:23
7,0.016527,0.015831,0.124597,00:24
8,0.015062,0.016429,0.126899,00:24
9,0.015736,0.019339,0.137957,00:24


<a id="moredata"><h1><strong>¡Aún más datos!</strong></h1></a>

Básicamente los que ganaron hicieron un feature engineerung muuuuy poderos
https://github.com/fastai/fastai/blob/master/dev_nbs/course/rossman_data_clean.ipynb