## Import libraries & data
- `fastai` releases updates frequently, so I won't guarantee this notebook will work with versions later than the one specified here
- This notebook is a follow-up from my previous attempt [**link**](https://www.kaggle.com/nguyncaoduy/fastai-tabular-regression-model-nn-xgb) 
- This notebook demonstrates how to use **TabNet (Attention-based network for tabular data)** in `fastai`. The original paper https://arxiv.org/pdf/1908.07442.pdf. 

In [None]:
!pip install -q fastai==2.2.5 fastcore==1.3.19 fast-tabnet==0.2.0

In [None]:
from fastai.tabular.all import *
from fast_tabnet.core import *

SEED = 42
set_seed(SEED, reproducible=True)

In [None]:
path = Path('/kaggle/input/tabular-playground-series-jan-2021')
path.ls()

## Process data

In [None]:
train_df = pd.read_csv(path/'train.csv')
train_df.head()

In [None]:
y_names = ['target']
cont_names = list(train_df.columns.values)[1:-1]
cat_names = []
procs = [Categorify, FillMissing, Normalize]
splits = RandomSplitter(seed=SEED)(range_of(train_df))
bs = 256

In [None]:
db = TabularPandas(
    train_df, 
    procs=procs, 
    cat_names=cat_names, 
    cont_names=cont_names, 
    y_names=y_names, 
    y_block=RegressionBlock(),
    splits=splits,
)

In [None]:
dls = db.dataloaders(bs=bs)
dls.show_batch()

## Model Training

In [None]:
model_name = 'tabnet'

### TabNet architecture

`model = TabNetModel(emb_szs, n_cont, out_sz, embed_p=0., y_range=None, 
                     n_d=8, n_a=8,
                     n_steps=3, gamma=1.5, 
                     n_independent=2, n_shared=2, epsilon=1e-15,
                     virtual_batch_size=128, momentum=0.02)`

Parameters `emb_szs, n_cont, out_sz, embed_p, y_range` are the same as for fastai TabularModel.

- n_d : int
    Dimension of the prediction  layer (usually between 4 and 64)
- n_a : int
    Dimension of the attention  layer (usually between 4 and 64)
- n_steps: int
    Number of sucessive steps in the newtork (usually betwenn 3 and 10)
- gamma : float
    Float above 1, scaling factor for attention updates (usually betwenn 1.0 to 2.0)
- momentum : float
    Float value between 0 and 1 which will be used for momentum in all batch norm
- n_independent : int
    Number of independent GLU layer in each GLU block (default 2)
- n_shared : int
    Number of independent GLU layer in each GLU block (default 2)
- epsilon: float
    Avoid log(0), this should be kept very low

In [None]:
model = TabNetModel(get_emb_sz(db), len(db.cont_names), dls.c, n_d=64, n_a=64, n_steps=5, virtual_batch_size=256)

Remember to try modifying the hyperparameters to get the best performance! Here I just used the default values

In [None]:
# save the best model so far
cbs = [SaveModelCallback(monitor='_rmse', comp=np.less, fname=model_name+'_best')]

In [None]:
learn = Learner(dls, model, loss_func=MSELossFlat(), metrics=rmse, cbs=cbs)

In [None]:
learn.lr_find()

In [None]:
learn.fit_one_cycle(20, 5e-2)

In [None]:
learn.show_results()

## Evaluate on validation data

In [None]:
learn.load(model_name+'_best')

In [None]:
preds, targs = learn.get_preds()
preds = preds.squeeze(1)

| Model    | Min RMSE (Validation) |
|----------|----------|
| tabnet    | 0.7147   |

In [None]:
rmse(preds, targs)

## Make predictions on test data

In [None]:
test_df = pd.read_csv(path/'test.csv')
test_df.head()

In [None]:
test_dl = dls.test_dl(test_df)

In [None]:
preds, _ = learn.get_preds(dl=test_dl)
preds = preds.squeeze(1)

In [None]:
submit = pd.read_csv(path/'sample_submission.csv')
submit['target'] = preds
submit.head()

## Submit to Kaggle
- Download the `submission.csv` file and submit

In [None]:
submit.to_csv('submission.csv', index=False)