## Metrics-6: fastai.Tabular NN on point tasks

This is really a tabular task because we are flattening the pixels and features before regressing. So we'll use fastai's Tabular learner to do this task and acheive results on par with the basic-ML algo's.

We need to build the pts features for each of the records with pre-processing  into a dataframe so we do that in the setup.

Some themes we'll explore are:
 - Normalizing/Scaling Y
 - Restricted X feature sets
 
For scaling Y, we'll need to inverse transform before the results are on the same scale as the other algos.

In [78]:
import os, sys
import copy as copyroot
import pandas as pd
from IPython.display import display
from matplotlib import pyplot as plt

from fastai2.basics import *
from fastai2.tabular.all import *

from sklearn.preprocessing import Normalizer, StandardScaler
from sklearn.metrics import r2_score

%load_ext autoreload
%autoreload 2

from module.mnist_helpers import build_df, build_tabular_df, build_dls
from module.mnist_metrics import metrics_df

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [2]:
path = untar_data(URLs.MNIST_TINY)
X, Y = build_tabular_df(path)

In [3]:
X.shape, Y.shape

((709, 856), (709, 5))

In [4]:
restrict_cols = [
    "pts22_5",
    "pts22_29",
    "pts22_21",
    "pts11_0",
    "pts12_4",
    "pts11_2",
]

In [5]:
y_cols = ['point_topleft_x', 'point_topleft_y']
y = Y[y_cols]

In [64]:
data = pd.concat((X.loc[:,restrict_cols], y), axis=1)

#### Point Scaling

In [65]:
# from fastai2.vision.core import PointScaler

In [66]:
dls_tl = build_dls(target='topleft')
dls_cr = build_dls(target='center')

point_t = dls_cr.transform[1][1]
scale_t = dls_tl.after_item

point_t, scale_t

def my_scale(x): return scale_t(point_t(x))

In [67]:
y_sc = pd.DataFrame(my_scale(Y.loc[:,y_cols]))

y_sc_cols = [f'{e}_sc' for e in  y_cols]

for a,b in zip(y_sc_cols, y_sc):
    data[a] = y_sc[b]

#### Normalization

In [68]:
ss = StandardScaler()

ss.fit(data[y_sc_cols])
y_norm = pd.DataFrame(ss.transform(data[y_sc_cols]))

In [69]:
y_norm_cols = [f'{e}_norm' for e in  y_sc_cols]

for a,b in zip(y_norm_cols, y_norm):
    data[a] = y_norm[b]

In [70]:
data.columns

Index(['pts22_5', 'pts22_29', 'pts22_21', 'pts11_0', 'pts12_4', 'pts11_2',
       'point_topleft_x', 'point_topleft_y', 'point_topleft_x_sc',
       'point_topleft_y_sc', 'point_topleft_x_sc_norm',
       'point_topleft_y_sc_norm'],
      dtype='object')

#### Build Tabular DL, Learner

In [72]:
dls = TabularDataLoaders.from_df(data, 
                                 path='.', 
                                 y_names=['point_topleft_x_sc_norm',
                                          'point_topleft_y_sc_norm'],
                                 procs=[Normalize])

In [73]:
learn = tabular_learner(dls,)

#### Fit + Eval

In [74]:
with learn.no_logging(): learn.fit(40)

In [98]:
metrics_norm = metrics_df(learn,
                           "fastai.Tabular 1.1",
                            "epochs=40 | 6 features for X | y-norm",
                            "topleft" )
metrics_norm

Unnamed: 0,model,details,target,split,mse,mae,r2,dist_avg,dist_r2,sqdist_avg,sqdist_r2
0,fastai.Tabular 1.1,epochs=40 | 6 features for X | y-norm,topleft,valid,0.003428,0.04378,0.996079,0.071832,0.939635,0.006855,0.996038
1,fastai.Tabular 1.1,epochs=40 | 6 features for X | y-norm,topleft,train,0.003756,0.045848,0.996358,0.076084,0.941759,0.007512,0.996365


In [99]:
metrics_unnorm = metrics_df(learn, 
                            "fastai.Tabular 1.1",
                            "epochs=40 | 6 features for X | y-norm",
                            "topleft", 
                            y_scaler=ss)
metrics_unnorm

Unnamed: 0,model,details,target,split,mse,mae,r2,dist_avg,dist_r2,sqdist_avg,sqdist_r2
0,fastai.Tabular 1.1,epochs=40 | 6 features for X | y-norm,topleft,valid,0.000163,0.008432,0.996079,0.014859,0.930329,0.000326,0.994995
1,fastai.Tabular 1.1,epochs=40 | 6 features for X | y-norm,topleft,train,0.000171,0.008702,0.996358,0.0154,0.934759,0.000342,0.995479


#### Save

In [100]:
metrics_unnorm.to_csv('assets/metrics-dfs/metrics6-fasttab-1x.csv',
                     index=False)