# Introduction to MetaTools

This tutorial will show how to quickly wrap a machine learning model with our GlueMetaTools,
which can easily and fastly add a lot of assisting functions on the model such as loss monitoring, log system.

Before we step into our development of a machine learning model,
we will spend a little time in talking about the scenarios a model may be in
and what will be done in different scenarios.
After analyzing the needed procedure,
we will summarize the needed components and propose our solution, i.e., `MetaTools`.

With loss of generality, a machine learning model can be involved in these scenarios:
1. **Training**: In this scenario, model will be trained based on the datasets with or without labels,
and the datasets could be stored in local file system or database, even some interactive systems like `gym`.
During training, losses and metrics on train dataset or valid dataset are
expected to be printed to console and saved in a log file, which helps people
to analyze the model effectiveness and find the best result.
After training, the model parameters should be saved.
In practice, people may save the model parameters every few epochs rather than only save in the end.
In addition, auto machine learning now is widely used to search the best learning strategy
(e.g., optimize strategy and learning rate), the best hyper-parameters and best network structure
(if you are using deep learning technologies), which also is expected to be involved
(in other words, auto machine learning is expected to be able to plugin into the model).
Also, these things could be done manually.
2. **Evaluation**: As known as "validation",
people use the valid dataset to validate the effectiveness of the model based on several metrics.
3. **Testing**: In fact, this scenario is quite like "Evaluation" in most research works :-),
and to better distinguish "Testing" with "Evaluation", we prefer to call it "Prediction" or "Submission".
In this scenario, people need to restore a model from existing parameters
(including hyper-parameters and model-parameters).

In summary, we will have the following steps when we are going to construct and utilize a machine learning model:
1. Get data [Training, Evaluation, Testing]: Extract-Transform-Load (ETL) is a well-known data loading procedure, no matter train or test a model,
this is needed. Usually, "Training" and "Evaluation" can share the same `etl`
while "Testing" may need some modification.
2. Persistent the hyper-parameters and model-parameters [Training].
3. Log the losses and metrics during training into console and file system etc. [Training, Evaluation]
4. Restore the stored hyper-parameters and model-parameters [Evaluation, Testing].
5. Necessary hint info to let user know the progress (which step is the model on)[Training, Evaluation, Testing].
6. Analyze the model effectiveness based on the logged losses and metrics,
and find the best model based on this analysis.

Therefore, we can know, we need to define these components in a `Configuration`
and have several assisting tools to help monitor and analyze the model:
1. The components in `Configuration`: (if you are not familiar with `Configuation`,
you can refer to this tutorial[].)


However, we notice that, there are still some variants brought by different machine learning framework,
thus, for general frameworks,
we propose `DLMeta` which can be used in most of machine learning frameworks like `sklearn` and `lightgbm`,
with a little adaptive modification.
and for specific frameworks, we propose `GLueMeta` for `mxnet.gluon` and `ToreMeta` for `pytorch`.

We use the well-known problem "Handwritten Digit Recognition" as an example.

First, we use `sklearn` to load the dataset and split it into `train` and `valid`:

In [5]:
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

dataset = load_digits()
dataset.data.shape, dataset.target.shape

((1797, 64), (1797,))

In [8]:
X_train, X_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2, random_state=0)
X_train[0], X_train.shape, y_train.shape

(array([ 0.,  0.,  0.,  9., 15.,  2.,  0.,  0.,  0.,  0.,  5., 16., 11.,
         1.,  0.,  0.,  0.,  0., 13., 15.,  1.,  0.,  0.,  0.,  0.,  2.,
        16., 11.,  0.,  0.,  0.,  0.,  0.,  2., 16., 11.,  4.,  4.,  0.,
         0.,  0.,  2., 15., 16., 16., 14., 10.,  1.,  0.,  0.,  9., 16.,
         7.,  3., 15.,  6.,  0.,  0.,  0.,  7., 15., 16., 16.,  6.]),
 (1437, 64),
 (1437,))

In [7]:
X_test.shape, y_test.shape

((360, 64), (360,))

Then, we batchify the dataset using `DataLoader` in `mxnet`, the tutorial of `DataLoader` can be
found [here](https://mxnet.apache.org/versions/1.7/api/python/docs/tutorials/packages/gluon/data/datasets.html)

In [None]:
from mxnet.gluon.data import ArrayDataset

train_dataset = ArrayDataset(X_train, y_train)
valid_dataset = ArrayDataset(X_test, y_test)
# train_data_loader = DataLoader(train_dataset, batch_size=16)
# valid_data_loader = DataLoader(valid_dataset, batch_size=16)

Here, we found a configurable variable `batch_size`, so a `Configuration` is needed.

In [None]:
from mxnet import autograd
from mxnet.gluon.loss import SoftmaxCELoss

class FCNet(mx.gluon.HybridBlock):
    def __init__(self, prefix=None, params=None):
        super(FCNet, self).__init__(prefix=prefix, params=params)
        with self.name_scope():
            self.fc = mx.gluon.nn.HybridSequential()
            self.fc.add(
                mx.gluon.nn.Dense(256, "relu"),
                mx.gluon.nn.Dropout(0.5),
                mx.gluon.nn.Dense(8),
            )
    def hybrid_forward(self, F, x, *args, **kwargs):
        return self.fc(x)

Loss = SoftmaxCELoss

def fit_f(net: FCNet, batch_data, bp_loss_f, trainer, *args, **kwargs):
    feature, label = batch_data
    with autograd.record():
        pred = net(feature)
        bp_loss = bp_loss_f(pred, label)
        bp_loss.backward()
        trainer.step(len(feature))

In [None]:
import functools
from longling.ML.MxnetHelper.glue import MetaModule, MetaModel


class Module(MetaModule):
    @functools.wraps(FCNet.__init__)
    def sym_gen(self, *args, **kwargs):
        return FCNet(*args, **kwargs)

    @functools.wraps(fit_f)
    def fit_f(self, *args, **kwargs):
        return fit_f(*args, **kwargs)


class Model(MetaModel):
    pass

In [None]:
# for X_batch, y_batch in train_data_loader:
#     print("X_batch has shape {}, and y_batch has shape {}".format(X_batch.shape, y_batch.shape))