# Tabular Data Classification with NNI in AML

This simple example is to use NNI NAS 2.0(Retiarii) framework to search for the best neural architecture for tabular data classification task in Azure Machine Learning training platform.

The video demo is in [YouTube](https://www.youtube.com/watch?v=PDVqBmm7Cro) and [Bilibili](https://www.bilibili.com/video/BV1oy4y1W7GF).

## Step 1: Prepare the dataset

The first step is to prepare the dataset. Here we use the Titanic dataset as an example.

In [1]:
from utils import TitanicDataset
from nni.retiarii import serialize

train_dataset = serialize(TitanicDataset, root='./data', train=True)
test_dataset = serialize(TitanicDataset, root='./data', train=False)

## Step 2: Define the Model Space

Model space is defined by users to express a set of models that they want to explore, which contains potentially good-performing models. In Retiarii(NNI NAS 2.0) framework, a model space is defined with two parts: a base model and possible mutations on the base model.

### Step 2.1: Define the Base Model

Defining a base model is almost the same as defining a PyTorch (or TensorFlow) model. Usually, you only need to replace the code ``import torch.nn as nn`` with ``import nni.retiarii.nn.pytorch as nn`` to use NNI wrapped PyTorch modules. Below is a very simple example of defining a base model.

In [None]:
import nni.retiarii.nn.pytorch as nn
import torch.nn.functional as F

class Net(nn.Module):

    def __init__(self, input_size):
        super().__init__()

        self.fc1 = nn.Linear(input_size, 16)
        self.bn1 = nn.BatchNorm1d(16)
        self.dropout1 = nn.Dropout(0.0)

        self.fc2 = nn.Linear(16, 16)
        self.bn2 = nn.BatchNorm1d(16)
        self.dropout2 = nn.Dropout(0.0)

        self.fc3 = nn.Linear(16, 2)

    def forward(self, x):

        x = self.dropout1(F.relu(self.bn1(self.fc1(x))))
        x = self.dropout2(F.relu(self.bn2(self.fc2(x))))
        x = F.sigmoid(self.fc3(x))
        return x
    
model_space = Net(len(train_dataset.__getitem__(0)[0]))

### Step 2.2: Define the Model Mutations

A base model is only one concrete model, not a model space. NNI provides APIs and primitives for users to express how the base model can be mutated, i.e., a model space that includes many models. The following will use inline Mutation APIs as a simple example. 

In [2]:
import nni.retiarii.nn.pytorch as nn
import torch.nn.functional as F

class Net(nn.Module):

    def __init__(self, input_size):
        super().__init__()

        self.hidden_dim1 = nn.ValueChoice(
            [16, 32, 64, 128, 256, 512, 1024], label='hidden_dim1')
        self.hidden_dim2 = nn.ValueChoice(
            [16, 32, 64, 128, 256, 512, 1024], label='hidden_dim2')

        self.fc1 = nn.Linear(input_size, self.hidden_dim1)
        self.bn1 = nn.BatchNorm1d(self.hidden_dim1)
        self.dropout1 = nn.Dropout(nn.ValueChoice([0.0, 0.25, 0.5]))

        self.fc2 = nn.Linear(self.hidden_dim1, self.hidden_dim2)
        self.bn2 = nn.BatchNorm1d(self.hidden_dim2)
        self.dropout2 = nn.Dropout(nn.ValueChoice([0.0, 0.25, 0.5]))

        self.fc3 = nn.Linear(self.hidden_dim2, 2)

    def forward(self, x):

        x = self.dropout1(F.relu(self.bn1(self.fc1(x))))
        x = self.dropout2(F.relu(self.bn2(self.fc2(x))))
        x = F.sigmoid(self.fc3(x))
        return x

model_space = Net(len(train_dataset.__getitem__(0)[0]))

Besides inline mutations, Retiarii also provides ``mutator``, a more general approach to express complex model space.

## Step 3: Explore the Defined Model Space

In the NAS process, the search strategy repeatedly generates new models, and the model evaluator is for training and validating each generated model. The obtained performance of a generated model is collected and sent to the search strategy for generating better models.

Users can choose a proper search strategy to explore the model space, and use a chosen or user-defined model evaluator to evaluate the performance of each sampled model.

### Step 3.1: Choose a Search Strategy

In [3]:
import nni.retiarii.strategy as strategy

simple_strategy = strategy.TPEStrategy()

### Step 3.2: Choose or Write a Model Evaluator

In the context of PyTorch, Retiarii has provided two built-in model evaluators, designed for simple use cases: classification and regression. These two evaluators are built upon the awesome library PyTorch-Lightning.

In [4]:
import nni.retiarii.evaluator.pytorch.lightning as pl

trainer = pl.Classification(train_dataloader=pl.DataLoader(train_dataset, batch_size=16),
                                val_dataloaders=pl.DataLoader(
                                test_dataset, batch_size=16),
                                max_epochs=20)

GPU available: True, used: False


[2021-06-08 09:56:10] INFO (lightning/MainThread) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 09:56:10] INFO (lightning/MainThread) TPU available: None, using: 0 TPU cores


## Step 4: Configure the Experiment

After all the above are prepared, it is time to configure an experiment to do the model search. The basic experiment configuration is as follows, and advanced configuration reference on [this page](https://nni.readthedocs.io/en/stable/reference/experiment_config.html).

In [5]:
from nni.retiarii.experiment.pytorch import RetiariiExeConfig, RetiariiExperiment

exp = RetiariiExperiment(model_space, trainer, [], simple_strategy)

exp_config = RetiariiExeConfig('aml')
exp_config.experiment_name = 'titanic_example'
exp_config.trial_concurrency = 2
exp_config.max_trial_number = 20
exp_config.max_experiment_duration = '2h'
exp_config.nni_manager_ip = '' # your nni_manager_ip

Before running experiments on AML(Azure Machine Learning) training service, you need to set up corresponding environment(refer to [AML mode doc](https://nni.readthedocs.io/en/stable/TrainingService/AMLMode.html)) and configure the following additional fields:

In [None]:
# Authenticate to your Azure subscription from the CLI.
# If you have finished, please skip it.
!az login

In [7]:
exp_config.training_service.subscription_id = '' # your subscription id
exp_config.training_service.resource_group = '' # your resource group
exp_config.training_service.workspace_name = '' # your workspace name
exp_config.training_service.compute_target = '' # your compute target
exp_config.training_service.docker_image = 'msranni/nni:latest'  # your docker image

## Step 5: Run and View the Experiment

You can launch the experiment now! 

Besides, NNI provides WebUI to help users view the experiment results and make more advanced analysis.

In [8]:
exp.run(exp_config, 8745)

[2021-06-08 09:56:54] INFO (nni.experiment/MainThread) Creating experiment, Experiment ID: 46den9qr
[2021-06-08 09:56:55] INFO (nni.experiment/MainThread) Connecting IPC pipe...
[2021-06-08 09:56:58] INFO (nni.experiment/MainThread) Starting web server...
[2021-06-08 09:57:00] INFO (nni.experiment/MainThread) Setting up...
[2021-06-08 09:57:05] INFO (nni.runtime.msg_dispatcher_base/Thread-8) Dispatcher started
[2021-06-08 09:57:05] INFO (nni.retiarii.experiment.pytorch/MainThread) Web UI URLs: http://127.0.0.1:8745
[2021-06-08 09:57:05] INFO (nni.retiarii.experiment.pytorch/MainThread) Start strategy...
[2021-06-08 09:57:05] INFO (nni.retiarii.strategy.tpe_strategy/MainThread) TPE strategy has been started.
[2021-06-08 09:57:05] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.001999 seconds
[2021-06-08 09:57:05] INFO (hyperopt.tpe/MainThread) TPE using 0 trials
[2021-06-08 09:57:10] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002029 seconds
[2021-06-08 09:57:10] INFO (hyper

GPU available: True, used: False


[2021-06-08 10:03:55] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:03:55] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:03:56] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.000000 seconds
[2021-06-08 10:03:56] INFO (hyperopt.tpe/MainThread) TPE using 1/1 trials with best loss 0.795455


GPU available: True, used: False


[2021-06-08 10:04:46] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:04:46] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:04:46] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.000000 seconds
[2021-06-08 10:04:46] INFO (hyperopt.tpe/MainThread) TPE using 2/2 trials with best loss 0.795455


GPU available: True, used: False


[2021-06-08 10:04:51] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:04:51] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:04:52] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.000000 seconds
[2021-06-08 10:04:52] INFO (hyperopt.tpe/MainThread) TPE using 3/3 trials with best loss 0.795455


GPU available: True, used: False


[2021-06-08 10:05:46] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:05:46] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:05:48] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002999 seconds
[2021-06-08 10:05:48] INFO (hyperopt.tpe/MainThread) TPE using 4/4 trials with best loss 0.791667


GPU available: True, used: False


[2021-06-08 10:05:56] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:05:56] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:05:56] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.000000 seconds
[2021-06-08 10:05:56] INFO (hyperopt.tpe/MainThread) TPE using 5/5 trials with best loss 0.791667


GPU available: True, used: False


[2021-06-08 10:06:26] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:06:26] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:06:27] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.004991 seconds
[2021-06-08 10:06:27] INFO (hyperopt.tpe/MainThread) TPE using 6/6 trials with best loss 0.791667


GPU available: True, used: False


[2021-06-08 10:07:06] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:07:06] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:07:07] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.006043 seconds
[2021-06-08 10:07:07] INFO (hyperopt.tpe/MainThread) TPE using 7/7 trials with best loss 0.784091


GPU available: True, used: False


[2021-06-08 10:07:56] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:07:56] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:07:57] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.006004 seconds
[2021-06-08 10:07:57] INFO (hyperopt.tpe/MainThread) TPE using 8/8 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:08:01] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:08:01] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:08:01] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.005000 seconds
[2021-06-08 10:08:01] INFO (hyperopt.tpe/MainThread) TPE using 9/9 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:08:56] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:08:56] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:08:58] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.004962 seconds
[2021-06-08 10:08:58] INFO (hyperopt.tpe/MainThread) TPE using 10/10 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:09:01] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:09:01] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:09:03] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.003043 seconds
[2021-06-08 10:09:03] INFO (hyperopt.tpe/MainThread) TPE using 11/11 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:10:27] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:10:27] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:10:28] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002005 seconds
[2021-06-08 10:10:28] INFO (hyperopt.tpe/MainThread) TPE using 12/12 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:10:52] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:10:52] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:10:53] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.122046 seconds
[2021-06-08 10:10:53] INFO (hyperopt.tpe/MainThread) TPE using 13/13 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:14:52] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:14:52] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:14:53] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002038 seconds
[2021-06-08 10:14:53] INFO (hyperopt.tpe/MainThread) TPE using 14/14 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:14:57] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:14:57] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:14:58] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.005870 seconds
[2021-06-08 10:14:58] INFO (hyperopt.tpe/MainThread) TPE using 15/15 trials with best loss 0.731061


GPU available: True, used: False


[2021-06-08 10:16:07] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:16:07] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:16:08] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.004999 seconds
[2021-06-08 10:16:08] INFO (hyperopt.tpe/MainThread) TPE using 16/16 trials with best loss 0.712121


GPU available: True, used: False


[2021-06-08 10:16:48] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:16:48] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:16:48] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002000 seconds
[2021-06-08 10:16:48] INFO (hyperopt.tpe/MainThread) TPE using 17/17 trials with best loss 0.712121


GPU available: True, used: False


[2021-06-08 10:16:53] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:16:53] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:16:55] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002010 seconds
[2021-06-08 10:16:55] INFO (hyperopt.tpe/MainThread) TPE using 18/18 trials with best loss 0.712121


GPU available: True, used: False


[2021-06-08 10:17:43] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:17:43] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:17:44] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.006001 seconds
[2021-06-08 10:17:44] INFO (hyperopt.tpe/MainThread) TPE using 19/19 trials with best loss 0.712121


GPU available: True, used: False


[2021-06-08 10:18:03] INFO (lightning/Thread-5) GPU available: True, used: False


TPU available: None, using: 0 TPU cores


[2021-06-08 10:18:03] INFO (lightning/Thread-5) TPU available: None, using: 0 TPU cores
[2021-06-08 10:18:04] INFO (hyperopt.tpe/MainThread) tpe_transform took 0.002009 seconds
[2021-06-08 10:18:04] INFO (hyperopt.tpe/MainThread) TPE using 20/20 trials with best loss 0.712121
[2021-06-08 10:18:12] INFO (nni.retiarii.experiment.pytorch/Thread-9) Stopping experiment, please wait...
[2021-06-08 10:18:14] INFO (nni.runtime.msg_dispatcher_base/Thread-8) Dispatcher exiting...
[2021-06-08 10:18:14] INFO (nni.retiarii.experiment.pytorch/MainThread) Strategy exit
[2021-06-08 10:18:14] INFO (nni.retiarii.experiment.pytorch/MainThread) Waiting for experiment to become DONE (you can ctrl+c if there is no running trial jobs)...
[2021-06-08 10:18:15] INFO (nni.retiarii.experiment.pytorch/Thread-9) Experiment stopped
[2021-06-08 10:18:16] INFO (nni.runtime.msg_dispatcher_base/Thread-8) Dispatcher terminiated


## Step 6: Export the top Model

Exporting the top model script is also very convenient.

In [9]:
print('Final model:')
for model_code in exp.export_top_models():
    print(model_code)

Final model:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import nni.retiarii.nn.pytorch

import torch


class _model(nn.Module):
    def __init__(self):
        super().__init__()
        self.__fc1 = torch.nn.modules.linear.Linear(in_features=9, out_features=512)
        self.__bn1 = torch.nn.modules.batchnorm.BatchNorm1d(num_features=512)
        self.__dropout1 = torch.nn.modules.dropout.Dropout(p=0.0)
        self.__fc2 = torch.nn.modules.linear.Linear(in_features=512, out_features=128)
        self.__bn2 = torch.nn.modules.batchnorm.BatchNorm1d(num_features=128)
        self.__dropout2 = torch.nn.modules.dropout.Dropout(p=0.25)
        self.__fc3 = torch.nn.modules.linear.Linear(in_features=128, out_features=2)

    def forward(self, x__1):
        __Constant3 = False
        __fc1 = self.__fc1(x__1)
        __bn1 = self.__bn1(__fc1)
        __relu7 = F.relu(__bn1, __Constant3)
        __dropout1 = self.__dropout1(__relu7)
      