# Tutorial 9: Neural Networks

<img src="../../imgs/lightautoml_logo_color.png" alt="LightAutoML logo" style="width:100%;"/>

Official LightAutoML github repository is [here](https://github.com/AILab-MLTools/LightAutoML)


In this tutorial you will learn how to:
* train neural networks (nn) with LightAutoML on tabualr data
* customize model architecture and pipelines

## 0. Prerequisites

### 0.0 install LightAutoML

In [1]:
# !pip install -U lightautoml[all]

### 0.1 Import libraries

Here we will import the libraries we use in this kernel:
- Standard python libraries for timing, working with OS etc.
- Essential python DS libraries like numpy, pandas, scikit-learn and torch (the last we will use in the next cell)
- LightAutoML modules: presets for AutoML, task and report generation module

In [2]:
# Standard python libraries
import os

# Essential DS libraries
import optuna
import requests
import numpy as np
import pandas as pd
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
import torch
from copy import deepcopy as copy
import torch.nn as nn
from collections import OrderedDict

# LightAutoML presets, task and report generation
from lightautoml.automl.presets.tabular_presets import TabularAutoML
from lightautoml.tasks import Task

### 0.2 Constants

Here we setup the constants to use in the kernel:
- `N_THREADS` - number of vCPUs for LightAutoML model creation
- `N_FOLDS` - number of folds in LightAutoML inner CV
- `RANDOM_STATE` - random seed for better reproducibility
- `TEST_SIZE` - houldout data part size 
- `TIMEOUT` - limit in seconds for model to train
- `TARGET_NAME` - target column name in dataset

In [3]:
N_THREADS = 4
N_FOLDS = 5
RANDOM_STATE = 42
TEST_SIZE = 0.2
TIMEOUT = 300
TARGET_NAME = 'TARGET'

np.random.seed(RANDOM_STATE)
torch.set_num_threads(N_THREADS)

### 0.3 Data loading

In [4]:
DATASET_DIR = '../data/'
DATASET_NAME = 'sampled_app_train.csv'
DATASET_FULLNAME = os.path.join(DATASET_DIR, DATASET_NAME)
DATASET_URL = 'https://raw.githubusercontent.com/AILab-MLTools/LightAutoML/master/examples/data/sampled_app_train.csv'

if not os.path.exists(DATASET_FULLNAME):
    os.makedirs(DATASET_DIR, exist_ok=True)

    dataset = requests.get(DATASET_URL).text
    with open(DATASET_FULLNAME, 'w') as output:
        output.write(dataset)

data = pd.read_csv(DATASET_FULLNAME)
data.head()

tr_data, te_data = train_test_split(
    data, 
    test_size=TEST_SIZE, 
    stratify=data[TARGET_NAME], 
    random_state=RANDOM_STATE
)

## 1. Available built-in models

To use different model pass it to the list in `"use_algo"`. We support custom models inherited from `torch.nn.Module` class. For every model their parameters is listed below.

### 1.1 MLP (`"mlp"`)
- `hidden_size` - define hidden layer dimensions

### 1.2 Dense Light (`"denselight"`)
<img src="../../imgs/denselight.png" style="width:25%;"/>

- `hidden_size` - define hidden layer dimensions

### 1.3 Dense (`"dense"`)
<img src="../../imgs/densenet.png" style="width:60%;"/>

- `block_config` - set number of blocks and layers within each block
- `compression` - portion of neuron to drop after `DenseBlock`
- `growth_size` - output dim of every `DenseLayer`
- `bn_factor` - size of intermediate fc is increased times this factor in layer

### 1.4 Resnet (`"resnet"`)
<img src="../../imgs/resnet.png" style="width:50%;"/>

- `hid_factor` - size of intermediate fc is increased times this factor in layer

### 1.5 SNN (`"snn"`)
- `hidden_size` - define hidden layer dimensions

### 1.5 NODE (`"node"`)
<img src="../../imgs/node.png" style="width:80%;"/>

### 1.5 AutoInt (`"autoint"`)
<img src="../../imgs/autoint.png" style="width:80%;"/>

### 1.5 FTTransformer (`"fttransformer"`)
<img src="../../imgs/fttransformer.png" style="width:80%;"/>

- `pooling` - Pooling used for the last step.
- `n_out` - Output dimension, 1 for binary prediction.
- `embedding_size` - Embeddings size.
- `depth` - Number of Attention Blocks inside Transformer.
- `heads` - Number of heads in Attention.
- `attn_dropout` - Post-Attention dropout.
- `ff_dropout` - Feed-Forward Dropout.
- `dim_head` - Attention head dimension
- `return_attn` - Return attention scores or not.
- `num_enc_layers` - Number of Transformer layers.
- `device` - Device to compute on.


## 2. Example of usage
### 2.1 Task definition

In [5]:
task = Task('binary')
roles = {
    'target': TARGET_NAME,
    'drop': ['SK_ID_CURR']
}

### 2.2 LightAutoML model creation - TabularAutoML preset with neural network

In next the cell we are going to create LightAutoML model with `TabularAutoML` class.

in just several lines. Let's discuss the params we can setup:
- `task` - the type of the ML task (the only **must have** parameter)
- `timeout` - time limit in seconds for model to train
- `cpu_limit` - vCPU count for model to use
- `nn_params` - network and training params, for example, `"hidden_size"`, `"batch_size"`, `"lr"`, etc.
- `nn_pipeline_params` - data preprocessing params, which affect how data is fed to the model: use embeddings or target encoding for categorical columns, standard scalar or quantile transformer for numerical columns
- `reader_params` - parameter change for Reader object inside preset, which works on the first step of data preparation: automatic feature typization, preliminary almost-constant features, correct CV setup etc.

In [6]:
automl = TabularAutoML(
    task = task, 
    timeout = TIMEOUT,
    cpu_limit = N_THREADS,
    general_params = {"use_algos": [["mlp"]]}, # ['nn', 'mlp', 'dense', 'denselight', 'resnet', 'snn', 'node', 'autoint', 'fttransformer'] or custom torch model
    nn_params = {"n_epochs": 10, "bs": 512, "num_workers": 0, "path_to_save": None, "freeze_defaults": True},
    nn_pipeline_params = {"use_qnt": True, "use_te": False},
    reader_params = {'n_jobs': N_THREADS, 'cv': N_FOLDS, 'random_state': RANDOM_STATE}
)

### 2.3 AutoML training

To run autoML training use fit_predict method:

- `train_data` - Dataset to train.
- `roles` - Roles dict.
- `verbose` - Controls the verbosity: the higher, the more messages.
        <1  : messages are not displayed;
        >=1 : the computation process for layers is displayed;
        >=2 : the information about folds processing is also displayed;
        >=3 : the hyperparameters optimization process is also displayed;
        >=4 : the training process for every algorithm is displayed;

Note: out-of-fold prediction is calculated during training and returned from the fit_predict method

In [7]:
%%time 
oof_pred = automl.fit_predict(tr_data, roles = roles, verbose = 1)

[14:56:04] Stdout logging level is INFO.
[14:56:04] Copying TaskTimer may affect the parent PipelineTimer, so copy will create new unlimited TaskTimer
[14:56:04] Task: binary

[14:56:04] Start automl preset with listed constraints:
[14:56:04] - time: 300.00 seconds
[14:56:04] - CPU: 4 cores
[14:56:04] - memory: 16 GB

[14:56:04] [1mTrain data shape: (8000, 122)[0m

[14:56:08] Layer [1m1[0m train process start. Time left 296.45 secs
[14:56:08] Start fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_mlp_0[0m ...
[14:56:15] Fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_mlp_0[0m finished. score = [1m0.6035621265821923[0m
[14:56:15] [1mLvl_0_Pipe_0_Mod_0_TorchNN_mlp_0[0m fitting and predicting completed
[14:56:15] Time left 289.10 secs

[14:56:15] [1mLayer 1 training completed.[0m

[14:56:15] [1mAutoml preset training completed in 10.90 seconds[0m

[14:56:15] Model description:
Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_TorchNN_mlp_0) 

CPU t

### 2.4 Prediction on holdout and model evaluation

In [8]:
%%time

te_pred = automl.predict(te_data)
print(f'Prediction for te_data:\n{te_pred}\nShape = {te_pred.shape}')

Prediction for te_data:
array([[0.09815434],
       [0.08660936],
       [0.060364  ],
       ...,
       [0.09103375],
       [0.05593849],
       [0.09817966]], dtype=float32)
Shape = (2000, 1)
CPU times: user 1.39 s, sys: 59.4 ms, total: 1.45 s
Wall time: 1.35 s


In [9]:
print(f'OOF score: {roc_auc_score(tr_data[TARGET_NAME].values, oof_pred.data[:, 0])}')
print(f'HOLDOUT score: {roc_auc_score(te_data[TARGET_NAME].values, te_pred.data[:, 0])}')

OOF score: 0.6035621265821923
HOLDOUT score: 0.5970482336956522


You can obtain the description of the resulting pipeline:

In [10]:
print(automl.create_model_str_desc())

Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_TorchNN_mlp_0) 


## 3. Main training loop and pipeline params

### 3.1 Training loop params

<img src="../../imgs/swa.png" style="width:70%;"/>

- `bs` - batch_size
- `snap_params` - early stopping and checkpoint averaging params, stochastic weight averaging (swa)
- `opt` - lr optimizer
- `opt_params` - optimizer params
- `clip_grad` - use grad clipping for regularization
- `clip_grad_params`
- `emb_dropout` - embedding dropout for categorical columns

This set of params should be passed in `nn_params` as well.

### 3.2 Pipeline params

Transformation for numerical columns

- `use_qnt` - uses quantile transformation for numerical columns
- `output_distribution` - type of distribuiton of feature after qnt transformer
- `n_quantiles` - number of quantiles used to build feature distribution
- `qnt_factor` - decreses `n_quantiles` depending on train data shape

Transformation for categorical columns

- `use_te` - uses target encoding
- `top_intersections` - number of intersections of cat columns to use

Full list of default parametres you can find here:
- [nn_params](../../lightautoml/automl/presets/tabular_config.yml)
- [nn_pipeline_params](../../lightautoml/automl/presets/tabular_config.yml)

## 4. More use cases

Let's remember default Lama params to be more compact.

In [11]:
default_lama_params = {
    "task": task, 
    "timeout": TIMEOUT,
    "cpu_limit": N_THREADS,
    "reader_params": {'n_jobs': N_THREADS, 'cv': N_FOLDS, 'random_state': RANDOM_STATE}
}

default_nn_params = {
    "bs": 512, "num_workers": 0, "path_to_save": None, "n_epochs": 10, "freeze_defaults": True
}

### 4.1 Custom model

Consider simple neural network that we want to train. 

In [12]:
class SimpleNet(nn.Module):
    def __init__(
        self,
        n_in,
        n_out,
        hidden_size,
        drop_rate,
        **kwargs, # kwargs is must-have to hold unnecessary parameters
    ):
        super(SimpleNet, self).__init__()
        self.features = nn.Sequential(OrderedDict([]))

        self.features.add_module("norm", nn.BatchNorm1d(n_in))
        self.features.add_module("dense1", nn.Linear(n_in, hidden_size))
        self.features.add_module("act", nn.SiLU())
        self.features.add_module("dropout", nn.Dropout(p=drop_rate))
        self.features.add_module("dense2", nn.Linear(hidden_size, n_out))

    def forward(self, x):
        """
        Args:
            x: data after feature pipeline transformation
            (by default concatenation of columns)
        """
        for layer in self.features:
            x = layer(x)
        return x

In [13]:
automl = TabularAutoML(
    **default_lama_params,
    general_params={"use_algos": [[SimpleNet]]},
    nn_params={
        **default_nn_params,
        "hidden_size": 256,
        "drop_rate": 0.1
    },
)
automl.fit_predict(tr_data, roles=roles, verbose=1)


[14:56:17] Stdout logging level is INFO.
[14:56:17] Task: binary

[14:56:17] Start automl preset with listed constraints:
[14:56:17] - time: 300.00 seconds
[14:56:17] - CPU: 4 cores
[14:56:17] - memory: 16 GB

[14:56:17] [1mTrain data shape: (8000, 122)[0m

[14:56:17] Layer [1m1[0m train process start. Time left 299.22 secs
[14:56:18] Start fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m ...
[14:56:23] Fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m finished. score = [1m0.70579837612218[0m
[14:56:23] [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m fitting and predicting completed
[14:56:23] Time left 293.15 secs

[14:56:23] [1mLayer 1 training completed.[0m

[14:56:23] [1mAutoml preset training completed in 6.86 seconds[0m

[14:56:23] Model description:
Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_TorchNN_0) 



array([[0.04888836],
       [0.02840128],
       [0.04246276],
       ...,
       [0.05778075],
       [0.17132443],
       [0.20606528]], dtype=float32)

#### 4.1.1 Define the pipeline by yourself

In [14]:
from typing import Sequence
from typing import Dict
from typing import Optional
from typing import Any
from typing import Callable
from typing import Union


class CatEmbedder(nn.Module):
    """Category data model.

    Args:
        cat_dims: Sequence with number of unique categories
            for category features
    """

    def __init__(
        self,
        cat_dims: Sequence[int],
        **kwargs
    ):
        super(CatEmbedder, self).__init__()
        emb_dims = [
            (int(x), 5)
            for x in cat_dims
        ]
        self.no_of_embs = sum([y for x, y in emb_dims])
        self.emb_layers = nn.ModuleList([nn.Embedding(x, y) for x, y in emb_dims])
    
    def get_out_shape(self) -> int:
        """Output shape.

        Returns:
            Int with module output shape.

        """
        return self.no_of_embs

    def forward(self, inp: Dict[str, torch.Tensor]) -> torch.Tensor:
        """Concat all categorical embeddings
        """
        output = torch.cat(
            [
                emb_layer(inp["cat"][:, i])
                for i, emb_layer in enumerate(self.emb_layers)
            ],
            dim=1,
        )
        return output


class ContEmbedder(nn.Module):
    """Numeric data model.

    Class for working with numeric data.

    Args:
        num_dims: Sequence with number of numeric features.
        input_bn: Use 1d batch norm for input data.

    """

    def __init__(self, num_dims: int,  **kwargs):
        super(ContEmbedder, self).__init__()
        self.n_out = num_dims
    
    def get_out_shape(self) -> int:
        """Output shape.

        Returns:
            int with module output shape.

        """
        return self.n_out
        
    def forward(self, inp: Dict[str, torch.Tensor]) -> torch.Tensor:
        """Forward-pass."""
        return (inp["cont"] - inp["cont"].mean(axis=0)) / (inp["cont"].std(axis=0) + 1e-6)

In [15]:
from lightautoml.text.nn_model import TorchUniversalModel

class SimpleNet_plus(TorchUniversalModel):
    """Mixed data model.

    Class for preparing input for DL model with mixed data.

    Args:
            n_out: Number of output dimensions.
            cont_params: Dict with numeric model params.
            cat_params: Dict with category model para
            **kwargs: Loss, task and other parameters.

        """

    def __init__(
            self,
            n_out: int = 1,
            cont_params: Optional[Dict] = None,
            cat_params: Optional[Dict] = None,
            **kwargs,
    ):
        # init parent class (need some helper functions to be used)
        super(SimpleNet_plus, self).__init__(**{
                **kwargs,
                "cont_params": cont_params,
                "cat_params": cat_params,
                "torch_model": None, # dont need any model inside parent class
        })
        
        n_in = 0
        
        # add cont columns processing
        self.cont_embedder = ContEmbedder(**cont_params)
        n_in += self.cont_embedder.get_out_shape()
        
        # add cat columns processing
        self.cat_embedder = CatEmbedder(**cat_params)
        n_in += self.cat_embedder.get_out_shape()
        
        self.torch_model = SimpleNet(
                **{
                    **kwargs,
                    **{"n_in": n_in, "n_out": n_out},
                }
        )
    
    def get_logits(self, inp: Dict[str, torch.Tensor]) -> torch.Tensor:
        outputs = []
        outputs.append(self.cont_embedder(inp))
        outputs.append(self.cat_embedder(inp))
        
        if len(outputs) > 1:
            output = torch.cat(outputs, dim=1)
        else:
            output = outputs[0]
        
        logits = self.torch_model(output)
        return logits

In [16]:
automl = TabularAutoML(
    **default_lama_params,
    general_params={"use_algos": [[SimpleNet_plus]]},
    nn_params={
        **default_nn_params,
        "hidden_size": 256,
        "drop_rate": 0.1,
        "model_with_emb": True,
    },
    debug=True
)
automl.fit_predict(tr_data, roles = roles, verbose = 1)

[14:56:24] Stdout logging level is INFO.
[14:56:24] Task: binary

[14:56:24] Start automl preset with listed constraints:
[14:56:24] - time: 300.00 seconds
[14:56:24] - CPU: 4 cores
[14:56:24] - memory: 16 GB

[14:56:24] [1mTrain data shape: (8000, 122)[0m

[14:56:24] Layer [1m1[0m train process start. Time left 299.21 secs
[14:56:25] Start fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m ...
[14:56:30] Fitting [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m finished. score = [1m0.6600159152016962[0m
[14:56:30] [1mLvl_0_Pipe_0_Mod_0_TorchNN_0[0m fitting and predicting completed
[14:56:30] Time left 293.19 secs

[14:56:30] [1mLayer 1 training completed.[0m

[14:56:30] [1mAutoml preset training completed in 6.82 seconds[0m

[14:56:30] Model description:
Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_TorchNN_0) 



array([[0.07509199],
       [0.06439159],
       [0.04291169],
       ...,
       [0.11671165],
       [0.2381251 ],
       [0.04382631]], dtype=float32)

### 4.2 Tuning network

One can try optimize metric with the help of Optuna. Among validation stratagies there are:
- `fit_on_holdout = True` - holdout
- `fit_on_holdout = False` - cross-validation.

#### 4.2.1 Built-in models

Use `"_tuned"` in model name to tune it.

In [17]:
automl = TabularAutoML(
    **default_lama_params,
    general_params={"use_algos": [["denselight_tuned"]]},
    nn_params={
        **default_nn_params,
        "n_epochs": 3,
        "tuning_params": {
            "max_tuning_iter": 5,
            "max_tuning_time": 100,
            "fit_on_holdout": True
        }
    },
)
automl.fit_predict(tr_data, roles = roles, verbose = 3)

[14:56:31] Stdout logging level is INFO3.
[14:56:31] Task: binary

[14:56:31] Start automl preset with listed constraints:
[14:56:31] - time: 300.00 seconds
[14:56:31] - CPU: 4 cores
[14:56:31] - memory: 16 GB

[14:56:31] [1mTrain data shape: (8000, 122)[0m

[14:56:31] Feats was rejected during automatic roles guess: []
[14:56:31] Layer [1m1[0m train process start. Time left 299.23 secs
[14:56:31] Start hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_denselight_tuned_0[0m ... Time budget is 100.00 secs
[14:56:32] Epoch: 0, train loss: 0.307483434677124, val loss: 0.2785775661468506, val metric: 0.6090575236140289
[14:56:32] Epoch: 1, train loss: 0.27614495158195496, val loss: 0.2799951434135437, val metric: 0.585088014710992
[14:56:32] Epoch: 2, train loss: 0.27499067783355713, val loss: 0.28620871901512146, val metric: 0.626435952125129
[14:56:32] Early stopping: val loss: 0.27636581659317017, val metric: 0.6215073421321315
[14:56:33] [1mTrial 1[0m with hy

array([[0.07784943],
       [0.04554275],
       [0.05328501],
       ...,
       [0.07100379],
       [0.09577154],
       [0.07620702]], dtype=float32)

#### 4.2.2 Custom model

There is a spesial flag `tuned` to mark that you need optimize parameters for the model.

In [18]:
automl = TabularAutoML(
    **default_lama_params,
    general_params={"use_algos": [[SimpleNet]]},
    nn_params={
        **default_nn_params,
        "hidden_size": 256,
        "drop_rate": 0.1,
        
        "tuned": True,
        "tuning_params": {
            "max_tuning_iter": 5,
            "max_tuning_time": 100,
            "fit_on_holdout": True
        }
    },
)
automl.fit_predict(tr_data, roles = roles, verbose = 2)

[14:56:43] Stdout logging level is INFO2.
[14:56:43] Task: binary

[14:56:43] Start automl preset with listed constraints:
[14:56:43] - time: 300.00 seconds
[14:56:43] - CPU: 4 cores
[14:56:43] - memory: 16 GB

[14:56:43] [1mTrain data shape: (8000, 122)[0m

[14:56:43] Layer [1m1[0m train process start. Time left 299.22 secs
[14:56:43] Start hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m ... Time budget is 100.00 secs


Optimization Progress: 100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:12<00:00,  2.42s/it, best_trial=0, best_value=0.767]

[14:56:56] Hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m completed
[14:56:56] The set of hyperparameters [1m{'bs': 128, 'weight_decay_bin': 0, 'lr': 0.029154431891537533}[0m
 achieve 0.7667 auc
[14:56:56] Start fitting [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m ...
[14:56:56] ===== Start working with [1mfold 0[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m =====





[14:56:58] ===== Start working with [1mfold 1[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m =====
[14:57:01] ===== Start working with [1mfold 2[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m =====
[14:57:04] ===== Start working with [1mfold 3[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m =====
[14:57:06] ===== Start working with [1mfold 4[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m =====
[14:57:09] Fitting [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m finished. score = [1m0.7271980081974132[0m
[14:57:09] [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m fitting and predicting completed
[14:57:09] Time left 273.65 secs

[14:57:09] [1mLayer 1 training completed.[0m

[14:57:09] [1mAutoml preset training completed in 26.35 seconds[0m

[14:57:09] Model description:
Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0) 



array([[0.03879727],
       [0.02351108],
       [0.0386253 ],
       ...,
       [0.04145308],
       [0.182652  ],
       [0.28383675]], dtype=float32)

Sometimes we need to tune parameters that we define by ourself. To this purpose we have `optimization_search_space` which describes necessary parameter grid. See example below.  
Here is the grid:  
- `bs` in `[64, 128, 256, 512, 1024]`
- `hidden_size` in `[64, 128, 256, 512, 1024]`
- `drop_rate` in `[0.0, 0.3]`


In [19]:
def my_opt_space(trial: optuna.trial.Trial, estimated_n_trials, suggested_params):
    ''' 
        This function needs for parameter tuning
    '''
    # optionally
    trial_values = copy(suggested_params)

    trial_values["bs"] = trial.suggest_categorical(
        "bs", [2 ** i for i in range(6, 11)]
    )
    trial_values["hidden_size"] = trial.suggest_categorical(
        "hidden_size", [2 ** i for i in range(6, 11)]
    )
    trial_values["drop_rate"] = trial.suggest_float(
        "drop_rate", 0.0, 0.3
    )
    return trial_values

In [20]:
automl = TabularAutoML(
    **default_lama_params,
    general_params={"use_algos": [[SimpleNet]]},
    nn_params={
        **default_nn_params,
        "n_epochs": 3,
        "tuned": True,
        "tuning_params": {
            "max_tuning_iter": 5,
            "max_tuning_time": 3600,
            "fit_on_holdout": True
        },
        "optimization_search_space": my_opt_space,
    },
)
automl.fit_predict(tr_data, roles = roles, verbose = 3)

[14:57:09] Stdout logging level is INFO3.
[14:57:09] Task: binary

[14:57:09] Start automl preset with listed constraints:
[14:57:09] - time: 300.00 seconds
[14:57:09] - CPU: 4 cores
[14:57:09] - memory: 16 GB

[14:57:09] [1mTrain data shape: (8000, 122)[0m

[14:57:10] Feats was rejected during automatic roles guess: []
[14:57:10] Layer [1m1[0m train process start. Time left 299.19 secs
[14:57:10] Start hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_0[0m ... Time budget is 156.97 secs
[14:57:10] Epoch: 0, train loss: 0.27667880058288574, val loss: 0.2776942551136017, val metric: 0.6380251348418515
[14:57:10] Epoch: 1, train loss: 0.2685483694076538, val loss: 0.2682102620601654, val metric: 0.6962383266246506
[14:57:11] Epoch: 2, train loss: 0.2586376965045929, val loss: 0.259479820728302, val metric: 0.741352213865324
[14:57:11] Early stopping: val loss: 0.2688170075416565, val metric: 0.7005254689396005
[14:57:11] [1mTrial 1[0m with hyperparameters {'bs'

array([[0.04496425],
       [0.03032025],
       [0.03665409],
       ...,
       [0.05365612],
       [0.16432838],
       [0.1691863 ]], dtype=float32)

##### 4.2.3 One more example
##### Tuning NODE params

In [21]:
TIMEOUT = 3000

In [22]:
default_lama_params = {
    "task": task, 
    "timeout": TIMEOUT,
    "cpu_limit": N_THREADS,
    "reader_params": {'n_jobs': N_THREADS, 'cv': N_FOLDS, 'random_state': RANDOM_STATE}
}

default_nn_params = {
    "bs": 512, "num_workers": 0, "path_to_save": None, "n_epochs": 10, "freeze_defaults": True
}

In [23]:
def my_opt_space_NODE(trial: optuna.trial.Trial, estimated_n_trials, suggested_params):
    ''' 
        This function needs for parameter tuning
    '''
    # optionally
    trial_values = copy(suggested_params)

    trial_values["layer_dim"] = trial.suggest_categorical(
        "layer_dim", [2 ** i for i in range(8, 10)]
    )
    trial_values["use_original_head"] = trial.suggest_categorical(
        "use_original_head", [True, False]
    )
    trial_values["num_layers"] = trial.suggest_int(
        "num_layers", 1, 3
    )
    trial_values["drop_rate"] = trial.suggest_float(
        "drop_rate", 0.0, 0.3
    )
    trial_values["tree_dim"] = trial.suggest_int(
        "tree_dim", 1, 3
    )
    return trial_values

In [24]:
automl = TabularAutoML(
    task = task, 
    timeout = TIMEOUT,
    cpu_limit = N_THREADS,
    general_params = {"use_algos": [["node_tuned"]]}, # ['nn', 'mlp', 'dense', 'denselight', 'resnet', 'snn'] or custom torch model
    nn_params = {"n_epochs": 10, "bs": 512, "num_workers": 0, "path_to_save": None, "freeze_defaults": True, "optimization_search_space": my_opt_space_NODE,},
    nn_pipeline_params = {"use_qnt": True, "use_te": False},
    reader_params = {'n_jobs': N_THREADS, 'cv': N_FOLDS, 'random_state': RANDOM_STATE}
)

In [25]:
oof_pred = automl.fit_predict(tr_data, roles = roles, verbose = 2)

[14:57:24] Stdout logging level is INFO2.
[14:57:24] Task: binary

[14:57:24] Start automl preset with listed constraints:
[14:57:24] - time: 3000.00 seconds
[14:57:24] - CPU: 4 cores
[14:57:24] - memory: 16 GB

[14:57:24] [1mTrain data shape: (8000, 122)[0m

[14:57:25] Layer [1m1[0m train process start. Time left 2999.22 secs
[14:57:25] Start hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m ... Time budget is 1574.34 secs


Optimization Progress: 100%|█████████████████████████████████████████████████████████████████████████████████| 25/25 [03:48<00:00,  9.14s/it, best_trial=13, best_value=0.732]

[15:01:14] Hyperparameters optimization for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m completed
[15:01:14] The set of hyperparameters [1m{'layer_dim': 512, 'use_original_head': False, 'num_layers': 3, 'drop_rate': 0.1310638585198816, 'tree_dim': 3}[0m
 achieve 0.7315 auc
[15:01:14] Start fitting [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m ...
[15:01:14] ===== Start working with [1mfold 0[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m =====





[15:01:27] ===== Start working with [1mfold 1[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m =====
[15:01:41] ===== Start working with [1mfold 2[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m =====
[15:01:54] ===== Start working with [1mfold 3[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m =====
[15:02:07] ===== Start working with [1mfold 4[0m for [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m =====
[15:02:20] Fitting [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m finished. score = [1m0.6942477367184283[0m
[15:02:20] [1mLvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0[0m fitting and predicting completed
[15:02:20] Time left 2703.48 secs

[15:02:20] [1mLayer 1 training completed.[0m

[15:02:20] [1mAutoml preset training completed in 296.53 seconds[0m

[15:02:20] Model description:
Final prediction for new objects (level 0) = 
	 1.00000 * (5 averaged models Lvl_0_Pipe_0_Mod_0_Tuned_TorchNN_node_tuned_0) 



### 4.3 Several models

If you have several neural networks you can either define one set parameters for all or use unique for each one of them as below.  
**Note:** numeration starts with 0. Each id (string of number) corresponds to the serial number in *the list of used neural networks*.

In [26]:
automl = TabularAutoML(
    **default_lama_params,
    general_params = {"use_algos": [["lgb", "mlp", "dense"]]},
    nn_params = {"0": {**default_nn_params, "n_epochs": 2},
                 "1": {**default_nn_params, "n_epochs": 5}},
)
automl.fit_predict(tr_data, roles = roles, verbose = 3)

[15:02:20] Stdout logging level is INFO3.
[15:02:20] Task: binary

[15:02:20] Start automl preset with listed constraints:
[15:02:20] - time: 3000.00 seconds
[15:02:20] - CPU: 4 cores
[15:02:20] - memory: 16 GB

[15:02:20] [1mTrain data shape: (8000, 122)[0m

[15:02:21] Feats was rejected during automatic roles guess: []
[15:02:21] Layer [1m1[0m train process start. Time left 2999.21 secs
[15:02:21] Training until validation scores don't improve for 200 rounds
[15:02:24] [1mSelector_LightGBM[0m fitting and predicting completed
[15:02:25] Start fitting [1mLvl_0_Pipe_0_Mod_0_LightGBM[0m ...
[15:02:25] ===== Start working with [1mfold 0[0m for [1mLvl_0_Pipe_0_Mod_0_LightGBM[0m =====
[15:02:25] Training until validation scores don't improve for 200 rounds
[15:02:27] ===== Start working with [1mfold 1[0m for [1mLvl_0_Pipe_0_Mod_0_LightGBM[0m =====
[15:02:27] Training until validation scores don't improve for 200 rounds
[15:02:31] ===== Start working with [1mfold 2[0m for 

array([[0.06395157],
       [0.04285344],
       [0.04808115],
       ...,
       [0.04276791],
       [0.19339147],
       [0.10395089]], dtype=float32)