# Build your own model upon others

Models can be built based on other trained models in the current model base or in other model bases. Both `AbstractModel` and `TorchModel` support this feature.

## For `AbstractModel`

In [1]:
import tabensemb
import numpy as np
import torch
import os
from tempfile import TemporaryDirectory
from tabensemb.model import WideDeep, AbstractModel

temp_path = TemporaryDirectory()
tabensemb.setting["default_output_path"] = os.path.join(temp_path.name, "output")
tabensemb.setting["default_config_path"] = os.path.join(temp_path.name, "configs")
tabensemb.setting["default_data_path"] = os.path.join(temp_path.name, "data")

device = "cuda" if torch.cuda.is_available() else "cpu"

Suppose that we want to call TabMlp of WideDeep in another model base `CallTabMlp`

```python
class CallTabMlp(AbstractModel):
    def _get_program_name(self):
        return "CallTabMlp"

    def _get_model_names(self):
        return ["CalledTabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}
```

Extracting another model can be done by setting `required_models` in a specific format. In the following code, "EXTERN" means that the model is from another model base. "WideDeep" is the name of the model base which the wanted model is from. "TabMlp" is the wanted model in the model base. If the model is from the current model base, only the name of the wanted model is needed (`return ["TabMlp"]`). Multiple required models can be specified in the returned list.

```python
    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]
```

As normal, `_train_data_preprocess`, `_data_preprocess`, `_new_model`, `_train_single_model`, and `_pred_single_model` should be implemented. First, `_train_data_preprocess` is called, and `_get_required_models` is used to extract the external model. In this case, a `WideDeep` instance containing the trained TabMlp model is returned. If the model is from the current model base, calling `self._get_required_models("TabMlp")` is equivalent to calling `self.model["TabMlp"]`.

Then the `_train_data_preprocess` method from `WideDeep` is directly used to process the dataset to get compatible processed data.

```python
    def _train_data_preprocess(self, model_name):
        if not hasattr(self, "net"):
            self.net = self._get_required_models("TabMlp")["EXTERN_WideDeep_TabMlp"]
            self.net.trainer = self.trainer
        return self.net._train_data_preprocess("TabMlp")
```

Also, `_data_preprocess` calls the same method from `WideDeep` instead to get compatible processed data.

```python
    def _data_preprocess(self, df, derived_data, model_name):
        return self.net._data_preprocess(df, derived_data, "TabMlp")
```

In `_new_model`, the extracted model is directly returned.

```python
    def _new_model(self, model_name, verbose, **kwargs):
        return self.net
```

`_pred_single_model` calls the same method from `WideDeep` to make predictions based on the extracted model.

```python
    def _pred_single_model(self, model, X_test, verbose, **kwargs):
        return model._pred_single_model(model.model["TabMlp"], X_test, verbose, **kwargs)
```

In this example, we won't do further training on the extracted model, but it is straightforward to do other operations on the predictions from the extracted model obtained by `model._pred_single_model` as shown above.

```python
    def _train_single_model(self, *args, **kwargs):
        pass
```

In [2]:
class CallTabMlp(AbstractModel):
    def _get_program_name(self):
        return "CallTabMlp"

    def _get_model_names(self):
        return ["TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _train_data_preprocess(self, model_name):
        if not hasattr(self, "net"):
            self.net = self._get_required_models("TabMlp")["EXTERN_WideDeep_TabMlp"]
            self.net.trainer = self.trainer
        return self.net._train_data_preprocess("TabMlp")

    def _data_preprocess(self, df, derived_data, model_name):
        return self.net._data_preprocess(df, derived_data, "TabMlp")

    def _new_model(self, model_name, verbose, **kwargs):
        return self.net

    def _train_single_model(self, *args, **kwargs):
        pass

    def _pred_single_model(self, model, X_test, verbose, **kwargs):
        return model._pred_single_model(model.model["TabMlp"], X_test, verbose, **kwargs)

## For `TorchModel`

It is easier to build a model based on others in `TorchModel` because we have already implemented complex dataset-building operations internally.

Similar to the implementation above, we specify methods except for `_train_data_preprocess` and `_data_preprocess`.

```python
class CallTabMlpTorch(TorchModel):
    def _get_program_name(self):
        return "CallTabMlpTorch"

    def _get_model_names(self):
        return ["TabMlp"]

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}
```

We build our model `CallTabMlpNN` on the top of TabMlp from WideDeep. In this tutorial, we will not train anything.

```python
    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNN(datamodule=self.trainer.datamodule, **kwargs)

    def _train_single_model(self, *args, **kwargs):
        pass
```

Now comes `CallTabMlpNN`. A positional argument `required_models` is passed to `__init__` containing all required and extracted models specified in `CallTabMlpTorch.required_models`.

```python
class CallTabMlpNN(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNN, self).__init__(datamodule, **kwargs)
        self.net = required_models["EXTERN_WideDeep_TabMlp"]
```

To get results from the extracted model, use `self.call_required_model`.

```python
    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        return self.call_required_model(self.net, x, derived_tensors)
```

**Remark**: Indeed, the output of the model is already calculated when preparing the dataset and is stored in `derived_tensors["data_required_models"]["MODELNAME_pred"]`. `self.call_required_model` first tries to find the pre-calculated output. If failed, the output is calculated using the dataset for the model base stored in `derived_tensors["data_required_models"]["MODELNAME"]`. Therefore, if you want to actually calculate the output during `forward`, just remove the stored predictions in `derived_tensors`.

In [3]:
from tabensemb.model import TorchModel, AbstractNN

class CallTabMlpNN(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNN, self).__init__(datamodule, **kwargs)
        self.net = required_models["EXTERN_WideDeep_TabMlp"]

    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        return self.call_required_model(self.net, x, derived_tensors)

class CallTabMlpTorch(TorchModel):
    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNN(datamodule=self.trainer.datamodule, **kwargs)

    def _get_program_name(self):
        return "CallTabMlpTorch"

    def _get_model_names(self):
        return ["TabMlp"]

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp"]

    def _space(self, model_name):
        return []

    def _initial_values(self, model_name):
        return {}

    def _train_single_model(self, *args, **kwargs):
        pass

We can compare results from the original model and the extracted model. They get exactly the same results.

In [4]:
from tabensemb.trainer import Trainer
from tabensemb.config import UserConfig

trainer = Trainer(device=device)
cfg = UserConfig.from_uci("Auto MPG", sep="\s+")
trainer.load_config(cfg)
trainer.load_data()
trainer.add_modelbases(
    [
        WideDeep(trainer, model_subset=["TabMlp"]),
        CallTabMlp(trainer),
        CallTabMlpTorch(trainer),
    ]
)
trainer.train(stderr_to_stdout=True)
trainer.get_leaderboard()

Downloading https://archive.ics.uci.edu/static/public/9/auto+mpg.zip to /tmp/tmpruwv_gw_/data/Auto MPG.zip
cylinders is Integer and will be treated as a continuous feature.
model_year is Integer and will be treated as a continuous feature.
origin is Integer and will be treated as a continuous feature.
Unknown values are detected in ['horsepower']. They will be treated as np.nan.
Project will be saved to /tmp/tmpruwv_gw_/output/auto-mpg/2023-08-27-13-08-57-0_UserInputConfig




Dataset size: 238 80 80
Data saved to /tmp/tmpruwv_gw_/output/auto-mpg/2023-08-27-13-08-57-0_UserInputConfig (data.csv and tabular_data.csv).

-------------Run WideDeep-------------

Training TabMlp
Epoch: 1/300, Train loss: 31.3699, Val loss: 31.3902, Min val loss: 31.3902
Epoch: 21/300, Train loss: 4.7284, Val loss: 4.3482, Min val loss: 4.3482
Epoch: 41/300, Train loss: 2.1918, Val loss: 2.1015, Min val loss: 2.1015
Epoch: 61/300, Train loss: 1.2278, Val loss: 1.2459, Min val loss: 1.2459
Epoch: 81/300, Train loss: 1.0096, Val loss: 0.9937, Min val loss: 0.9937
Epoch: 101/300, Train loss: 0.8193, Val loss: 0.7567, Min val loss: 0.7567
Epoch: 121/300, Train loss: 0.7792, Val loss: 0.7089, Min val loss: 0.7049
Epoch: 141/300, Train loss: 0.7625, Val loss: 0.6014, Min val loss: 0.6014
Epoch: 161/300, Train loss: 0.7034, Val loss: 0.5495, Min val loss: 0.5495
Epoch: 181/300, Train loss: 0.6561, Val loss: 0.5199, Min val loss: 0.5059
Epoch: 201/300, Train loss: 0.6215, Val loss: 0.4681, 

Unnamed: 0,Program,Model,Training RMSE,Training MSE,Training MAE,Training MAPE,Training R2,Training MEDIAN_ABSOLUTE_ERROR,Training EXPLAINED_VARIANCE_SCORE,Testing RMSE,...,Testing R2,Testing MEDIAN_ABSOLUTE_ERROR,Testing EXPLAINED_VARIANCE_SCORE,Validation RMSE,Validation MSE,Validation MAE,Validation MAPE,Validation R2,Validation MEDIAN_ABSOLUTE_ERROR,Validation EXPLAINED_VARIANCE_SCORE
0,WideDeep,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366
1,CallTabMlp,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366
2,CallTabMlpTorch,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366


## Extract learned hidden representation from models

The original correlation among input features and targets can be complex, especially for high dimensional inputs and multimodal inputs, which is why we want deep learning models to extract the internal relations and reduce the dimension. For most deep learning models, no matter what the backbone structure is, the output of the backbone is normally a low dimension tensor (for instance, `(batch_size, 16)`), which contains learned information from the deep learning model, so we name it "hidden representation" of the deep learning model. The hidden representation will be projected to the output dimension through a linear layer, an MLP, etc.

Most models in two model bases, `pytorch_widedeep` (WideDeep) and `pytorch_tabular` (PyTorchTabular), are supported to extract hidden representations in an `AbstractNN`.

To use this functionality, first, change the name in `required_models`. A postfix "_WRAP" is added.

```python
class CallTabMlpTorchWrapped(CallTabMlpTorch):
    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp_WRAP"]

    def _get_program_name(self):
        return "CallTabMlpTorchWrapped"

    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNNWrapped(datamodule=self.trainer.datamodule, **kwargs)
```

Now more operations can be done in the `AbstractNN`. In `__init__`, `_test_required_model` can be used to check the validity of hidden representations and get its dimension to further generate `nn.Module`s like a linear layer or MLP.

```python
class CallTabMlpNNWrapped(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNNWrapped, self).__init__(datamodule, **kwargs)
        print(required_models)
        self.net = required_models["EXTERN_WideDeep_TabMlp_WRAP"]
        self.use_hidden_rep, hidden_rep_dim = self._test_required_model(
            self.n_inputs, self.net
        )
        print(f"Does the model support extracting hidden representation?: {self.use_hidden_rep}")
        print(f"The dimension of the hidden representation: {hidden_rep_dim}")
```

When doing forward propagation, the hidden representation can be extracted using `get_hidden_state`.

```python
    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        print(derived_tensors["data_required_models"].keys())
        output = self.call_required_model(self.net, x, derived_tensors)
        hidden = self.get_hidden_state(self.net, x, derived_tensors)
        print(f"The dimensions of the batched hidden representation: {hidden.shape}")
        return output
```

**Remark**: Same as the output, the hidden representation is calculated when preparing the dataset and is stored in `derived_tensors["data_required_models"]["MODELNAME_hidden"]`.

In [5]:
from tabensemb.model import TorchModel, AbstractNN

class CallTabMlpNNWrapped(AbstractNN):
    def __init__(self, datamodule, required_models, **kwargs):
        super(CallTabMlpNNWrapped, self).__init__(datamodule, **kwargs)
        print(required_models)
        self.net = required_models["EXTERN_WideDeep_TabMlp_WRAP"]
        self.use_hidden_rep, hidden_rep_dim = self._test_required_model(
            self.n_inputs, self.net
        )
        print(f"Does the model support extracting hidden representation?: {self.use_hidden_rep}")
        print(f"The dimension of the hidden representation: {hidden_rep_dim}")

    def _forward(self, x: torch.Tensor, derived_tensors) -> torch.Tensor:
        print(derived_tensors["data_required_models"].keys())
        output = self.call_required_model(self.net, x, derived_tensors)
        hidden = self.get_hidden_state(self.net, x, derived_tensors)
        print(f"The dimensions of the batched hidden representation: {hidden.shape}")
        return output

class CallTabMlpTorchWrapped(CallTabMlpTorch):
    def _get_program_name(self):
        return "CallTabMlpTorchWrapped"

    def _new_model(self, model_name, verbose, **kwargs):
        return CallTabMlpNNWrapped(datamodule=self.trainer.datamodule, **kwargs)

    def required_models(self, model_name: str):
        return ["EXTERN_WideDeep_TabMlp_WRAP"]

We can show the information of the extracted model, the hidden representation, and the stored data and predictions.

In [6]:
trainer.add_modelbases([CallTabMlpTorchWrapped(trainer, store_in_harddisk=False)])
trainer.get_modelbase("CallTabMlpTorchWrapped").train(stderr_to_stdout=True)
trainer.get_leaderboard()


-------------Run CallTabMlpTorchWrapped-------------

Training TabMlp
{'EXTERN_WideDeep_TabMlp_WRAP': <tabensemb.model.widedeep.WideDeepWrapper object at 0x7f4e0df990a0>}
Does the model support extracting hidden representation?: True
The dimension of the hidden representation: 100
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([238, 100])
Training mse loss: 0.25748
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
Validation mse loss: 0.28530
dict_keys(['EXTERN_WideDeep_TabMlp', 'EXTERN_WideDeep_TabMlp_pred', 'EXTERN_WideDeep_TabMlp_hidden'])
The dimensions of the batched hidden representation: torch.Size([80, 100])
Testing mse loss: 0.27732
Trainer saved. To load the trainer, run trainer = load_trainer(path='/tmp/tmpruwv_gw_/output/auto-mpg/2

Unnamed: 0,Program,Model,Training RMSE,Training MSE,Training MAE,Training MAPE,Training R2,Training MEDIAN_ABSOLUTE_ERROR,Training EXPLAINED_VARIANCE_SCORE,Testing RMSE,...,Testing R2,Testing MEDIAN_ABSOLUTE_ERROR,Testing EXPLAINED_VARIANCE_SCORE,Validation RMSE,Validation MSE,Validation MAE,Validation MAPE,Validation R2,Validation MEDIAN_ABSOLUTE_ERROR,Validation EXPLAINED_VARIANCE_SCORE
0,WideDeep,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366
1,CallTabMlp,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366
2,CallTabMlpTorch,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366
3,CallTabMlpTorchWrapped,TabMlp,0.507421,0.257476,0.370808,0.070433,0.908686,0.272293,0.915921,0.526612,...,0.91047,0.32388,0.921738,0.53413,0.285295,0.419972,0.081513,0.899874,0.35455,0.905366


**Remark**: If a model from the same `TorchModel` is required, the `AbstractNN` is extracted and passed as `required_models`. When calling `call_required_model` and `get_hidden_state`, you must pass the `model_name` argument.