# Teras Pretraining Tutorial

## Introduction

In this tutorial, I'll walk you through how to pretrain a model using Teras.

**Model**: `TabNetRegressor`

**Dataset**: Gemstone dataset (from Kaggle)

**Task**: Regression using pretrained model


**NOTE**: This tutorial is for those architectures that offer the pretraining capability and their custom pretraining architectures.
There is no one-for-all pretraining class in Teras as every architecture that incorporates pretraining has its own pretraining approach.

## Data Loading and Preprocessing

In [1]:
import pandas as pd

# We'll use the first 10000 instances
# We also drop the id column since that is useless
gem_df = pd.read_csv("./datasets/gemstone_dataset.csv").drop("id", axis=1)[:10000]
gem_df.head(3)

Unnamed: 0,carat,cut,color,clarity,depth,table,x,y,z,price
0,1.52,Premium,F,VS2,62.2,58.0,7.27,7.33,4.55,13619
1,2.03,Very Good,J,SI2,62.0,58.0,8.06,8.12,5.05,13387
2,0.7,Ideal,G,VS1,61.2,57.0,5.69,5.73,3.5,2772


In [2]:
categorical_feats = ["cut", "color", "clarity"]
numerical_feats = ["carat", "depth", "table", "x", "y", "z"]

### Splitting dataset for pretraiining and training
TabNetPretrainer is designed for scenarios when there's lots of unlabeled data and relatively small size of labeled data.
Though, our dataset is all labeled but for the sake of this tutorial, we'll create splits of datasets and use unlabeled split
for pretraining and labeled split for training

In [3]:
from sklearn.model_selection import train_test_split


training_df, pretraining_df = train_test_split(gem_df,
                                            test_size=0.5,
                                            shuffle=True,
                                            random_state=1337)  # 1337 gang :D
pretraining_df.drop("price", axis=1, inplace=True) # make it unlabeled for pretraining

In [4]:
# Generate features metadata
# Read more about what it is and why do we need it in the
# Section 1 of General Guidelines and FAQs notebook in the tutorials directory
from teras.utils import get_features_metadata_for_embedding

metadata_pretraining = get_features_metadata_for_embedding(pretraining_df,
                                                           categorical_features=categorical_feats,
                                                           numerical_features=numerical_feats)

metadata_training = get_features_metadata_for_embedding(training_df,
                                                        categorical_features=categorical_feats,
                                                        numerical_features=numerical_feats)

2023-07-13 19:01:41.901312: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 19:01:41.987041: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-07-13 19:01:41.988658: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [5]:
# Converting dataframe to tensorflow dataset of dictionary format
# Read more about why and when we need to convert it into dictionary format in the
# Section 2 of General Guidelines and FAQs notebook in the tutorials directory 
from teras.utils import dataframe_to_tf_dataset

pretraining_ds = dataframe_to_tf_dataset(pretraining_df, as_dict=True)
training_ds = dataframe_to_tf_dataset(training_df, target="price", as_dict=True)

## Pretraining the base model

Now usually we an architecture's Classifier or Regressor model for the task at hand, but for pretraining, instead of using those, we use a base model class of that architecture.

In this case, we want to first pretrain and then finetune a `TabNetRegressor` model but to do that, we'll first pretrain a base `TabNet` model. This may sound like an extra step but you'll soon see why we do it and how it helps us avoid any pitfalls.

#### Import the base TabNet model

In [6]:
from teras.models import TabNet

# Since we want to encode categorical string values 
# so we set the encode_categorical_values flag to True
# We leave everything else to their default values
base_model = TabNet(features_metadata=metadata_pretraining,
                    encode_categorical_values=True,
                    virtual_batch_size=2)


TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 

  return bool(asarray(a1 == a2).all())


#### Import the TabNetPretrainer class

To pretrain a base `TabNet` model, there's a `TabNetPretrainer` model that accepts a base model as arguments along with any architecture specific parameters for pretraining and is just a `keras model` which you can `compile` and `fit` like you do for any other keras model.

In [None]:
from teras.models import TabNetPretrainer

# We'll leave other parameters values as default
# for this tutorial but feel free to experiment
pretrainer = TabNetPretrainer(model=base_model)

# For custom architectures like these, which employ custom
# loss functions, `Teras` has default values in place.
# So unless you understand the underlying structure of
# the pretrainer arhcitecture, it's better to just use
# the default loss.
# Though you can specify any optimizer.
# Read more in the 
# Section 4 of General Guidelines and FAQs notebook in tutorial directory.
pretrainer.compile()

history = pretrainer.fit(pretraining_ds, epochs=1)

## Creating TabNetRegressor instance using a pretrained base model

To create a TabNetRegressor model instance using the pretrained base TabNet model, we first need to retrieve that pretrained model from teh TabNetPretrainer instance.

In [8]:
pretrained_base = pretrainer.pretrained_model

`TabNetRegressor` (and any other model for that matter that is part of an architecture that offers pretraining capabilities) offers a .`from_pretrained` class method takes in the pretrained base and returns an instance of `TabNetRegressor` using that pretrained base.

In [9]:
from teras.models import TabNetRegressor

regressor = TabNetRegressor.from_pretrained(pretrained_model=pretrained_base,
                                            num_outputs=1)

## Finetuning

The returned instance is **NOT** compiled automatically — for obvious reasons to allow user the flexibility to freeze compile train, unfreeze compile train, you know the typical fine-tuning workflow.


### Training the regression head only!
Let's first freeze the pretrained base, and train the Regression head of the TabNetRegressor for a few epochs before we train the whole model as a whole.

In [10]:
pretrained_base.trainable = False
regressor.compile(loss="mse", metrics=["mae"])
regressor.fit(training_ds, epochs=2)

Epoch 1/2


2023-07-13 19:02:10.593278: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_6' with dtype double and shape [5000]
	 [[{{node Placeholder/_6}}]]
2023-07-13 19:02:10.594293: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_2' with dtype string and shape [5000]
	 [[{{node Placeholder/_2}}]]


Epoch 2/2


<keras.callbacks.History at 0x7f5ab6315ab0>

### Training the model as a whole!

In [11]:
pretrained_base.trainable = True
regressor.compile(loss="mse", metrics=["mae"])
regressor.fit(training_ds, epochs=2)

Epoch 1/2


Epoch 2/2


<keras.callbacks.History at 0x7f5ab6315570>

## Wrappin it up!

And that wraps up our pretraining tutorial using Teras.

If you need more help, consult documentation, and other available resources and if that still leaves you with questions, feel free to raise an issue or email me khawaja.abaid@gmail.com

If you find `Teras` useful, please consider giving it a star on GitHub and sharing it with others!

Thank you!