# Training a model in Hezar

In this notebook, we're gonna demonstrate a training walkthrough. Training a model in Hezar is pretty much like any other library or even simpler! As mentioned before, any model in Hezar is also a PyTorch module. So training a model is actually training a PyTorch model with some more cool features! Lets dive in.

In [1]:
from hezar import (
    TrainConfig,
    Trainer,
    TextClassificationDatasetConfig,
    build_dataset,
    build_model,
    build_optimizer,
    build_scheduler,
)

  from .autonotebook import tqdm as notebook_tqdm


### Build the datasets

First things first, lets build our datasets. Your dataset can be either a normal `PyTorch Dataset` or a Hezar `Dataset`. Here we use a simple `TextClassificationDataset` from Hezar.


Hezar datasets are built using a `DatasetConfig`. So lets define our dataset parameters.

In [2]:
dataset_config = TextClassificationDatasetConfig(
    path="hezar-ai/sentiment_digikala_snappfood",
    text_field="text",
    label_field="label",
    tokenizer_path="hezar-ai/bert-base-fa",
)
dataset_config

TextClassificationDatasetConfig(name='text_classification', config_type='dataset', task='text_classification', path='hezar-ai/sentiment_digikala_snappfood', normalizers=None, tokenizer_path='hezar-ai/bert-base-fa', label_field='label', text_field='text', max_length=None)

Now create train/validation datasets.

In [3]:
train_dataset = build_dataset(name="text_classification", split="train", config=dataset_config)
eval_dataset = build_dataset(name="text_classification", split="test", config=dataset_config)

Found cached dataset sentiment_digikala_snappfood (/home/aryan/.cache/huggingface/datasets/hezar-ai___sentiment_digikala_snappfood/default/0.0.0/1302e757606fe651f42166af308f6002a67f0f78beab10903a743bfa615150c2)
Found cached dataset sentiment_digikala_snappfood (/home/aryan/.cache/huggingface/datasets/hezar-ai___sentiment_digikala_snappfood/default/0.0.0/1302e757606fe651f42166af308f6002a67f0f78beab10903a743bfa615150c2)


### Build the model

Choose a model for this task and build the model as you would normally do in Hezar (See [models overview](01_models_overview.ipynb))

In [4]:
model = build_model("bert_text_classification", id2label=train_dataset.id2label)  # we add id2label as a config parameter since it's necessary for a classification model

### Define optimizer

We use the typical `Adam` with a `Reduce On Plateau` scheduler

In [5]:
optimizer = build_optimizer("adam", model.parameters(), lr=2e-5)
lr_scheduler = build_scheduler("reduce_on_plateau", optimizer=optimizer)

# Optionally you can configure optimizer and scheduler as you would in PyTorch
# from torch import optim
# optimizer = optim.Adam(model.parameters(), lr=2e-5)
# lr_scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer=optimizer)

### Training

Hezar comes with a built-in `Trainer` so that model training is as easy and straightforward as possible. As you might have guessed, in order to use a Trainer we first need to setup the config.

In [6]:
train_config = TrainConfig(
    name="text_classification",
    device="cuda",
    init_weights_from="hezar-ai/bert-base-fa",
    batch_size=8,
    num_train_epochs=5,
    checkpoints_dir="checkpoints/",
    metrics={"f1": {"task": "multiclass"}},
)

Notice that our model is a BERT model with random weights, but we want to finetune it for a simple task. So we need to load the pretrained language model weights. To do this, simply provide the `init_weights_from` parameter which takes a Hub ID to a model and loads the weights to our model. (Missing classification head is automatically ignored)

Now that we have our config, lets build the Trainer.

In [7]:
trainer = Trainer(
    config=train_config,
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    data_collator=train_dataset.data_collator,
    optimizer=optimizer,
    lr_scheduler=lr_scheduler,
)

Incompatible keys: []
Missing keys: ['classifier.weight', 'classifier.bias']



Aaaannnddd lets train!

In [8]:
trainer.train()




Epoch: 1/5      100%|######################################################################| 3576/3576 [07:07<00:00,  8.37batch/s, f1=0.732, loss=0.619]
Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 38.64batch/s, f1=0.8, loss=0.473]  





Epoch: 2/5      100%|######################################################################| 3576/3576 [07:00<00:00,  8.50batch/s, f1=0.807, loss=0.47] 
Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 39.87batch/s, f1=0.838, loss=0.419]





Epoch: 3/5      100%|######################################################################| 3576/3576 [07:01<00:00,  8.48batch/s, f1=0.864, loss=0.348]
Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 39.97batch/s, f1=0.875, loss=0.346]





Epoch: 4/5      100%|######################################################################| 3576/3576 [06:57<00:00,  8.56batch/s, f1=0.919, loss=0.227]
Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 38.84batch/s, f1=0.875, loss=0.381]





Epoch: 5/5      100%|######################################################################| 3576/3576 [07:02<00:00,  8.46batch/s, f1=0.943, loss=0.156]
Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 39.71batch/s, f1=0.887, loss=0.446]


Training is done! Lets re-evaluate the model

In [19]:
trainer.evaluate()

Evaluating...   100%|######################################################################| 290/290 [00:07<00:00, 39.46batch/s, f1=0.887, loss=0.445]


{'loss': 0.4447593633920468, 'f1': 0.8866379310344827}

So we trained the model for 5 epochs. As you can see, everything is verbosed during the process. After each epoch all metrics and weights are logged and saved. Tensorboard logs are saved to a folder called `runs` (you can change this default) and you can inspect it as usual:

In [None]:
%tensorboard --logdir runs/

And the weights are saved to `checkpoints` (you can change this default).

### Push to Hub

Now we can push our model along with some training specific configs to the Hub! 

In [None]:
trainer.push_to_hub("hezar-ai/bert-fa-sentiment-digikala-snappfood")