# Example: FLAVA training and inference on FHM dataset

This notebook provides a comprehensive guide on using the MATK (Multimodal AI Toolkit) library to evaluate the performance of the FLAVA model on the Facebook Hateful Memes dataset.

We kindly request that interested researchers duly acknowledge and adhere to Facebook AI's Hateful Memes dataset licence agreements. This entails the requisite download of the original dataset provided by Facebook AI.

## Step 1. Review and Accept Facebook AI's Dataset Licence Agreement
Researchers may access the Hateful Memes dataset license agreements by visiting the official website at https://hatefulmemeschallenge.com/. Once researchers have carefully reviewed and duly accepted the terms outlined in the license agreements, they are eligible to proceed with the download of the Hateful Memes datasets. This includes

* train, dev, dev_seen and test annotations
* images (critical for vision-language multimodal models)

## Step 2. Configuring the dataset

Locate the **[configs/fhm/normal](https://github.com/Social-AI-Studio/MATK/tree/main/configs/fhm/normal)** folder. We will use the **[flava.yaml](https://github.com/Social-AI-Studio/MATK/blob/main/configs/fhm/normal/flava.yaml)** config for our model.

Everything related to the dataset configuration is stored inside the **data** key: 
1. **class_path**: specifies which datamodule to be used from **[datamodules/modules](https://github.com/Social-AI-Studio/MATK/blob/merge-preprocessing-to-dataloaders/datamodules/modules.py)**. This goes hand-in-hand with the **dataset_class** key. 

| Dataset              | DataModule        | Usage                      |
|----------------------|-------------------|----------------------------|
| FasterRCNNDataModule | FasterRCNNDataset | For vision-language models |
| ImagesDataModule     | ImagesDataset     | For vision-language models |
| TextDataModule       | TextDataset       | For language models        |


2. **tokenizer_class_or_path**: specifies tokenizer or processor class/path for model
3. **frcnn_class_or_path**: specifies class/path Faster R-CNN feature extraction
4. **image_dirs** or **feats_dirs**: specifies path for dataset images or dataset images features respectively
    * Sometimes, you may wish to extract features for the dataset images to use for a model like LXMERT or VisualBERT. You can use the script provided under **tools/features/extract_features_frcnn.py**.
5. **annotation_filepaths**: path to files containing annotations (typically train.jsonl, test.jsonl, etc.)
6. **auxiliary_dicts**: path to .pkl containing auxilliary information like captions for images
7. **labels**: names of the labels for classification
8. **num_workers**: perform multi-process data loading by simply setting the argument num_workers to a positive integer


### Modification

1. Modify the keys inside **image_dirs** and **annotation_filepaths** according to the location of the downloaded images and downloaded annotation files respectively. 
2. You can suitably modify the **batch_size**, **num_workers**  and **shuffle_train** arguments.

## Step 2: Configuring the Model

Everything related to the model configuration is stored insude the **model** key:

1. **class_path**: specifies path to file under **[models/](https://github.com/Social-AI-Studio/MATK/tree/main/models)**
2. **model_class_or_path**: specifies the pretrained model to be used
3. **cls_dict**: specifies each label and the number of different values each label can have - this is useful in metric instantiation and logging.

No modification is required for this


## Step 3: Configuring the Trainer

The Trainer helps automate several aspects of training. As the documentation says,  it handles all loop details for you, some examples include:
* Automatically enabling/disabling grads
* Running the training, validation and test dataloaders
* Calling the Callbacks at the appropriate times
* Putting batches and computations on the correct devices

Everything related to the trainer is specified under the **trainer** key.

### Modification
1. Suitably modify **dirpath** and **name** arguments under callbacks to choose where your checkpoints will be stored and what name it will be given respectively. 
2. Suitably modify **save_dir** and **name** arguments under logger to choose where your lightining logs will be stored and what name it will be given respectively.
3. You can also modify other hyperparameters such as **max_epochs** or even find new ways to tweak the trainer by adding keys mentioned here: https://lightning.ai/docs/pytorch/stable/common/trainer.html#


## Step 4: Checking your config (optional)

To check that your model, dataset and trainer have been configured correctly, you can use the located here: **[scripts/fhm/test/test_classifications](https://github.com/Social-AI-Studio/MATK/blob/main/scripts/fhm/test/test_classifications.sh)**. This is particularly useful when you are getting ready to configure and train multiple models. 

## Step 5: Model Training

1. Run the training script under **[scripts/fhm/train/flava.sh](https://github.com/Social-AI-Studio/MATK/blob/main/scripts/fhm/train/flava.sh)**

## Step 6: Inference

1. If you wish to use a checkpoint, uncomment the **ckpt_path** key and enter the path to the required model checkpoint 
2. Run the inference script under **scripts/fhm/infer/flava.sh**