# This notebook is created based on the MMF Colab Demo template - https://colab.research.google.com/github/facebookresearch/mmf/blob/notebooks/notebooks/mmf_hm_example.ipynb

## Download MMF

In this section, we will download the MMF package and required dependencies.

### Prerequisites 
Please enable GPU in this notebook: Runtime > Change runtime type > Hardware Accelerator > Set to GPU

First we will install the MMF package and required dependencies

In [1]:
#!pip install --pre --upgrade mmf

# Remove the pre-installed mmf
!rm -rf /content/mmf/
!rm -rf /root/.cache/torch/mmf/

# Update PyYAML and imagug
!pip uninstall -y PyYAML
!pip install PyYAML==5.1.2
!pip uninstall -y imgaug && pip uninstall -y albumentations && pip install git+https://github.com/aleju/imgaug.git

# Install mmf from source
!git clone https://github.com/facebookresearch/mmf.git
!cd /content/mmf/
!pip install --editable /content/mmf/

Uninstalling PyYAML-3.13:
  Successfully uninstalled PyYAML-3.13
Collecting PyYAML==5.1.2
[?25l  Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
[K     |████████████████████████████████| 266kB 11.4MB/s 
[?25hBuilding wheels for collected packages: PyYAML
  Building wheel for PyYAML (setup.py) ... [?25l[?25hdone
  Created wheel for PyYAML: filename=PyYAML-5.1.2-cp37-cp37m-linux_x86_64.whl size=44103 sha256=d73d574bd620659f00dc6edcbbd45968761cf7436ea2c85a62bae0162ac1ecf2
  Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
Successfully built PyYAML
Installing collected packages: PyYAML
Successfully installed PyYAML-5.1.2
Uninstalling imgaug-0.2.9:
  Successfully uninstalled imgaug-0.2.9
Uninstalling albumentations-0.1.12:
  Successfully uninstalled albumentations-0.1.12
Collecting git+https://github.com/aleju/imgaug.git
  Cloning ht

## Download dataset

We will now download the Hateful Memes dataset. You will require two things to download the datasets: (i) URL (ii) Password to the zip file. To get both of these follow these steps:

1. Go to [DrivenData challenge page](https://www.drivendata.org/competitions/64/hateful-memes/)
2. Register, read and acknowledge the agreements for data access.
3. Go to the [data page](https://www.drivendata.org/competitions/64/hateful-memes/data), right click on the "Hateful Memes challenge dataset" link and "Copy Link Address" as shown in the image. This will copy the URL for the zip file to your clipboard which you will use in the next step.
![data](https://i.imgur.com/JQx2hPm.png)
4. Also, note the password provided in the description.
5. Run the next code block, fill in the URL and the zipfile's password when prompted.

The code blocks after that will download, convert and visualize the dataset.

URL of Dataset: https://drivendata-competition-fb-hateful-memes-data.s3.amazonaws.com/XjiOc5ycDBRRNwbhRlgH.zip?AWSAccessKeyId=AKIARVBOBDCY4MWEDJKS&Signature=FpmkioFlEFPvW%2FMtmwfZIgJ%2BGCE%3D&Expires=1618941090

Password: EWryfbZyNviilcDF

In [1]:
from getpass import getpass, getuser
#url = getpass("Enter the Hateful Memes data URL:")
#password = getpass("Enter ZIP file's Password:")
#url = 'https://drivendata-competition-fb-hateful-memes-data.s3.amazonaws.com/XjiOc5ycDBRRNwbhRlgH.zip?AWSAccessKeyId=AKIARVBOBDCY4MWEDJKS&Signature=FpmkioFlEFPvW%2FMtmwfZIgJ%2BGCE%3D&Expires=1618941090'
password = 'EWryfbZyNviilcDF'
url = 'https://drivendata-competition-fb-hateful-memes-data.s3.amazonaws.com/XjiOc5ycDBRRNwbhRlgH.zip?AWSAccessKeyId=AKIARVBOBDCY4MWEDJKS&Signature=ey9vLRX9%2FMRFZRKyFOIlJiJtjmo%3D&Expires=1620143289'

This will actually download the data.

In [2]:
!curl -o /content/hm.zip "$url" -H 'Referer: https://www.drivendata.org/competitions/64/hateful-memes/data/' --compressed

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   295    0   295    0     0    470      0 --:--:-- --:--:-- --:--:--   470


The next command will convert the zip file into required MMF format.

In [3]:
!mmf_convert_hm --zip_file /content/hm.zip --password $password --bypass_checksum=1

2021-04-30 19:08:28.644195: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Data folder is /root/.cache/torch/mmf/data
Zip path is /content/hm.zip
Copying /content/hm.zip
Unzipping /content/hm.zip
Extracting the zip can take time. Sit back and relax.
Moving train.jsonl
Moving dev_seen.jsonl
Moving test_seen.jsonl
Moving dev_unseen.jsonl
Moving test_unseen.jsonl
Moving img


Remove hm.zip to save same some space

In [None]:
!rm /content/hm.zip
!rm /root/.cache/torch/mmf/data/datasets/hateful_memes/defaults/images/hm.zip

rm: cannot remove '/content/hm.zip': No such file or directory
rm: cannot remove '/root/.cache/torch/mmf/data/datasets/hateful_memes/defaults/images/hm.zip': No such file or directory


Test/evaluate ready models(.pth)/check points

In [7]:
#!ls /content/drive/MyDrive/pth/BachSize32_MaxUpdates15000/
#!rm -rf /content/save/hateful*
#!cp -rf /content/save/visual_bert_final_VB_lr5e6.pth /content/drive/MyDrive/pth/b/
#!cp -rf /content/save/VB_b32_m15000_4LayerBert_hateful_memes_visual_bert_1909805/ /content/drive/MyDrive/pth/b/

# Set the dir for pth files
#dir = '/content/drive/MyDrive/pth/BachSize32_MaxUpdates15000/'
#dir = '/content/drive/MyDrive/pth/BachSize32_MaxUpdates15000_ClsNumLayers4_UnimodalOnly/' 
#=========Image-Grid=======================
# Predict val/test
#!mmf_predict config=projects/hateful_memes/configs/unimodal/image.yaml model=unimodal_image dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'unimodal_image_final.pth' checkpoint.resume_pretrained=False
#!mmf_predict config=projects/hateful_memes/configs/unimodal/image.yaml model=unimodal_image dataset=hateful_memes run_type=test checkpoint.resume_file=$dir'unimodal_image_final.pth' checkpoint.resume_pretrained=False

# Run on val with trained model
#!mmf_run config=projects/hateful_memes/configs/unimodal/image.yaml model=unimodal_image dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'unimodal_image_final.pth' checkpoint.resume_pretrained=False

#==========Text BERT ======================
#!mmf_predict config=projects/hateful_memes/configs/unimodal/bert.yaml model=unimodal_text dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'unimodal_text_final.pth' checkpoint.resume_pretrained=False
#!mmf_predict config=projects/hateful_memes/configs/unimodal/bert.yaml model=unimodal_text dataset=hateful_memes run_type=test checkpoint.resume_file=$dir'unimodal_text_final.pth' checkpoint.resume_pretrained=False
#!mmf_run config=projects/hateful_memes/configs/unimodal/bert.yaml model=unimodal_text dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'unimodal_text_final.pth' checkpoint.resume_pretrained=False

#==========Visual BERT ======================
#!mmf_predict config=projects/hateful_memes/configs/visual_bert/direct.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'visual_bert_final.pth' checkpoint.resume_pretrained=False dataset_config.hateful_memes.annotations.val[0]=hateful_memes/defaults/annotations/dev_seen.jsonl dataset_config.hateful_memes.annotations.test[0]=hateful_memes/defaults/annotations/test_seen.jsonl
#!mmf_predict config=projects/hateful_memes/configs/visual_bert/direct.yaml model=visual_bert dataset=hateful_memes run_type=test checkpoint.resume_file=$dir'visual_bert_final.pth' checkpoint.resume_pretrained=False dataset_config.hateful_memes.annotations.val[0]=hateful_memes/defaults/annotations/dev_seen.jsonl dataset_config.hateful_memes.annotations.test[0]=hateful_memes/defaults/annotations/test_seen.jsonl
#!mmf_run config=projects/hateful_memes/configs/visual_bert/direct.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'visual_bert_final.pth' checkpoint.resume_pretrained=False

#==========Visual BERT COCO ======================
#!mmf_predict config=projects/hateful_memes/configs/visual_bert/from_coco.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_file=$dir'visual_bert_COCO_final.pth' checkpoint.resume_pretrained=False
#!mmf_predict config=projects/hateful_memes/configs/visual_bert/from_coco.yaml model=visual_bert dataset=hateful_memes run_type=test checkpoint.resume_file=$dir'visual_bert_COCO_final.pth' checkpoint.resume_pretrained=False dataset_config.hateful_memes.annotations.val[0]=hateful_memes/defaults/annotations/dev_seen.jsonl dataset_config.hateful_memes.annotations.test[0]=hateful_memes/defaults/annotations/test_seen.jsonl
#!mmf_run config=projects/hateful_memes/configs/visual_bert/from_coco.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_file=/content/save/visual_bert_final_coco5e6.pth checkpoint.resume_pretrained=False training.max_updates=6000

# =========Pretrained baseline=====================
#!mmf_run config=projects/hateful_memes/configs/unimodal/image.yaml model=unimodal_image dataset=hateful_memes run_type=val checkpoint.resume_zoo=unimodal_image.hateful_memes.images checkpoint.resume_pretrained=False
#!mmf_run config=projects/hateful_memes/configs/unimodal/bert.yaml model=unimodal_text dataset=hateful_memes run_type=val checkpoint.resume_zoo=unimodal_text.hateful_memes.bert checkpoint.resume_pretrained=False
#!mmf_run config=projects/hateful_memes/configs/visual_bert/direct.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_zoo=visual_bert.finetuned.hateful_memes.direct checkpoint.resume_pretrained=False
#!mmf_run config=projects/hateful_memes/configs/visual_bert/from_coco.yaml model=visual_bert dataset=hateful_memes run_type=val checkpoint.resume_zoo=visual_bert.finetuned.hateful_memes.from_coco checkpoint.resume_pretrained=False




Dataset analysis

In [6]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#import sys
#sys.path.append("/content/mmf/mmf")
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

dataset_loc = '/root/.cache/torch/mmf/data/datasets/'
train_ds = dataset_loc + 'hateful_memes/defaults/annotations/train.jsonl'
val_ds = dataset_loc + 'hateful_memes/defaults/annotations/dev_unseen.jsonl'
test_ds = dataset_loc +'hateful_memes/defaults/annotations/test_unseen.jsonl'

val_ds2 = dataset_loc + 'hateful_memes/defaults/annotations/dev_seen.jsonl'
test_ds2 = dataset_loc +'hateful_memes/defaults/annotations/test_seen.jsonl'
print(train_ds)
#!cat $val_ds
train_df = pd.read_json(train_ds, lines=True)
val_df = pd.read_json(val_ds, lines=True)
test_df = pd.read_json(test_ds, lines=True)

val_df2 = pd.read_json(val_ds2, lines=True)
test_df2 = pd.read_json(test_ds2, lines=True)

print(train_df.shape, val_df.shape, test_df.shape, val_df2.shape, test_df2.shape)
print(val_df.iloc[0:10,0:3].values)


/root/.cache/torch/mmf/data/datasets/hateful_memes/defaults/annotations/train.jsonl
(8500, 4) (540, 4) (2000, 3) (500, 4) (1000, 3)
[[76432 'img/76432.png' 0]
 [14270 'img/14270.png' 0]
 [56947 'img/56947.png' 0]
 [35174 'img/35174.png' 0]
 [39264 'img/39264.png' 0]
 [18564 'img/18564.png' 0]
 [42361 'img/42361.png' 0]
 [29067 'img/29067.png' 0]
 [86471 'img/86471.png' 0]
 [51940 'img/51940.png' 0]]


## Configure and tune pretrained model

Refer to https://github.com/facebookresearch/mmf/tree/master/projects/hateful_memes 


In [8]:

#===Type: Unimodal===
#MODEL = 'Text BERT'
#REPLACE_WITH_BASELINE_CONFIG = 'projects/hateful_memes/configs/unimodal/bert.yaml'
#REPLACE_WITH_MODEL_KEY = 'unimodal_text'
#REPLACE_WITH_PRETRAINED_ZOO_KEY = 'unimodal_text.hateful_memes.bert'

#MODEL = 'Image-Grid'
#REPLACE_WITH_BASELINE_CONFIG = 'projects/hateful_memes/configs/unimodal/image.yaml'
#REPLACE_WITH_MODEL_KEY = 'unimodal_image'
#REPLACE_WITH_PRETRAINED_ZOO_KEY = 'unimodal_image.hateful_memes.images'

#===Type: Multimodal (Unimodal Pretraining)===
#MODEL = 'Visual BERT'
#REPLACE_WITH_BASELINE_CONFIG = 'projects/hateful_memes/configs/visual_bert/direct.yaml'
#REPLACE_WITH_MODEL_KEY = 'visual_bert'
#REPLACE_WITH_PRETRAINED_ZOO_KEY = 'visual_bert.finetuned.hateful_memes.direct'

#===Type: Multimodal (Multimodal Pretraining)===
#MODEL = 'Visual BERT COCO'
REPLACE_WITH_BASELINE_CONFIG = 'projects/hateful_memes/configs/visual_bert/from_coco.yaml'
REPLACE_WITH_MODEL_KEY = 'visual_bert'
REPLACE_WITH_PRETRAINED_ZOO_KEY = 'visual_bert.finetuned.hateful_memes.from_coco'

Train the model

In [9]:
!mmf_run config=$REPLACE_WITH_BASELINE_CONFIG model=$REPLACE_WITH_MODEL_KEY dataset=hateful_memes training.log_interval=50 \
  training.max_updates=10000 \
  training.batch_size=32 \
  training.evaluation_interval=500 evaluation.predict=true training.fp16=True

2021-05-01 03:31:25.961621: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option config to projects/hateful_memes/configs/visual_bert/from_coco.yaml
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option model to visual_bert
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option datasets to hateful_memes
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option training.log_interval to 50
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option training.max_updates to 10000
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option training.batch_size to 32
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option training.evaluation_interval to 500
[32m2021-05-01T03:31:33 | mmf.utils.configuration: [0mOverriding option evaluation.predict to true


Evaluating Pretrained model on Validation set

In [None]:
#! Evaluating Pretrained model on Validation set
!cd /content/mmf/
!mmf_run config=$REPLACE_WITH_BASELINE_CONFIG model=$REPLACE_WITH_MODEL_KEY dataset=hateful_memes run_type=val checkpoint.resume_zoo=$REPLACE_WITH_PRETRAINED_ZOO_KEY checkpoint.resume_pretrained=False


2021-04-21 20:12:27.771581: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option config to projects/hateful_memes/configs/unimodal/bert.yaml
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option model to unimodal_text
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option datasets to hateful_memes
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option run_type to val
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option checkpoint.resume_zoo to unimodal_text.hateful_memes.bert
[32m2021-04-21T20:12:30 | mmf.utils.configuration: [0mOverriding option checkpoint.resume_pretrained to False
[32m2021-04-21T20:12:30 | mmf: [0mLogging to: ./save/train.log
[32m2021-04-21T20:12:30 | mmf_cli.run: [0mNamespace(config_override=None, local_rank=None, opts=['config=projects/hateful_

Evaluating Pretrained model on Test set

In [None]:

#! Evaluating Pretrained model on Test set
!cd /content/mmf/
!mmf_predict config=$REPLACE_WITH_BASELINE_CONFIG model=$REPLACE_WITH_MODEL_KEY dataset=hateful_memes run_type=test checkpoint.resume_zoo=$REPLACE_WITH_PRETRAINED_ZOO_KEY checkpoint.resume_pretrained=False



2021-04-21 20:05:55.082562: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option config to projects/hateful_memes/configs/unimodal/bert.yaml
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option model to unimodal_text
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option datasets to hateful_memes
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option run_type to test
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option checkpoint.resume_zoo to unimodal_text.hateful_memes.bert
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option checkpoint.resume_pretrained to False
[32m2021-04-21T20:05:58 | mmf.utils.configuration: [0mOverriding option evaluation.predict to true
[32m2021-04-21T20:05:58 | mmf: [0mLogging to: ./save/train.log
[32m2021-04-21T20:05:

## Submit a prediction

Now, we will use a pretrained model from MMF to submit a prediction to DrivenData. Run the command in the next block and at the end it will output the path to the csv file generated. Download and upload that file to [DrivenData's submission page](https://www.drivendata.org/competitions/64/hateful-memes/submissions/).

In [None]:
!mmf_predict config=projects/hateful_memes/configs/mmbt/defaults.yaml \
  model=mmbt \ 
  dataset=hateful_memes \
  run_type=test \ 
  checkpoint.resume_zoo=mmbt.hateful_memes.images \
  training.batch_size=16

## Train an existing model

We will use MMF to train an existing baseline from MMF's model zoo on the Hateful Memes dataset. Run the next code cell to start training MMBT-Grid model on the dataset. You can adjust the batch size, maximum number of updates, log and evaluation interval among other things by using command line overrides. Read more about MMF's configuration system at https://mmf.readthedocs.io/en/latest/notes/configuration.html.

In [None]:
!mmf_run config=projects/hateful_memes/configs/mmbt/defaults.yaml \
  model=mmbt \
  dataset=hateful_memes \
  training.log_interval=50 \
  training.max_updates=10000 \
  training.batch_size=32 \
  training.evaluation_interval=500

2021-04-21 23:55:35.715105: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option config to projects/hateful_memes/configs/mmbt/defaults.yaml
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option model to mmbt
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option datasets to hateful_memes
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option training.log_interval to 50
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option training.max_updates to 10000
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option training.batch_size to 32
[32m2021-04-21T23:55:38 | mmf.utils.configuration: [0mOverriding option training.evaluation_interval to 500
[32m2021-04-21T23:55:38 | mmf: [0mLogging to: ./save/train.log
[32m2021-04-21T23:55:38 | mmf_cli.run: [0mNamespa

## Build your own model

Using MMF's encoders, modules and utilities, we can easily build a custom model. In this example, we are building a fusion model which fuses ResNet pooled grid features with fasttext embedding vectors to classify a meme as hateful or not hateful. 

Steps involved in building the model are:

1. Create a new processor to get fasttext sentence embeddings. (Read more on processors [here]())
2. Create new model using encoders from MMF.
3. Move hardcoded stuff from model to configuration.

In [None]:
import torch 

# We will inherit the FastText Processor already present in MMF
from mmf.datasets.processors import FastTextProcessor
# registry is needed to register processor and model to be MMF discoverable
from mmf.common.registry import registry

# Register the processor so that MMF can discover it
@registry.register_processor("fasttext_sentence_vector")
class FastTextSentenceVectorProcessor(FastTextProcessor):
    # Override the call method
    def __call__(self, item):
        # This function is present in FastTextProcessor class and loads
        # fasttext bin
        self._load_fasttext_model(self.model_file)
        if "text" in item:
            text = item["text"]
        elif "tokens" in item:
            text = " ".join(item["tokens"])

        # Get a sentence vector for sentence and convert it to torch tensor
        sentence_vector = torch.tensor(
            self.model.get_sentence_vector(text),
            dtype=torch.float
        )

        # Return back a dict
        return {
            "text": sentence_vector
        }
    
    # Make dataset builder happy, return a random number
    def get_vocab_size(self):
        return None

In [None]:
import torch

# registry is need to register our new model so as to be MMF discoverable
from mmf.common.registry import registry
# All model using MMF need to inherit BaseModel
from mmf.models.base_model import BaseModel
# ProjectionEmbedding will act as proxy encoder for FastText Sentence Vector
from mmf.modules.embeddings import ProjectionEmbedding
# Builder methods for image encoder and classifier
from mmf.utils.build import build_classifier_layer, build_image_encoder

# Register the model for MMF, "concat_vl" key would be used to find the model
@registry.register_model("concat_vl")
class LanguageAndVisionConcat(BaseModel):
    # All models in MMF get first argument as config which contains all
    # of the information you stored in this model's config (hyperparameters)
    def __init__(self, config, *args, **kwargs):
        # This is not needed in most cases as it just calling parent's init
        # with same parameters. But to explain how config is initialized we 
        # have kept this
        super().__init__(config, *args, **kwargs)
    
    # This classmethod tells MMF where to look for default config of this model
    @classmethod
    def config_path(cls):
        # Relative to user dir root
        return "/content/hm_example_mmf/configs/models/concat_vl.yaml"
    
    # Each method need to define a build method where the model's modules
    # are actually build and assigned to the model
    def build(self):
        """
        Config's image_encoder attribute will used to build an MMF image
        encoder. This config in yaml will look like:

        # "type" parameter specifies the type of encoder we are using here. 
        # In this particular case, we are using resnet152
        type: resnet152
      
        # Parameters are passed to underlying encoder class by 
        # build_image_encoder
        params:
          # Specifies whether to use a pretrained version
          pretrained: true 
          # Pooling type, use max to use AdaptiveMaxPool2D
          pool_type: avg 
      
          # Number of output features from the encoder, -1 for original
          # otherwise, supports between 1 to 9
          num_output_features: 1 
        """
        self.vision_module = build_image_encoder(self.config.image_encoder)

        """
        For classifer, configuration would look like:
        # Specifies the type of the classifier, in this case mlp
        type: mlp
        # Parameter to the classifier passed through build_classifier_layer
        params:
          # Dimension of the tensor coming into the classifier
          in_dim: 512
          # Dimension of the tensor going out of the classifier
          out_dim: 2
          # Number of MLP layers in the classifier
          num_layers: 0
        """
        self.classifier = build_classifier_layer(self.config.classifier)
        
        # ProjectionEmbeddings takes in params directly as it is module
        # So, pass in kwargs, which are in_dim, out_dim and module
        # whose value would be "linear" as we want linear layer
        self.language_module = ProjectionEmbedding(
            **self.config.text_encoder.params
        )
        # Dropout value will come from config now
        self.dropout = torch.nn.Dropout(self.config.dropout)
        # Same as Projection Embedding, fusion's layer params (which are param 
        # for linear layer) will come from config now
        self.fusion = torch.nn.Linear(**self.config.fusion.params)
        self.relu = torch.nn.ReLU()

    # Each model in MMF gets a dict called sample_list which contains
    # all of the necessary information returned from the image
    def forward(self, sample_list):
        # Text input features will be in "text" key
        text = sample_list["text"]
        # Similarly, image input will be in "image" key
        image = sample_list["image"]

        text_features = self.relu(self.language_module(text))
        image_features = self.relu(self.vision_module(image))

        # Concatenate the features returned from two modality encoders
        combined = torch.cat([text_features, image_features.squeeze()], dim=1)

        # Pass through the fusion layer, relu and dropout
        fused = self.dropout(self.relu(self.fusion(combined)))

        # Pass final tensor from classifier to get scores
        logits = self.classifier(fused)

        # For loss calculations (automatically done by MMF based on loss defined
        # in the config), we need to return a dict with "scores" key as logits
        output = {"scores": logits}

        # MMF will automatically calculate loss
        return output

Now, we will install the example repo that we have already created on top of MMF and contains code in this colab. We do this so that we don't have to build configs again from scratch

In [None]:
!git clone https://github.com/apsdehal/hm_example_mmf /content/hm_example_mmf

Cloning into '/content/hm_example_mmf'...
remote: Enumerating objects: 25, done.[K
remote: Counting objects: 100% (25/25), done.[K
remote: Compressing objects: 100% (16/16), done.[K
remote: Total 25 (delta 5), reused 22 (delta 3), pack-reused 0[K
Unpacking objects: 100% (25/25), done.


## Train your model

In this step, we will train the model we just built. A dot list can be passed as either a dict or a list to the run to override the configuration parameters.

In [None]:
!ls /content/hm_example_mmf/configs/experiments/defaults.yaml

ls: cannot access '/content/hm_example_mmf/configs/experiments/defaults.yaml': No such file or directory


In [None]:
import sys
from mmf_cli.run import run
opts = opts=[
    "config='/content/hm_example_mmf/configs/experiments/defaults.yaml'", 
    "model=concat_vl", 
    "dataset=hateful_memes", 
    "training.num_workers=0"
]
run(opts=opts)

[32m2021-04-17T23:20:14 | mmf.utils.configuration: [0mOverriding option config to '/content/hm_example_mmf/configs/experiments/defaults.yaml'
[32m2021-04-17T23:20:14 | mmf.utils.configuration: [0mOverriding option model to concat_vl
[32m2021-04-17T23:20:14 | mmf.utils.configuration: [0mOverriding option datasets to hateful_memes
[32m2021-04-17T23:20:14 | mmf.utils.configuration: [0mOverriding option training.num_workers to 0
[32m2021-04-17T23:20:14 | mmf: [0mLogging to: ./save/train.log
[32m2021-04-17T23:20:14 | mmf_cli.run: [0mNamespace(config_override=None, opts=["config='/content/hm_example_mmf/configs/experiments/defaults.yaml'", 'model=concat_vl', 'dataset=hateful_memes', 'training.num_workers=0'])
[32m2021-04-17T23:20:14 | mmf_cli.run: [0mTorch version: 1.8.1+cu102
[32m2021-04-17T23:20:14 | mmf.utils.general: [0mCUDA Device 0 is: Tesla P100-PCIE-16GB
[32m2021-04-17T23:20:14 | mmf_cli.run: [0mUsing seed 14307354
[32m2021-04-17T23:20:14 | mmf.trainers.mmf_trainer

169876453it [03:16, 866696.08it/s]

[32m2021-04-17T23:23:32 | mmf.datasets.processors.processors: [0mfastText bin downloaded at /root/.cache/torch/mmf/wiki.en.bin.
[32m2021-04-17T23:23:32 | mmf.datasets.multi_datamodule: [0mMultitasking disabled by default for single dataset training
[32m2021-04-17T23:23:32 | mmf.datasets.multi_datamodule: [0mMultitasking disabled by default for single dataset training
[32m2021-04-17T23:23:32 | mmf.datasets.multi_datamodule: [0mMultitasking disabled by default for single dataset training
[32m2021-04-17T23:23:32 | mmf.trainers.mmf_trainer: [0mLoading model



Downloading: "https://download.pytorch.org/models/resnet152-b121ed2d.pth" to /root/.cache/torch/hub/checkpoints/resnet152-b121ed2d.pth


HBox(children=(FloatProgress(value=0.0, max=241530880.0), HTML(value='')))


[32m2021-04-17T23:23:45 | mmf.trainers.mmf_trainer: [0mLoading optimizer
[32m2021-04-17T23:23:45 | mmf.trainers.mmf_trainer: [0mLoading metrics
Use OmegaConf.to_yaml(cfg)


Use OmegaConf.to_yaml(cfg)


[32m2021-04-17T23:23:45 | mmf.trainers.mmf_trainer: [0m===== Model =====
[32m2021-04-17T23:23:45 | mmf.trainers.mmf_trainer: [0mLanguageAndVisionConcat(
  (vision_module): ResNet152ImageEncoder(
    (model): Sequential(
      (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace=True)
      (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
      (4): Sequential(
        (0): Bottleneck(
          (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (conv2): Conv2d(64, 64, kernel_size=(3




[32m2021-04-17T23:29:04 | mmf.trainers.callbacks.logistics: [0mprogress: 100/22000, train/hateful_memes/cross_entropy: 0.6420, train/hateful_memes/cross_entropy/avg: 0.6420, train/total_loss: 0.6420, train/total_loss/avg: 0.6420, max mem: 11928.0, experiment: run, epoch: 1, num_updates: 100, iterations: 100, max_updates: 22000, lr: 0., ups: 0.31, time: 05m 19s 047ms, time_since_start: 05m 19s 067ms, eta: 19h 35m 299ms
[32m2021-04-17T23:32:11 | mmf.trainers.callbacks.logistics: [0mprogress: 200/22000, train/hateful_memes/cross_entropy: 0.6322, train/hateful_memes/cross_entropy/avg: 0.6371, train/total_loss: 0.6322, train/total_loss/avg: 0.6371, max mem: 11928.0, experiment: run, epoch: 2, num_updates: 200, iterations: 200, max_updates: 22000, lr: 0.00001, ups: 0.54, time: 03m 06s 487ms, time_since_start: 08m 25s 555ms, eta: 11h 23m 40s 267ms
[32m2021-04-17T23:35:16 | mmf.trainers.callbacks.logistics: [0mprogress: 300/22000, train/hateful_memes/cross_entropy: 0.6322, train/hateful_

  0%|          | 0/32 [00:00<?, ?it/s]

  "Sample list has not field 'targets', are you "

  "Sample list has not field 'targets', are you "



  3%|▎         | 1/32 [00:01<00:34,  1.11s/it]

  + "might not work as expected."

  + "might not work as expected."



100%|██████████| 32/32 [00:28<00:00,  1.11it/s]


KeyError: ignored

## Using your module

Since, we have cloned the repo that contains the example we built in this colab notebook we can use it also to run the training from command line by using the `env.user_dir` option or by overriding the environment variable `MMF_USER_DIR`. Expand the cell below the next code cell to see how it can be done.

In [None]:
!MMF_USER_DIR="/content/hm_example_mmf" mmf_run \
  config="configs/experiments/defaults.yaml" \
  model=concat_vl \
  dataset=hateful_memes \
  training.num_workers=0

## Conclusion and Further Steps

In this colab notebook, we learned how we can use MMF to train and predict already existing models in MMF's zoo. We also learned how we can build custom models using various modules and goodies provided in MMF easily.

If you have any issues, feedback or comments, please reach us out at mmf@fb.com or open up an issue at [GitHub](https://github.com/facebookresearch/mmf/issues/new/choose). We are also accepting PRs if you want to add your cool model to MMF and we are always open to community contributions.

At Facebook AI, we’ll continuously improve and expand on the multimodal capabilities available through MMF, and we welcome contributions from the community as well to build this resource. We hope MMF will be the framework of choice and be a catalyst for research in this area by providing a powerful, versatile platform for multimodal research. 