## Goal: test active sampler implementation

Using placeholder uncertainty = 1 for all tasks.

Key difference between this and `train-baseline.ipynb`: `config.uncertainty_based_sampling = True` (under "Define a Configuration Object")

## PyTorch Setup 

In [1]:
# uninstall the default 1.4.0 and run the following command (due to the config of our dev box)
# !pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html


### Text Classification of MultiNLI Sentences using MT-DNN

This notebook utilizes the pip installable package that implements the Multi-Task Deep Neural Network Toolkit (MTDNN) for Natural Language Understanding. It's recommended to run this notebook on GPU machines as it's very computationally intensive.

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import json
import os
import shutil
import sys
from datetime import datetime

import pandas as pd
import torch
import tensorflow.io.gfile as gfile

from mtdnn.common.types import EncoderModelType
from mtdnn.configuration_mtdnn import MTDNNConfig
from mtdnn.data_builder_mtdnn import MTDNNDataBuilder
from mtdnn.modeling_mtdnn import MTDNNModel
from mtdnn.process_mtdnn import MTDNNDataProcess
from mtdnn.tasks.config import MTDNNTaskDefs
from mtdnn.tokenizer_mtdnn import MTDNNTokenizer

## Define Configuration, Tasks and Model Objects

In [4]:
DEBUG = True

In [5]:
# Define Configuration, Tasks and Model Objects
ROOT_DIR = 'gs://cs330'
MODEL_ID = datetime.now().strftime('%m%d%H%M')
OUTPUT_DIR = os.path.join(ROOT_DIR, 'checkpoint', MODEL_ID)

DATA_DIR = "../../../glue_data/" if not DEBUG else "../../../debug_data/"
TASK_DATA_DIRS = {
    'qqp': os.path.join(DATA_DIR, "QQP"),
    'mnli': os.path.join(DATA_DIR, "MNLI"),
    'sst': os.path.join(DATA_DIR, "SST-2"),
    'mrpc': os.path.join(DATA_DIR, "MRPC")
    }

In [6]:
# Training parameters
BATCH_SIZE = 16
MULTI_GPU_ON = False
MAX_SEQ_LEN = 128
NUM_EPOCHS = 2 if DEBUG else 5

UNCERTAINTY_BASED_SAMPLING = True
LOG_PER_UPDATES = 4 if DEBUG else 500

In [7]:
TASK_DATA_DIRS

{'qqp': '../../../debug_data/QQP',
 'mnli': '../../../debug_data/MNLI',
 'sst': '../../../debug_data/SST-2',
 'mrpc': '../../../debug_data/MRPC'}

Exploring the location for our data to be downloaded, model to be checkpointed and logs to be dumped

In [8]:
print(OUTPUT_DIR)

gs://cs330/checkpoint/11120706


### Define a Configuration Object 

Create a model configuration object, `MTDNNConfig`, with the necessary parameters to initialize the MT-DNN model. Initialization without any parameters will default to a similar configuration that initializes a BERT model. 

In [9]:
config = MTDNNConfig(batch_size=BATCH_SIZE, 
                     max_seq_len=MAX_SEQ_LEN, 
                     multi_gpu_on=MULTI_GPU_ON,
                     log_per_updates=LOG_PER_UPDATES,
                     uncertainty_based_sampling=UNCERTAINTY_BASED_SAMPLING,
                    )


### Create Task Definition Object  

Define the task parameters to train for and initialize an `MTDNNTaskDefs` object. Create a task parameter dictionary. Definition can be a single or multiple tasks to train.  `MTDNNTaskDefs` can take a python dict, yaml or json file with task(s) defintion.

The data source directory is the path of data downloaded and extracted above using `download_tsv_files_and_extract` which is the `MNLI` dir under the `DATA_DIR` temporary directory.    

The data source has options that are set to drive each task pre-processing; `data_process_opts`


In [10]:
default_data_process_opts = {"header": True, "is_train": True, "multi_snli": False,}
default_split_names = ["train", "dev", "test"]
tasks_params = {
    "mnli": {
        "data_format": "PremiseAndOneHypothesis",
        "encoder_type": "BERT",
        "dropout_p": 0.3,
        "enable_san": True,
        "labels": ["contradiction", "neutral", "entailment"],
        "metric_meta": ["ACC"],
        "loss": "CeCriterion",
        "kd_loss": "MseCriterion",
        "n_class": 3,
        "split_names": [
            "train",
            "dev_matched",
            "dev_mismatched",
            "test_matched",
            "test_mismatched",
        ],
        "data_source_dir": TASK_DATA_DIRS['mnli'],
        "data_process_opts": default_data_process_opts,
        "task_type": "Classification",
    },
    "mrpc": {
                "task_name": "mrpc",
                "data_format": "PremiseAndOneHypothesis",
                "encoder_type": "BERT",
                "enable_san": True,
                "metric_meta": ["ACC", "F1"],
                "loss": "CeCriterion",
                "kd_loss": "MseCriterion",
                "n_class": 2,
                "split_names": default_split_names,
                "data_source_dir": TASK_DATA_DIRS['mrpc'],
                "data_process_opts": default_data_process_opts,
                "task_type": "Classification",
            },
    "sst": {
                "task_name": "sst",
                "data_format": "PremiseOnly",
                "encoder_type": "BERT",
                "enable_san": False,
                "metric_meta": ["ACC"],
                "loss": "CeCriterion",
                "kd_loss": "MseCriterion",
                "n_class": 2,
                "split_names": default_split_names,
                "data_source_dir": TASK_DATA_DIRS['sst'],
                "data_process_opts": default_data_process_opts,
                "task_type": "Classification",
            },
    "qqp": {
        "task_name": "qqp",
        "data_format": "PremiseAndOneHypothesis",
        "encoder_type": "BERT",
        "enable_san": True,
        "metric_meta": ["ACC", "F1"],
        "loss": "CeCriterion",
        "kd_loss": "MseCriterion",
        "n_class": 2,
        "split_names": default_split_names,
        "data_source_dir": TASK_DATA_DIRS['qqp'],
        "data_process_opts": default_data_process_opts,
        "task_type": "Classification",
    }
}

# Define the tasks
task_defs = MTDNNTaskDefs(tasks_params)

11/12/2020 07:06:21 - mtdnn.tasks.config - INFO - Mapping Task attributes
11/12/2020 07:06:21 - mtdnn.tasks.config - INFO - Mapping Task attributes
11/12/2020 07:06:21 - mtdnn.tasks.config - INFO - Mapping Task attributes
11/12/2020 07:06:21 - mtdnn.tasks.config - INFO - Mapping Task attributes
11/12/2020 07:06:21 - mtdnn.tasks.config - INFO - Configured task definitions - ['mnli', 'mrpc', 'sst', 'qqp']



### Create the MTDNN Data Tokenizer Object  

Create a data tokenizing object, `MTDNNTokenizer`. Based on the model initial checkpoint, it wraps around the model's Huggingface transformers library to encode the data to MT-DNN format. This becomes the input to the data building stage.  


In [11]:
tokenizer = MTDNNTokenizer(do_lower_case=True)

#### Testing out the Tokenizer encode function on a sample text
`tokenizer.encode("What NLP toolkit do you recommend", "MT-DNN is a fantastic toolkit")`

In [12]:
# single sentence
tokenizer.encode("What NLP toolkit do you recommend")

([101, 2054, 17953, 2361, 6994, 23615, 2079, 2017, 16755, 102, 102],
 None,
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1])

In [13]:
# sentence pair
print(tokenizer.encode("What NLP toolkit do you recommend", "MT-DNN is a fantastic toolkit"))

([101, 2054, 17953, 2361, 6994, 23615, 2079, 2017, 16755, 102, 11047, 1011, 1040, 10695, 2003, 1037, 10392, 6994, 23615, 102], None, [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])


## Data Preprocessing

### Create the Data Builder Object  

Create a data preprocessing object, `MTDNNDataBuilder`. This class is responsible for converting the data into the MT-DNN format depending on the task.  
 

Define a data builder that handles the creating of each task's vectorized data utilizing the model tokenizer. This will build out the vectorized data needed for creating the training, test and development PyTorch dataloaders

In [14]:
task_defs.data_paths_map

{'mnli': {'data_paths': ['../../../debug_data/MNLI/train.tsv',
   '../../../debug_data/MNLI/dev_matched.tsv',
   '../../../debug_data/MNLI/dev_mismatched.tsv',
   '../../../debug_data/MNLI/test_matched.tsv',
   '../../../debug_data/MNLI/test_mismatched.tsv'],
  'data_opts': {'header': True, 'is_train': True, 'multi_snli': False}},
 'mrpc': {'data_paths': ['../../../debug_data/MRPC/train.tsv',
   '../../../debug_data/MRPC/dev.tsv',
   '../../../debug_data/MRPC/test.tsv'],
  'data_opts': {'header': True, 'is_train': True, 'multi_snli': False}},
 'sst': {'data_paths': ['../../../debug_data/SST-2/train.tsv',
   '../../../debug_data/SST-2/dev.tsv',
   '../../../debug_data/SST-2/test.tsv'],
  'data_opts': {'header': True, 'is_train': True, 'multi_snli': False}},
 'qqp': {'data_paths': ['../../../debug_data/QQP/train.tsv',
   '../../../debug_data/QQP/dev.tsv',
   '../../../debug_data/QQP/test.tsv'],
  'data_opts': {'header': True, 'is_train': True, 'multi_snli': False}}}

In [15]:
## Load and build data
data_builder = MTDNNDataBuilder(
    tokenizer=tokenizer,
    task_defs=task_defs,
    data_dir='.', #DATA_SOURCE_DIR,
    canonical_data_suffix="canonical_data",
    dump_rows=True,
)

## Build data to MTDNN Format
## Iterable of each specific task and processed data
vectorized_data = data_builder.vectorize()

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mnli at ./canonical_data/mnli_train.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mnli at ./canonical_data/mnli_dev_matched.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mnli at ./canonical_data/mnli_dev_mismatched.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mnli at ./canonical_data/mnli_test_matched.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mnli at ./canonical_data/mnli_test_mismatched.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mrpc at ./canonical_data/mrpc_train.tsv
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Sucessfully loaded and built 64 samples for mrpc at ./canonical_data/mrp

Building Data For Premise and One Hypothesis: 64it [00:00, 1137.07it/s]

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mnli_train.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 29382.16it/s]

mnli_dev_matched
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MNLI DEV MATCHED' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1114.05it/s]

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mnli_dev_matched.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 29976.04it/s]

mnli_dev_mismatched
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MNLI DEV MISMATCHED' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1249.50it/s]

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mnli_dev_mismatched.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 34160.79it/s]

mnli_test_matched
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MNLI TEST MATCHED' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1255.16it/s]

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mnli_test_matched.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 31907.22it/s]

mnli_test_mismatched
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MNLI TEST MISMATCHED' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1194.15it/s]

11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mnli_test_mismatched.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 30321.41it/s]

mrpc_train
11/12/2020 07:06:21 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MRPC TRAIN' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 996.99it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mrpc_train.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 22304.57it/s]

mrpc_dev
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MRPC DEV' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 943.60it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mrpc_dev.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 21730.39it/s]

mrpc_test
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'MRPC TEST' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 993.50it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/mrpc_test.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 13789.96it/s]

sst_train
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'SST TRAIN' Task



Building Data For Premise Only: 64it [00:00, 3184.70it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/sst_train.json



Saving Data For PremiseOnly: 100%|██████████| 64/64 [00:00<00:00, 32796.02it/s]

sst_dev
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'SST DEV' Task



Building Data For Premise Only: 64it [00:00, 1922.16it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/sst_dev.json



Saving Data For PremiseOnly: 100%|██████████| 64/64 [00:00<00:00, 20862.32it/s]

sst_test
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'SST TEST' Task



Building Data For Premise Only: 64it [00:00, 2062.00it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/sst_test.json



Saving Data For PremiseOnly: 100%|██████████| 64/64 [00:00<00:00, 21289.19it/s]

qqp_train
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'QQP TRAIN' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1681.92it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/qqp_train.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 16003.07it/s]

qqp_dev
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'QQP DEV' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1753.48it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/qqp_dev.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 20323.70it/s]

qqp_test
11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Building Data For 'QQP TEST' Task



Building Data For Premise and One Hypothesis: 64it [00:00, 1692.33it/s]

11/12/2020 07:06:22 - mtdnn.data_builder_mtdnn - INFO - Saving data to ./canonical_data/bert_base_uncased/qqp_test.json



Saving Data For PremiseAndOneHypothesis: 100%|██████████| 64/64 [00:00<00:00, 20460.02it/s]


### Create the Data Processing Object  

Create a data preprocessing object, `MTDNNDataProcess`. This creates the training, test and development PyTorch dataloaders needed for training and testing. We also need to retrieve the necessary training options required to initialize the model correctly, for all tasks.  

Define a data process that handles creating the training, test and development PyTorch dataloaders

In [16]:
# Make the Data Preprocess step and update the config with training data updates
data_processor = MTDNNDataProcess(
    config=config, task_defs=task_defs, vectorized_data=vectorized_data
)

11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Starting to process the training data sets
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Loading mnli_train as task 0
11/12/2020 07:06:22 - mtdnn.dataset_mtdnn - INFO - Loaded 63 samples out of 64
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Loading mrpc_train as task 1
11/12/2020 07:06:22 - mtdnn.dataset_mtdnn - INFO - Loaded 64 samples out of 64
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Loading sst_train as task 2
11/12/2020 07:06:22 - mtdnn.dataset_mtdnn - INFO - Loaded 64 samples out of 64
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Loading qqp_train as task 3
11/12/2020 07:06:22 - mtdnn.dataset_mtdnn - INFO - Loaded 64 samples out of 64
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Starting to process the testing data sets
11/12/2020 07:06:22 - mtdnn.process_mtdnn - INFO - Loading mnli_dev_matched as task 0
11/12/2020 07:06:22 - mtdnn.dataset_mtdnn - INFO - Loaded 64 samples out of 64
11/12/2020 07:

### Instantiate the MTDNN Model

Now we can go ahead and create an `MTDNNModel` model

In [17]:
model = MTDNNModel(
    config,
    task_defs,
    data_processor=data_processor,
    pretrained_model_name="bert-base-uncased",
    output_dir=OUTPUT_DIR,
)

[34m[1mwandb[0m: Currently logged in as: [33mfeifang24[0m (use `wandb login --relogin` to force relogin)
[34m[1mwandb[0m: wandb version 0.10.10 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade


idx: 0, number of task labels: 3
idx: 1, number of task labels: 2
idx: 2, number of task labels: 2
idx: 3, number of task labels: 2


In [18]:
for idx, dataset in enumerate(model.test_datasets_list):
    print(idx, dataset)

0 mnli_matched
1 mnli_mismatched
2 mrpc
3 sst
4 qqp


In [19]:
for idx, task in enumerate(model.tasks):
    print(idx, task)

0 mnli
1 mrpc
2 sst
3 qqp


## Model Finetuning, Prediction and Evaluation

### Fit and finetune model on five epochs and predict using the training and test  

At this point the MT-DNN model allows us to fit to the model and create predictions. The fit takes an optional `epochs` parameter that overwrites the epochs set in the `MTDNNConfig` object. 

In [20]:
model.fit(epochs=NUM_EPOCHS)

11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - Total number of params: 109489161
11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - At epoch 1
11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - Amount of data to go over: 16
11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - Updates - [     1] Training Loss - [0.75243] Time Remaining - [0:00:04]
11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 0: MNLI_MATCHED
11/12/2020 07:06:32 - mtdnn.modeling_mtdnn - INFO - predicting 0


	add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add_(Tensor other, *, Number alpha) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
  exp_avg.mul_(beta1).add_(1 - beta1, grad)


11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mnli_matched -- Dev loss: 1.283
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mnli_matched -- Dev ACC: 25.000
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 1: MNLI_MISMATCHED
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mnli_mismatched -- Dev loss: 1.165
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mnli_mismatched -- Dev ACC: 35.938
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 2: MRPC
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev loss: 0.634
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev ACC: 67.188
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev F1: 80.374
11/12/2020 07:06:33 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 3: SST
11/12/2020 07

11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task mnli_mismatched -- Dev ACC: 35.938
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 2: MRPC
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev loss: 0.629
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev ACC: 67.188
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev F1: 80.374
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 3: SST
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task sst -- Dev loss: 0.706
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Task sst -- Dev ACC: 39.062
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 4: QQP
11/12/2020 07:06:42 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:06:43 - mtdnn.modeling_mtdnn - INFO - Task qqp -- Dev lo

11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task mnli_mismatched -- Dev ACC: 35.938
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 2: MRPC
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev loss: 0.628
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev ACC: 67.188
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task mrpc -- Dev F1: 80.374
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 3: SST
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task sst -- Dev loss: 0.676
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Task sst -- Dev ACC: 60.938
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - Evaluating on dev ds 4: QQP
11/12/2020 07:07:06 - mtdnn.modeling_mtdnn - INFO - predicting 0
11/12/2020 07:07:07 - mtdnn.modeling_mtdnn - INFO - Task qqp -- Dev lo