Add MCTest and MCTACO #1197

HaokunLiu · 2020-10-15T04:27:58Z

This PR adds MCTest and MCTACO tasks.
A few worth noting points:

MCTACO introduces an evaluation scheme that measures the EM and F1 on the potential answers of each question, then take the average on all questions.
The new evaluation scheme uses ConcatenateLogitsAccumulator, however, due to this accumulator does not output anything based on guid_list. It was updating self.guid_list incorrectly and not noticed. I fixed ConcatenateLogitsAccumulator here.

Using Roberta-base model, MCTest160 gets 50 accuracy, MCTest500 gets 75 accuracy, MCTACO gets 49 EM and 70 F1. The data splits are more or less different from their original work, for some project design reasons. Nevertheless, the results are comparable or better than what's reported.

jeswan · 2020-10-15T16:51:53Z

Thanks for this PR. Can you please attach/reference the commands to reproduce these results? I want to start collecting these in tests or nyu-jiant. Also, can you add these tasks to the task chart (new adding tasks steps here: https://github.com/nyu-mll/jiant/pull/1154/files)?

zphang

Are these tasks available in NLP/datasets?

zphang · 2020-10-16T04:50:58Z

jiant/tasks/lib/mctaco.py

+                last_question = question
+            examples.append(
+                Example(
+                    # NOTE: get_glue_preds() is dependent on this guid format.


Remove this comment: not a GLUE task.

jeswan

Please add tasks to https://github.com/nyu-mll/jiant/blob/master/guides/tasks/supported_tasks.md

HaokunLiu · 2020-10-20T15:35:52Z

Thanks for this PR. Can you please attach/reference the commands to reproduce these results? I want to start collecting these in tests or nyu-jiant. Also, can you add these tasks to the task chart (new adding tasks steps here: https://github.com/nyu-mll/jiant/pull/1154/files)?

···
WORKING_DIR=... # Choose a working dir

MODELS_DIR=${WORKING_DIR}/models
DATA_DIR=${WORKING_DIR}/data
CACHE_DIR=${WORKING_DIR}/cache
RUN_CONFIG_DIR=${WORKING_DIR}/run_config_dir
OUTPUT_DIR=${WORKING_DIR}/output_dir
MODEL_TYPE=roberta-base

python jiant/scripts/preproc/export_model.py
--model_type ${MODEL_TYPE}
--output_base_path ${MODELS_DIR}/${MODEL_TYPE}

for TASK_NAME in mctest160 mctest500 mctaco
do
srun --mem=24000 --nodes=1 python jiant/proj/main/tokenize_and_cache.py
--task_config_path ${DATA_DIR}/configs/${TASK_NAME}_config.json
--model_type ${MODEL_TYPE}
--model_tokenizer_path ${MODELS_DIR}/${MODEL_TYPE}/tokenizer
--phases train,val
--max_seq_length 256
--do_iter
--smart_truncate
--output_dir ${CACHE_DIR}/${MODEL_TYPE}/${TASK_NAME}
done

for TASK_NAME in mctest160 mctest500 mctaco
do
python /home/hl3236/misc/make_config.py
--task_config_path ${DATA_DIR}/configs/${TASK_NAME}_config.json
--task_cache_base_path ${CACHE_DIR}/${MODEL_TYPE}/${TASK_NAME}
--train_batch_size 16
--accumulation 1
--epochs 30
--output_path ${RUN_CONFIG_DIR}/${TASK_NAME}.json
done

for TASK_NAME in mctest160 mctest500 mctaco
do
for lr in 1e-5
do
srun --mem=24000 --gres=gpu:p40:1 --time=3:00:00 python
jiant/proj/main/runscript.py
run
--ZZsrc ${MODELS_DIR}/${MODEL_TYPE}/config_master.json
--jiant_task_container_config_path ${RUN_CONFIG_DIR}/${TASK_NAME}.json
--model_load_mode from_transformers
--model_path ${MODELS_DIR}/${MODEL_TYPE}/model/${MODEL_TYPE}.p
--learning_rate ${lr}
--force_overwrite
--do_train --do_val
--do_save
--eval_every_steps 500
--no_improvements_for_n_evals 30
--save_checkpoint_every_steps 10000
--output_dir ${OUTPUT_DIR}/task/${TASK_NAME}_${lr}
done
done

grep major ${OUTPUT_DIR}/mctest*/val_metrics.json
···

I use these

codecov · 2020-10-20T15:45:00Z

Codecov Report

Merging #1197 into master will increase coverage by 0.06%.
The diff coverage is 61.94%.

@@            Coverage Diff             @@
##           master    #1197      +/-   ##
==========================================
+ Coverage   56.41%   56.48%   +0.06%     
==========================================
  Files         142      144       +2     
  Lines       10231    10364     +133     
==========================================
+ Hits         5772     5854      +82     
- Misses       4459     4510      +51

Impacted Files	Coverage Δ
jiant/tasks/evaluate/core.py	`36.17% <17.85%> (-0.99%)`	⬇️
jiant/tasks/lib/mctest.py	`68.29% <68.29%> (ø)`
jiant/tasks/lib/mctaco.py	`76.19% <76.19%> (ø)`
jiant/tasks/retrieval.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 76e2826...fcfd058. Read the comment docs.

HaokunLiu · 2020-10-21T14:15:01Z

Are these tasks available in NLP/datasets?

no, they are not.

zphang

Resolve merge conflicts and should be good to go.

HaokunLiu · 2020-10-28T20:04:44Z

@mattr1 @Slash0BZ
We are implementing two datasets of yours, if you have any suggestions, it will be very helpful.

* add mctest and mctaco * Update mctaco.py * add task to suppported tasks

add mctest and mctaco

04fe269

HaokunLiu requested review from jeswan and zphang as code owners October 15, 2020 04:27

zphang requested changes Oct 16, 2020

View reviewed changes

jeswan requested changes Oct 19, 2020

View reviewed changes

Update mctaco.py

bd1f384

add task to suppported tasks

f730da6

zphang approved these changes Oct 21, 2020

View reviewed changes

Merge branch 'master' into mctest_and_mctaco

fcfd058

HaokunLiu requested a review from jeswan October 28, 2020 05:56

jeswan approved these changes Oct 29, 2020

View reviewed changes

zphang merged commit c00360f into master Oct 29, 2020

zphang deleted the mctest_and_mctaco branch October 29, 2020 17:09

jeswan mentioned this pull request Nov 2, 2020

Fix guids handling #1225

Merged

leo-liuzy pushed a commit to leo-liuzy/dynamic_jiant that referenced this pull request Nov 11, 2020

Add MCTest and MCTACO (nyu-mll#1197)

7e2d5be

* add mctest and mctaco * Update mctaco.py * add task to suppported tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MCTest and MCTACO #1197

Add MCTest and MCTACO #1197

HaokunLiu commented Oct 15, 2020 •

edited

jeswan commented Oct 15, 2020 •

edited

zphang left a comment

zphang Oct 16, 2020

HaokunLiu Oct 20, 2020

jeswan left a comment

HaokunLiu commented Oct 20, 2020

codecov bot commented Oct 20, 2020 •

edited

HaokunLiu commented Oct 21, 2020

zphang left a comment

HaokunLiu commented Oct 28, 2020

Add MCTest and MCTACO #1197

Add MCTest and MCTACO #1197

Conversation

HaokunLiu commented Oct 15, 2020 • edited

jeswan commented Oct 15, 2020 • edited

zphang left a comment

Choose a reason for hiding this comment

zphang Oct 16, 2020

Choose a reason for hiding this comment

HaokunLiu Oct 20, 2020

Choose a reason for hiding this comment

jeswan left a comment

Choose a reason for hiding this comment

HaokunLiu commented Oct 20, 2020

codecov bot commented Oct 20, 2020 • edited

Codecov Report

HaokunLiu commented Oct 21, 2020

zphang left a comment

Choose a reason for hiding this comment

HaokunLiu commented Oct 28, 2020

HaokunLiu commented Oct 15, 2020 •

edited

jeswan commented Oct 15, 2020 •

edited

codecov bot commented Oct 20, 2020 •

edited