Complete pytorch transformers interface, deprecate old GPT implement #881

HaokunLiu · 2019-08-08T15:20:48Z

Replacing old GPT implement with the one from huggingface pytorch transformers
Add GPT2, Transformer-XL, XLM to pytorch transformer interface
Refactor pytorch transformer interface a little bit to reduce duplicated / outdated code & comments

This reverts commit 0cf7b23.

…ster

…h-transformers

… into pytorch-transformers

…h-transformers

…ster

… into pytorch-transformers

HaokunLiu · 2019-08-22T21:37:51Z

Generally, this PR looks good. Thanks for taking the time to do this Haokun! My only concern is about TaskModulator. I do see what you mean in that some of preprocessing belongs to the model, but I think we need to be careful about introducing any more complexity/abstraction to jiant (and making functions that are already long even longer and harder to parse). I'd have to think a bit harder about how to make this abstraction the most clear as it can be, but a start would be renaming TaskModulator to something like ModelPreprocessingInterface.

I was thinking about including tokenizer and indexer inside model_preprocessing_interface as well, and change the main process from create_tasks -> preprocess -> create model to create_tasks -> create model -> preprocess, so that members of model_preprocessing_interface can be passed from model, instead of first creating from args in preprocess, and again creating from args in model. But this is unnecessarily radical for the sake of #881 .

Since you are one of the major developers of jiant, maybe you can consider this idea, and when the time comes, figure out a well-rounded overall architecture for jiant.

HaokunLiu · 2019-08-26T02:34:52Z

This looks good to me. Do experiments with GPT1 match the old implementation?

	openai-gpt(old)	openai-gpt
cola	cola_mcc: 0.57882, cola_accuracy: 0.82838	cola_mcc: 0.554, cola_accuracy: 0.818
sts-b	sts-b_corr: 0.748, sts-b_pearsonr: 0.738, sts-b_spearmanr: 0.758	sts-b_corr: 0.857, sts-b_pearsonr: 0.859, sts-b_spearmanr: 0.856
mnli	mnli_accuracy: 0.71000	mnli_accuracy: 0.779
mrpc	mrpc_acc_f1: 0.76916, mrpc_accuracy: 0.71569, mrpc_f1: 0.82263, mrpc_precision: 0.71733, mrpc_recall: 0.96416	mrpc_acc_f1: 0.825, mrpc_accuracy: 0.794, mrpc_f1: 0.855, mrpc_precision: 0.824, mrpc_recall: 0.889
qnli	qnli_accuracy: 0.750	qnli_accuracy: 0.846
qqp	qqp_acc_f1: 0.75931, qqp_accuracy: 0.78620, qqp_f1: 0.73242, qqp_precision: 0.66895, qqp_recall: 0.80918	qqp_acc_f1: 0.824, qqp_accuracy: 0.847, qqp_f1: 0.801, qqp_precision: 0.771, qqp_recall: 0.833
rte	rte_accuracy: 0.560	rte_accuracy: 0.603
sst	sst_accuracy: 0.928	sst_accuracy: 0.939
wnli	wnli_accuracy: 0.437	wnli_accuracy: 0.493

Some results are different, but I think, considering how old gpt and new gpt differs, it meets the expectation.
We didn't include concatenating two sentences in a pair for old gpt, when it process pairwise tasks, it make embedding for two sentences separately, then combine the two embedding to make prediction.
As we can see, on single sentence tasks, i.e. cola and sst, old gpt and new gpt have similar results. While on pair tasks, new gpt is better.

sleepinyourhat · 2019-08-26T14:31:10Z

Thanks! That's not that informative of a comparison, but since you're getting numbers in the same ballpark as what OpenAI published, I think that's enough. I agree that it's okay to leave in some awkward abstractions for now—better to get this out there and refactor later than to put too much burden on you for doing it.

There are some new merge conflicts (we moved the config dir), BTW.

pruksmhc

LGTM!

sleepinyourhat · 2019-08-26T16:49:28Z

Ready to merge?

HaokunLiu · 2019-08-26T16:55:09Z

Yes, it’s ready.

sleepinyourhat · 2019-08-26T20:06:07Z

Great—I'll make a proper release tm unless someone beats me to it.

@iftenney

…881) * Rename namespaces to suppress warnings. * Revert "Rename namespaces to suppress warnings." This reverts commit 0cf7b23. * Initial working-ish attempt. * Intermediate check-in... * More partial progress. * Another pass... * Fix sep/cls handling, cleanup. * Further cleanup. * Keyword name fix. * Another flag fix. * Pull debug print. * Line length cleanup. * WiC fix. * Two task setup bugs. * BoolQ typo * Improved segment handling. * Delete unused is_pair_task, other cleanup/fixes. * Fix deleted path from merge. * Fix cache path. * relocate tasks from seminar * add linguistic phenomena benchmark tasks * Address (spurious?) tokenization warning. * Select pool_type automatically to match model. h/t Haokun Liu * Config updates. * Path fix * add two prefix method and simple LM * Fix XLNet UNK handling. * Internal temporary MNLI alternate. * Revert "Internal temporary MNLI alternate." This reverts commit 455792a. * refacor tags in data loader * Add helper fn tests * Finish merge * Remove unused argument. * update task init * Possible ReCoRD bug fix * Cleanup * Fix merge issues. * Revert "Remove unused argument." This reverts commit 96a7c37. * Assorted responses to Alex's commenst. * Further ReCoRD fix. * @iftenney's comments. * Fix/simplify segment logic. * @W4ngatang's comments * Cleanup. * add forward functinos * bugfix * merge pytorch transformer * update old process split * add gpt2 * add get_pretrained_lm_head for transformers * update filename * add config * debug * update config * allow evaluate with raw parameter * debug * Cleanup * Fix issues with alternative embeddings_mode settings, max_layer. * More mix cleanup. * Masking fix. * cleanup * simplify get_seg_ids * debug * related adjustments to add pytorch transformers * pytorch transformer refactor * formatting * formatting * debug * TransformerXL fix * update test script * formatting again * add note to transfo-xl * debug * update test script * update test script * tokenized_name change * cleanup * pool type fix * config update * Update defaults.conf * rename use_pytorch_transformer * cleanup * Update test_preprocess.py * Update test_checkpointing.py * Update test_write_preds.py * clean up * debug * name changes * name changes * update message * name changes * tokenizer name fix * docstring changes * name changes * restore asserts * add pair embedding for pytorch_transformers * add max position embedding assert * deal with gpt-like boundary fn * roberta tokenizer support * roberta model support * roberta embedder * fix roberta seg_id * change unused_task_name message * more test cases for pytorch_tranformers_interface * gpt-style mirrored pair forward func for similarity tasks * Update environment.yml * adjust import location * black * move import location * update test script * add comments to test script * update test script * pool type fix * tokenizer fix * debug * special tokens fix * roberta vocab fix * roberta tokenizer fix * clean up * Update test_pytorch_transformers_interface.py * add_special_token fix * black * fix roberta message logic * fix embedding extend bug * black * clean up * simplify add_special_token fix * add assert for lm task & pytorch_transformers * black * relocate task_modulator initialization * minor changes * rename task_modulator -> model_preprocessing_interface * change lm_parsing process_split docstring * black * add gpt2-large * update dependency * update dependency for real * clean up * add a forgotten similarity task for gpt * update setup * update setup

sleepinyourhat and others added 30 commits July 12, 2019 15:01

Rename namespaces to suppress warnings.

0cf7b23

Revert "Rename namespaces to suppress warnings."

38c5581

This reverts commit 0cf7b23.

Merge branch 'master' of https://github.com/nyu-mll/jiant into nyu-ma…

0c4546b

…ster

Initial working-ish attempt.

e881c19

Intermediate check-in...

6d4ff7f

More partial progress.

0d64ff2

Another pass...

4d8c125

Fix sep/cls handling, cleanup.

8f98adf

Further cleanup.

3f4c434

Merge branch 'master' of https://github.com/nyu-mll/jiant into pytorc…

fec6d36

…h-transformers

Keyword name fix.

283d23a

Another flag fix.

1563ef0

Pull debug print.

d98915b

Line length cleanup.

7b04a03

WiC fix.

f6b09cd

Two task setup bugs.

e933ecb

BoolQ typo

b6166a2

Improved segment handling.

3ad8471

Delete unused is_pair_task, other cleanup/fixes.

4d31430

Merge branch 'master' into pytorch-transformers

cf994c7

Merge branch 'pytorch-transformers' of https://github.com/nyu-mll/jiant…

9cd7555

… into pytorch-transformers

Fix deleted path from merge.

25b2d29

Fix cache path.

baba4b0

Merge branch 'master' of https://github.com/nyu-mll/jiant into pytorc…

e2f06cb

…h-transformers

relocate tasks from seminar

3e0322a

Merge branch 'master' of https://github.com/nyu-mll/jiant into nyu-ma…

4e2734b

…ster

add linguistic phenomena benchmark tasks

5a6b9db

Merge branch 'master' into pytorch-transformers

033d24c

Merge branch 'pytorch-transformers' of https://github.com/nyu-mll/jiant…

9f81708

… into pytorch-transformers

Address (spurious?) tokenization warning.

6a8459e

HaokunLiu added 4 commits August 22, 2019 17:16

relocate task_modulator initialization

9f5a056

minor changes

1fb9eff

rename task_modulator -> model_preprocessing_interface

f4f0c32

change lm_parsing process_split docstring

08deda9

HaokunLiu added 6 commits August 22, 2019 17:39

black

e262075

add gpt2-large

5e2d628

update dependency

bb27f9b

update dependency for real

ebff9a3

clean up

8163933

add a forgotten similarity task for gpt

82c3c6e

HaokunLiu requested review from iftenney and pruksmhc August 26, 2019 02:35

pruksmhc approved these changes Aug 26, 2019

View reviewed changes

HaokunLiu added 3 commits August 26, 2019 12:07

Merge branch 'master' into pytorch_transformers

14dd469

update setup

0c5c02d

update setup

53e2015

sleepinyourhat merged commit 3bf415c into master Aug 26, 2019

HaokunLiu deleted the pytorch_transformers branch August 26, 2019 19:16

HaokunLiu mentioned this pull request Sep 5, 2019

update dependencies to support gpt2-large #909

Merged

pyeres mentioned this pull request Apr 3, 2020

Unsupported tokenizer 'OpenAI.BPE' #1049

Closed

This was referenced Sep 17, 2020

[CLOSED] Complete pytorch transformers interface, deprecate old GPT implement nyu-mll/jiant-v1-legacy#881

Closed

Unsupported tokenizer 'OpenAI.BPE' nyu-mll/jiant-v1-legacy#1049

Open

jeswan added the jiant-v1-legacy Relevant to versions <= v1.3.2 label Sep 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete pytorch transformers interface, deprecate old GPT implement #881

Complete pytorch transformers interface, deprecate old GPT implement #881

HaokunLiu commented Aug 8, 2019

HaokunLiu commented Aug 22, 2019

HaokunLiu commented Aug 26, 2019 •

edited

sleepinyourhat commented Aug 26, 2019

pruksmhc left a comment

sleepinyourhat commented Aug 26, 2019

HaokunLiu commented Aug 26, 2019

sleepinyourhat commented Aug 26, 2019

Complete pytorch transformers interface, deprecate old GPT implement #881

Complete pytorch transformers interface, deprecate old GPT implement #881

Conversation

HaokunLiu commented Aug 8, 2019

HaokunLiu commented Aug 22, 2019

HaokunLiu commented Aug 26, 2019 • edited

sleepinyourhat commented Aug 26, 2019

pruksmhc left a comment

Choose a reason for hiding this comment

sleepinyourhat commented Aug 26, 2019

HaokunLiu commented Aug 26, 2019

sleepinyourhat commented Aug 26, 2019

HaokunLiu commented Aug 26, 2019 •

edited