Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete pytorch transformers interface, deprecate old GPT implement #881

Merged
merged 163 commits into from Aug 26, 2019

Conversation

HaokunLiu
Copy link
Member

  • Replacing old GPT implement with the one from huggingface pytorch transformers
  • Add GPT2, Transformer-XL, XLM to pytorch transformer interface
  • Refactor pytorch transformer interface a little bit to reduce duplicated / outdated code & comments

sleepinyourhat and others added 30 commits July 12, 2019 15:01
@HaokunLiu
Copy link
Member Author

Generally, this PR looks good. Thanks for taking the time to do this Haokun! My only concern is about TaskModulator. I do see what you mean in that some of preprocessing belongs to the model, but I think we need to be careful about introducing any more complexity/abstraction to jiant (and making functions that are already long even longer and harder to parse). I'd have to think a bit harder about how to make this abstraction the most clear as it can be, but a start would be renaming TaskModulator to something like ModelPreprocessingInterface.

I was thinking about including tokenizer and indexer inside model_preprocessing_interface as well, and change the main process from create_tasks -> preprocess -> create model to create_tasks -> create model -> preprocess, so that members of model_preprocessing_interface can be passed from model, instead of first creating from args in preprocess, and again creating from args in model. But this is unnecessarily radical for the sake of #881 .

Since you are one of the major developers of jiant, maybe you can consider this idea, and when the time comes, figure out a well-rounded overall architecture for jiant.

@HaokunLiu
Copy link
Member Author

HaokunLiu commented Aug 26, 2019

This looks good to me. Do experiments with GPT1 match the old implementation?

  openai-gpt(old) openai-gpt
cola cola_mcc: 0.57882, cola_accuracy: 0.82838 cola_mcc: 0.554, cola_accuracy: 0.818
sts-b sts-b_corr: 0.748, sts-b_pearsonr: 0.738, sts-b_spearmanr: 0.758 sts-b_corr: 0.857, sts-b_pearsonr: 0.859, sts-b_spearmanr: 0.856
mnli mnli_accuracy: 0.71000 mnli_accuracy: 0.779
mrpc mrpc_acc_f1: 0.76916, mrpc_accuracy: 0.71569, mrpc_f1: 0.82263, mrpc_precision: 0.71733, mrpc_recall: 0.96416 mrpc_acc_f1: 0.825, mrpc_accuracy: 0.794, mrpc_f1: 0.855, mrpc_precision: 0.824, mrpc_recall: 0.889
qnli qnli_accuracy: 0.750 qnli_accuracy: 0.846
qqp qqp_acc_f1: 0.75931, qqp_accuracy: 0.78620, qqp_f1: 0.73242, qqp_precision: 0.66895, qqp_recall: 0.80918 qqp_acc_f1: 0.824, qqp_accuracy: 0.847, qqp_f1: 0.801, qqp_precision: 0.771, qqp_recall: 0.833
rte rte_accuracy: 0.560 rte_accuracy: 0.603
sst sst_accuracy: 0.928 sst_accuracy: 0.939
wnli wnli_accuracy: 0.437 wnli_accuracy: 0.493

Some results are different, but I think, considering how old gpt and new gpt differs, it meets the expectation.
We didn't include concatenating two sentences in a pair for old gpt, when it process pairwise tasks, it make embedding for two sentences separately, then combine the two embedding to make prediction.
As we can see, on single sentence tasks, i.e. cola and sst, old gpt and new gpt have similar results. While on pair tasks, new gpt is better.

@sleepinyourhat
Copy link
Contributor

Thanks! That's not that informative of a comparison, but since you're getting numbers in the same ballpark as what OpenAI published, I think that's enough. I agree that it's okay to leave in some awkward abstractions for now—better to get this out there and refactor later than to put too much burden on you for doing it.

There are some new merge conflicts (we moved the config dir), BTW.

Copy link
Contributor

@pruksmhc pruksmhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sleepinyourhat
Copy link
Contributor

Ready to merge?

@HaokunLiu
Copy link
Member Author

Yes, it’s ready.

@sleepinyourhat sleepinyourhat merged commit 3bf415c into master Aug 26, 2019
@HaokunLiu HaokunLiu deleted the pytorch_transformers branch August 26, 2019 19:16
@sleepinyourhat
Copy link
Contributor

Great—I'll make a proper release tm unless someone beats me to it.

phu-pmh pushed a commit that referenced this pull request Apr 17, 2020
…881)

* Rename namespaces to suppress warnings.

* Revert "Rename namespaces to suppress warnings."

This reverts commit 0cf7b23.

* Initial working-ish attempt.

* Intermediate check-in...

* More partial progress.

* Another pass...

* Fix sep/cls handling, cleanup.

* Further cleanup.

* Keyword name fix.

* Another flag fix.

* Pull debug print.

* Line length cleanup.

* WiC fix.

* Two task setup bugs.

* BoolQ typo

* Improved segment handling.

* Delete unused is_pair_task, other cleanup/fixes.

* Fix deleted path from merge.

* Fix cache path.

* relocate tasks from seminar

* add linguistic phenomena benchmark tasks

* Address (spurious?) tokenization warning.

* Select pool_type automatically to match model.

h/t Haokun Liu

* Config updates.

* Path fix

* add two prefix method and simple LM

* Fix XLNet UNK handling.

* Internal temporary MNLI alternate.

* Revert "Internal temporary MNLI alternate."

This reverts commit 455792a.

* refacor tags in data loader

* Add helper fn tests

* Finish merge

* Remove unused argument.

* update task init

* Possible ReCoRD bug fix

* Cleanup

* Fix merge issues.

* Revert "Remove unused argument."

This reverts commit 96a7c37.

* Assorted responses to Alex's commenst.

* Further ReCoRD fix.

* @iftenney's comments.

* Fix/simplify segment logic.

* @W4ngatang's comments

* Cleanup.

* add forward functinos

* bugfix

* merge pytorch transformer

* update old process split

* add gpt2

* add get_pretrained_lm_head for transformers

* update filename

* add config

* debug

* update config

* allow evaluate with raw parameter

* debug

* Cleanup

* Fix issues with alternative embeddings_mode settings, max_layer.

* More mix cleanup.

* Masking fix.

* cleanup

* simplify get_seg_ids

* debug

* related adjustments to add pytorch transformers

* pytorch transformer refactor

* formatting

* formatting

* debug

* TransformerXL fix

* update test script

* formatting again

* add note to transfo-xl

* debug

* update test script

* update test script

* tokenized_name change

* cleanup

* pool type fix

* config update

* Update defaults.conf

* rename use_pytorch_transformer

* cleanup

* Update test_preprocess.py

* Update test_checkpointing.py

* Update test_write_preds.py

* clean up

* debug

* name changes

* name changes

* update message

* name changes

* tokenizer name fix

* docstring changes

* name changes

* restore asserts

* add pair embedding for pytorch_transformers

* add max position embedding assert

* deal with gpt-like boundary fn

* roberta tokenizer support

* roberta model support

* roberta embedder

* fix roberta seg_id

* change unused_task_name message

* more test cases for pytorch_tranformers_interface

* gpt-style mirrored pair forward func for similarity tasks

* Update environment.yml

* adjust import location

* black

* move import location

* update test script

* add comments to test script

* update test script

* pool type fix

* tokenizer fix

* debug

* special tokens fix

* roberta vocab fix

* roberta tokenizer fix

* clean up

* Update test_pytorch_transformers_interface.py

* add_special_token fix

* black

* fix roberta message logic

* fix embedding extend bug

* black

* clean up

* simplify add_special_token fix

* add assert for lm task & pytorch_transformers

* black

* relocate task_modulator initialization

* minor changes

* rename task_modulator -> model_preprocessing_interface

* change lm_parsing process_split docstring

* black

* add gpt2-large

* update dependency

* update dependency for real

* clean up

* add a forgotten similarity task for gpt

* update setup

* update setup
@jeswan jeswan added the jiant-v1-legacy Relevant to versions <= v1.3.2 label Sep 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.x.0 release on fix Put out a new 0.x.0 release when this is fixed. jiant-v1-legacy Relevant to versions <= v1.3.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants