Add new GLUE example with no Trainer. #10555

sgugger · 2021-03-05T22:19:11Z

What does this PR do?

This PR adds a new GLUE example that does not use the Trainer, leveraging accelerate for the distributed training. The necessary instructions are added in the text-classification README.

cc @JetRunner as it should be of interest to you.

JetRunner

@sgugger Very impressive! accelerate seems very user-friendly and I definitely like it.

I think the arguments should be consistent with transformers==2.11.0 (which is pretty close to Google's old run_glue.py). Not because I'm nostalgic but I think we should respect researchers' customs and try not to go against them. Also, I think all features in the old script are necessary and we shouldn't really cut them out. Just have something compatible with the latest version of transformers but has the same parameters as the old run_glue.py is the best IMO. Researchers typically don't care about fancy features like picking the best checkpoint automatically but let's at least don't have fewer features than Google's run_glue.

Also, making predictions (--do_predict) is critical for researchers so I kinda cannot understand why it's supported in Trainer-version run_glue but not here.

JetRunner · 2021-03-06T15:15:33Z

examples/text-classification/run_glue_no_trainer.py

+        required=True,
+    )
+    parser.add_argument(
+        "--use_slow_tokenizer",


--use_native/traditional/python/legacy_tokenizer?
slow tokenizer sounds terrible but it's actually not that slow.

BTW, some users told me fast tokenizer sometimes output different results from the native ones occasionally. But I can't really reproduce that. Should we use the native tokenizers by default just to be safe?

I believe the different results are actually bug fixes but it's hard to say without any example.

JetRunner · 2021-03-06T15:20:53Z

examples/text-classification/run_glue_no_trainer.py

+    # In distributed training, the .from_pretrained methods guarantee that only one local process can concurrently
+    # download model & vocab.
+    config = AutoConfig.from_pretrained(args.model_name_or_path, num_labels=num_labels, finetuning_task=args.task_name)
+    tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, use_fast=not args.use_slow_tokenizer)


I personally prefer use_slow by default.

Fast is the default of the library since 4.0.0.

JetRunner · 2021-03-06T15:26:46Z

examples/text-classification/run_glue_no_trainer.py

+                optimizer.zero_grad()
+                progress_bar.update(1)
+                completed_steps += 1
+


Logging/saving during training is a kind of necessity for researchers. Should have a --logging_steps here and --saving_steps.

JetRunner · 2021-03-06T15:38:22Z

This kind of logging is very useful for researchers. Let's add them back?

https://github.com/google-research/bert/blob/master/run_classifier.py#L871

JetRunner · 2021-03-06T15:40:30Z

In a nutshell, I'll burst into tears if we can just have Google's run_classifier.py back but with accelerate :)

JetRunner

Some additional comments.

JetRunner · 2021-03-06T15:43:00Z

examples/text-classification/run_glue_no_trainer.py

+    parser.add_argument(
+        "--no_correct_bias_in_adam", action="store_true", help="If passed, no bias correction is applied in AdamW."
+    )


Not really necessary to me? Researchers can just tweak it manually if they need that.

JetRunner · 2021-03-06T15:45:16Z

examples/text-classification/run_glue_no_trainer.py

+        # Some have all caps in their config, some don't.
+        label_name_to_id = {k.lower(): v for k, v in model.config.label2id.items()}
+        if list(sorted(label_name_to_id.keys())) == list(sorted(label_list)):
+            label_to_id = {i: label_name_to_id[label_list[i]] for i in range(num_labels)}


Let's add a warning too? Users should know their label assignment is not the default config in datasets.

JetRunner · 2021-03-06T15:46:27Z

examples/text-classification/run_glue_no_trainer.py

+    if (
+        model.config.label2id != PretrainedConfig(num_labels=num_labels).label2id
+        and args.task_name is not None
+        and is_regression


and is_regression? Do you mean and not is_regression?

JetRunner · 2021-03-06T15:49:01Z

examples/text-classification/run_glue_no_trainer.py

+                "Your model seems to have been trained with labels, but they don't match the dataset: ",
+                f"model labels: {list(sorted(label_name_to_id.keys()))}, dataset labels: {list(sorted(label_list))}."
+                "\nIgnoring the model labels as a result.",


Add Please make sure the label assignment is consistent with {list(sorted(label_list))}.? Then we should let the users know how to overwrite the label assignment?

JetRunner · 2021-03-06T15:53:38Z

Maybe we should tag other researchers (even external) to give some feedback. cc @VictorSanh @TevenLeScao

sgugger · 2021-03-09T19:35:37Z

Addressed most of your comments except the logging/saving steps. I do not have time to add this right now, so I suggest we merge the current version and someone from the community can finish it.

JetRunner

LGTM! I can make some improvements incl. logging/saving steps.

LysandreJik

LGTM, thanks @sgugger. Looking forward to future improvements by @JetRunner too!

* Add new GLUE example with no Trainer. * Style * Address review comments

Add new GLUE example with no Trainer.

5b32b65

sgugger requested review from thomwolf and LysandreJik March 5, 2021 22:19

Style

ff19237

JetRunner reviewed Mar 6, 2021

View reviewed changes

Address review comments

0c1abdc

JetRunner approved these changes Mar 10, 2021

View reviewed changes

LysandreJik approved these changes Mar 10, 2021

View reviewed changes

sgugger merged commit efb5c0a into master Mar 10, 2021

sgugger deleted the new_glue_script branch March 10, 2021 14:29

Iwontbecreative pushed a commit to Iwontbecreative/transformers that referenced this pull request Jul 15, 2021

Add new GLUE example with no Trainer. (huggingface#10555)

698382b

* Add new GLUE example with no Trainer. * Style * Address review comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new GLUE example with no Trainer. #10555

Add new GLUE example with no Trainer. #10555

sgugger commented Mar 5, 2021

JetRunner left a comment •

edited

JetRunner Mar 6, 2021 •

edited

sgugger Mar 9, 2021

JetRunner Mar 6, 2021

sgugger Mar 9, 2021

JetRunner Mar 6, 2021

JetRunner commented Mar 6, 2021

JetRunner commented Mar 6, 2021

JetRunner left a comment

JetRunner Mar 6, 2021

JetRunner Mar 6, 2021 •

edited

JetRunner Mar 6, 2021

JetRunner Mar 6, 2021

JetRunner commented Mar 6, 2021 •

edited

sgugger commented Mar 9, 2021

JetRunner left a comment

LysandreJik left a comment

Add new GLUE example with no Trainer. #10555

Add new GLUE example with no Trainer. #10555

Conversation

sgugger commented Mar 5, 2021

What does this PR do?

JetRunner left a comment • edited

Choose a reason for hiding this comment

JetRunner Mar 6, 2021 • edited

Choose a reason for hiding this comment

sgugger Mar 9, 2021

Choose a reason for hiding this comment

JetRunner Mar 6, 2021

Choose a reason for hiding this comment

sgugger Mar 9, 2021

Choose a reason for hiding this comment

JetRunner Mar 6, 2021

Choose a reason for hiding this comment

JetRunner commented Mar 6, 2021

JetRunner commented Mar 6, 2021

JetRunner left a comment

Choose a reason for hiding this comment

JetRunner Mar 6, 2021

Choose a reason for hiding this comment

JetRunner Mar 6, 2021 • edited

Choose a reason for hiding this comment

JetRunner Mar 6, 2021

Choose a reason for hiding this comment

JetRunner Mar 6, 2021

Choose a reason for hiding this comment

JetRunner commented Mar 6, 2021 • edited

sgugger commented Mar 9, 2021

JetRunner left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

JetRunner left a comment •

edited

JetRunner Mar 6, 2021 •

edited

JetRunner Mar 6, 2021 •

edited

JetRunner commented Mar 6, 2021 •

edited