-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XLNet evaluation on SQuAD #9351
Comments
Pinging @sgugger here. Think he has more knowledge about the training script than I do. |
This is linked to this issue in the tokenizers repo. Until this is solved, the script |
Hi @sgugger, thanks for your answer. However, I'm trying to do a (fair) comparison between models, so using beam search is not an option. I might install another package version that works well with XLNet on SQuAD (I've seen, for example, that v. 3.10 also has some problems in evaluation). Do you know if any previous version is ok, at the moment? |
You can always use the legacy script if you can't wait for the fix. |
Thank you very much, I was unaware of legacy scripts. Do I need a particular transformers version to run them? When I run run_squad.py at the moment I get (errors in bolds) 01/05/2021 15:51:31 - WARNING - main - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False [INFO|configuration_utils.py:431] 2021-01-05 15:51:31,607 >> loading configuration file https://huggingface.co/xlnet-base-cased/resolve/main/config.json from cache at /home/scasola/.cache/huggingface/transformers/06bdb0f5882dbb833618c81c3b4c996a0c79422fa2c95ffea3827f92fc2dba6b.da982e2e596ec73828dbae86525a1870e513bd63aae5a2dc773ccc840ac5c346 [INFO|tokenization_utils_base.py:1802] 2021-01-05 15:51:32,221 >> loading file https://huggingface.co/xlnet-base-cased/resolve/main/spiece.model from cache at /home/scasola/.cache/huggingface/transformers/df73bc9f8d13bf2ea4dab95624895e45a550a0f0a825e41fc25440bf367ee3c8.d93497120e3a865e2970f26abdf7bf375896f97fde8b874b70909592a6c785c9 **The above exception was the direct cause of the following exception: Traceback (most recent call last):** This might be related to the tokenizer, as in #7735 . |
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions. If you think this still needs to be addressed please comment on this thread. |
am having the same issue and a fix would be really nice... |
Thank you for opening an issue - Unfortunately, we're limited on bandwidth and fixing QA for XLNet is quite low on our priority list. If you would like to go ahead and fix this issue, we would love to review a PR, but we won't find the time to get to it right away. |
Environment info
transformers
version: 4.2.0dev0Who can help
XLNet @LysandreJik
Information
Model I am using (Bert, XLNet ...): XLNet
The problem arises when using:
The tasks I am working on is:
To reproduce
I installed the transformer package from source, as required.
When I try to evaluate XLNet on the SQUAD dataset, however, I get a problem.
In particular, I run the official script as:
This is the whole output, most of which is probably non relevant, for reference (error in bold)
12/29/2020 22:41:21 - WARNING - main - Process rank: -1, device: cuda:0, n_gpu: 2distributed training: False, 16-bits training: False
12/29/2020 22:41:21 - INFO - main - Training/evaluation parameters TrainingArguments(output_dir=../../../../squad_results, overwrite_output_dir=False, do_train=False, do_eval=True, do_predict=False, model_parallel=False, evaluation_strategy=EvaluationStrategy.NO, prediction_loss_only=False, per_device_train_batch_size=8, per_device_eval_batch_size=1, gradient_accumulation_steps=1, eval_accumulation_steps=None, learning_rate=1e-05, weight_decay=0.0, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=3.0, max_steps=-1, lr_scheduler_type=SchedulerType.LINEAR, warmup_steps=0, logging_dir=runs/Dec29_22-41-21_HLTNLP-GPU-B, logging_first_step=False, logging_steps=500, save_steps=500, save_total_limit=None, no_cuda=False, seed=1, fp16=False, fp16_opt_level=O1, local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=500, dataloader_num_workers=0, past_index=-1, run_name=../../../../squad_results, disable_tqdm=False, remove_unused_columns=True, label_names=None, load_best_model_at_end=False, metric_for_best_model=None, greater_is_better=None, ignore_data_skip=False, fp16_backend=auto, sharded_ddp=False, label_smoothing_factor=0.0, adafactor=False)
Reusing dataset squad_v2 (/home/scasola/.cache/huggingface/datasets/squad_v2/squad_v2/2.0.0/0e44b51f4035c15e218d53dc9eea5fe7123341982e524818b8500e4094fffb7b)
loading configuration file https://huggingface.co/xlnet-base-cased/resolve/main/config.json from cache at /home/scasola/.cache/huggingface/transformers/06bdb0f5882dbb833618c81c3b4c996a0c79422fa2c95ffea3827f92fc2dba6b.da982e2e596ec73828dbae86525a1870e513bd63aae5a2dc773ccc840ac5c346
Model config XLNetConfig {
"architectures": [
"XLNetLMHeadModel"
],
"attn_type": "bi",
"bi_data": false,
"bos_token_id": 1,
"clamp_len": -1,
"d_head": 64,
"d_inner": 3072,
"d_model": 768,
"dropout": 0.1,
"end_n_top": 5,
"eos_token_id": 2,
"ff_activation": "gelu",
"initializer_range": 0.02,
"layer_norm_eps": 1e-12,
"mem_len": null,
"model_type": "xlnet",
"n_head": 12,
"n_layer": 12,
"pad_token_id": 5,
"reuse_len": null,
"same_length": false,
"start_n_top": 5,
"summary_activation": "tanh",
"summary_last_dropout": 0.1,
"summary_type": "last",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 250
}
},
"untie_r": true,
"use_mems_eval": true,
"use_mems_train": false,
"vocab_size": 32000
}
loading configuration file https://huggingface.co/xlnet-base-cased/resolve/main/config.json from cache at /home/scasola/.cache/huggingface/transformers/06bdb0f5882dbb833618c81c3b4c996a0c79422fa2c95ffea3827f92fc2dba6b.da982e2e596ec73828dbae86525a1870e513bd63aae5a2dc773ccc840ac5c346
Model config XLNetConfig {
"architectures": [
"XLNetLMHeadModel"
],
"attn_type": "bi",
"bi_data": false,
"bos_token_id": 1,
"clamp_len": -1,
"d_head": 64,
"d_inner": 3072,
"d_model": 768,
"dropout": 0.1,
"end_n_top": 5,
"eos_token_id": 2,
"ff_activation": "gelu",
"initializer_range": 0.02,
"layer_norm_eps": 1e-12,
"mem_len": null,
"model_type": "xlnet",
"n_head": 12,
"n_layer": 12,
"pad_token_id": 5,
"reuse_len": null,
"same_length": false,
"start_n_top": 5,
"summary_activation": "tanh",
"summary_last_dropout": 0.1,
"summary_type": "last",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 250
}
},
"untie_r": true,
"use_mems_eval": true,
"use_mems_train": false,
"vocab_size": 32000
}
loading file https://huggingface.co/xlnet-base-cased/resolve/main/spiece.model from cache at /home/scasola/.cache/huggingface/transformers/df73bc9f8d13bf2ea4dab95624895e45a550a0f0a825e41fc25440bf367ee3c8.d93497120e3a865e2970f26abdf7bf375896f97fde8b874b70909592a6c785c9
loading file https://huggingface.co/xlnet-base-cased/resolve/main/tokenizer.json from cache at /home/scasola/.cache/huggingface/transformers/46f47734f3dcaef7e236b9a3e887f27814e18836a8db7e6a49148000058a1a54.2a683f915238b4f560dab0c724066cf0a7de9a851e96b0fb3a1e7f0881552f53
loading weights file https://huggingface.co/xlnet-base-cased/resolve/main/pytorch_model.bin from cache at /home/scasola/.cache/huggingface/transformers/9461853998373b0b2f8ef8011a13b62a2c5f540b2c535ef3ea46ed8a062b16a9.3e214f11a50e9e03eb47535b58522fc3cc11ac67c120a9450f6276de151af987
Some weights of the model checkpoint at xlnet-base-cased were not used when initializing XLNetForQuestionAnsweringSimple: ['lm_loss.weight', 'lm_loss.bias']
Some weights of XLNetForQuestionAnsweringSimple were not initialized from the model checkpoint at xlnet-base-cased and are newly initialized: ['qa_outputs.weight', 'qa_outputs.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Loading cached processed dataset at /home/scasola/.cache/huggingface/datasets/squad_v2/squad_v2/2.0.0/0e44b51f4035c15e218d53dc9eea5fe7123341982e524818b8500e4094fffb7b/cache-c46fe459ef8061d5.arrow
The following columns in the evaluation set don't have a corresponding argument in
XLNetForQuestionAnsweringSimple.forward
and have been ignored: example_id, offset_mapping.12/29/2020 22:41:30 - INFO - main - *** Evaluate ***
The following columns in the evaluation set don't have a corresponding argument in
XLNetForQuestionAnsweringSimple.forward
and have been ignored: example_id, offset_mapping.***** Running Evaluation *****
Num examples = 12231
Batch size = 2
█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6116/6116 [38:14<00:00, 3.32it/s]12/29/2020 23:19:57 - INFO - utils_qa - Post-processing 11873 example predictions split into 12231 features.
0%| | 0/11873 [00:00<?, ?it/s]Traceback (most recent call last): | 0/11873 [00:00<?, ?it/s] File "run_qa.py", line 480, in
main()
File "run_qa.py", line 461, in main
results = trainer.evaluate()
File "/home/scasola/survey/squad/xlnet/transformers/examples/question-answering/trainer_qa.py", line 62, in evaluate
eval_preds = self.post_process_function(eval_examples, eval_dataset, output.predictions)
File "run_qa.py", line 407, in post_processing_function
is_world_process_zero=trainer.is_world_process_zero(),
File "/home/scasola/survey/squad/xlnet/transformers/examples/question-answering/utils_qa.py", line 195, in postprocess_qa_predictions
while predictions[i]["text"] == "":
IndexError: list index out of range
Expected behavior
Evalaution of the model saved in the output dir
The text was updated successfully, but these errors were encountered: