scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

rahuljauhari3 · 2024-05-16T12:06:06Z

System Info

i am testing run_qa.py file on native settings it is giving me the following error.

2024-05-16 11:46:37.772374: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-05-16 11:46:37.832999: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-16 11:46:38.990959: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/rahul/anaconda3/envs/qwerty/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForQuestionAnswering: ['roberta.embeddings.position_ids']

This IS expected if you are initializing TFRobertaForQuestionAnswering from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing TFRobertaForQuestionAnswering from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFRobertaForQuestionAnswering were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFRobertaForQuestionAnswering for predictions without further training.
2024-05-16 11:46:44.559942: W tensorflow/core/framework/dataset.cc:959] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2024-05-16 11:47:03.841362: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node MultiDeviceIteratorGetNextFromShard}}]]
1/1 [==============================] - 19s 19s/step
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11873/11873 [00:01<00:00, 10079.43it/s]
Traceback (most recent call last):
File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 846, in
main()
File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 803, in main
metrics = compute_metrics(post_processed_eval)
File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 653, in compute_metrics
return metric.compute(predictions=p.predictions, references=p.label_ids)
File "/home/rahul/anaconda3/envs/qwerty/lib/python3.10/site-packages/evaluate/module.py", line 467, in compute
output = self._compute(**inputs, **compute_kwargs)
File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/squad.py", line 110, in _compute
score = compute_score(dataset=dataset, predictions=pred_dict)
File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/compute_score.py", line 67, in compute_score
exact_match += metric_max_over_ground_truths(exact_match_score, prediction, ground_truths)
File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/compute_score.py", line 52, in metric_max_over_ground_truths
return max(scores_for_ground_truths)
ValueError: max() arg is an empty sequence

roberta model works with squad dataset but not with squad_v2 dataset.

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

pip install git+https://github.com/huggingface/transformers
pip install tensorflow
cd transformers/examples/tensorflow/question-answering
python run_qa.py --model_name_or_path deepset/roberta-base-squad2 --output_dir ~/tf_accuracy/results/roberta --dataset_name squad_v2 --do_eval --overwrite_output_dir

Expected behavior

it should pass and produce f1 score and exact match.

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-05-16T12:33:36Z

cc @Rocketknight1

rahuljauhari3 · 2024-05-28T04:21:21Z

Hi, any updates?

Rocketknight1 · 2024-05-28T17:02:05Z

I reproduced the issue - it seems to only occur with squad_v2, rather than squad. It also occurs in both the PyTorch and TF example, so my guess is that there's something odd in that dataset that's throwing the example off. I'll see if I can find time to investigate, but it's fairly low priority.

amyeroberts added TensorFlow Anything TensorFlow Examples Which is related to examples in general labels May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

rahuljauhari3 commented May 16, 2024

amyeroberts commented May 16, 2024

rahuljauhari3 commented May 28, 2024

Rocketknight1 commented May 28, 2024

scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

Comments

rahuljauhari3 commented May 16, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented May 16, 2024

rahuljauhari3 commented May 28, 2024

Rocketknight1 commented May 28, 2024