Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scores_for_ground_truths Error for deepset/roberta-base-squad2 model and squad_v2 dataset #30856

Open
2 of 4 tasks
rahuljauhari3 opened this issue May 16, 2024 · 3 comments
Open
2 of 4 tasks
Labels
Examples Which is related to examples in general TensorFlow Anything TensorFlow

Comments

@rahuljauhari3
Copy link

System Info

i am testing run_qa.py file on native settings it is giving me the following error.

2024-05-16 11:46:37.772374: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-05-16 11:46:37.832999: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-16 11:46:38.990959: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/rahul/anaconda3/envs/qwerty/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForQuestionAnswering: ['roberta.embeddings.position_ids']

  • This IS expected if you are initializing TFRobertaForQuestionAnswering from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing TFRobertaForQuestionAnswering from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
    All the weights of TFRobertaForQuestionAnswering were initialized from the PyTorch model.
    If your task is similar to the task the model of the checkpoint was trained on, you can already use TFRobertaForQuestionAnswering for predictions without further training.
    2024-05-16 11:46:44.559942: W tensorflow/core/framework/dataset.cc:959] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
    2024-05-16 11:47:03.841362: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
    [[{{node MultiDeviceIteratorGetNextFromShard}}]]
    1/1 [==============================] - 19s 19s/step
    100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11873/11873 [00:01<00:00, 10079.43it/s]
    Traceback (most recent call last):
    File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 846, in
    main()
    File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 803, in main
    metrics = compute_metrics(post_processed_eval)
    File "/home/rahul/test/transformers/examples/tensorflow/question-answering/run_qa.py", line 653, in compute_metrics
    return metric.compute(predictions=p.predictions, references=p.label_ids)
    File "/home/rahul/anaconda3/envs/qwerty/lib/python3.10/site-packages/evaluate/module.py", line 467, in compute
    output = self._compute(**inputs, **compute_kwargs)
    File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/squad.py", line 110, in _compute
    score = compute_score(dataset=dataset, predictions=pred_dict)
    File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/compute_score.py", line 67, in compute_score
    exact_match += metric_max_over_ground_truths(exact_match_score, prediction, ground_truths)
    File "/home/rahul/xyz/modules/evaluate_modules/metrics/evaluate-metric--squad/b4e2dbca455821c7367faa26712f378254b69040ebaab90b64bdeb465e4a304d/compute_score.py", line 52, in metric_max_over_ground_truths
    return max(scores_for_ground_truths)
    ValueError: max() arg is an empty sequence

roberta model works with squad dataset but not with squad_v2 dataset.

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

pip install git+https://github.com/huggingface/transformers
pip install tensorflow
cd transformers/examples/tensorflow/question-answering
python run_qa.py --model_name_or_path deepset/roberta-base-squad2 --output_dir ~/tf_accuracy/results/roberta --dataset_name squad_v2 --do_eval --overwrite_output_dir

Expected behavior

it should pass and produce f1 score and exact match.

@amyeroberts
Copy link
Collaborator

cc @Rocketknight1

@amyeroberts amyeroberts added TensorFlow Anything TensorFlow Examples Which is related to examples in general labels May 16, 2024
@rahuljauhari3
Copy link
Author

Hi, any updates?

@Rocketknight1
Copy link
Member

I reproduced the issue - it seems to only occur with squad_v2, rather than squad. It also occurs in both the PyTorch and TF example, so my guess is that there's something odd in that dataset that's throwing the example off. I'll see if I can find time to investigate, but it's fairly low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Examples Which is related to examples in general TensorFlow Anything TensorFlow
Projects
None yet
Development

No branches or pull requests

3 participants