Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug in eval_batch_retrieval of eval_rag.py #9089

Merged
merged 1 commit into from
Dec 15, 2020

Conversation

yoshitomo-matsubara
Copy link
Contributor

What does this PR do?

Following the instructions in RAG example, I was trying to evaluate retrieval against DPR evaluation data.

pipenv run python eval_rag.py --model_name_or_path facebook/rag-sequence-nq --model_type rag_sequence --evaluation_set output/biencoder-nq-dev.questions --gold_data_path output/biencoder-nq-dev.pages --predictions_path output/retrieval_preds.tsv --eval_mode retrieval --k 1

With the above command, I faced the following error and confirmed that question_enc_outputs is a tuple whose length is 1.

...
loading weights file https://huggingface.co/facebook/rag-sequence-nq/resolve/main/pytorch_model.bin from cache at /home/ubuntu/.cache/huggingface/transformers/9456ce4ba210322153f704e0f26c6228bd6c0caad60fe1b3bdca001558adbeca.ee816b8e716f9741a2ac602bb9c6f4d84eff545b0b00a6c5353241bea6dec221
All model checkpoint weights were used when initializing RagSequenceForGeneration.

All the weights of RagSequenceForGeneration were initialized from the model checkpoint at facebook/rag-sequence-nq.
If your task is similar to the task the model of the checkpoint was trained on, you can already use RagSequenceForGeneration for predictions without further training.
initializing retrieval
Loading index from https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/
loading file https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/hf_bert_base.hnswSQ8_correct_phi_128.c_index.index.dpr from cache at /home/ubuntu/.cache/huggingface/transformers/a481b3aaed56325cb8901610e03e76f93b47f4284a1392d85e2ba5ce5d40d174.a382b038f1ea97c4fbad3098cd4a881a7cd4c5f73902c093e0c560511655cc0b
loading file https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/hf_bert_base.hnswSQ8_correct_phi_128.c_index.index_meta.dpr from cache at /home/ubuntu/.cache/huggingface/transformers/bb9560964463bc761c682818cbdb4e1662e91d25a9407afb102970f00445678c.f8cbe3240b82ffaad54506b5c13c63d26ff873d5cfabbc30eef9ad668264bab4
7it [00:00, 212.77it/s]
Traceback (most recent call last):
  File "eval_rag.py", line 315, in <module>
    main(args)
  File "eval_rag.py", line 301, in main
    answers = evaluate_batch_fn(args, model, questions)
  File "eval_rag.py", line 99, in evaluate_batch_retrieval
    question_enc_pool_output = question_enc_outputs.pooler_output
AttributeError: 'tuple' object has no attribute 'pooler_output'

With this simple change (question_enc_outputs.pooler_output -> question_enc_outputs[0]), I got to run the evaluation code and confirmed
INFO:__main__:Precision@1: 70.74

Environments

  • Ubuntu 18.04 LTS
  • Python 3.7.7
  • transformers 4.0.1
  • torch: 1.7.1

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

@ola13 (confirmed by git blame) @patrickvonplaten @lhoestq

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. It's saver this way

@patrickvonplaten
Copy link
Contributor

@lhoestq - feel free to merge if you're ok with the PR

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right ! thanks for fixing it

@lhoestq lhoestq merged commit 44c340f into huggingface:master Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants