Fix a bug in eval_batch_retrieval of eval_rag.py #9089

yoshitomo-matsubara · 2020-12-14T01:49:45Z

What does this PR do?

Following the instructions in RAG example, I was trying to evaluate retrieval against DPR evaluation data.

pipenv run python eval_rag.py --model_name_or_path facebook/rag-sequence-nq --model_type rag_sequence --evaluation_set output/biencoder-nq-dev.questions --gold_data_path output/biencoder-nq-dev.pages --predictions_path output/retrieval_preds.tsv --eval_mode retrieval --k 1

With the above command, I faced the following error and confirmed that question_enc_outputs is a tuple whose length is 1.

...
loading weights file https://huggingface.co/facebook/rag-sequence-nq/resolve/main/pytorch_model.bin from cache at /home/ubuntu/.cache/huggingface/transformers/9456ce4ba210322153f704e0f26c6228bd6c0caad60fe1b3bdca001558adbeca.ee816b8e716f9741a2ac602bb9c6f4d84eff545b0b00a6c5353241bea6dec221
All model checkpoint weights were used when initializing RagSequenceForGeneration.

All the weights of RagSequenceForGeneration were initialized from the model checkpoint at facebook/rag-sequence-nq.
If your task is similar to the task the model of the checkpoint was trained on, you can already use RagSequenceForGeneration for predictions without further training.
initializing retrieval
Loading index from https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/
loading file https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/hf_bert_base.hnswSQ8_correct_phi_128.c_index.index.dpr from cache at /home/ubuntu/.cache/huggingface/transformers/a481b3aaed56325cb8901610e03e76f93b47f4284a1392d85e2ba5ce5d40d174.a382b038f1ea97c4fbad3098cd4a881a7cd4c5f73902c093e0c560511655cc0b
loading file https://storage.googleapis.com/huggingface-nlp/datasets/wiki_dpr/hf_bert_base.hnswSQ8_correct_phi_128.c_index.index_meta.dpr from cache at /home/ubuntu/.cache/huggingface/transformers/bb9560964463bc761c682818cbdb4e1662e91d25a9407afb102970f00445678c.f8cbe3240b82ffaad54506b5c13c63d26ff873d5cfabbc30eef9ad668264bab4
7it [00:00, 212.77it/s]
Traceback (most recent call last):
  File "eval_rag.py", line 315, in <module>
    main(args)
  File "eval_rag.py", line 301, in main
    answers = evaluate_batch_fn(args, model, questions)
  File "eval_rag.py", line 99, in evaluate_batch_retrieval
    question_enc_pool_output = question_enc_outputs.pooler_output
AttributeError: 'tuple' object has no attribute 'pooler_output'

With this simple change (question_enc_outputs.pooler_output -> question_enc_outputs[0]), I got to run the evaluation code and confirmed
INFO:__main__:Precision@1: 70.74

Environments

Ubuntu 18.04 LTS
Python 3.7.7
transformers 4.0.1
torch: 1.7.1

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

@ola13 (confirmed by git blame) @patrickvonplaten @lhoestq

patrickvonplaten

This looks good to me. It's saver this way

patrickvonplaten · 2020-12-14T11:38:04Z

@lhoestq - feel free to merge if you're ok with the PR

lhoestq

Right ! thanks for fixing it

fix a bug in eval_batch_retrieval

b00f54a

LysandreJik requested review from lhoestq and patrickvonplaten December 14, 2020 03:33

patrickvonplaten approved these changes Dec 14, 2020

View reviewed changes

lhoestq approved these changes Dec 15, 2020

View reviewed changes

lhoestq merged commit 44c340f into huggingface:master Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a bug in eval_batch_retrieval of eval_rag.py #9089

Fix a bug in eval_batch_retrieval of eval_rag.py #9089

yoshitomo-matsubara commented Dec 14, 2020

patrickvonplaten left a comment

patrickvonplaten commented Dec 14, 2020

lhoestq left a comment

Fix a bug in eval_batch_retrieval of eval_rag.py #9089

Fix a bug in eval_batch_retrieval of eval_rag.py #9089

Conversation

yoshitomo-matsubara commented Dec 14, 2020

What does this PR do?

Environments

Before submitting

Who can review?

patrickvonplaten left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Dec 14, 2020

lhoestq left a comment

Choose a reason for hiding this comment