You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the great work! I wanted to reproduce evaluation on GQA, however, I am not sure how I can do that.
I am working with the 1000 samples of GQA that you provided with the code and used gpt-3.5-turbo-0613.
However, I got an accuracy of 33.2, which is more than 10% lower than the reported accuracy.
I used 'results/craft_tools/5_deduplicated_tool.csv' as a toolset and used the default configuration on retrieval_gqa_config.yaml.
Can you help me reproduce the results?
Also, if possible, can you provide the output of the code you got using gpt-3.5?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Thank you for the great work! I wanted to reproduce evaluation on GQA, however, I am not sure how I can do that.
I am working with the 1000 samples of GQA that you provided with the code and used gpt-3.5-turbo-0613.
However, I got an accuracy of 33.2, which is more than 10% lower than the reported accuracy.
I used 'results/craft_tools/5_deduplicated_tool.csv' as a toolset and used the default configuration on retrieval_gqa_config.yaml.
Can you help me reproduce the results?
Also, if possible, can you provide the output of the code you got using gpt-3.5?
Thanks in advance!
The text was updated successfully, but these errors were encountered: