-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation error on PubMedQA dataset #760
Comments
Thanks for your interest in LMFlow! LMFlow benchmark hasn't supported automatic PubMedQA evaluation yet, but modifying it should be not that difficult. @2003pro I am wondering if you could take a look? |
Here is the key regex for extracting the answer from responses generated from the lora-tuned model. You can check this for your evaluation script:
|
Thanks for your reply. I can use |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I can't evaluate a model on PubMedQA dataset, I use the commond such as
The error is "NotImplementedError: benchmarking dataset PubMedQA is not supported".
Simply adding the dataset PubMedQA to LOCAL_DATSET_GROUP_MAP in the benchmarking.py cannot solve. The new problem is
It seems that something is wrong during tokenizing the test dataset.
Could you please tell me what should I do to solve the bug? Thanks very much.
The text was updated successfully, but these errors were encountered: