-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Docs] GSM8K Accuracy Evaluation doc update #25360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: David Chen <530634352@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates a command in the README.md
for the GSM8K evaluation, correcting the path to the gsm8k_eval.py
script. The change is a good improvement. I've added a suggestion to use python3
explicitly to avoid potential issues on systems where python
might refer to Python 2. Additionally, please note that the path in the pytest
command on line 10 of the same README file also appears to be incorrect and could be fixed for consistency.
|
||
# Run evaluation | ||
python tests/gsm8k/gsm8k_eval.py --port 8000 | ||
python tests/evals/gsm8k/gsm8k_eval.py --port 8000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The associated script gsm8k_eval.py
uses a python3
shebang (#!/usr/bin/env python3
). To ensure this command runs reliably across different user environments, it's best practice to use python3
explicitly. On some systems, python
may still point to an older, incompatible Python 2 installation, which would cause the script to fail.
python tests/evals/gsm8k/gsm8k_eval.py --port 8000 | |
python3 tests/evals/gsm8k/gsm8k_eval.py --port 8000 |
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: David Chen <530634352@qq.com> Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: David Chen <530634352@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Purpose
fix gsm8k eval doc
Run standalone evaluation script
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.