-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add auto-budget option #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds an --auto_budget flag that enables automatic determination of reasoning steps needed for each question during MCQA evaluation. When enabled, the system uses a model call with a single example to predict the appropriate reasoning budget for each question, overriding the manual --reason_budget setting.
Changes:
- Added
--auto_budgetcommand-line flag for automatic reasoning budget determination - Modified
_form_cot_querymethod in GIMEvaluator to dynamically determine reasoning budget via model inference - Added error handling with fallback to budget of 1 when auto-determination fails
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/gimbench/arguments.py | Added --auto_budget flag argument definition |
| src/gimbench/mcqa/evaluators.py | Implemented auto-budget logic with model-based determination and error handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
for more information, see https://pre-commit.ci
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fixes #31