-
Notifications
You must be signed in to change notification settings - Fork 14
refactor(graders): fix deprecated parameters and improve input validation #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(graders): fix deprecated parameters and improve input validation #56
Conversation
…tion - Fix import paths in docstrings to use correct module paths - Rename deprecated 'grader_mode' parameter to 'mode' in multiple graders - Add input type validation to format graders (json_validator, length_penalty, ngram_repetition_penalty, reasoning_format, reasoning_tool_format) - Fix think pattern regex in ngram_repetition_penalty to include <think></think> tags - Fix error handling in relevance.py to return GraderScore instead of GraderError - Fix parameter name 'output' to 'response' in instruction_following docstring example
Summary of ChangesHello @XiaoBoAI, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the stability and maintainability of the grading system by addressing several technical debts. It standardizes parameter names, strengthens input validation across various format graders, and refines error handling mechanisms. These changes collectively contribute to a more robust and predictable grading process, ensuring that graders operate correctly with expected inputs and provide consistent outputs even in error scenarios. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
…ndling GraderScore does not have an 'error' attribute. When RelevanceGrader catches an exception, it stores the error message in the 'reason' field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces several valuable refactorings and fixes across the grader modules. The changes include updating deprecated parameters, fixing incorrect import paths in docstrings, and improving input validation for several format graders. Key bug fixes include correcting a regex pattern in ngram_repetition_penalty and fixing the error handling in relevance.py to return the correct type.
My review includes a couple of suggestions for further improvement:
- In
relevance.py, I've noted that metadata from the super-class evaluation is being discarded, which could be preserved. - For the format graders, I've suggested refactoring the duplicated input validation logic into a reusable decorator to improve maintainability.
Overall, these are solid improvements that enhance the robustness and consistency of the codebase.
…ceGrader Address review feedback: preserve result.metadata from super().aevaluate() to avoid losing important evaluation details from the LLM.
…ed params - Replace grader_mode with mode parameter in graders (deprecated param) - Return GraderError instead of GraderScore on evaluation errors - Add input validation for response type in format graders - Update tests to check error field instead of reason for errors - Fix docstring import paths in common graders
- Add require_string_response decorator for format graders - Fix mode parameter naming (grader_mode -> mode) - Add threshold param to RelevanceGrader - Fix think pattern regex in ngram_repetition_penalty
- Remove require_string_response decorator from base_grader.py - Rely on Python type annotations instead of runtime type checks - Follow project guidelines: trust internal code, validate only at system boundaries
| score=score, | ||
| reason=reason, | ||
| ) | ||
| return GraderError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to be the same data type with error message in the return object
ployts
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
OpenJudge Version
0.2.0
Description
Checklist
Please check the following items before code is ready to be reviewed.
pre-commit run --all-filescommand