-
Notifications
You must be signed in to change notification settings - Fork 62
Improve Text2SQL Metrics: Refactoring, New Execution Metric, and Bug Fixes #1841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
* Add Multi Turn Metrics Support Signed-off-by: elronbandel <elronbandel@gmail.com> * Add multi-turn metrics and templates support Signed-off-by: elronbandel <elronbandel@gmail.com> * Refactor MultiTurnMetric into GroupMetric for improved grouping and item identification Signed-off-by: elronbandel <elronbandel@gmail.com> * Remove duplicate import of dict_get and update line number in secrets baseline Signed-off-by: elronbandel <elronbandel@gmail.com> * Implement sequential success accuracy metric and refactor group reduction logic Signed-off-by: elronbandel <elronbandel@gmail.com> * Format Signed-off-by: elronbandel <elronbandel@gmail.com> * Some fixes Signed-off-by: elronbandel <elronbandel@gmail.com> * Rename Signed-off-by: elronbandel <elronbandel@gmail.com> --------- Signed-off-by: elronbandel <elronbandel@gmail.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Yoav Katz <katz@il.ibm.com>
* Increase token limit so the judge can get to the actual answer. Signed-off-by: Jonathan Bnayahu <bnayahu@il.ibm.com> * now with the json file Signed-off-by: Jonathan Bnayahu <bnayahu@il.ibm.com> --------- Signed-off-by: Jonathan Bnayahu <bnayahu@il.ibm.com>
…enabled. (#1834) * Added example of running inference with log probability Signed-off-by: Yoav Katz <katz@il.ibm.com> * Initial changes to support generated_text in meta data Signed-off-by: Yoav Katz <katz@il.ibm.com> * Added missing generated_text in llava models Signed-off-by: Yoav Katz <katz@il.ibm.com> * Fixed WMLInferenceEngineChat and improved tests Signed-off-by: Yoav Katz <katz@il.ibm.com> * Added print header to exmaple Signed-off-by: Yoav Katz <katz@il.ibm.com> * Added "text" and "logprob" to OpenAiInferenceEngine Signed-off-by: Yoav Katz <katz@il.ibm.com> * Reverted test question change * Updated tests Signed-off-by: Yoav Katz <katz@il.ibm.com> --------- Signed-off-by: Yoav Katz <katz@il.ibm.com> Co-authored-by: Elron Bandel <elronbandel@gmail.com>
Signed-off-by: Yoav Katz <katz@il.ibm.com> Co-authored-by: Elron Bandel <elronbandel@gmail.com>
Signed-off-by: Martín Santillán Cooper <msantillancooper@ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patches in the tests are still pointing towards the old files:
for example at line 108 in the tests:
@patch(
"unitxt.sql_utils.LocalSQLiteConnector.get_db_file_path",Other than that it looks great and better organized.
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
Signed-off-by: Oktie Hassanzadeh <hassanzadeh@us.ibm.com>
|
Thanks @elronbandel - I believe I've fixed the tests. I also added some more docs to the text2sql util functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM.
This PR introduces improvements to the Text2SQL evaluation framework, focusing on flexibility, accuracy, and robustness of execution-based metrics.
Changes: