Refactor EhrSqlAnnotator for improved error handling and query management#7
Merged
chakravarthik27 merged 2 commits intoApr 20, 2026
Merged
Conversation
…mprove query handling
iulianigas
approved these changes
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request updates the EHR SQL annotation and scenario components to improve SQL evaluation consistency, clarify instructions for SQL generation, and update data sources. The most significant changes ensure that SQL queries using current time are handled deterministically, update dataset and database URLs, and refine the logic for processing input data.
SQL Evaluation Improvements:
ehr_sql_annotator.pyto replace any usage ofcurrent_timein SQL queries with the fixed timestamp'2105-12-31 23:59:00', ensuring deterministic evaluation results. Also updated the expected result types fromList[str]toList[Tuple[Any, ...]]for more accurate result handling. [1] [2]Prompt and Adapter Specification Updates:
medhelm_run_specs.pyto require the use of the fixed timestamp for current time, enforce stricter output formatting, and clarify that only a single SQL statement should be returned.Dataset and Database Source Updates:
ehr_sql_scenario.pyto point to the latest dataset, schema, and database sources, ensuring the scenario uses current and accessible resources.Data Processing Logic:
ehr_sql_scenario.pyto skip entries where the query is"null", preventing invalid data from being included in the scenario.Minor Metadata Correction:
model_metadata.yamlfile for theupstage/solar-pro-241126model's display name.References: