feat(scorer/llmrater): add fallback to SQL logic comparison for empty results by wangauone · Pull Request #326 · GoogleCloudPlatform/evalbench

wangauone · 2026-04-14T22:55:53Z

This change implements a fallback mechanism in LLMRater to handle cases where both the golden and generated queries return empty datasets. Instead of defaulting to a 100% score (false positive), it now invokes the LLM to compare the SQL logic of the queries. It also refactors the hardcoded prompts into file-level constants for better maintainability and adds unit tests to verify the new behavior.

… results

IsmailMehdi · 2026-04-14T23:24:34Z

/gcbrun

feat(scorer/llmrater): add fallback to SQL logic comparison for empty…

6ae05ed

… results

wangauone requested a review from IsmailMehdi as a code owner April 14, 2026 22:55

style(test): fix whitespace issues in llmrater_test.py

6229c6f

IsmailMehdi approved these changes Apr 14, 2026

View reviewed changes

Merge branch 'main' into feat/llmrater-empty-results-fallback

648338c

IsmailMehdi merged commit d168ac0 into main Apr 14, 2026
4 of 5 checks passed

release-please bot mentioned this pull request Apr 14, 2026

chore(main): release 1.4.0 #327

Merged

wangauone deleted the feat/llmrater-empty-results-fallback branch April 15, 2026 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scorer/llmrater): add fallback to SQL logic comparison for empty results#326

feat(scorer/llmrater): add fallback to SQL logic comparison for empty results#326
IsmailMehdi merged 3 commits intomainfrom
feat/llmrater-empty-results-fallback

wangauone commented Apr 14, 2026

Uh oh!

IsmailMehdi commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wangauone commented Apr 14, 2026

Uh oh!

IsmailMehdi commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants