Fix inverse metric adjustment to skip string labels from code-based evaluators by imatiach-msft · Pull Request #46663 · Azure/azure-sdk-for-python

imatiach-msft · 2026-05-01T04:56:51Z

Fix inverse metric adjustment to skip string labels from code-based evaluators

Bug

Bug #5240742 - deflection_rate evaluator shows incorrect pass/fail labels in AppInsights (score=1.0 labeled "pass" instead of "fail").

Root Cause

_create_result_object() calls _adjust_for_inverse_metric(label) for all inverse (decrease+boolean) metrics. Code-based evaluators like deflection_rate return string labels ("pass"/"fail") that already reflect direction-aware semantics. But _adjust_for_inverse_metric only handles bool - it treats any non-bool (including strings) as False, mapping everything to "pass".

Fix

Skip _adjust_for_inverse_metric entirely when the label is already a string, since string labels mean the evaluator already computed the correct direction-aware pass/fail.

Before (buggy):

if is_inverse:
    score, label, passed = _adjust_for_inverse_metric(label)

After (fix):

if is_inverse and not (label is not None and isinstance(label, str)):
    score, label, passed = _adjust_for_inverse_metric(label)

Boolean labels (from safety evaluators like indirect_attack, code_vulnerability) continue to be inverted as before.

Tests

TestAdjustForInverseMetric (3 tests): boolean True/False and None handling
TestIsInverseMetric (4 tests): hardcoded, configured, non-inverse, deflection_rate
TestCreateResultObjectInverseMetric (4 tests): integration tests verifying string labels preserved, boolean labels adjusted, non-inverse unmodified

Affected Evaluators

Only deflection_rate is affected - it is the only evaluator that is both code-based (string labels) AND decrease+boolean (triggers inverse path). All other inverse metrics return boolean labels and are unaffected.

Copilot

Pull request overview

Fixes inverse-metric adjustment logic so evaluators that emit string pass/fail labels (not booleans) don’t get their label incorrectly overwritten when is_inverse=True, addressing incorrect label="pass" reporting for deflection-rate-like metrics.

Changes:

Extend _adjust_for_inverse_metric to recognize string "pass" / "fail" labels (case/whitespace-insensitive).
Update _adjust_for_inverse_metric docstring to document string-label behavior and examples.

…valuators Code-based evaluators like deflection_rate return string pass/fail labels that already reflect direction-aware semantics. The inverse metric adjustment was incorrectly treating these strings as boolean False (since isinstance('fail', bool) is False), flipping 'fail' to 'pass'. Fix: skip _adjust_for_inverse_metric entirely when the label is a string, since string labels mean the evaluator already computed the correct direction-aware pass/fail. Boolean labels (from safety evaluators) still get inverted as before. Fixes Bug #5240742 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

YoYoJa

It might be better to have consistent type of value for evaluator in the future?

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…46763) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 1, 2026 04:56

imatiach-msft requested a review from a team as a code owner May 1, 2026 04:56

github-actions Bot added the Evaluation Issues related to the client library for Azure AI Evaluation label May 1, 2026

Copilot started reviewing on behalf of imatiach-msft May 1, 2026 04:57 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

Comment thread sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Outdated

Comment thread sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Outdated

imatiach-msft force-pushed the fix/inverse-metric-string-labels branch 4 times, most recently from 5a574e6 to 740e8c2 Compare May 1, 2026 16:14

imatiach-msft force-pushed the fix/inverse-metric-string-labels branch from 740e8c2 to fe204e7 Compare May 1, 2026 16:15

imatiach-msft changed the title ~~Fix _adjust_for_inverse_metric to handle string pass/fail labels~~ Fix inverse metric adjustment to skip string labels from code-based evaluators May 1, 2026

qusongms approved these changes May 1, 2026

View reviewed changes

YoYoJa approved these changes May 1, 2026

View reviewed changes

imatiach-msft merged commit a4541c2 into Azure:main May 1, 2026
19 checks passed

imatiach-msft mentioned this pull request May 6, 2026

Add changelog entry for inverse metric string label fix (PR #46663) #46763

Merged

imatiach-msft added a commit that referenced this pull request May 6, 2026

Add changelog entry for inverse metric string label fix (PR #46663)

6f76a03

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

imatiach-msft added a commit that referenced this pull request May 6, 2026

Add changelog entry for inverse metric string label fix (PR #46663) (#…

fb4ed62

…46763) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inverse metric adjustment to skip string labels from code-based evaluators#46663

Fix inverse metric adjustment to skip string labels from code-based evaluators#46663
imatiach-msft merged 1 commit intoAzure:mainfrom
imatiach-msft:fix/inverse-metric-string-labels

imatiach-msft commented May 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

YoYoJa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

imatiach-msft commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!