Skip to content

Per-metric relativization for Summary with preference metrics#5218

Closed
ItsMrLin wants to merge 1 commit into
facebook:mainfrom
ItsMrLin:export-D99149923
Closed

Per-metric relativization for Summary with preference metrics#5218
ItsMrLin wants to merge 1 commit into
facebook:mainfrom
ItsMrLin:export-D99149923

Conversation

@ItsMrLin
Copy link
Copy Markdown
Contributor

@ItsMrLin ItsMrLin commented Jun 3, 2026

Summary:

Summary

Supersedes D97533888. Addresses drfreund's review comment about deduping with Data.relativize by placing the per-metric scoping in the data layer rather than adding analysis-specific logic.

When an experiment has both preference metrics (e.g., pairwise_pref_query) and standard tracking metrics, the Summary should relativize tracking metrics normally while skipping the preference metric (whose binary 0/1 labels have SQ mean near zero, causing "mean_control too small" crash).

Previously D99037272 applied a blanket guard that skipped ALL relativization when any objective was a preference metric. This diff replaces that with per-metric scoping: non-preference metrics are relativized and %-formatted, while preference metrics are excluded from relativization and their columns are dropped from the summary table (binary 0/1 labels are not informative in a tabular summary). Labeling-only trial rows (with no tracking metric data) are also dropped.

Changes:

  • Data.relativize() and relativize_dataframe() in ax/core/data.py: add metric_names parameter to scope which metrics get relativized. Unscoped metrics pass through with raw values. SEM zeroing for status quo rows is also scoped -- non-relativized metrics retain their original SEM.
  • Experiment.to_df() in ax/core/experiment.py: add metric_names_to_relativize parameter, threaded to Data.relativize(). Percentage formatting also scoped to only relativized metrics.
  • Summary.compute() in ax/analysis/summary.py: replace blanket not has_preference_objective guard with per-metric scoping. Builds a list of non-preference metric names, passes it as metric_names_to_relativize, then drops preference columns and labeling-only rows.

Differential Revision: D99149923

Summary:
## Summary

Supersedes D97533888. Addresses drfreund's review comment about deduping with `Data.relativize` by placing the per-metric scoping in the data layer rather than adding analysis-specific logic.

When an experiment has both preference metrics (e.g., `pairwise_pref_query`) and standard tracking metrics, the Summary should relativize tracking metrics normally while skipping the preference metric (whose binary 0/1 labels have SQ mean near zero, causing "mean_control too small" crash).

Previously D99037272 applied a blanket guard that skipped ALL relativization when any objective was a preference metric. This diff replaces that with per-metric scoping: non-preference metrics are relativized and %-formatted, while preference metrics are excluded from relativization and their columns are dropped from the summary table (binary 0/1 labels are not informative in a tabular summary). Labeling-only trial rows (with no tracking metric data) are also dropped.

Changes:
- `Data.relativize()` and `relativize_dataframe()` in `ax/core/data.py`: add `metric_names` parameter to scope which metrics get relativized. Unscoped metrics pass through with raw values. SEM zeroing for status quo rows is also scoped -- non-relativized metrics retain their original SEM.
- `Experiment.to_df()` in `ax/core/experiment.py`: add `metric_names_to_relativize` parameter, threaded to `Data.relativize()`. Percentage formatting also scoped to only relativized metrics.
- `Summary.compute()` in `ax/analysis/summary.py`: replace blanket `not has_preference_objective` guard with per-metric scoping. Builds a list of non-preference metric names, passes it as `metric_names_to_relativize`, then drops preference columns and labeling-only rows.

Differential Revision: D99149923
@meta-cla meta-cla Bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jun 3, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Jun 3, 2026

@ItsMrLin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99149923.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.59155% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 96.50%. Comparing base (d0ae700) to head (05bb9a8).

Files with missing lines Patch % Lines
ax/core/data.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5218   +/-   ##
=======================================
  Coverage   96.50%   96.50%           
=======================================
  Files         617      617           
  Lines       69776    69833   +57     
=======================================
+ Hits        67339    67395   +56     
- Misses       2437     2438    +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Jun 4, 2026

This pull request has been merged in c40dfae.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants