Skip to content

Fix SFT evaluation metric logging by adding avg_perplexity#3691

Merged
copybara-service[bot] merged 1 commit intomainfrom
fix-sft-eval-metrics
Apr 17, 2026
Merged

Fix SFT evaluation metric logging by adding avg_perplexity#3691
copybara-service[bot] merged 1 commit intomainfrom
fix-sft-eval-metrics

Conversation

@igorts-git
Copy link
Copy Markdown
Collaborator

Description

Fix SFT (and DPO) eval that started failing recently due to changes in PR #3664.
Now we must report eval/avg_perplexity to the metric_logger or it would fail with and Exception.

Tests

Manual execution of a local SFT run with eval enabled for a few steps with and without the fix.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@igorts-git igorts-git force-pushed the fix-sft-eval-metrics branch from 55e5ca1 to c32c023 Compare April 17, 2026 20:25
@igorts-git igorts-git force-pushed the fix-sft-eval-metrics branch from c32c023 to 3797c23 Compare April 17, 2026 20:32
@copybara-service copybara-service Bot merged commit 1907615 into main Apr 17, 2026
37 checks passed
@copybara-service copybara-service Bot deleted the fix-sft-eval-metrics branch April 17, 2026 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants