Skip to content

[codex] Restore PR 607 KL loss removal#639

Merged
FurtherAI merged 1 commit intomainfrom
austin/fix_pr_607_regression_from_pr_619
Apr 2, 2026
Merged

[codex] Restore PR 607 KL loss removal#639
FurtherAI merged 1 commit intomainfrom
austin/fix_pr_607_regression_from_pr_619

Conversation

@FurtherAI
Copy link
Copy Markdown
Collaborator

Summary

Removes the Schulman KL term that PR 619 accidentally reintroduced into art.loss.loss_fn, restoring the behavior from PR 607.

Root Cause

The LoRA correctness work in PR 619 refactored Loss and brought back the direct KL-divergence term and Loss.kl field that PR 607 had intentionally removed.

Changes

  • remove the reintroduced KL-divergence accumulation from src/art/loss.py
  • remove the now-invalid loss.kl scaling path from the Megatron oracle worker helper
  • add a focused regression assertion so Loss does not grow the kl field back silently

Validation

  • uv sync --all-extras
  • uv run pytest src/art/test/test_kl_advantage.py -q
  • uv run python -m py_compile src/art/loss.py tests/integration/megatron_oracle_worker.py src/art/test/test_kl_advantage.py

@FurtherAI FurtherAI force-pushed the austin/fix_pr_607_regression_from_pr_619 branch from f9d79b2 to 348410b Compare April 2, 2026 08:40
@FurtherAI FurtherAI marked this pull request as ready for review April 2, 2026 17:22
@FurtherAI FurtherAI merged commit 75a81e9 into main Apr 2, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant