fix: configurable max_grad_norm, lower default lr, remove premature deprecation by abrichr · Pull Request #255 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-30T16:55:15Z

Summary

Based on client training results: grad_norm=101 with lr=5e-6, 0.00 eval delta after 20 steps.

max_grad_norm in TrainingConfig — was hardcoded to 1.0. Now configurable. Warns when grad_norm > 10x clip threshold (gradients clipped to near-random direction).
Default learning_rate 5e-6 → 1e-6 — with raw grad_norm=101, effective step at lr=5e-6 overshoots. lr=1e-6 gives stable updates after clipping.
Remove deprecation warning — the standalone trainer is the production path for VLM GRPO. TRL's rollout_func doesn't support multimodal (issue #5120). Replaced with info log noting TRL migration pending PR #5323.

🤖 Generated with Claude Code

…eprecation Three changes based on client training results (grad_norm=101, 0.00 eval delta): 1. Add max_grad_norm to TrainingConfig (was hardcoded to 1.0). When grad_norm >> max_grad_norm, gradients are clipped to a near-random direction — training makes no progress despite non-zero loss. Now warns when grad_norm > 10x the clip threshold. 2. Lower default learning_rate from 5e-6 to 1e-6. With grad_norm=101 and lr=5e-6, effective step size overshoots. lr=1e-6 with max_grad_norm=1.0 gives stable updates. 3. Remove "standalone trainer is deprecated" warning. It was premature — TRL's rollout_func doesn't support multimodal VLMs (issue #5120). The standalone trainer is the production training path until TRL PR #5323 merges. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

abrichr merged commit 321dcea into main Mar 30, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: configurable max_grad_norm, lower default lr, remove premature deprecation#255

fix: configurable max_grad_norm, lower default lr, remove premature deprecation#255
abrichr merged 1 commit intomainfrom
fix/gradient-clipping-config-and-undeprecate

abrichr commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 30, 2026

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant