Skip to content

Conversation

@szrlee
Copy link
Contributor

@szrlee szrlee commented Nov 12, 2025

  • Add citation to "When Speed Kills Stability: Demystifying RL Collapse from the Training-Inference Mismatch" paper by Liu, Li et al., which first demonstrated that training-inference mismatch can lead to RL collapse in both reasoning and agentic RL setups.

Signed-off-by: szrlee <szrlee@gmail.com>
@heheda12345 heheda12345 merged commit 8b24f2c into vllm-project:main Nov 12, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants