Description
I wanted to suggest adding support (or at least an example integration) for Atropos
Why this matters
Atropos is becoming a critical component for Reasoning-centric RL (CoT), particularly for:
- Verifiable Reasoning tasks (Math, Code, Logic)
- Deep Reasoning Expansion
- Scalable feedback loops for complex agentic workflows
Impact
While the current ROCK examples are great, having a similar integration for Atropos would:
- Broaden ROLL's usability: for the "Reasoning Frontier" in LLM research.
- Decoupled Scalability: Leverage Atropos' microservice architecture to keep high-concurrency rollouts independent of the main training loop, ensuring maximum GPU utilization during complex reasoning tasks.
I have already verified this integration locally with a 100-step "Golden Run" (expanding thinking tokens from 218 to 330) and have a PR ready for submission.
Description
I wanted to suggest adding support (or at least an example integration) for Atropos
Why this matters
Atropos is becoming a critical component for Reasoning-centric RL (CoT), particularly for:
Impact
While the current ROCK examples are great, having a similar integration for Atropos would:
I have already verified this integration locally with a 100-step "Golden Run" (expanding thinking tokens from 218 to 330) and have a PR ready for submission.