Skip to content

insensible104/agentic-rl-lab

Repository files navigation

Agentic RL Lab

An extensible agentic RL framework for training multi-turn agents across tool use, memory, and expert routing.

This repository is organized as a reusable training stack for:

  • multi-turn rollouts
  • task-level rewards
  • trajectory conversion
  • benchmark inspection
  • HPC-scale training workflows

The project is meant to be a clean base for building new agentic RL methods, not a collection of one-off training scripts.

What It Supports

  • loop_agent: planner / executor / verifier style tool-use RL
  • memory_agent: chunk-wise memory compression for long-context QA
  • expert_router: routing across retrieval and external expert models under cost-aware preferences

Core Framework Pieces

  • agentic_rl.multi_turn: shared trajectory expansion and GRPO reward normalization
  • agentic_rl.core: shared runtime utilities such as LLM engine adapters
  • agentic_rl.methods.registry: typed method registry and method metadata
  • agentic_rl.cli: unified inspection entrypoint for methods and benchmarks

Examples:

agentic-rl list-methods
agentic-rl show-method loop_agent
agentic-rl benchmarks

HPC Training

The repository includes a minimal HPC training layer:

  • requirements-train.txt for environment setup
  • configs/hpc.env.example for cluster paths and runtime variables
  • configs/models/*.sh and configs/methods/*.sh for model/method launch configs
  • scripts/launch_train.sh as the shared Ray + training entrypoint
  • scripts/*.sbatch templates for debug and formal jobs
  • scripts/preflight_check.py for dataset/checkpoint/path validation

See docs/hpc_training.md before submitting jobs.

Notes

  • Install slime separately or through your preferred environment setup.
  • expert_router expects external services for retrieval and expert models.
  • For func_call mode, set AGENTIC_RL_TAU2_ROOT to an external TAU2 checkout or asset directory.

Attribution

This repository includes work developed with reference to upstream open-source projects. See THIRD_PARTY_NOTICES.md for redistribution and attribution details.

About

Framework-first agentic RL built on top of slime, with unified rollout, reward, trajectory conversion, and HPC training workflows.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors