FoldAct: Stable Training for Long-Horizon RL with Context Folding

FoldAct is a framework for training long-horizon reinforcement learning agents with context folding, addressing the fundamental non-stationary observation problem that arises when summary actions modify the agent's future observation space.

Installation

Setup

Create and activate conda environment:

conda create -n verl-agent python=3.10
conda activate verl-agent

Install PyTorch (adjust CUDA version as needed):

pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124

Install flash-attention

pip install flash-attn --no-build-isolation

Install dependencies:

pip install -e .

Quick Start

Training with FoldAct

The main training script is train_grpo_asearcher_consistency.sh. Key configuration options:

# Enable FoldAct features
# +actor_rollout_ref.actor.use_separated_loss=true  # (Auto-enabled) Separated loss computation
+actor_rollout_ref.actor.use_full_context_supervision=true  # Consistency loss

# Loss weights
+actor_rollout_ref.actor.summary_loss_weight=1      # for summary loss
+actor_rollout_ref.actor.action_loss_weight=1       # for action loss
+actor_rollout_ref.actor.consistency_loss_weight=1  # for consistency loss

# Context window management
+use_summary=true                              # Enable compression (auto-enables separated loss)
+per_turn_dropout_prob=0.5                     # Selective segment training

Example Training Command

bash train_grpo_asearcher_consistency.sh

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 571 Commits
.github		.github
.vscode		.vscode
agent_system		agent_system
cache		cache
docker		docker
docs		docs
examples		examples
recipe		recipe
scripts		scripts
search_r1		search_r1
tests		tests
utils		utils
verl		verl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README.md		README.md
env.yaml		env.yaml
pyproject.toml		pyproject.toml
requirements-npu.txt		requirements-npu.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
run_inference.py		run_inference.py
run_inference.sh		run_inference.sh
setup.py		setup.py
setup_env.sh		setup_env.sh
train_foldact_consistency.sh		train_foldact_consistency.sh
train_tool_agent.sh		train_tool_agent.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FoldAct: Stable Training for Long-Horizon RL with Context Folding

Installation

Setup

Quick Start

Training with FoldAct

Example Training Command

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FoldAct: Stable Training for Long-Horizon RL with Context Folding

Installation

Setup

Quick Start

Training with FoldAct

Example Training Command

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages