SEAL: Synergistic Co-Evolution of Agents and Learning Environments

Tool-Use Agents · Self-Evolution · Reinforcement Learning

Yihao Hu^1,2, Zhihao Wen^*,1, Xiujin Liu³, Pan Wang^1,4, Xin Zhang¹, Wei Wu¹

^*Corresponding Author · ¹Ant Group · ²Westlake University · ³University of Michigan-Ann Arbor · ⁴University of Science and Technology of China

Overview

SEAL is a closed-loop co-evolution framework for interactive tool-use agents. It collects on-policy trajectories under executable verification, diagnoses failed rollouts into turn-level failure labels, and uses these diagnoses as a shared signal for both training-time interface evolution and model-side policy optimization.

In SEAL, the agent reveals its capability gaps, the learning interface adapts around these failures, and the policy internalizes the resulting feedback through GRPO. Evaluation remains strict: tool semantics, task labels, and the verifier are unchanged.

The released SEAL-7B model is available on Hugging Face: mis2/SEAL-7B.

Highlights

Verifier-grounded diagnosis: executable traces are mapped to failure types such as invalid tool calls, argument mismatches, missing tool calls, recovery failures, and response mismatches.
Training-time interface evolution: BFCL observations expose schema affordances and recovery-oriented feedback without changing the test-time environment.
Diagnosis-guided optimization: diagnostic profiles reweight GRPO advantages while preserving the original verifier reward.

Reproducible Server Setup

The steps below mirror the AgentEvolver environment build: a Python 3.11 training environment with CUDA/flash-attn, plus a separate Python 3.11.13 BFCL environment service.

0. Prerequisites

Linux server with Conda, Git, CUDA-capable GPUs, and network access.
Activate your user Miniconda first, for example: source ~/miniconda3/bin/activate.
Defaults: trainer.n_gpus_per_node=4, BFCL service at http://127.0.0.1:8082.

1. Clone the repository

git clone https://github.com/yihaohu0118/SEAL.git
cd SEAL

2. Build the SEAL training environment

Recommended:

source ~/miniconda3/bin/activate
bash install.sh
conda activate seal

Equivalent manual commands:

conda create -n seal python=3.11 -y
conda activate seal
conda install -y -c nvidia cuda-toolkit
python -m pip install --upgrade pip wheel "setuptools==68.2.2"
pip install -r requirements.txt
pip install --verbose flash-attn==2.7.4.post1 ring-flash-attn --no-build-isolation

If the server has slow Hugging Face access, set a mirror before installing or before the first model download:

export HF_ENDPOINT=https://hf-mirror.com

3. Build the BFCL environment service

This creates the bfcl Conda environment, clones the official Gorilla/BFCL repository, installs bfcl_eval, and generates the processed multi-turn BFCL data used by SEAL.

source ~/miniconda3/bin/activate
bash env_service/environments/bfcl/setup.sh

Verify the BFCL setup:

conda run -n bfcl python -c "import bfcl_eval; print('bfcl_eval ok')"
test -f env_service/environments/bfcl/bfcl_data/multi_turn_processed.jsonl
test -f data/bfcl_400_split.json

If bfcl_eval imports but multi_turn_processed.jsonl is missing, the data preprocessing step failed. Install the missing dependency into bfcl, not base, then rerun setup:

conda run -n bfcl pip install -r env_service/environments/bfcl/requirements.txt
bash env_service/environments/bfcl/setup.sh

4. Start the BFCL service

In terminal 1:

source ~/miniconda3/bin/activate
conda activate bfcl
export BFCL_HOST=127.0.0.1
export BFCL_PORT=8082
bash env_service/launch_script/bfcl.sh

Keep this process running. In another terminal, check that the service is up:

curl -fsS http://127.0.0.1:8082/healthz

5. Run SEAL training

In terminal 2:

source ~/miniconda3/bin/activate
cd SEAL
conda activate seal
python launcher.py --conf exp/SEAL.yaml

Useful server-specific overrides:

python launcher.py --conf exp/SEAL.yaml -- \
  trainer.n_gpus_per_node=8 \
  actor_rollout_ref.model.path=/path/to/Qwen2.5-7B-Instruct \
  env_service.env_url=http://127.0.0.1:8082

The launcher writes a config/code backup to launcher_record/SEAL/, and training outputs are written under experiments/SEAL/.

6. View BFCL validation results

Validation generations are written as JSONL files under experiments/tech_synthetic/<experiment_name>/validation_log. Summarize per-category BFCL pass rates with:

TEST_PARQUET=data/bfcl_eval_400.parquet
BFCL_JSONL=env_service/environments/bfcl/bfcl_data/multi_turn_processed.jsonl

python3 scripts/stats_validation_bfcl.py \
  --val-dir experiments/tech_synthetic/SEAL/validation_log \
  --parquet "${TEST_PARQUET}" \
  --bfcl-jsonl "${BFCL_JSONL}"

Optional: let the launcher start BFCL

If you want launcher.py to start BFCL automatically, configure .env first:

cp example.env .env

Edit BFCL_SCRIPT in .env so it uses your actual Conda base path:

BFCL_SCRIPT="source /path/to/miniconda/bin/activate bfcl; bash bfcl.sh"

Then run:

source ~/miniconda3/bin/activate
cd SEAL
conda activate seal
python launcher.py --conf exp/SEAL.yaml --with-bfcl

DASHSCOPE_API_KEY is not required for the released SEAL recipe because task_manager.n=0 and synthetic_data_ratio=0.0. Set it only if you enable synthetic task exploration.

Repository Layout

exp/SEAL.yaml                         Full closed-loop SEAL co-evolution recipe
env_service/environments/bfcl/        Executable verifier and train-time interface evolution layer
agentevolver/module/tocf/             Failure diagnosis, capability-state tracking, and A-Patch advantage reweighting
agentevolver/module/task_manager/     BFCL task adaptation, reward routing, and diagnostic dense grader
data/                                 Low-resource BFCL train/evaluation splits used by SEAL

Acknowledgements

SEAL builds on ideas and infrastructure from modelscope/AgentEvolver and verl-project/verl. We sincerely thank the authors and contributors of these projects.

Citation

@article{hu2026seal,
  title={SEAL: Synergistic Co-Evolution of Agents and Learning Environments},
  author={Hu, Yihao and Wen, Zhihao and Liu, Xiujin and Wang, Pan and Zhang, Xin and Wu, Wei},
  journal={arXiv preprint arXiv:2605.24426},
  year={2026},
  url={https://arxiv.org/abs/2605.24426}
}

License

This project is released under the Apache License 2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SEAL: Synergistic Co-Evolution of Agents and Learning Environments

Overview

Highlights

Reproducible Server Setup

0. Prerequisites

1. Clone the repository

2. Build the SEAL training environment

3. Build the BFCL environment service

4. Start the BFCL service

5. Run SEAL training

6. View BFCL validation results

Optional: let the launcher start BFCL

Repository Layout

Acknowledgements

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
agentevolver		agentevolver
config		config
cookbook/env_profiles		cookbook/env_profiles
data		data
env_service		env_service
exp		exp
external/config_fallback		external/config_fallback
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.env		example.env
install.sh		install.sh
launcher.py		launcher.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SEAL: Synergistic Co-Evolution of Agents and Learning Environments

Overview

Highlights

Reproducible Server Setup

0. Prerequisites

1. Clone the repository

2. Build the SEAL training environment

3. Build the BFCL environment service

4. Start the BFCL service

5. Run SEAL training

6. View BFCL validation results

Optional: let the launcher start BFCL

Repository Layout

Acknowledgements

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages