Skip to content

feat: add AReaL AgentWorkflow wrapping WAADesktopEnv for RL training#140

Merged
abrichr merged 1 commit intomainfrom
feat/areal-workflow
Mar 19, 2026
Merged

feat: add AReaL AgentWorkflow wrapping WAADesktopEnv for RL training#140
abrichr merged 1 commit intomainfrom
feat/areal-workflow

Conversation

@abrichr
Copy link
Member

@abrichr abrichr commented Mar 19, 2026

Summary

  • Add WAADesktopWorkflow class implementing AReaL's agent workflow pattern for RL training of desktop automation agents
  • Add AReaL GRPO config template (configs/areal_waa_grpo.yaml) for single-GPU training with Qwen2.5-VL-3B-Instruct
  • Add 14 tests covering full episode execution, dense milestone rewards, edge cases, and message building

Details

AReaL (github.com/inclusionAI/AReaL) is an async RL training framework that supports agent workflows where your code talks to an OpenAI-compatible proxy, and AReaL transparently handles token tracking, logprobs, and gradient computation.

WAADesktopWorkflow.run() executes one episode:

  1. Reset environment with task_id
  2. Loop: screenshot -> LLM (via AReaL proxy) -> parse action JSON -> execute on VM
  3. Evaluate with dense milestone rewards via evaluate_dense()
  4. Return reward scalar (0.0 to 1.0)

Key design decisions:

  • Reuses existing infrastructure: parse_action_json, RLEnvironment, TaskConfig, evaluate_dense()
  • AReaL is an optional dependency -- module imports gracefully when AReaL is not installed
  • Tests use mock adapters and mock AsyncOpenAI clients (no real VM or AReaL needed)
  • Config follows AReaL's exact YAML format (verified against their examples/agent_workflow/config.yaml)

Test plan

  • All 14 new tests pass (tests/test_areal_workflow.py)
  • All 48 related existing tests pass (trl_rollout, dense_rewards, verl_env) -- no regressions
  • Manual: verify config loads correctly with AReaL on GPU VM (requires AReaL installation)

🤖 Generated with Claude Code

Add proof-of-concept integration with AReaL (inclusionAI/AReaL), an
async RL training framework. The WAADesktopWorkflow class implements
AReaL's agent workflow pattern: an async run() method that receives
task data and an OpenAI-compatible proxy URL, runs a full desktop
automation episode against WAADesktopEnv, and returns a scalar reward.

New files:
- openadapt_evals/training/areal_workflow.py: WAADesktopWorkflow class
  with screenshot-to-base64 encoding, multi-turn message building,
  action parsing via parse_action_json, and dense milestone rewards.
- configs/areal_waa_grpo.yaml: AReaL config template for single-GPU
  GRPO training with Qwen2.5-VL-3B-Instruct.
- tests/test_areal_workflow.py: 14 tests covering episode execution,
  reward computation, edge cases, and message building.

AReaL is an optional dependency -- the workflow gracefully handles
the case where AReaL is not installed. Tests use mock adapters and
mock OpenAI clients (no real VM or AReaL needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@abrichr abrichr merged commit a7983cc into main Mar 19, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant