feat: add AReaL AgentWorkflow wrapping WAADesktopEnv for RL training by abrichr · Pull Request #140 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-19T00:24:33Z

Summary

Add WAADesktopWorkflow class implementing AReaL's agent workflow pattern for RL training of desktop automation agents
Add AReaL GRPO config template (configs/areal_waa_grpo.yaml) for single-GPU training with Qwen2.5-VL-3B-Instruct
Add 14 tests covering full episode execution, dense milestone rewards, edge cases, and message building

Details

AReaL (github.com/inclusionAI/AReaL) is an async RL training framework that supports agent workflows where your code talks to an OpenAI-compatible proxy, and AReaL transparently handles token tracking, logprobs, and gradient computation.

WAADesktopWorkflow.run() executes one episode:

Reset environment with task_id
Loop: screenshot -> LLM (via AReaL proxy) -> parse action JSON -> execute on VM
Evaluate with dense milestone rewards via evaluate_dense()
Return reward scalar (0.0 to 1.0)

Key design decisions:

Reuses existing infrastructure: parse_action_json, RLEnvironment, TaskConfig, evaluate_dense()
AReaL is an optional dependency -- module imports gracefully when AReaL is not installed
Tests use mock adapters and mock AsyncOpenAI clients (no real VM or AReaL needed)
Config follows AReaL's exact YAML format (verified against their examples/agent_workflow/config.yaml)

Test plan

All 14 new tests pass (tests/test_areal_workflow.py)
All 48 related existing tests pass (trl_rollout, dense_rewards, verl_env) -- no regressions
Manual: verify config loads correctly with AReaL on GPU VM (requires AReaL installation)

🤖 Generated with Claude Code

Add proof-of-concept integration with AReaL (inclusionAI/AReaL), an async RL training framework. The WAADesktopWorkflow class implements AReaL's agent workflow pattern: an async run() method that receives task data and an OpenAI-compatible proxy URL, runs a full desktop automation episode against WAADesktopEnv, and returns a scalar reward. New files: - openadapt_evals/training/areal_workflow.py: WAADesktopWorkflow class with screenshot-to-base64 encoding, multi-turn message building, action parsing via parse_action_json, and dense milestone rewards. - configs/areal_waa_grpo.yaml: AReaL config template for single-GPU GRPO training with Qwen2.5-VL-3B-Instruct. - tests/test_areal_workflow.py: 14 tests covering episode execution, reward computation, edge cases, and message building. AReaL is an optional dependency -- the workflow gracefully handles the case where AReaL is not installed. Tests use mock adapters and mock OpenAI clients (no real VM or AReaL needed). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

abrichr merged commit a7983cc into main Mar 19, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add AReaL AgentWorkflow wrapping WAADesktopEnv for RL training#140

feat: add AReaL AgentWorkflow wrapping WAADesktopEnv for RL training#140
abrichr merged 1 commit intomainfrom
feat/areal-workflow

abrichr commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 19, 2026

Summary

Details

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant