Fix sandbox errors, add project docs, and update stale documentation by hwuiwon · Pull Request #56 · AppliedLabsAI/cua

hwuiwon · 2026-04-07T22:06:36Z

Summary

Fix volume commit AuthError inside Modal sandboxes by skipping explicit commit (auto-synced on exit)
Fix extract timeout on truncated href selectors by retrying with href^= (starts-with) fallback
Fix CancelledError in Starlette lifespan by using graceful uvicorn shutdown (server.should_exit)
Fix AsyncUsageWarning by making RunRegistry.remove() async and using await dict.pop.aio()
Add AGENTS.md with architecture overview, conventions, and quick reference commands
Add /test-deployment slash command for validating the deployed API
Update all docs to reflect current defaults (model, stuck detection settings, auth flow, tool list)

Test plan

All unit tests pass (pytest tests/ -x -q — 100%)
Ruff lint passes on all changed files
Deployed to Modal and validated with dry-run, simple directive, and structured output extraction
No CancelledError, AuthError, or AsyncUsageWarning in sandbox logs

Summary by CodeRabbit

Release Notes

Documentation
- Added comprehensive system architecture and development guide
- Updated authentication to perform fresh login each run instead of session restoration
- Changed default LLM model from Gemini to OpenAI GPT-5.4
- Added new test deployment guide with validation test cases
New Features
- Added new browser actions: select() for dropdowns and evaluate() for JavaScript execution (10 total actions)
- Enhanced stuck detection with three new configuration parameters
Bug Fixes
- Fixed authentication form detection for truncated hrefs in DOM snapshots

coderabbitai · 2026-04-07T22:07:23Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: acace907-0070-47be-b353-43f1f26510e8

📥 Commits

Reviewing files that changed from the base of the PR and between c846c5d and 7d89aff.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (18)

.claude/commands/create-playbook.md
.claude/commands/test-deployment.md
AGENTS.md
CLAUDE.md
README.md
agent/main.py
agent/session/finalizer.py
api/runs/registry.py
api/runs/service.py
bridge/page_actions.py
docs/api.md
docs/authentication.md
docs/configuration.md
docs/guardrails.md
docs/playbooks.md
docs/tools.md
tests/test_run_registry.py
tests/test_streaming.py

📝 Walkthrough

Walkthrough

Comprehensive update including documentation refinements, async migration of RunRegistry.remove operation, improved status API shutdown to eliminate CancelledError, conditional volume commit for non-local Modal sandboxes, DOM extraction retry for truncated href attributes, and corresponding test updates.

Changes

Cohort / File(s)	Summary
Documentation & Configuration `.claude/commands/create-playbook.md`, `.claude/commands/test-deployment.md`, `AGENTS.md`, `CLAUDE.md`, `README.md`, `docs/api.md`, `docs/authentication.md`, `docs/configuration.md`, `docs/guardrails.md`, `docs/playbooks.md`, `docs/tools.md`	Updated authentication behavior from session persistence to fresh login pattern; added E2E deployment testing guide and system architecture documentation; changed default model to OpenAI Responses; refined guardrails stuck-detection config (window size 8→12, added revisit/failure cluster parameters); documented tool action changes (9→10 actions, replaced `wait_for` with `select` and `evaluate`).
RunRegistry Async Migration `api/runs/registry.py`, `api/runs/service.py`, `tests/test_run_registry.py`, `tests/test_streaming.py`	Converted `RunRegistry.remove` from synchronous to async method; updated `ModalDictRunRegistry` to use `await modal_dict.pop.aio()`; modified `RunService.remove_handle` to await registry removal in cleanup, status error, stop, and streaming error paths; updated all test methods to handle async operations with `await`.
Status API Shutdown `agent/main.py`	Enhanced in-process status API shutdown: added return of both `task` and `uvicorn.Server` from `_start_status_api`; replaced task cancellation (which caused `CancelledError` noise) with `server.should_exit = True` followed by task await with exception suppression.
Session Finalizer `agent/session/finalizer.py`	Made `_commit_recording_volume` conditionally skip explicit `modal.Volume.commit` when `modal_sandbox_id != "local"`; preserves existing try/except behavior for local/standard sandboxes.
DOM Extraction Retry `bridge/page_actions.py`	Added retry mechanism for `extract_content` inner\_text extraction: catches exceptions and relaxes exact href selectors (`[href="..."]`) to starts-with patterns (`[href^="..."]`) to handle truncated hrefs in DOM snapshots; includes `_relax_href_selector` helper and compiled regex pattern.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Add SSE replay, event persistence, and multi-container support #20 — Modifies in-process status API startup/shutdown logic in agent/main.py, directly related to the uvicorn server lifecycle changes.
Refactor run/session lifecycle into dedicated packages and tighten browser execution boundaries #47 — Modifies agent/session/finalizer.py and the _commit_recording_volume function, affecting Modal sandbox recording behavior.
optimize task speed and reliability #55 — Modifies bridge/page_actions.py extraction and settling logic, related to the DOM content retrieval changes.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch modal-bug

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Fix sandbox errors, add project docs, and update stale documentation

7d89aff

hwuiwon merged commit 6baacb0 into main Apr 7, 2026
3 checks passed

hwuiwon deleted the modal-bug branch April 7, 2026 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sandbox errors, add project docs, and update stale documentation#56

Fix sandbox errors, add project docs, and update stale documentation#56
hwuiwon merged 1 commit into
mainfrom
modal-bug

hwuiwon commented Apr 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hwuiwon commented Apr 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hwuiwon commented Apr 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading