wip ci(smoke): enable agent isolation on smoke runs by akshaylive · Pull Request #525 · UiPath/skills

akshaylive · 2026-05-01T22:51:20Z

Summary

Set `agent.isolation: true` in `tests/experiments/default.yaml` so smoke tasks exercise coder_eval's filesystem-perimeter sandbox
Pin the e2e smoke job to `ubuntu-24.04` and install `bubblewrap` + `socat` (with the AppArmor sysctl + `aa-disable` workarounds) so the SDK's bwrap sandbox actually engages instead of silently falling back to no-op

Mirrors UiPath/coder_eval#207.

Test plan

Smoke Skill Tests workflow passes on this PR
Spot-check one task's `sdk_options.json` (or task.json) to confirm `sandbox.enabled=true` was sent to the SDK

🤖 Generated with Claude Code

Set agent.isolation=true in tests/experiments/default.yaml and update the e2e smoke job to ubuntu-24.04 with bubblewrap+socat installed and AppArmor's unprivileged-userns restriction relaxed, so the SDK's bwrap sandbox actually engages instead of silently falling back to no-op. Mirrors UiPath/coder_eval#207. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-01T22:51:39Z

Claude finished @akshaylive's task in 1m 39s —— View job

Review: PR #525 — ci(smoke): enable agent isolation on smoke runs

Read review criteria and project conventions
Read full diff and changed files
Check PR conversation for existing feedback
Perform review against criteria
Post review

Summary

Enables filesystem-perimeter sandboxing (agent.isolation: true) for all smoke test tasks by updating the default experiment config, and pins the CI runner to ubuntu-24.04 with bubblewrap + socat installed (plus AppArmor workarounds) so the sandbox actually engages.

Change-by-Change Review

1. `.github/workflows/smoke-skills.yml`

Severity: OK

Clean, well-scoped changes:

ubuntu-latest → ubuntu-24.04 (line 157): Correct — ubuntu-latest currently resolves to 22.04, which ships bubblewrap 0.6.1 lacking the mount overlay features the SDK requires. The explanatory comment is thorough.
New step: Install OS sandbox dependencies (lines 183–197): Installs bubblewrap and socat with --no-install-recommends (minimal footprint). The AppArmor workarounds are properly guarded:
- echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns || true — gracefully no-ops if the sysctl doesn't exist.
- The for loop over AppArmor profiles checks [ -f "$prof" ] before disabling, with || true catching both missing-file and aa-disable failures. Correct bash short-circuit.
The detect job stays on ubuntu-latest — appropriate since it only does git operations and doesn't need bwrap.

One observation: when GitHub eventually moves ubuntu-latest to 24.04 (expected mid-2025, may already be in progress), the pin becomes redundant but harmless. No action needed now, just noting for future cleanup.

2. `tests/experiments/default.yaml`

Severity: OK

isolation: true added at line 21, correctly placed under agent: at the same indentation level as max_turns, allowed_tools, and plugins. The YAML structure is valid.

What's Missing

Nothing identified. This is a CI infrastructure change — no skill files are modified, so no CODEOWNERS, frontmatter, or skill-body checks apply.

Area Ratings

Area	Status	Notes
Frontmatter	N/A	No skill files changed
E2E Tests	N/A	Test config updated, not test tasks
Skill Body	N/A	No skill files changed
References & Assets	N/A	No skill files changed
Repo Hygiene	OK	Scoped to CI + test config, no secrets, no cross-skill refs

Issues for Manual Review

Confirm the upstream UiPath/coder_eval#207 change is merged/available so the SDK actually honors isolation: true when it receives it.
Per the PR's test plan: spot-check one task's sdk_options.json (or task.json) in a CI run to confirm sandbox.enabled=true was sent to the SDK.

Conclusion

Clean, well-documented CI change. The runner pin and sandbox dependency installation are correctly implemented with appropriate fallbacks. No issues found — approve.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

akshaylive · 2026-05-05T00:52:45Z

After debating back and forth, while this change works, it's very brittle. Closing this with no-op.

akshaylive requested a review from uipreliga as a code owner May 1, 2026 22:51

akshaylive changed the title ~~ci(smoke): enable agent isolation on smoke runs~~ wip ci(smoke): enable agent isolation on smoke runs May 3, 2026

ci(smoke): pin coder_eval to akshaya/sandbox for isolation testing

cf23341

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

akshaylive force-pushed the akshaya/isolate branch from f0e938a to cf23341 Compare May 4, 2026 15:41

akshaylive closed this May 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wip ci(smoke): enable agent isolation on smoke runs#525

wip ci(smoke): enable agent isolation on smoke runs#525
akshaylive wants to merge 2 commits into
mainfrom
akshaya/isolate

akshaylive commented May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026 •

edited

Loading

Uh oh!

akshaylive commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

akshaylive commented May 1, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: PR #525 — ci(smoke): enable agent isolation on smoke runs

Summary

Change-by-Change Review

1. .github/workflows/smoke-skills.yml

2. tests/experiments/default.yaml

What's Missing

Area Ratings

Issues for Manual Review

Conclusion

Uh oh!

akshaylive commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 1, 2026 •

edited

Loading

1. `.github/workflows/smoke-skills.yml`

2. `tests/experiments/default.yaml`