Skip to content

fix: filter split-fs-invisible mounts when sysroot-stage is active (arc-dind)#5734

Merged
lpcox merged 3 commits into
mainfrom
fix/sysroot-split-fs-mount-filtering
Jun 30, 2026
Merged

fix: filter split-fs-invisible mounts when sysroot-stage is active (arc-dind)#5734
lpcox merged 3 commits into
mainfrom
fix/sysroot-split-fs-mount-filtering

Conversation

@lpcox

@lpcox lpcox commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Problem

Follow-up to #5732. On ARC/DinD runners with split filesystem, additional bind mounts fail after the /etc mounts fix:

  1. Chroot hosts file (/tmp/awf-*/chroot-*/hosts:/host/etc/hosts:ro) — written to runner's /tmp which the Docker daemon can't see
  2. All workDir-based mounts (initSignalDir, agentLogsPath, sessionStatePath, chroot-home) — sourced from runner's unshared /tmp/awf-*
  3. Home directory mounts (~/.cache, ~/.config, etc. targeting /host/home/...) — runner's home isn't visible to the daemon

Fix

Two commits:

1. Skip chroot hosts mount when sysroot active (volume-builder.ts)

  • The sysroot volume already provides /etc/hosts
  • DNS pre-resolution is traded off; domains resolve at runtime via container DNS config

2. Filter workDir-based and home-based mounts in compose generator (compose-generator.ts)

  • Drop bind mounts whose source starts with config.workDir (runner's /tmp/awf-*)
  • Drop home directory mounts targeting /host${effectiveHome}/... (sysroot provides writable home)
  • Keep: /tmp:/tmp (daemon's own), workspace (ARC-shared), kernel VFS, /dev/null overlays, custom --mount flags

What remains after filtering

  • /tmp:/tmp:rw — daemon has its own /tmp
  • ${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}:rw — shared between runner and daemon on ARC
  • /sys:/host/sys:ro, /dev:/host/dev:ro — kernel VFS
  • /dev/null:... credential-hiding overlays
  • Custom --mount flags (typically under shared /tmp/gh-aw/)
  • sysroot:/host:rw named volume (provides full glibc filesystem)

Testing

  • All 3387 tests pass
  • Added compose-generator test verifying split-fs filtering behavior
  • TypeScript compiles cleanly

Context

Discovered during ARC/DinD canary testing (bbq-beets-four-nines/agentic-workflows-canary):

lpcox and others added 2 commits June 30, 2026 15:06
The generated /etc/hosts file (chroot-*/hosts) is written to the
runner's /tmp which the Docker daemon cannot see on split-fs ARC/DinD.
Skip this mount when sysroot is active since the volume already provides
/etc/hosts. DNS pre-resolution is traded off; domains resolve at runtime.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
On ARC/DinD with split filesystem, the Docker daemon cannot see:
- Paths under AWF's workDir (/tmp/awf-*) on the runner's /tmp
- Runner home directory paths mounted to /host/home/...

These bind mounts fail with OCI runtime errors. When sysroot-stage is
active, drop them at compose generation time:
- workDir-based mounts (initSignalDir, logs, session state, chroot-home)
- Home directory mounts targeting /host (sysroot volume provides these)

The agent still gets:
- /tmp:/tmp:rw (daemon's own /tmp)
- Workspace (shared between runner and daemon on ARC)
- Kernel VFS (/sys, /dev)
- Credential-hiding /dev/null overlays
- The sysroot named volume at /host with full glibc filesystem

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 30, 2026 22:06
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 98.64% 98.68% 📈 +0.04%
Statements 98.56% 98.58% 📈 +0.02%
Functions 99.55% 99.55% ➡️ +0.00%
Branches 94.53% 94.51% 📉 -0.02%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/compose-generator.ts 99.0% → 99.0% (+0.07%) 99.0% → 98.1% (-0.83%)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a follow-up ARC/DinD (“split filesystem”) fix that adjusts how AWF composes bind mounts when runnerTopology: 'arc-dind' (sysroot-stage active), aiming to avoid Docker daemon mount failures caused by runner-only paths.

Changes:

  • Skip generating and bind-mounting a chroot /etc/hosts file when sysroot-stage is active.
  • Filter agent bind mounts in sysroot-stage mode to drop mounts likely sourced from runner-only paths (workDir and home).
  • Add a unit test asserting workDir/home bind mounts are filtered while core mounts remain.
Show a summary per file
File Description
src/services/agent-volumes/volume-builder.ts Skips generateHostsFileMount() when sysroot-stage is active.
src/compose-generator.ts Adds sysroot-mode filtering logic for agent volumes entries.
src/compose-generator.test.ts Adds a test covering the new sysroot-mode filtering behavior.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread src/compose-generator.ts
Comment thread src/compose-generator.ts
Add parts.length < 2 check to avoid undefined target when a volume
mount string lacks the expected ':' separator.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

✅ Copilot review passed with no inline comments.

@lpcox Add the ready-for-aw label to this PR to trigger agentic CI smoke tests.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (Entra) completed. Copilot AOAI BYOK (Entra) mode operational. 🔓

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

📡 Smoke OTel Tracing completed. All tracing scenarios validated. ✅

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Smoke Claude passed

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Contribution Check completed successfully!

Contribution guidelines review complete: PR #5734 follows the applicable CONTRIBUTING.md guidance. It includes a clear description, relevant context/reference, tests for the code change, and changes are organized under src/ with the test alongside existing coverage.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

Copy link
Copy Markdown
Contributor

🚀 Security Guard has started processing this pull request

@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 98.64% 98.68% 📈 +0.04%
Statements 98.56% 98.57% ➡️ +0.01%
Functions 99.55% 99.55% ➡️ +0.00%
Branches 94.53% 94.48% 📉 -0.05%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/compose-generator.ts 99.0% → 99.0% (+0.08%) 99.0% → 97.2% (-1.72%)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Result
API status ✅ PASS
GitHub CLI check ✅ PASS
File status ✅ PASS

Overall result: PASS

Generated by Smoke Claude for #5734 · 52.2 AIC · ⊞ 3.3K ·
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

🔥 Smoke Test: PAT Auth — PASS

Test Result
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read ⚠️ pre-step data unavailable

Overall: PASS (connectivity tests verified independently; pre-step template outputs were unresolved)

CC @lpcox — Auth mode: PAT (COPILOT_GITHUB_TOKEN)

🔑 PAT report filed by Smoke Copilot PAT
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox Smoke Test Results:

  • GitHub MCP Testing: ✅
  • GitHub.com Connectivity: ✅
  • File Write/Read Test: ✅
  • BYOK Inference Test: ✅

Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra

Overall: PASS

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK (Direct) Mode ✅

✅ GitHub MCP connectivity verified
✅ GitHub.com HTTP 200 reached
✅ File write/read test passed
✅ Direct BYOK inference working (COPILOT_PROVIDER_API_KEY → api-proxy → api.githubcopilot.com)

Status: PASS — Running in direct BYOK mode via api-proxy sidecar

🔑 BYOK report filed by Smoke Copilot BYOK
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

🤖 Smoke Test Results

Test Result
GitHub MCP connectivity
GitHub.com HTTP (200)
File write/read

Overall: PASS

/cc @lpcox

📰 BREAKING: Report filed by Smoke Copilot
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

Merged PRs:

Checks:

  • GitHub read: ✅
  • Playwright title: ✅
  • File write: ✅
  • Build: ✅

Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.17.0 v22.23.0 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

🔬 Smoke Test: API Proxy OpenTelemetry Tracing

Scenario Result Detail
1. Module Loading otel.js loads cleanly; exports startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled + test helpers
2. Test Suite otel.test.js exists (652 lines, 12 describe blocks); tests cover serialization helpers, exporters, isEnabled, parent context propagation, token attributes, and span lifecycle
3. Env Var Forwarding src/services/api-proxy-env-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, and OTEL_SERVICE_NAME (default: awf-api-proxy) to the api-proxy container
4. Token Tracker Integration token-tracker-http.js declares onUsage callback in opts (line 283); invoked at line 324 — confirmed hook point for setTokenAttributes / setBudgetAttributes
5. OTEL Diagnostics No OTLP endpoint configured in this run → spans fall back to /var/log/api-proxy/otel.jsonl (expected; graceful degradation path confirmed in _init())

All scenarios pass. OTEL tracing integration is correctly implemented: module initializes, spans are created with GenAI semconv attributes, token usage flows via onUsagesetTokenAttributes, env vars are forwarded, and the system degrades gracefully when no exporter endpoint is set.

📡 OTel tracing validated by Smoke OTel Tracing
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox

fix: filter split-fs-invisible mounts when sysroot-stage is active (arc-dind)

✅ GitHub MCP connectivity
✅ GitHub.com connectivity
✅ File I/O sandbox test
✅ Direct BYOK inference

Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (o4-mini-aw)

PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity

Check Result
Redis PING ❌ No response (timeout)
PostgreSQL pg_isready ❌ No response
PostgreSQL SELECT 1 ❌ No response (timeout)

Overall: FAIL

host.docker.internal resolves to 172.17.0.1 but all service connections timed out or returned no response. The service containers do not appear to be running or reachable from this environment.

🔌 Service connectivity validated by Smoke Services
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results: Gemini Engine

  • GitHub MCP Testing: ❌ (Tools not found, used curl fallback)
  • GitHub.com Connectivity: ✅ (HTTP 200)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Last 2 merged PRs:

  1. fix: skip /etc bind mounts when sysroot-stage is active (arc-dind) (fix: skip /etc bind mounts when sysroot-stage is active (arc-dind) #5732)
  2. fix: escape $d as $$d in sysroot-stage compose command (fix: escape $d as $$d in sysroot-stage compose command #5730)

Overall status: PASS (with fallback)

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini
Add label ready-for-aw to run again

@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx passed ✅ PASS
Node.js execa passed ✅ PASS
Node.js p-limit passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for #5734 · 57 AIC · ⊞ 7.8K ·
Add label ready-for-aw to run again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants