fix(docker): apply docker-host-path-prefix to all compose service mounts#3218
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Fixes a DinD/split-filesystem bind-mount mismatch where --docker-host-path-prefix was applied to the agent’s init-signal mount but not to the iptables-init service, causing the agent to time out waiting for the /tmp/awf-init/ready handshake file.
Changes:
- Exported
normalizeDockerHostPathPrefix/translateBindMountHostPathfromagent-volumes.tsfor reuse. - Updated
buildIptablesInitService()to apply the same host-path prefix translation to the init-signal bind mount asbuildAgentVolumes()does. - Wired
dockerHostPathPrefixthrough compose generation and added regression tests ensuring agent + init container mount-source symmetry (including trailing-slash normalization).
Show a summary per file
| File | Description |
|---|---|
| src/services/agent-volumes.ts | Exports the existing docker-host-path-prefix normalization/translation helpers for reuse. |
| src/services/agent-service.ts | Applies docker-host-path-prefix translation to the iptables-init init-signal volume so it matches the agent’s mount source. |
| src/services/agent-service.test.ts | Adds coverage for default behavior, prefixed behavior, and trailing-slash normalization for the iptables-init init-signal volume. |
| src/compose-generator.ts | Passes config.dockerHostPathPrefix into buildIptablesInitService(). |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 0
|
@copilot address review feedback |
The Copilot reviewer found 0 issues across all 4 changed files. All CI failures are pre-existing and unrelated to this change:
No code changes are needed. |
|
same gap buildSquidService, buildApiProxyService, buildCliProxyService all create unprefixed bind mounts (logs, SSL cert/key, DIFC CA cert). Under ARC/DinD their outputs silently go to daemon-local dirs, and optional file mounts can fail when Docker auto-creates a directory where a file is expected. My PR fixes only the iptables-init handshake — the next layer of pain after gets past the timeout will be missing logs and possibly broken SSL bump / DIFC. wondering if i shoudl do this in the pr |
|
i'm gonna expand scope |
306d670 to
b72cd31
Compare
This comment has been minimized.
This comment has been minimized.
The agent and iptables-init containers coordinate via a shared bind-mounted init-signal directory at /tmp/awf-init. The iptables-init container writes ready/output.log there after running setup-iptables.sh, and the agent's entrypoint waits for those files before continuing. buildAgentVolumes() applies dockerHostPathPrefix to its mount sources so the agent's /tmp/awf-init bind is daemon-resolvable on split runner/Docker daemon filesystems (e.g. ARC + DinD). buildIptablesInitService() did not, so once --docker-host-path-prefix was set the two containers bound to two different daemon-side directories. The init container could complete successfully and the agent would still time out after 30s with 'No init container output log found' because its bind target stayed empty. The same gap existed in the squid, api-proxy, and cli-proxy service builders: their bind-mount sources (squid logs, SSL cert/key/db, api-proxy logs, cli-proxy logs, optional DIFC CA cert) were never run through the prefix translation, so on ARC/DinD their logs would land in daemon-local directories and optional file mounts could fail when Docker auto-creates a directory at the unstaged source path. Extract normalize/translate/applyHostPathPrefixToVolumes into a shared host-path-prefix module and call applyHostPathPrefixToVolumes() at the end of every service builder's volume list construction. agent-volumes.ts delegates to the shared helper and re-exports the helpers for backwards compatibility. doh-proxy has no bind mounts and is unchanged. Add a parameterized symmetric invariant test that walks every bind mount on every compose service and asserts the prefix is applied uniformly when set (and skipped otherwise), so any future service builder is protected against the same class of asymmetric translation bug.
This comment has been minimized.
This comment has been minimized.
b72cd31 to
4144266
Compare
|
done ready again for review |
Smoke Test: Copilot BYOK (Offline) Mode
Running in BYOK offline mode ( Overall: PASS (GitHub MCP auth limitation is environment infra, not BYOK path) Author: @salmanmkc
|
Smoke Test Results
Overall: FAILED (2/3 passed) Details:
|
🔍 Smoke Test Results
Overall: FAIL The workflow did not expand /cc @salmanmkc
|
Smoke Test Results
Overall Status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
|
refactor: [Export Audit] Remove test-only re-exports from barrel modules Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
Chroot Version Comparison Results
Result: Not all tests passed — Python and Node.js versions differ between host and chroot.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test Results
Overall: FAIL — service containers not reachable from this runner environment.
|
|
@salmanmkc thank you for all the bug fixes, but it's preferrable that you file issues rather than PRs. we try to be very responsive and schedule them as soon as we can. thanks! |
no worries, but even the code review agent didn't catch that this issue would happen later, so unsure if an issue is really useful in this case. |
|
in specifics needing to expand the scope |
Bug Fix
What was the bug?
On split runner/Docker daemon filesystems (notably ARC + DinD), enabling
--docker-host-path-prefixproduced two related failure classes:1. iptables-init handshake timeout (the reported symptom):
The agent and
awf-iptables-initcontainers coordinate via a shared bind-mounted init-signal directory:setup-iptables.sh > /tmp/awf-init/output.log 2>&1 && touch /tmp/awf-init/ready.containers/agent/entrypoint.shpolls for/tmp/awf-init/ready(timeout 30s) and falls back to printing/tmp/awf-init/output.logon failure.buildAgentVolumes()runs every agent-side mount throughtranslateBindMountHostPath(mount, dockerHostPathPrefix)so the init-signal source becomes daemon-resolvable (e.g./host/<workDir>/init-signal).buildIptablesInitService()did not apply the same translation, so once--docker-host-path-prefixwas set the two containers bound to two different daemon-side directories. The init container could completesetup-iptables.shsuccessfully and the agent would still time out.2. The same gap existed in every other compose service builder:
buildSquidService— squid logs, SSL cert, SSL key, SSL DBbuildApiProxyService— api-proxy logsbuildCliProxyService— cli-proxy logs and the optional DIFC CA cert mountTheir bind-mount sources were never run through the prefix translation, so on ARC + DinD their logs would silently land in daemon-local directories and the optional file mounts could fail when Docker auto-creates a directory at the unstaged source path.
How did you fix it?
Shared host-path-prefix module
normalizeDockerHostPathPrefix,translateBindMountHostPath, and a newapplyHostPathPrefixToVolumes(volumes, prefix)array helper intosrc/services/host-path-prefix.ts. The translation logic is unchanged.agent-volumes.tsre-exports the helpers for backwards compatibility and delegates its existing prefix block to the shared helper (no behavior change for agent volumes).Symmetric translation across the compose stack
buildIptablesInitService,buildSquidService,buildApiProxyService, andbuildCliProxyServicenow callapplyHostPathPrefixToVolumes(volumes, config.dockerHostPathPrefix)at the end of their volume list construction.buildDohProxyServicehas no bind mounts and is unchanged.--docker-host-path-prefixis set.Behavioral coverage
should mount init-signal dir without translation when dockerHostPathPrefix is unset— confirms the default unprefixed behavior is preserved.should apply dockerHostPathPrefix to the iptables-init init-signal volume— regression test for the original bug, including a cross-check that the agent and iptables-init mount sources stay symmetric.should normalize trailing slash in dockerHostPathPrefix for iptables-init mount— mirrors the existing trailing-slash test for agent volumes./host,/host/) and skipped otherwise (unset, empty, whitespace). This is what protects future service builders from the same class of asymmetric translation bug.What is the impact?
--docker-host-path-prefix /host: they no longer hit the 30s init handshake timeout, and squid/api-proxy/cli-proxy bind-mount sources now resolve in the daemon namespace.dockerHostPathPrefixis unset (preserved by the early return inapplyHostPathPrefixToVolumes).Testing
Local:
npx tsc --noEmit— clean.npx eslinton every touched file — 0 errors (only pre-existing warnings that match nearby code style).npx jest src/services/— 295 passed.CI (post-push): 62 of 65 checks green. The 3 failures all reproduce on
mainand are unrelated to this change:Audit Docs Site Package— npm audit CVEs indocs-site/node_modules(mermaid/postcss/uuid/vite/yaml).Test Coverage Report—scripts/ci/workflow-engine-install-security.test.ts"Claude Code CLI installs must include --ignore-scripts".Smoke OpenCode— "No safe outputs were invoked"; failing onmainfor 4+ consecutive scheduled runs.