Fix DinD compose ACP: use Daytona PTY WebSocket for live agent pipes#193
Merged
Fix DinD compose ACP: use Daytona PTY WebSocket for live agent pipes#193
Conversation
SSH pipes break through the DinD→compose exec chain, causing "Process closed stdout (rc=None)" on all compose tasks. New DaytonaPtyProcess uses Daytona SDK's WebSocket PTY API for the outer connection (keeps pipe alive), then docker compose exec -i -T inside (clean stdio for the agent). Includes marker-based startup to drain shell output before ACP handshake, and echo-resistant response matching in the ACP client (filter echoed requests by checking for 'method' field absence). Also adds skills_dir: "auto" support in Job for per-task skill resolution after PR #720 removed COPY skills from Dockerfiles.
abfb0c2 to
aaa6b80
Compare
4 tasks
4 tasks
xdotli
added a commit
that referenced
this pull request
Apr 25, 2026
* fix: env-file path mismatch in DinD compose mode Devin caught a real bug introduced by PR #193 (DinD compose ACP): src/benchflow/process.py:325 sets remote_env_path = "/tmp/benchflow_env_$$.env" expecting the remote shell to expand $$ to its PID. But shlex.join() at line 329 single-quotes the --env-file argument, so docker compose receives the literal string "/tmp/benchflow_env_$$.env" while the cat heredoc that writes the file (line 339, raw f-string) does expand $$. The file is written to /tmp/benchflow_env_<pid>.env and read from /tmp/benchflow_env_$$.env — silent mismatch, env vars (incl. API keys) silently dropped in DinD compose tasks. Fix: use uuid.uuid4().hex[:16] for the unique suffix instead of relying on shell-side $$ expansion. The path is then a literal that survives quoting. Apply the same fix to the direct (non-DinD) Daytona branch even though it was working — uniformity makes the path robust against future quoting changes. Also fix a pre-existing SIM103 lint error in _daytona_patches.py that ruff caught while validating the test changes. Tests: tests/test_process.py +2 regression tests pinning that no remote command contains a literal "$$" — would catch this exact regression. 8/8 process tests pass; ruff clean. * test: reference PR #193 / #198 in regression test docstring Devin caught: CLAUDE.md mandates regression tests name the commit/PR they guard. Updated TestDaytonaProcessEnvFilePath docstring to cite PR #198 (the fix) and PR #193 / commit cdccac7 (the regression).
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DaytonaPtyProcessclass that uses Daytona SDK's WebSocket PTY API instead of SSH for DinD compose tasksProcess closed stdout rc=None). WebSocket PTY maintains the connection.client.py(filter echoed requests by checking formethodfield absence)skills_dir: "auto"support in Job for per-task skill resolution (needed after skillsbench PR #720 removed COPY skills from Dockerfiles)How it works
-T(compose exec): disables TTY inside container so agent gets clean stdio_acp_run.pydetects DinD → uses PTY transport automaticallyTest results
Test plan
Ref: #192