feat: Docker-in-Docker job execution via host socket#18
feat: Docker-in-Docker job execution via host socket#18
Conversation
912a09d to
c91f885
Compare
There was a problem hiding this comment.
Pull request overview
Adds Docker-based job execution for the local PanDA/Harvester compose stack by introducing Harvester plugins that submit and monitor jobs as Docker containers via the host Docker socket.
Changes:
- Added
DockerSubmitter/DockerMonitorHarvester plugins to run and poll worker containers through/var/run/docker.sock. - Updated
docker-compose.ymlto run Harvester in foreground with a PID healthcheck, mount the Docker socket, and load the custom plugins viaPYTHONPATH. - Updated the CI workflow and
pandajob-submitscript to submit an inline Alpinesh -c ...job and poll until it finishes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
scripts/pandajob-submit |
Adds inline --transformation/--params submission mode (and makes --script optional). |
docker-compose.yml |
Starts Harvester daemon in foreground, installs docker SDK at startup, mounts docker.sock, adds healthcheck, mounts plugin dir. |
config/harvester/plugins/docker_submitter.py |
New submitter plugin that launches detached containers for workers. |
config/harvester/plugins/docker_monitor.py |
New monitor plugin that maps container status/exit code to Harvester worker states. |
config/harvester/panda_queues.cfg |
Switches queue to DockerSubmitter/DockerMonitor and uses BaseMessenger. |
.github/workflows/ci.yml |
Expands smoke test to submit an Alpine job and poll for terminal status. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| mkdir -p /var/run/panda /var/log/panda /harvester/workers /harvester/wdir | ||
| # Allow the unprivileged atlpan process to reach the Docker socket | ||
| chmod 666 /var/run/docker.sock || true |
There was a problem hiding this comment.
chmod 666 /var/run/docker.sock makes the host Docker socket world-writable inside the container, which effectively grants root-equivalent access to the host. Prefer granting access via group permissions (e.g., run the container with the host docker group GID via group_add, or run as root and keep the socket at 660), and avoid widening permissions on the socket.
| container_name = f"harvester-worker-{workSpec.workerID}" | ||
| wLog.debug(f"running container image={self.containerImage} command={command}") | ||
|
|
||
| container = client.containers.run( | ||
| self.containerImage, | ||
| command=command, | ||
| name=container_name, | ||
| detach=True, | ||
| remove=False, | ||
| ) |
There was a problem hiding this comment.
container_name is deterministic per workerID and remove=False leaves containers behind. If Harvester retries submission or reuses a workerID after restart, containers.run(..., name=...) will fail with a name conflict, and old containers will accumulate. Consider enabling auto-removal (remove=True), generating a unique name (e.g., include a UUID/batchID), and/or adding labels and a sweeper to clean up containers for finished/failed workers.
| elif c_status == "exited": | ||
| if exit_code == 0: | ||
| new_status = WorkSpec.ST_finished | ||
| workSpec.nativeExitCode = 0 | ||
| workSpec.nativeStatus = "exited(0)" | ||
| else: | ||
| new_status = WorkSpec.ST_failed | ||
| workSpec.nativeExitCode = exit_code | ||
| workSpec.nativeStatus = f"exited({exit_code})" | ||
| else: | ||
| new_status = WorkSpec.ST_failed | ||
| workSpec.nativeStatus = c_status | ||
|
|
||
| retList.append((new_status, "")) |
There was a problem hiding this comment.
For non-zero exit codes and other failure states you set nativeStatus/nativeExitCode, but you always append (new_status, ""), which drops useful diagnostics. Returning a non-empty message (e.g., include c_status/exit_code) will make failures debuggable upstream.
| if c_status in ("created", "running"): | ||
| new_status = WorkSpec.ST_running | ||
| elif c_status == "exited": | ||
| if exit_code == 0: | ||
| new_status = WorkSpec.ST_finished | ||
| workSpec.nativeExitCode = 0 | ||
| workSpec.nativeStatus = "exited(0)" | ||
| else: | ||
| new_status = WorkSpec.ST_failed | ||
| workSpec.nativeExitCode = exit_code | ||
| workSpec.nativeStatus = f"exited({exit_code})" | ||
| else: | ||
| new_status = WorkSpec.ST_failed | ||
| workSpec.nativeStatus = c_status |
There was a problem hiding this comment.
Docker can report container status as restarting (and some daemon versions report removing). Those currently fall into the else branch and mark the worker failed immediately. Consider treating transient statuses like restarting as ST_running (or at least not failing hard until it becomes terminal).
| if args.script is not None: | ||
| script_path = os.path.abspath(args.script) | ||
| if not os.path.isfile(script_path): | ||
| print(f"ERROR: script not found: {script_path}", file=sys.stderr) | ||
| sys.exit(1) | ||
| transformation = "/bin/bash" | ||
| job_parameters = script_path | ||
| elif args.transformation is not None or args.params is not None: | ||
| transformation = args.transformation or "sh" | ||
| job_parameters = args.params or "" |
There was a problem hiding this comment.
In --script mode you submit an absolute path from the machine running pandajob-submit as job.jobParameters. With the new DockerSubmitter executing inside a Docker container, that path almost certainly doesn't exist inside the job container, and transformation=/bin/bash also won't exist on the default alpine image. Either clarify in the CLI help that --script requires the script to be present in the worker/container filesystem (and use /bin/sh for alpine), or implement staging/mounting so the script is actually available to the worker.
| # Install Python docker SDK into the harvester virtualenv | ||
| /opt/harvester/bin/pip install -q docker | ||
| set +e |
There was a problem hiding this comment.
The container startup script uses set -e and then runs /opt/harvester/bin/pip install -q docker before set +e. If the pip install fails (network/pypi outage), the whole container will exit immediately despite the later set +e. Either keep set -e and remove the misleading set +e, or explicitly handle the install failure (e.g., retry / fail with a clearer message / make it non-fatal if appropriate). Also consider pinning the docker SDK version to avoid CI breakage from upstream releases.
| # Install Python docker SDK into the harvester virtualenv | |
| /opt/harvester/bin/pip install -q docker | |
| set +e | |
| # Install Python docker SDK into the harvester virtualenv. | |
| # Retry transient network/PyPI failures and pin the version for reproducible startup. | |
| for attempt in 1 2 3; do | |
| if /opt/harvester/bin/pip install -q 'docker==7.1.0'; then | |
| break | |
| fi | |
| if [ "$attempt" -eq 3 ]; then | |
| echo "ERROR: Failed to install Python package 'docker==7.1.0' after ${attempt} attempts" >&2 | |
| exit 1 | |
| fi | |
| echo "WARNING: pip install of 'docker==7.1.0' failed on attempt ${attempt}; retrying..." >&2 | |
| sleep 5 | |
| done |
Replace SubprocessSubmitter/SubprocessMonitor with custom DockerSubmitter and DockerMonitor plugins that run PanDA jobs as Docker containers on the host daemon via a bind-mounted socket (/var/run/docker.sock). Changes: - config/harvester/plugins/docker_submitter.py: new plugin — starts a detached Alpine container per worker; reads transformation/jobPars from job spec and builds the container command via shlex.split - config/harvester/plugins/docker_monitor.py: new plugin — polls container state; maps running→ST_running, exited(0)→ST_finished, else→ST_failed - docker-compose.yml (harvester): install docker SDK at startup, chmod socket, write PID file for healthcheck; add socket and plugin volume mounts - config/harvester/panda_queues.cfg: switch to DockerSubmitter/DockerMonitor/ BaseMessenger (job completion driven by worker status, no jobReport needed) - .github/workflows/ci.yml: wait for Harvester healthy before submitting; run alpine sh -c 'echo hello from alpine; uname -a' as integration test; raise job/CI timeout to 30 min Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0b9a976 to
2f2e541
Compare
Add --container IMAGE to pandajob-submit and wire it through to DockerSubmitter,
mirroring PanDA's production container_name job spec field.
In production PanDA, jobs carry a container_name attribute that submitter plugins
use to select the container image per-job (e.g. via prun --containerImage). Harvester
surfaces this as job.jobParams["container_name"]. This commit replicates that pattern
in the panda-compose dev stack.
Changes:
- scripts/pandajob-submit: add --container IMAGE optional arg; when provided it sets
job.container_name on the submitted JobSpec
- config/harvester/plugins/docker_submitter.py: extract _resolve_image() helper that
reads job.jobParams.get("container_name"), strips any "docker://" prefix for Docker
SDK compatibility, and falls back to the queue-level containerImage default
- .github/workflows/ci.yml: use --container python:3.12-alpine in the integration test
to demonstrate per-job container selection (different from the queue default alpine:latest)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Adds Docker-in-Docker (DinD) support so Harvester executes real container jobs on a local Alpine image rather than relying on a local submitter that doesn't exist in the image.
Changes
New files
config/harvester/plugins/docker_submitter.py— Harvester submitter plugin that launches detached Alpine containers via the host Docker socket, usingtransformation+jobParsfrom the PanDA job spec as the container commandconfig/harvester/plugins/docker_monitor.py— polls container status and maps toST_running/ST_finished/ST_failedModified files
docker-compose.yml— harvester service now starts the Harvester daemon (master.py --foreground); installs the PythondockerSDK at startup; mounts/var/run/docker.sock; adds PID-based healthcheckconfig/harvester/panda_queues.cfg— switches from non-existentLocalSubmitter/LocalMonitortoDockerSubmitter/DockerMonitor(alpine:latest); switches messenger toBaseMessengerso job completion is driven by worker status alone without requiring ajobReport.jsonfilescripts/pandajob-submit— adds--transformation/--paramsflags for self-contained inline commands alongside the existing--scriptmode.github/workflows/ci.yml— submits an Alpinesh -c 'echo hello; uname -a'job toPANDA_COMPOSE_LOCAL, waits for Harvester to become healthy, polls up to 10 min forjobStatus == finishedDepends on
PR #1 (
pip-install-user-panda-client) — panda-client installation fix