Problem
When OSA is deployed using the published ghcr.io/opensciencearchive/osa image (which ends in USER appuser), spawning ingester/validator containers fails:
[i.o.ingester_runner] Pulling ingester image: osa-hooks-ingesters/cultivarium:latest
[infra.event.worker] Cannot connect to Docker Engine via unix:///run/docker.sock ssl:default [Permission denied]
The host's /var/run/docker.sock is root:docker 0660. appuser isn't in a group with access, so the call fails with EACCES.
This is invisible in up-dev and the pockets pilot because both build with target: builder — an earlier multistage layer that hasn't reached the USER appuser directive yet, so the container runs as root and socket access works.
Why not just fix the user/group
A GID-detection entrypoint script would work (detect the socket's group, alias a group to that GID, add appuser to it, drop privileges). But it papers over a real problem: OSA's container has direct access to the host's Docker daemon, which is functionally equivalent to host root. The "fix" would be making a non-root user inside the container hold a root-equivalent capability.
Proposed solution: socket proxy
Add a docker-socket-proxy service (e.g. tecnativa/docker-socket-proxy) to the default deploy/docker-compose.yml. OSA talks to the proxy over TCP (DOCKER_HOST=tcp://docker-socket-proxy:2375) instead of mounting the host socket.
Benefits:
- Fixes the permission bug — OSA never touches the socket file, so GID issues disappear.
- Closes the dev/prod divergence — dev uses the proxy too, so "only works because dev is root" stops being a class of bug.
- Documents the Docker API surface OSA actually needs (auditable, mirrors the K8s runner permission model).
- Self-host-friendly — still a single
docker-compose up, just one extra small service. No K8s required.
Implementation sketch
- Add to
deploy/docker-compose.yml:
docker-socket-proxy:
image: tecnativa/docker-socket-proxy:latest
restart: unless-stopped
environment:
CONTAINERS: 1
IMAGES: 1
POST: 1
VOLUMES: 1 # if ingesters mount volumes
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
networks: [osa-internal]
- Set
DOCKER_HOST=tcp://docker-socket-proxy:2375 in OSA's service env.
- Remove the host socket mount from the OSA service.
- Verify
aiodocker in the Docker runner respects DOCKER_HOST (it does by default).
- Apply the same change to
deploy/docker-compose.dev.yml to close the dev/prod gap.
Out of scope
- The Kubernetes runner — already a separate path, no changes needed.
- Rootless Docker / Podman support — interesting future work, not blocking.
- The
target: builder shortcut in dev — should be revisited separately; ideally dev also ends on USER appuser so prod-only bugs stop existing.
Acceptance
Problem
When OSA is deployed using the published
ghcr.io/opensciencearchive/osaimage (which ends inUSER appuser), spawning ingester/validator containers fails:The host's
/var/run/docker.sockisroot:docker 0660.appuserisn't in a group with access, so the call fails withEACCES.This is invisible in
up-devand the pockets pilot because both build withtarget: builder— an earlier multistage layer that hasn't reached theUSER appuserdirective yet, so the container runs as root and socket access works.Why not just fix the user/group
A GID-detection entrypoint script would work (detect the socket's group, alias a group to that GID, add
appuserto it, drop privileges). But it papers over a real problem: OSA's container has direct access to the host's Docker daemon, which is functionally equivalent to host root. The "fix" would be making a non-root user inside the container hold a root-equivalent capability.Proposed solution: socket proxy
Add a
docker-socket-proxyservice (e.g.tecnativa/docker-socket-proxy) to the defaultdeploy/docker-compose.yml. OSA talks to the proxy over TCP (DOCKER_HOST=tcp://docker-socket-proxy:2375) instead of mounting the host socket.Benefits:
docker-compose up, just one extra small service. No K8s required.Implementation sketch
deploy/docker-compose.yml:DOCKER_HOST=tcp://docker-socket-proxy:2375in OSA's service env.aiodockerin the Docker runner respectsDOCKER_HOST(it does by default).deploy/docker-compose.dev.ymlto close the dev/prod gap.Out of scope
target: buildershortcut in dev — should be revisited separately; ideally dev also ends onUSER appuserso prod-only bugs stop existing.Acceptance
appuserand successfully spawns ingester containers via the proxy.docker-compose upworks end-to-end with no extra host setup.docker-compose.dev.yml) uses the same proxy path.