Skip to content

Conversation

@mhucka
Copy link
Collaborator

@mhucka mhucka commented Aug 10, 2025

Docker builds on GitHub are failing with the problem that Docker build is trying to get qsim images from docker.io:

#1 [internal] load local bake definitions
#1 reading from stdin 935B done
#1 DONE 0.0s

#2 [qsim internal] load build definition from Dockerfile
#2 transferring dockerfile: 1.34kB done
#2 DONE 0.0s

#3 [qsim-py-tests internal] load build definition from Dockerfile
#3 transferring dockerfile: 381B done
#3 DONE 0.0s

#4 [qsim-cxx-tests internal] load build definition from Dockerfile
#4 transferring dockerfile: 208B done
#4 DONE 0.0s

#5 [qsim-cxx-tests internal] load metadata for #docker.io/library/qsim:latest

Apparently, this happens because the buildx plugin, which is the default builder in many modern Docker setups, can build images in parallel and uses an isolated builder environment. An image built for one service (like qsim): might not be immediately loaded into the local Docker daemon's image store. When the build for a dependent service (like qsim-cxx-tests) starts, it can't find the qsim base image locally and falls back to searching Docker Hub.

This is despite the fact that a plain docker compose build locally (at least in my Debian Linux environment) works as expected.

After a lot of trial and error, the only solution I could find is to split up the docker compose build … into 2 steps, the first being to build the qsim base image separately:

- name: Build Docker images
  run: |
    docker compose build qsim-base-image
    docker compose build qsim-cxx-tests-image qsim-py-tests-image

In addition to this change, in the process of debugging this I found it useful to make some housekeeping changes to the Dockerfile and docker-compose.yml files, in particular to change the names of jobs and images to distinguish the different things being run or built. (Having several things all named qsim was not helpful.)

Noting here what else I tried and that didn't work:

  • What I thought would be the obvious solution (i.e., using the command-line argument --parallel 1 or setting the env var COMPOSE_PARALLEL_LIMIT to tell buildx not to build in parallel) did not work on GitHub. Since I can't reproduce the problem locally, I can't be sure if that would have solved it anyway.

  • Trying to use the buildx bake command first, like this:

    docker buildx bake --load --progress=plain -f docker-compose.yml
    docker compose build
  • Trying to clear the Docker cache.

Docker builds on GitHub are failing with the problem that Docker build
is trying to get qsim images from docker.io:

```
#1 [internal] load local bake definitions
#1 reading from stdin 935B done
#1 DONE 0.0s

#2 [qsim internal] load build definition from Dockerfile
#2 transferring dockerfile: 1.34kB done
#2 DONE 0.0s

#3 [qsim-py-tests internal] load build definition from Dockerfile
#3 transferring dockerfile: 381B done
#3 DONE 0.0s

#4 [qsim-cxx-tests internal] load build definition from Dockerfile
#4 transferring dockerfile: 208B done
#4 DONE 0.0s

#5 [qsim-cxx-tests internal] load metadata for #docker.io/library/qsim:latest
```

Apparently, this happens because the `buildx` plugin, which is the
default builder in many modern Docker setups, can build images in
parallel and uses an isolated builder environment. An image built for
one service (like `qsim`): might not be immediately loaded into the
local Docker daemon's image store. When the build for a dependent
service (like `qsim-cxx-tests`) starts, it can't find the `qsim` base
image locally and falls back to searching Docker Hub.

While `depends_on` controls the runtime start order of containers, the
build-time dependency inference can be tricky in this parallel execution
environment.

One way to fix this is to explicitly tell `buildx` which platform to
build for. This forces buildx to build a single-platform image and load
its layers into the local image store, making it available to subsequent
build steps.
@mhucka mhucka added the area/devops Involves build systems, Make files, Bazel files, continuous integration, and/or other DevOps topics label Aug 10, 2025
@github-actions github-actions bot added the Size: XS <10 lines changed label Aug 10, 2025
mhucka added 4 commits August 10, 2025 03:24
This should explicitly instruct buildx to export the finished build as a
standard image in the local Docker daemon.
Turns out it's not valid.
It gets hard to know what is happening when everthing is called "qsim".
@github-actions github-actions bot added size: S 10< lines changed <50 and removed Size: XS <10 lines changed labels Aug 10, 2025
@mhucka mhucka closed this Aug 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/devops Involves build systems, Make files, Bazel files, continuous integration, and/or other DevOps topics size: S 10< lines changed <50

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant