Docker compose won't attach to network when monitoring is enabled during install#16
Conversation
- docker_health_wait() - accounts for container exiting early
- docker_up() - docker_cleanup_legacy_network() no longer hides return
err from docker up cmd (just moved the execution above the
docker-compose up)
…ually and docker-compose for node later won't attach (since it's missing compose labels)
…vive the update
Installs that hit the pre-fix network race created a 'logosnode-net' via a
bare `docker network create` (no compose labels), with the monitoring
containers already attached. Once both stacks declare the network with
`name: logosnode-net`, compose refuses to adopt that unlabeled network
("found but has incorrect label") and `start` breaks — and because the
network isn't orphaned, the legacy-network cleanup can't remove it.
Add docker_repair_unmanaged_network(): if a 'logosnode-net' exists without
compose's label, bring both stacks down to detach, drop it, and let the
normal bring-up recreate it labeled. Wired into both docker_up() and
monitoring_up() so it runs whichever stack starts first (start, install,
update, monitor start). No-op on healthy/compose-managed installs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Pushed one follow-up commit ( The forward path here is solid — once both stacks use
I left the crash-loop detection alone (a |
The problem is this code. When during an installation user decides to enable monitoring the monitoring stack is ran before the node stack. This means the network won't exist and the this code creates the network manually.
But when later docker-compose tries to bring up the node, it won't attach to the network because the network is not a docker-compose managed network (doesn't have the compose labels). It gets refused and fails. The second commit basically just lets compose always decide whether to create/attach the network which will always be compose managed. (Another way to go would be to use
externalfor both networks but that would mean you need to create the network on several different places in code, which I like less).The first commit basically is only correctly surfacing the errors that might happen, because originally when I tried this on
mainI would only get the 120s timeout for spinning up the docker compose and no specific error:docker_up()returns the last call (the legacy cleanup) and ignored the previous errors.Tested this with installing both w/ the monitoring stack during install and w/o (and enabling the monitoring stack later).