Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swarm Manager Node no longer in Swarm after reboot #47741

Open
FiretronP75 opened this issue Apr 22, 2024 · 4 comments
Open

Swarm Manager Node no longer in Swarm after reboot #47741

FiretronP75 opened this issue Apr 22, 2024 · 4 comments
Labels
area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/26.0

Comments

@FiretronP75
Copy link

Description

After a reboot of the machine, the single node managing the swarm, is no longer in the swarm, making both the node and the swarm inaccessible orphans. Did this a few times using fresh local Ubuntu servers.

Reproduce

  1. run a container that uses swarm
  2. verify everything is working correctly
  3. reboot machine

Expected behavior

node is still in swarm and is still manager

docker version

Client: Docker Engine - Community
 Version:           26.0.2
 API version:       1.43 (downgraded from 1.45)
 Go version:        go1.21.9
 Git commit:        3c863ff
 Built:             Thu Apr 18 16:27:07 2024
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.8
  Git commit:       a61e2b4
  Built:            Sat Oct  7 00:14:30 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
 runc:
  Version:          1.1.7
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    26.0.2
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.26.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 5
  Running: 0
  Paused: 0
  Stopped: 5
 Images: 4
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: 
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-105-generic
 Operating System: Ubuntu Core 22
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 62.51GiB
 Name: XXXXXX
 ID: 99e4fe8c-e96f-47c6-a0cd-74b0c56dba4f
 Docker Root Dir: /var/snap/docker/common/var-lib-docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

I do not see this issue on my remote servers only local physical (not vm). Fresh install of Ubuntu server and installed docker engine according to the docs, not from package manager.

@FiretronP75 FiretronP75 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Apr 22, 2024
@thaJeztah
Copy link
Member

This part stands out to me; this looks to be a location as used by Canonical's snap packages;

Docker Root Dir: /var/snap/docker/common/var-lib-docker

Is that something you configured, or did you have the snap packages installed at some point, and some config for that left behind?

@FiretronP75
Copy link
Author

FiretronP75 commented May 1, 2024

When I installed Ubuntu Server, it had a part where you select and deselect popular packages, and that installed docker. I assumed it did it the wrong way so I followed the instructions for completely removing it. Maybe the complete removal instructions don't 100% do it.

To be specific, for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done from https://docs.docker.com/engine/install/ubuntu/

@thaJeztah
Copy link
Member

Hm, right 🤔

I guess those steps don't take snaps into account; what does snap list show? Wondering if docker still shows up as installed there.

@FiretronP75
Copy link
Author

FiretronP75 commented May 1, 2024

Sigh... docker 24.0.5 2915 latest/stable canonical✓ -
So, I guess add the snap removal command to the install docs?

I suppose this has been discussed at length already, but I'm wondering why they want to keep pushing a version on people that doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/26.0
Projects
None yet
Development

No branches or pull requests

2 participants