New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299
Comments
I am using the node drain machanisim drain_on_shutdown {
deadline = "1h"
force = false
ignore_system_jobs = false
} The TimeOutSpec in systemd job is set to 1h as well. Bu somehow the init containers alives after every node reboot. I made an ugly workarround with a PostStart methon in nomad agent systemd job #!/bin/bash
CONTAINER_IDS=$(docker ps | grep "amd64")
if [ -n "$CONTAINER_IDS" ]; then
docker kill $(docker ps | grep "amd64" | awk '{ print $1 }')
fi |
@suikast42 I haven't been able to reproduce what you're seeing. Can you paste more of the Client spec, in particular what you have for
And then also include your systemd unit file for the Nomad Client agent. When you reboot the VM, is that sending a signal to the Nomad agent? |
So here are my config files Systemd [Unit]
# When using Nomad with Consul it is not necessary to start Consul first. These
# lines start Consul before Nomad as an optimization to avoid Nomad logging
# that Consul is unavailable at startup.
Description=Nomad
Documentation=https://www.nomadproject.io/docs/
Wants=network-online.target,containerd.service,docker.service,consul.service
After=network-online.target,containerd.service,docker.service,consul.service
[Service]
ExecStartPre=/bin/bash -c '(while ! nc -z -v -w1 consul.service.consul 8501 2>/dev/null; do echo "Waiting for consul.service.consul 8501 to open..."; sleep 1; done); sleep 1'
# Nomad server should be run as the nomad user. Nomad clients
# should be run as root
User=root
Group=root
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/nomad agent -config /etc/nomad.d
# See issue https://github.com/hashicorp/nomad/issues/17299
# See issue https://github.com/suikast42/nomadder/issues/138
ExecStartPre=/etc/nomad.d/nomad_kill_pause_containers.sh
# nomad client have a active setting drain_on_shutdown
# this drains the node and mark it as ineligible.
# Make the node eligible again
ExecStartPost=systemctl restart nomad.eligtion.service
# Use node drain over client config drain_on_shutdown
# Enable this section if you disable the option drain_on_shutdown
#ExecStop=/etc/nomad.d/nomad_node_drain.sh
KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2
## Configure unit start rate limiting. Units which are started more than
## *burst* times within an *interval* time span are not permitted to start any
## more. Use `StartLimitIntervalSec` or `StartLimitInterval` (depending on
## systemd version) to configure the checking interval and `StartLimitBurst`
## to configure how many starts per interval are allowed. The values in the
## commented lines are defaults.
# StartLimitBurst = 5
## StartLimitIntervalSec is used for systemd versions >= 230
StartLimitIntervalSec = 10s
# drain_on_shutdown + 30s
TimeoutStopSec=2m30s
## StartLimitInterval is used for systemd versions < 230
# StartLimitInterval = 10s
TasksMax=infinity
#The default systemd configuration for Nomad should set OOMScoreAdjust=-1000 to avoid OOMing the Nomad process.
OOMScoreAdjust=-1000
[Install]
WantedBy=multi-user.target agent conflog_level = "DEBUG"
name = "worker-01"
datacenter = "nomadder1"
data_dir = "/opt/services/core/nomad/data"
bind_addr = "0.0.0.0" # the default
leave_on_interrupt= true
#https://github.com/hashicorp/nomad/issues/17093
#systemctl kill -s SIGTERM nomad will suppress node drain if
#leave_on_terminate set to false
leave_on_terminate = true
advertise {
# Defaults to the first private IP address.
http = "10.21.21.42"
rpc = "10.21.21.42"
serf = "10.21.21.42"
}
client {
enabled = true
network_interface = "eth1"
meta {
node_type= "worker"
connect.log_level = "debug"
connect.sidecar_image= "registry.cloud.private/envoyproxy/envoy:v1.26.2"
}
server_join {
retry_join = ["10.21.21.41"]
retry_max = 0
retry_interval = "15s"
}
# Either leave_on_interrupt or leave_on_terminate must be set
# for this to take effect.
drain_on_shutdown {
deadline = "2m"
force = false
ignore_system_jobs = false
}
host_volume "ca_cert" {
path = "/usr/local/share/ca-certificates/cloudlocal"
read_only = true
}
host_volume "cert_ingress" {
path = "/etc/opt/certs/ingress"
read_only = true
}
## Cert consul client
## Needed for consul_sd_configs
## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
host_volume "cert_consul" {
path = "/etc/opt/certs/consul"
read_only = true
}
## Cert consul client
## Needed for jenkins
## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
host_volume "cert_nomad" {
path = "/etc/opt/certs/nomad"
read_only = true
}
## Cert docker client
## Needed for jenkins
## Should be deleted after migrating to vault
host_volume "cert_docker" {
path = "/etc/opt/certs/docker"
read_only = true
}
host_network "public" {
interface = "eth0"
#cidr = "203.0.113.0/24"
#reserved_ports = "22,80"
}
host_network "default" {
interface = "eth1"
}
host_network "private" {
interface = "eth1"
}
host_network "local" {
interface = "lo"
}
reserved {
# cpu (int: 0) - Specifies the amount of CPU to reserve, in MHz.
# cores (int: 0) - Specifies the number of CPU cores to reserve.
# memory (int: 0) - Specifies the amount of memory to reserve, in MB.
# disk (int: 0) - Specifies the amount of disk to reserve, in MB.
# reserved_ports (string: "") - Specifies a comma-separated list of ports to reserve on all fingerprinted network devices. Ranges can be specified by using a hyphen separating the two inclusive ends. See also host_network for reserving ports on specific host networks.
cpu = 1000
memory = 2048
}
max_kill_timeout = "1m"
}
tls {
http = true
rpc = true
ca_file = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
cert_file = "/etc/opt/certs/nomad/nomad.pem"
key_file = "/etc/opt/certs/nomad/nomad-key.pem"
verify_server_hostname = true
verify_https_client = true
}
consul{
ssl= true
address = "127.0.0.1:8501"
grpc_address = "127.0.0.1:8503"
# this works only with ACL enabled
allow_unauthenticated= true
ca_file = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
grpc_ca_file = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
cert_file = "/etc/opt/certs/consul/consul.pem"
key_file = "/etc/opt/certs/consul/consul-key.pem"
}
telemetry {
collection_interval = "1s"
disable_hostname = true
prometheus_metrics = true
publish_allocation_metrics = true
publish_node_metrics = true
}
plugin "docker" {
config {
allow_privileged = false
disable_log_collection = false
# volumes {
# enabled = true
# selinuxlabel = "z"
# }
infra_image = "registry.cloud.private/google_containers/pause-amd64:3.2"
infra_image_pull_timeout ="30m"
extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]
logging {
type = "journald"
config {
labels-regex =".*"
}
}
gc{
container = true
dangling_containers{
enabled = true
# period = "3m"
# creation_grace = "5m"
}
}
}
} How can I check which signal is sent form OS to systemd service ? |
Ah I was finally able to reproduce @suikast42, thanks for the extra info. Not sure what the underlying problem is yet but at least I can investigate now. |
This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationResource field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299
This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationResource field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299
This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299
This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299
…#17455) This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299
I am on nomad 1.5.6
Everytime if I reboot the ubuntu 22.04 VM there are pause containers left.
This containers does not consume any mem or cpu.
I have gc active in the client config but that does not have an effect.
The text was updated successfully, but these errors were encountered: