nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299

suikast42 · 2023-05-23T21:12:33Z

I am on nomad 1.5.6

Everytime if I reboot the ubuntu 22.04 VM there are pause containers left.

This containers does not consume any mem or cpu.

I have gc active in the client config but that does not have an effect.

plugin "docker" {
  config {
    allow_privileged = false
    disable_log_collection  = false
#    volumes {
#      enabled = true
#      selinuxlabel = "z"
#    }
    infra_image = "{{nomad_infra_image}}"
    infra_image_pull_timeout ="30m"
    extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]
    logging {
      type = "journald"
       config {
          labels-regex =".*"
       }
    }
    gc{
      container = true
      dangling_containers{
        enabled = true
      # period = "3m"
      # creation_grace = "5m"
      }
    }

  }
}

CONTAINER ID   IMAGE                                                      COMMAND                  CREATED             STATUS         PORTS     NAMES
8bbbdd9bd7cb   registry.cloud.private/suikast42/logunifier:0.1.1          "/logunifier -config…"   5 minutes ago       Up 5 minutes             logunifier-880a05ce-2dfe-ac5f-a3eb-c0fdcde6bcb6
229e3975f7c3   prom/blackbox-exporter:v0.24.0                             "/bin/blackbox_expor…"   6 minutes ago       Up 6 minutes             blackbox-task-9a8cf574-b7f5-23fc-ae46-44de47ae6e1c
a1ced0db67fe   10.21.21.41:5000/traefik:v2.10.1                           "/entrypoint.sh trae…"   6 minutes ago       Up 6 minutes             traefik-54665384-005c-a298-9ebd-6c4f110cbdee
ac00bc7b5962   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 6 minutes ago       Up 6 minutes             nomad_init_880a05ce-2dfe-ac5f-a3eb-c0fdcde6bcb6
a41df1e4a6ea   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 6 minutes ago       Up 6 minutes             nomad_init_9a8cf574-b7f5-23fc-ae46-44de47ae6e1c
581cb58b77a4   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 About an hour ago   Up 7 minutes             nomad_init_c1f6660d-bee3-bcea-1a52-43f8307bad07
63cb098bf749   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 14 hours ago        Up 7 minutes             nomad_init_e84b912c-50e6-ecec-9b2a-85e44ce4825f
e562b1afdf58   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 24 hours ago        Up 7 minutes             nomad_init_6a754458-08cf-0326-6b68-7bfecd44b95a
da3819a02e31   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 24 hours ago        Up 7 minutes             nomad_init_501de7f0-bd13-78b7-8967-02b91e62764d
bc4f4df0d861   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 31 hours ago        Up 7 minutes             nomad_init_cb2ace0d-4eac-45fa-17a8-902193f581f7
058219c80f51   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 31 hours ago        Up 7 minutes             nomad_init_8fdf7135-cd79-2175-5512-0cdfc6698806
9bdb7cf5373f   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 32 hours ago        Up 7 minutes             nomad_init_2537b847-5ce5-f5cf-cde8-613344206a87
1dfc05610d67   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 32 hours ago        Up 7 minutes             nomad_init_82cf76dc-37fc-8226-6df8-7a98da3238ea
aadc899cdadc   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 3 days ago          Up 7 minutes             nomad_init_64d45e1d-2e71-a531-2823-073264ab8c91
7c270ed72430   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 5 days ago          Up 7 minutes             nomad_init_b5f25ba3-eece-6a12-ad3e-4a5a6ac1ddcf
cefd98ab6156   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 5 days ago          Up 7 minutes             nomad_init_7a274784-170d-a3a2-6a09-117f8b4aa51d
1a9d6c5b2291   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 6 days ago          Up 7 minutes             nomad_init_733ca4a4-a8b0-c21e-25d6-494f3811e94b
9dc5621ed2f6   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 7 days ago          Up 7 minutes             nomad_init_ddf2525a-9659-1160-6813-0f835a109f38
60cae637f900   registry.cloud.private/google_containers/pause-amd64:3.2   "/pause"                 10 days ago         Up 7 minutes             nomad_init_0c710e44-1916-6b69-f3c2-d4bb6ec92ba9

The text was updated successfully, but these errors were encountered:

suikast42 · 2023-05-26T22:00:49Z

That's strange. The puase container gets a termintation signal after every rebott but keeps alive

suikast42 · 2023-05-28T21:18:23Z

I am using the node drain machanisim

  drain_on_shutdown {
    deadline           = "1h"
    force              = false
    ignore_system_jobs = false
  }

The TimeOutSpec in systemd job is set to 1h as well.

Bu somehow the init containers alives after every node reboot.

I made an ugly workarround with a PostStart methon in nomad agent systemd job

#!/bin/bash
CONTAINER_IDS=$(docker ps | grep "amd64")
if [ -n "$CONTAINER_IDS" ]; then
    docker kill $(docker ps | grep "amd64" | awk '{ print $1 }')
fi

shoenig · 2023-06-05T20:24:32Z

@suikast42 I haven't been able to reproduce what you're seeing. Can you paste more of the Client spec, in particular what you have for

leave_on_terminate = true
leave_on_interrupt = true

And then also include your systemd unit file for the Nomad Client agent.

When you reboot the VM, is that sending a signal to the Nomad agent?

suikast42 · 2023-06-06T08:52:03Z

So here are my config files

Systemd

   [Unit]
# When using Nomad with Consul it is not necessary to start Consul first. These
# lines start Consul before Nomad as an optimization to avoid Nomad logging
# that Consul is unavailable at startup.
Description=Nomad
Documentation=https://www.nomadproject.io/docs/
Wants=network-online.target,containerd.service,docker.service,consul.service
After=network-online.target,containerd.service,docker.service,consul.service



[Service]
ExecStartPre=/bin/bash -c '(while ! nc -z -v -w1 consul.service.consul 8501 2>/dev/null; do echo "Waiting for consul.service.consul 8501 to open..."; sleep 1; done); sleep 1'

# Nomad server should be run as the nomad user. Nomad clients
# should be run as root
User=root

Group=root



ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/nomad agent -config /etc/nomad.d
# See issue https://github.com/hashicorp/nomad/issues/17299
# See issue https://github.com/suikast42/nomadder/issues/138
ExecStartPre=/etc/nomad.d/nomad_kill_pause_containers.sh
# nomad client have a active setting drain_on_shutdown
# this drains the node and mark it as ineligible.
# Make the node eligible again
ExecStartPost=systemctl restart nomad.eligtion.service
# Use node drain over client config drain_on_shutdown
# Enable this section if you disable the option drain_on_shutdown
#ExecStop=/etc/nomad.d/nomad_node_drain.sh

KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2

## Configure unit start rate limiting. Units which are started more than
## *burst* times within an *interval* time span are not permitted to start any
## more. Use `StartLimitIntervalSec` or `StartLimitInterval` (depending on
## systemd version) to configure the checking interval and `StartLimitBurst`
## to configure how many starts per interval are allowed. The values in the
## commented lines are defaults.

# StartLimitBurst = 5

## StartLimitIntervalSec is used for systemd versions >= 230
StartLimitIntervalSec = 10s

# drain_on_shutdown +  30s
TimeoutStopSec=2m30s
## StartLimitInterval is used for systemd versions < 230
# StartLimitInterval = 10s

TasksMax=infinity
#The default systemd configuration for Nomad should set OOMScoreAdjust=-1000 to avoid OOMing the Nomad process.
OOMScoreAdjust=-1000

[Install]
WantedBy=multi-user.target

agent conf

log_level = "DEBUG"
name = "worker-01"
datacenter = "nomadder1"
data_dir =  "/opt/services/core/nomad/data"
bind_addr = "0.0.0.0" # the default

leave_on_interrupt= true
#https://github.com/hashicorp/nomad/issues/17093
#systemctl kill -s SIGTERM nomad will suppress node drain if
#leave_on_terminate set to false
leave_on_terminate = true

advertise {
  # Defaults to the first private IP address.
  http = "10.21.21.42"
  rpc  = "10.21.21.42"
  serf = "10.21.21.42"
}
client {
  enabled = true
  network_interface = "eth1"
  meta {
    node_type= "worker"
    connect.log_level = "debug"
    connect.sidecar_image= "registry.cloud.private/envoyproxy/envoy:v1.26.2"
  }
  server_join {
    retry_join =  ["10.21.21.41"]
    retry_max = 0
    retry_interval = "15s"
  }
  # Either leave_on_interrupt or leave_on_terminate must be set
  # for this to take effect.
  drain_on_shutdown {
    deadline           = "2m"
    force              = false
    ignore_system_jobs = false
  }
  host_volume "ca_cert" {
    path      = "/usr/local/share/ca-certificates/cloudlocal"
    read_only = true
  }
  host_volume "cert_ingress" {
    path      = "/etc/opt/certs/ingress"
    read_only = true
  }
  ## Cert consul client
  ## Needed for consul_sd_configs
  ## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
  host_volume "cert_consul" {
    path      = "/etc/opt/certs/consul"
    read_only = true
  }

  ## Cert consul client
  ## Needed for jenkins
  ## Should be deleted after resolve https://github.com/suikast42/nomadder/issues/100
  host_volume "cert_nomad" {
    path      = "/etc/opt/certs/nomad"
    read_only = true
  }

  ## Cert docker client
  ## Needed for jenkins
  ## Should be deleted after migrating to vault
  host_volume "cert_docker" {
    path      = "/etc/opt/certs/docker"
    read_only = true
  }

  host_network "public" {
    interface = "eth0"
    #cidr = "203.0.113.0/24"
    #reserved_ports = "22,80"
  }
  host_network "default" {
      interface = "eth1"
  }
  host_network "private" {
    interface = "eth1"
  }
  host_network "local" {
    interface = "lo"
  }

  reserved {
  # cpu (int: 0) - Specifies the amount of CPU to reserve, in MHz.
  # cores (int: 0) - Specifies the number of CPU cores to reserve.
  # memory (int: 0) - Specifies the amount of memory to reserve, in MB.
  # disk (int: 0) - Specifies the amount of disk to reserve, in MB.
  # reserved_ports (string: "") - Specifies a comma-separated list of ports to reserve on all fingerprinted network devices. Ranges can be specified by using a hyphen separating the two inclusive ends. See also host_network for reserving ports on specific host networks.
    cpu    = 1000
    memory = 2048
  }
  max_kill_timeout  = "1m"
}

tls {
  http = true
  rpc  = true

  ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  cert_file = "/etc/opt/certs/nomad/nomad.pem"
  key_file  = "/etc/opt/certs/nomad/nomad-key.pem"

  verify_server_hostname = true
  verify_https_client    = true
}

consul{
  ssl= true
  address = "127.0.0.1:8501"
  grpc_address = "127.0.0.1:8503"
  # this works only with ACL enabled
  allow_unauthenticated= true
  ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  grpc_ca_file   = "/usr/local/share/ca-certificates/cloudlocal/cluster-ca-bundle.pem"
  cert_file = "/etc/opt/certs/consul/consul.pem"
  key_file  = "/etc/opt/certs/consul/consul-key.pem"
}


telemetry {
  collection_interval = "1s"
  disable_hostname = true
  prometheus_metrics = true
  publish_allocation_metrics = true
  publish_node_metrics = true
}

plugin "docker" {
  config {
    allow_privileged = false
    disable_log_collection  = false
#    volumes {
#      enabled = true
#      selinuxlabel = "z"
#    }
    infra_image = "registry.cloud.private/google_containers/pause-amd64:3.2"
    infra_image_pull_timeout ="30m"
    extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]
    logging {
      type = "journald"
       config {
          labels-regex =".*"
       }
    }
    gc{
      container = true
      dangling_containers{
        enabled = true
      # period = "3m"
      # creation_grace = "5m"
      }
    }

  }
}

How can I check which signal is sent form OS to systemd service ?

shoenig · 2023-06-06T15:50:22Z

Ah I was finally able to reproduce @suikast42, thanks for the extra info. Not sure what the underlying problem is yet but at least I can investigate now.

This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationResource field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299

This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299

…#17455) This PR fixes a bug where the docker network pause container would not be stopped and removed in the case where a node is restarted, the alloc is moved to another node, the node comes back up. See the issue below for full repro conditions. Basically in the DestroyNetwork PostRun hook we would depend on the NetworkIsolationSpec field not being nil - which is only the case if the Client stays alive all the way from network creation to network teardown. If the node is rebooted we lose that state and previously would not be able to find the pause container to remove. Now, we manually find the pause container by scanning them and looking for the associated allocID. Fixes #17299

suikast42 added the type/bug label May 23, 2023

suikast42 changed the title ~~nomad left pause-amd64 continaers alive~~ nomad left pause-amd64 continaers alive if drain_on_shutdown is used May 28, 2023

shoenig self-assigned this Jun 2, 2023

shoenig added theme/driver/docker stage/accepted Confirmed, and intend to work on. No timeline committment though. labels Jun 2, 2023

shoenig added this to Needs Triage in Nomad - Community Issues Triage via automation Jun 2, 2023

shoenig added the stage/needs-investigation label Jun 6, 2023

This was referenced Jun 6, 2023

panic: during agent shutdown with drain_on_shutdown, got this panic #17439

Closed

docker: stop network pause container of lost alloc after node restart #17455

Merged

shoenig closed this as completed in #17455 Jun 9, 2023

Nomad - Community Issues Triage automation moved this from Needs Triage to Done Jun 9, 2023

hc-github-team-nomad-core mentioned this issue Jun 9, 2023

Backport of docker: stop network pause container of lost alloc after node restart into release/1.5.x #17466

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299

nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299

suikast42 commented May 23, 2023

suikast42 commented May 26, 2023

suikast42 commented May 28, 2023

shoenig commented Jun 5, 2023

suikast42 commented Jun 6, 2023

shoenig commented Jun 6, 2023

nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299

nomad left pause-amd64 continaers alive if drain_on_shutdown is used #17299

Comments

suikast42 commented May 23, 2023

suikast42 commented May 26, 2023

suikast42 commented May 28, 2023

shoenig commented Jun 5, 2023

suikast42 commented Jun 6, 2023

shoenig commented Jun 6, 2023