Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to decode driver config: readContainerLen: Unrecognized descriptor #5680

Closed
JeffreyVdb opened this issue May 10, 2019 · 2 comments · Fixed by #5844
Closed

failed to decode driver config: readContainerLen: Unrecognized descriptor #5680

JeffreyVdb opened this issue May 10, 2019 · 2 comments · Fixed by #5844

Comments

@JeffreyVdb
Copy link

JeffreyVdb commented May 10, 2019

Nomad version

Nomad v0.9.1 (4b2bdbd)

Operating system and Environment details

CentOS Linux release 7.6.1810 (Core)
Linux Kernel 3.10.0-957.12.1.el7.x86_64

Issue

When running an HCL file, decoding driver config fails with the error message:

failed to decode driver config: [pos 171]: readContainerLen: Unrecognized descriptor byte: hex: d4, decimal: 212

Nomad validate returns that the file is valid however:

$ nomad validate my-postgres.hcl
Job validation successful

Reproduction steps

  • Install nomad
  • Run job file

Job file (if appropriate)

job "my-postgres" {
  region = "global"
  datacenters = ["dc1"]
  type = "service"

  update {
    stagger      = "30s"
    max_parallel = 1
  }

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  group "app" {
    count = 1

    task "my-postgres" {
      driver = "docker"

      config {
        dns_servers = ["${NOMAD_IP_http}"]
        load = "/mnt/data/containers/my-postgres.tar"
        image = "my-postgres:latest"
        force_pull = true
        volumes = [
          "/mnt/data/postgres:/var/lib/postgresql/data:z"
        ]
        port_map {
          pg = 5432
        }
        logging {
          type = "syslog"
          config {
            syslog-facility = "local1"
            tag = "POSTGRES"
          }
        }
      }

      service {
        name = "tm-postgres"
        tags = ["version=1.0.0"]
        port = "pg"

        check {
          name     = "postgres"
          type     = "script"
          command  = "/usr/local/bin/pg_isready"
          args     = ["-U", "postgres"]
          interval = "5s"
          timeout  = "10s"

          check_restart {
            limit = 3
            grace = "60s"
            ignore_warnings = false
          }
        }
      }

      resources {
        cpu    = 20
        memory = 2048

        network {
          mbits = 1
          port "pg" { static = "5432" }
        }
      }
    }
  }
}

Nomad Client logs (if appropriate)

$ nomad alloc-status my-postgres

2019-05-10T12:08:34Z  Killing          Sent interrupt
2019-05-10T12:08:33Z  Alloc Unhealthy  Unhealthy because of failed task
2019-05-10T12:08:33Z  Not Restarting   Error was unrecoverable
2019-05-10T12:08:33Z  Driver Failure   failed to decode driver config: [pos 171]: readContainerLen: Unrecognized descriptor byte: hex: d4, decimal: 212
2019-05-10T12:08:33Z  Task Setup       Building Task Directory
2019-05-10T12:08:33Z  Received         Task received by client

Nomad Server logs (if appropriate)

# journalctl -t nomad

19-05-10T12:12:08.010Z [WARN ] client.alloc_runner.task_runner.task_hook.logmon.nomad: timed out waiting for read-side of process output pipe to close: alloc_id=4c858980-77cf-c51f-fbeb-a808a4495941 task=my-postgres @module=logmon timestamp=2019-05-10T12:12:08.010Z
19-05-10T12:12:08.010Z [WARN ] client.alloc_runner.task_runner.task_hook.logmon.nomad: timed out waiting for read-side of process output pipe to close: alloc_id=4c858980-77cf-c51f-fbeb-a808a4495941 task=my-postgres @module=logmon timestamp=2019-05-10T12:12:08.010Z
19-05-10T12:16:04.085Z [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=65a238f6-2675-e9be-49d5-d2b11648bcbb task=my-postgres path=/mnt/data/nomad/alloc/65a238f6-2675-e9be-49d5-d2b11648bcbb/alloc/logs/.my-postgres.stdout.fifo @module=logmon timestamp=2019-05-10T12:16:04.085Z
19-05-10T12:16:04.085Z [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=65a238f6-2675-e9be-49d5-d2b11648bcbb task=my-postgres @module=logmon path=/mnt/data/nomad/alloc/65a238f6-2675-e9be-49d5-d2b11648bcbb/alloc/logs/.my-postgres.stderr.fifo timestamp=2019-05-10T12:16:04.085Z
19-05-10T12:16:04.094Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=65a238f6-2675-e9be-49d5-d2b11648bcbb task=my-postgres error="failed to decode driver config: [pos 171]: readContainerLen: Unrecognized descriptor byte: hex: d4, decimal: 212"
19-05-10T12:16:04.096Z [INFO ] client.alloc_runner.task_runner: not restarting task: alloc_id=65a238f6-2675-e9be-49d5-d2b11648bcbb task=my-postgres reason="Error was unrecoverable"
@schmichael
Copy link
Member

Hi @JeffreyVdb,

Sorry you hit this. The reason the job validates but fails at runtime is because the migration to plugins in 0.9 introduced a regression where servers can no longer validate task's driver config stanzas. It seems we neglected to note that in the changelog or docs, so I fixed that in #5693.

That error message is also particularly useless and is something we intend to improve in future releases.

The error is that you're using ${NOMAD_IP_http} instead of _pg in your dns_servers line.

I filed a new issue for fixing the error message when using an undefined variable:

Using undefined variables in task config produces useless error #5694

I'm going to close this in favor of #5694, but please feel free to reopen if you feel like I missed something.

Thanks for the report and sorry for the regression.

notnoop pushed a commit that referenced this issue Jun 17, 2019
This upgrades hcl2 library dependency to pick up
hashicorp/hcl2#113 .

Prior to this change, parsing and decoding array attributes containing
invalid errors (e.g. references to unknown variables) are silently
dropped, with `cty.Unknown` being assigned to the bad element.  Rather
than showing a type/meaningful error from hcl2, we get a very decrypted
error message from msgpack layer trying to handle `cty.unknown`.

This ensures that we propagate diagnostics correctly and report
meaningful errors to users.

Fixes #5694
Fixes #5680
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants