Terraform nomad_job throwing "`job` stanza not found" error during `terraform plan` when we have made no code change #92

wfeng-fsde · 2020-01-31T19:13:18Z

Hi there,

Thank you for opening an issue. Please note that we try to keep the Terraform issue tracker reserved for bug reports and feature requests. For general usage questions, please see: https://www.terraform.io/community.html.

Terraform Version

Run terraform -v to show the version. If you are not running the latest version of Terraform, please upgrade because your issue may have already been fixed.

This command will also output the provider version, please include that as well.

$ terraform -v
Terraform v0.12.20
+ provider.archive v1.3.0
+ provider.aws v2.47.0
+ provider.consul v2.6.1
+ provider.local v1.4.0
+ provider.nomad v1.4.2
+ provider.null v2.1.2
+ provider.random v2.2.1
+ provider.template v2.1.2
+ provider.tls v2.1.1
+ provider.vault v2.7.1`

Nomad Version

Run nomad server members in your target node to view which version of Nomad is running. Make sure to include the entire version output.

$ nomad server members                                                                                                                                                                                                       [01/31|10:52AM]
Name                                   Address      Port  Status  Leader  Protocol  Build   Datacenter              Region
nomad-shared-ip-10-27-0-143.us-west-2  10.27.0.143  4648  alive   true    2         0.10.2  apiq-sre-us1-r1-shared  us-west-2
nomad-shared-ip-10-27-1-224.us-west-2  10.27.1.224  4648  alive   false   2         0.10.2  apiq-sre-us1-r1-shared  us-west-2
nomad-shared-ip-10-27-2-209.us-west-2  10.27.2.209  4648  alive   false   2         0.10.2  apiq-sre-us1-r1-shared  us-west-2

Provider Configuration

Which values are you setting in the provider configuration?

provider "nomad" {
  ...
  version   = "~> 1.4"
}

Environment Variables

Do you have any Nomad specific environment variable set in the machine running Terraform?

env | grep "NOMAD_"

Nothing.

Affected Resource(s)

Please list the resources as a list, for example:

nomad_job

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

resource "nomad_job" "zookeeper_server" {
  jobspec = templatefile("${path.module}/zookeeper_server_nomad.tpl", {
    instance_count             = var.instance_count
    nomad_region           = var.nomad_region
    nomad_datacenter       = var.nomad_datacenter
    ...
  })
}

and the template is:

      type          = "service"
    
      region        = "${nomad_region}"
      datacenters   = ["${nomad_datacenter}"]
    
      constraint {
        operator  = "distinct_hosts"
        value     = "true"
      }
    
      meta {
        S3BUCKET = "${meta_s3_bucket}"
      }
    
      group "main" {
        count = ${instance_count}
    
        constraint {
          attribute = "$${meta.ResourceId}"
          operator  = "=="
          value     = "${resource_id}"
        }
    
        meta {
          restartflag = "1"
        }
    
        update {
          max_parallel        = 1
          health_check        = "checks"
          ...
          canary              = 0
        }
    
        restart {
          attempts  = 2
          ...
        }
    
        reschedule {
          unlimited      = false
          ...
          max_delay      = "30m"
        }
    
        ephemeral_disk {
          size = 2048 #MB
        }
    
        task "exhibitor-prestep" {
          driver = "raw_exec"
          ...
          }
    
        }
    
        task "exhibitor" {
          driver = "raw_exec"
    
          artifact {
            source = "..."
          }
        }
      }
    
      migrate {
        max_parallel        = 1
        ...
      }
    }

Debug Output

Please provider a link to a GitHub Gist containing the complete debug output: https://www.terraform.io/docs/internals/debugging.html. Please do NOT paste the debug output in the issue; just paste a link to the Gist.

During terraform plan, I get the following error:


  on .terraform/.../modules/zookeeper_nomad_job/main.tf line 1, in resource "nomad_job" "zookeeper_server":
   1: resource "nomad_job" "zookeeper_server" {

Panic Output

If Terraform produced a panic, please provide a link to a GitHub Gist containing the output of the crash.log.

Expected Behavior

What should have happened?

We have not changed this code for quite a long time and our infra has been up-to-date with this resource. So I expect terraform plan should pass with no change in this resource or no error output.

Actual Behavior

What actually happened?
We came across this just recently while we made some changes to some other resources that is totally unrelated, terraform plan gives us this error. So I'm suspecting that this is a provider bug.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform apply

This was during terraform plan and we have not made any code change to this resource.

Important Factoids

Are there anything atypical about your accounts that we should know? For example: Do you have ACL enabled? Multi-region deployment?

No

References

Are there any other GitHub issues (open or closed) or Pull Requests that should be linked here? For example:

GH-1234

The text was updated successfully, but these errors were encountered:

cgbaker · 2020-01-31T19:26:25Z

Hi @wfeng-fsde , thanks for the report. Was there an upgrade (of Terraform or the Nomad provider) that changed to cause this?

rmlsun · 2020-02-04T06:48:25Z

@cgbaker Teammate of @wfeng-fsde here.

We ran into this issue with Terraform 0.12.18 first I believed. Then we tried using latest Terraform 0.12.20. But we ran into same issue with both versions of Terraform.

On the nomad provider side, since the provider version spec is "~> 1.4", I am not sure if we were using the same version earlier, but now I see we're pulling v1.4.2 version of nomad tf provider.

cgbaker · 2020-02-04T18:23:41Z

I'm not getting the same error as you, but I'm getting something similar. Using the following versions:

$ terraform -v
Terraform v0.12.20
+ provider.nomad v1.4.2

and a shortened template base on the one included above. I get the following error:

Error: error parsing jobspec: 4 errors occurred:
	* invalid key: type
	* invalid key: region
	* invalid key: datacenters
	* invalid key: group

The reason is that the jobspec string for the nomad_job resource must be a valid Nomad jobspec; specifically, it must include a job stanza. The error is resolved by modifying my template file to have the following shape:

job "jobname" {
  ...
}

There are a few things I don't understand: why you're getting a different error message and why this worked before. Is that the entire template pasted above (perhaps the copy above is missing the first line)? If not, can you provide the full template?

mahsoud · 2020-04-15T07:32:22Z

I also ran into this bug with Terraform v0.12.24 + provider.nomad v1.4.5

We are able to reproduce 100% of time. Here is my job template:

job "docs" {
  datacenters = ["test"]
  group "example" {
    meta {
      date_time = "${deploy_timestamp}"
    }
    task "server" {

      driver = "raw_exec"
      template {
        destination   = "local/sample.conf"
        data = "sample"
      }
    }
  }
}

and this is main.tf:

locals {
    template_vars = {
        deploy_timestamp = formatdate("DD-MM-YY hh-mm ZZZ", timestamp())
    }
}

resource nomad_job test {
  jobspec                 = templatefile("${path.module}/job.hcl.tpl", local.template_vars)
  deregister_on_destroy   = true
  deregister_on_id_change = true
}

it works fine on first apply, but after nomad_job is added to state file, the refresh of the state fails to parse job spec with error "'job' stanza not found"

mahsoud · 2020-04-15T08:05:44Z

Tried replacing timestamp with random_id (via random provider) and got the same result.

Moved deploy_timestamp from meta, into template and also got the same result.

It seems that any dynamic value that changes between apply commands causes the bug. At the same time, if deploy_timestamp is set via a terraform variable that we change between applys works fine.

cgbaker · 2020-04-15T09:42:37Z

Thank you for the detailed reproduction. I will check this out again.

…

On Wed, Apr 15, 2020, 03:06 mahsoud ***@***.***> wrote: Tried replacing timestamp with random_id (via random provider) and got the same result. Moved deploy_timestamp from meta, into template and also got the same result. It seems that any dynamic value that changes between apply commands causes the bug. At the same time, if deploy_timestamp is set via a terraform variable that we change between applys works fine. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#92 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMY6TZ76SWUNUQZ5USSTL3RMVTGRANCNFSM4KOMGUEQ> .

cyrilgdn · 2020-04-29T20:15:53Z

It seems that any dynamic value that changes between apply commands causes the bug

Even if the generated value doesn't change 🤔
I used:

  [...]
  jobspec = templatefile("${path.module}/test.nomad", {
    date = formatdate("YYYY-MM-DD", timestamp())
  })
}

So the generated value today was 2020-04-29, both apply was made today and I had this error on the second one. Very strange 😮

cyrilgdn · 2020-04-29T21:36:00Z

@cgbaker So I quickly checked and here:

https://github.com/terraform-providers/terraform-provider-nomad/blob/f8b584cd73bc9dce1e64c8758fa62806aba77486/nomad/resource_job.go#L449-L459

As soon as there is a timestamp() call, the field is set as computed (as Terraform can't know the value before the real apply) and d.GetChange returns an empty value for newSpecRaw so the JSON parsing fails.

I tried to add

	if !d.NewValueKnown("jobspec") {
		return nil
	}

just before and NewValueKnown returns true in this case so it returns and I get this diff:

[...]
      ~ jobspec                 = <<~EOT
            job "docs" {
              datacenters = ["dc1"]
              group "example" {
                meta {
                  date = "2020-04-29 48"
                }
                task "server" {
                  driver = "docker"
                  config {
                    image = "nginx"
                  }
                }
              }
            }
        EOT -> (known after apply)
[...]

Which indeed says known after apply.

The problem is that in this case, there will be a diff on every apply. But I think there's no choice actually...

mahsoud · 2020-05-12T10:46:05Z

The problem is that in this case, there will be diff on every application. But I think there's no choice actually.

I believe this is fair, and in my case, this was expected and desired behaviour

wfeng-fsde changed the title ~~Terraform nomad_job throwing "job stanza not found" error~~ Terraform nomad_job throwing "job stanza not found" error during terraform plan when we have made no code change Jan 31, 2020

cgbaker self-assigned this May 15, 2020

cgbaker added the bug label May 15, 2020

cgbaker added this to the 1.4.6 milestone May 15, 2020

cgbaker mentioned this issue May 15, 2020

update SDK, resolve bugs around deployments and diffs #105

Merged

cgbaker closed this as completed in #105 May 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terraform nomad_job throwing "`job` stanza not found" error during `terraform plan` when we have made no code change #92

Terraform nomad_job throwing "`job` stanza not found" error during `terraform plan` when we have made no code change #92

wfeng-fsde commented Jan 31, 2020

cgbaker commented Jan 31, 2020

rmlsun commented Feb 4, 2020

cgbaker commented Feb 4, 2020

mahsoud commented Apr 15, 2020

mahsoud commented Apr 15, 2020

cgbaker commented Apr 15, 2020 via email

cyrilgdn commented Apr 29, 2020

cyrilgdn commented Apr 29, 2020 •

edited

mahsoud commented May 12, 2020

Terraform nomad_job throwing "job stanza not found" error during terraform plan when we have made no code change #92

Terraform nomad_job throwing "job stanza not found" error during terraform plan when we have made no code change #92

Comments

wfeng-fsde commented Jan 31, 2020

Terraform Version

Nomad Version

Provider Configuration

Environment Variables

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

cgbaker commented Jan 31, 2020

rmlsun commented Feb 4, 2020

cgbaker commented Feb 4, 2020

mahsoud commented Apr 15, 2020

mahsoud commented Apr 15, 2020

cgbaker commented Apr 15, 2020 via email

cyrilgdn commented Apr 29, 2020

cyrilgdn commented Apr 29, 2020 • edited

mahsoud commented May 12, 2020

Terraform nomad_job throwing "`job` stanza not found" error during `terraform plan` when we have made no code change #92

Terraform nomad_job throwing "`job` stanza not found" error during `terraform plan` when we have made no code change #92

cyrilgdn commented Apr 29, 2020 •

edited