Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nomad_plugin not waiting for expectedControllers reported by nomad #234

Merged
merged 2 commits into from
Jun 4, 2021

Conversation

kevinschoonover
Copy link
Contributor

@kevinschoonover kevinschoonover commented Jun 3, 2021

I was having an issue using nomad_external_volume with the following snippet:

resource "nomad_job" "plugin_digitalocean_monolith" {
  detach  = false
  jobspec = file("${path.module}/../../jobs/plugin-digitalocean-monolith.nomad")
}

data "nomad_plugin" "digitalocean" {
  depends_on   = [nomad_job.plugin_digitalocean_monolith]

  plugin_id        = "digitalocean"
  wait_for_registration = true
  wait_for_healthy = true
}

resource "nomad_external_volume" "cortex" {
  depends_on = [data.nomad_plugin.digitalocean]
  count        = 1
  type         = "csi"
  plugin_id    = data.nomad_plugin.digitalocean.plugin_id
  volume_id    = "cortex[${count.index}]"
  name         = "cortex-${count.index}"
  capacity_min = "1GiB"
  capacity_max = "10GiB"

  capability {
    access_mode     = "single-node-writer"
    attachment_mode = "file-system"
  }

  mount_options {
    fs_type = "ext4"
  }
}

with three nomad clients because nomad_plugin was immediately returning instead of waiting for the 3 controllers to be healthy. This caused the volume creation to fail with 500: plugin has no controller. After doing some investigation, we noticed that nomad_plugin was reporting

output "controller_required" {
  value = data.nomad_plugin.digitalocean.controller_required # == false
}

output "controllers_expected " {
  value = data.nomad_plugin.digitalocean.controllers_expected # == 0
}

output "controllers_healthy" {
  value = data.nomad_plugin.digitalocean.controllers_healthy # == 0
}

This led us to the line-edited below where it appears to be comparing the number of Controllers as reported by nomad vs the number of healthy controllers which will always be the same? Let me know if this assumption is wrong.

I tested this change locally and it is no longer giving a 500: plugin has no controller error and seems to be waiting appropriately.

Please let me know any comments/changes/concerns and I am happy to address them!

@hashicorp-cla
Copy link

hashicorp-cla commented Jun 3, 2021

CLA assistant check
All committers have signed the CLA.

@jrasell jrasell requested review from jrasell and lgfa29 June 3, 2021 07:05
Copy link
Member

@jrasell jrasell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kevinschoonover and thanks for this. The line change looks good to me. I would also be interested in your thoughts around changing the set on the controllers_expected field also to keep consistency.

@kevinschoonover
Copy link
Contributor Author

@jrasell Done! Thanks for the heads up, totally missed that. Let me know if there is anything else. I won't get a chance to test the latest commit until tomorrow and can follow up if that's a blocker.

Copy link
Contributor

@lgfa29 lgfa29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to reproduce the issue, and verified that the fix works.

I also noticed that plugin.NodesHealthy is not being used in the wait condition and that nodes_expected is not reading from plugin.NodesExpected. I will submit a fix for these once the PR is merged.

The failing CI is due to an issue loading the Nomad License in Travis. I run the test suite locally and it's all good 👍

Thanks for the contribution @kevinschoonover!

@lgfa29 lgfa29 merged commit 6871e9e into hashicorp:main Jun 4, 2021
lgfa29 added a commit that referenced this pull request Jun 4, 2021
lgfa29 added a commit that referenced this pull request Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants