Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow jobs not requiring network resources with network fingerprinter disabled #14300

Merged
merged 1 commit into from Oct 6, 2022

Conversation

wjordan
Copy link
Contributor

@wjordan wjordan commented Aug 24, 2022

Jobs not requiring any network resources should be allowed even when the network fingerprinter is disabled.

My use-case: Nomad Client running on a host with thousands of virtual network interfaces where the network fingerprinter takes a long time (several minutes), and the job definitions aren't allocating any network resources. I would like to add network to the fingerprint.denylist option, but this makes the scheduler fail to place any job because the current NetworkChecker fails if the node doesn't have any network. Allowing jobs not requiring any network resources would let me disable the network fingerprinter while still placing jobs properly.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wjordan! This is a great idea -- I always love finding ways to do less work in the scheduler. I've left a suggestion about a way we could make this a bit more intentional in behavior.

Also, you can add a changelog file at .changelog/14300.txt similar to this one? Something like "scheduler: Allow jobs not...", etc. Thanks!

Comment on lines 384 to 387
// Allow jobs not requiring any network resources
if c.ports == nil {
return true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a network.mode = "none" that I would expect to be set in the scenario you've described. Rather than using nil ports as an implied "no network", it'd be better if we checked explicitly on that mode. That way we're not confusing potential misconfiguration for an intentional user instruction.

Jobs not requiring any network resources should be allowed
even when the network fingerprinter is disabled.
@wjordan
Copy link
Contributor Author

wjordan commented Sep 28, 2022

@tgross I've updated the PR based on your suggestion to allow network.mode = "none" to not require network resources, please take another look.

@tgross tgross added this to the 1.4.x milestone Sep 29, 2022
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @wjordan! We're in the release candidate window for Nomad 1.4.0 and this didn't get in, but I'll merge this once we've shipped GA (next week-ish).

Tested with the following job:

jobspec
job "example" {
  type        = "batch"
  datacenters = ["dc1"]

  parameterized {
    payload = "required"
  }

  group "group" {

    network {
      mode = "none"
    }

    task "task" {

      driver = "docker"

      config {
        image   = "busybox:1"
        command = "/bin/sh"
        args    = ["-c", "cat local/payload.txt; sleep 1"]
      }

      dispatch_payload {
        file = "payload.txt"
      }

      resources {
        cpu    = 64
        memory = 500
      }

    }
  }
}

And with the following in my client config:

client {
  options {
    fingerprint.denylist = "network"
  }
}

(Which I'm now realizing isn't available without the deprecated options block, so I really should not have removed that from the docs in #12416 and I should revert that.)

@tgross
Copy link
Member

tgross commented Oct 6, 2022

Hi @wjordan we had to push out a quick Nomad 1.4.1 but this has now been merged and will ship in the next regular patch release of Nomad. Thanks again!

@github-actions
Copy link

github-actions bot commented Feb 4, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.4.x backport to 1.4.x release line theme/scheduling
Development

Successfully merging this pull request may close these issues.

None yet

2 participants