Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file parameter to job's vault stanza #13343

Merged
merged 11 commits into from Jun 23, 2023
Merged

Conversation

grembo
Copy link
Contributor

@grembo grembo commented Jun 12, 2022

This complements the env parameter, so that the operator can author
tasks that don't share their Vault token with the payload. As a result,
more powerful tokens can be used in a job definition, allowing it to
use template stanzas to issue all kinds of secrets (database secrets,
Vault tokens with very specific policies, etc.), without sharing that
issuing power with the task itself as long as a driver with image
isolation is used.

This is accomplished by creating a directory called private within
the task's working directory, which shares many properties of
the secrets directory (tmpfs where possible, not accessible by
nomad alloc fs or Nomad's web UI), but isn't mounted into/bound to the
container.

If the file parameter is set to true (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.

See #11900

@grembo
Copy link
Contributor Author

grembo commented Jul 6, 2022

authrequired

Can someone please authorize this (if it is a blocker to proceed with the review)? Thanks!

@jrasell jrasell added this to Needs Triage in Nomad - Community Issues Triage via automation Jul 6, 2022
@tgross tgross requested a review from lgfa29 July 6, 2022 13:43
@tgross
Copy link
Member

tgross commented Jul 6, 2022

Hi @grembo we weren't waiting on CI for a review here, I think there was just some confusion as to which of the several PRs we have open for the original issue was the one we want. I'm going to assign the PR review to @lgfa29 as it looks like it he was the one who spent a bunch of time going back and forth with you on the design.

api/tasks.go Outdated Show resolved Hide resolved
@grembo
Copy link
Contributor Author

grembo commented Jul 6, 2022

Hi @grembo we weren't waiting on CI for a review here, I think there was just some confusion as to which of the several PRs we have open for the original issue was the one we want. I'm going to assign the PR review to @lgfa29 as it looks like it he was the one who spent a bunch of time going back and forth with you on the design.

Apologies, I though my comment on #11900 (and turning the previous review into a Draft) would make it clear. Let me abandon the previous review to reduce the noise.

@grembo
Copy link
Contributor Author

grembo commented Jan 13, 2023

@tgross Hi, any news on this? Thanks!

@lgfa29
Copy link
Contributor

lgfa29 commented Jan 14, 2023

Hi @grembo 👋

Apologies for delay on reviewing this, it unfortunately fell out of my radar...I will give it another pass in the coming days!

@marknl
Copy link

marknl commented Feb 24, 2023

Hi all! I'm looking forward to see this PR merged into Nomad.
@lgfa29 Any update?

grembo and others added 8 commits June 22, 2023 23:41
This complements the `env` parameter, so that the operator can author
tasks that don't share their Vault token with the payload. As a result,
more powerful tokens can be used in a job definition, allowing it to
use template stanzas to issue all kinds of secrets (database secrets,
Vault tokens with very specific policies, etc.), without sharing that
issuing power with the task itself as long as a driver with `image`
isolation is used.

This is accomplished by creating a directory called `private` within
the task's working directory, which shares many properties of
the `secrets` directory (tmpfs where possible, not accessible by
`nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the
container.

If the `file` parameter is set to `true` (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.

See hashicorp#11900
Show multiple templates using different change modes and
explain what they mean.
In order to maintan backwards compatibility we are not able to create a
new field where the expected default is different from its zero value
because during an upgrade the Raft log entries will not have this field
and therefore existing tasks will be set to non-backwards compatible
zero value.
Tasks that were created before a Nomad upgrade will not have the private
directory in their filesystem, so the task restore process would fail.

This commit adds upgrade path logic to handle cases where the private
directory doesn't exist.
If the task's Vault `disable_file` config changes the task group must be
recreated to make sure the allocation file system is properly setup. For
example, in an upgrade path the private directory will not exist for
previous allocations.
@lgfa29
Copy link
Contributor

lgfa29 commented Jun 23, 2023

Hi everyone 👋

Apologies for the delay on moving this forward, this is an area of the code we've been iterating over and it's also a sneaky complex part as well, so it took me a while to get the time to dedicate to a thorough review.

I have been able to give it another pass and found some critical issues on the upgrade path.The biggest one is that this line will cause an existing alloc to fail when the Nomad agent is upgraded because the /private directory doesn't exist in that allocation.

The other big problem is the lesson we learned the hard way in #17087, which is that we can't default to a non-zero value for a new attribute. What this means is that I had to change the attribute name from file to disable_file so that existing jobs are loaded in a backwards compatible way after an upgrade. Without this, if an old alloc fails it would be recreated with file: false even though we need it to be true.

Since this PR has been stale for a while because of us, I don't think it would fair to ask @grembo to do all of these changes, so I implemented the fixes in https://github.com/hashicorp/nomad/compare/vault-token-file-rebased (ignore that tmp commit if you see it, I just want to run our test suite without opening a PR 😅).

Due to the merge conflicts I also had to rebase the branch, which means I would need to force push to your branch @grembo to get the changes to this PR. Would that be OK?

@grembo
Copy link
Contributor Author

grembo commented Jun 23, 2023

Hi @lgfa29,

Thanks for picking this up again.

I have been able to give it another pass and found some critical issues on the upgrade path.The biggest one is that this line will cause an existing alloc to fail when the Nomad agent is upgraded because the /private directory doesn't exist in that allocation.

That's a good find, how are you planning to work around it (create the directory in case it is missing)?

The other big problem is the lesson we learned the hard way in #17087, which is that we can't default to a non-zero value for a new attribute. What this means is that I had to change the attribute name from file to disable_file so that existing jobs are loaded in a backwards compatible way after an upgrade. Without this, if an old alloc fails it would be recreated with file: false even though we need it to be true.

My intention was to have file set to "false" by default in the long run (following the philosophy of safe defaults and opt-in into less secure/more open ones). I understand the backwards compatibility argument though - I have some ideas how such a change could be managed (e.g., have a "use_defaults_" knob, that allows to opt-into a new set of secure defaults), but all of these add extra complications that would delay making progress. So if that's the way to move forward, so be it ;)

Since this PR has been stale for a while because of us, I don't think it would fair to ask @grembo to do all of these changes, so I implemented the fixes in https://github.com/hashicorp/nomad/compare/vault-token-file-rebased (ignore that tmp commit if you see it, I just want to run our test suite without opening a PR 😅).

Due to the merge conflicts I also had to rebase the branch, which means I would need to force push to your branch @grembo to get the changes to this PR. Would that be OK?

Yes, please go ahead

@lgfa29
Copy link
Contributor

lgfa29 commented Jun 23, 2023

For reference, this is how I tested the upgrade path.

  1. Start dev Vault agent and login with root token.
    $ vault server -dev
    $ vault login
    
  2. Create Nomad server role, jobs policy, and secret.
    {
      "disallowed_policies": "nomad-server",
      "token_explicit_max_ttl": 0,
      "name": "nomad-cluster",
      "orphan": true,
      "token_period": 259200,
      "renewable": true
    }
    path "secret/data/*" {
      capabilities = ["read"]
    }
    $ vault policy write nomad-job vault-job-policy.hcl
    $ vault write /auth/token/roles/nomad-cluster @nomad-cluster-role.json
    $ vault kv put -mount=secret test test=1
    
  3. Start Nomad agent version 1.5.6 with this config.
    data_dir = "/tmp/nomad-vault/data"
    
    server {
      enabled          = 1
      bootstrap_expect = 1
    }
    
    client {
      enabled = true
    }
    
    vault {
      enabled          = true
      address          = "http://localhost:8200"
      task_token_ttl   = "5m"
      create_from_role = "nomad-cluster"
      token            = "<VAULT ROOT TOKEN>"
    }
    $ /opt/hashicorp/nomad/1.5.6/nomad agent -config ./config.hcl
    
  4. Run job.
    job "example" {
      group "cache" {
        task "redis" {
          driver = "docker"
    
          config {
            image = "redis:7"
          }
    
          vault {
            policies = ["nomad-job"]
          }
    
          template {
            data        = <<EOF
    {{with secret "secret/test"}}{{.Data.data.test}}{{end}}
    EOF
            destination = "config.txt"
          }
        }
      }
    }
    $ nomad run example.hcl
    
  5. Verify Vault token is written to disk.
    $ tree /tmp/nomad-vault/data/alloc
    /tmp/nomad-vault/data/alloc
    └── 3e1318f6-0ee0-8725-ac72-ed1a928d6401
        ├── alloc
        │   ├── data
        │   ├── logs
        │   │   ├── redis.stderr.0
        │   │   └── redis.stdout.0
        │   └── tmp
        └── redis
            ├── config.txt
            ├── local
            ├── secrets
            │   ├── api.sock
            │   └── vault_token
            └── tmp
    
    9 directories, 5 files
  6. Stop Nomad agent and start the custom binary.
    $ ~/go/bin/nomad agent -config ./config.hcl
    
  7. Verify allocation is restored without changes (no new task event) and file system stays the same.
  8. Update job to set vault.disable_file = true.
  9. Run updated job and verify a new allocation is created with the right folder structure.
    $ tree /tmp/nomad-vault/data/alloc
    /tmp/nomad-vault/data/alloc
    ├── 3e1318f6-0ee0-8725-ac72-ed1a928d6401
    │   ├── alloc
    │   │   ├── data
    │   │   ├── logs
    │   │   │   ├── redis.stderr.0
    │   │   │   └── redis.stdout.0
    │   │   └── tmp
    │   └── redis
    │       ├── config.txt
    │       ├── local
    │       ├── secrets
    │       │   └── vault_token
    │       └── tmp
    └── 7ea55a0d-c12c-5590-4263-2fe3e222cb18
        ├── alloc
        │   ├── data
        │   ├── logs
        │   │   ├── redis.stderr.0
        │   │   └── redis.stdout.0
        │   └── tmp
        └── redis
            ├── config.txt
            ├── local
            ├── private
            │   └── vault_token
            ├── secrets
            │   └── api.sock
            └── tmp
    
    19 directories, 9 files

@lgfa29
Copy link
Contributor

lgfa29 commented Jun 23, 2023

That's a good find, how are you planning to work around it (create the directory in case it is missing)?

The fix was two-fold:

  1. On Prestart I'm also looking for the token in /secrets path.
  2. On writeToken I'm checking if the /private dir is missing, which indicates this is an old task, and then keep writing to /secrets.

You can check the diff here: 2f881b7#diff-fdbf72fb40a559f0592c927e632f7f8a7178915920dd439de74e8798bc4acb88

I renamed the variables because I was getting confused which was which 😅

Another thing that was missing that I forgot to mention before is that we need a destructive update when this flag changes because the alloc runner may need to recreate the task dir to create /private if it doesn't exist. This was a quick fix: 83eeb46

My intention was to have file set to "false" by default in the long run

Yeah, we can start discussions about changing the default behaviour, but that is a bit more complicated. We can open an issue once this is released to discuss it further.

Yes, please go ahead

Awesome, will do, thanks!

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM code-wise, once comments are addressed.

website/content/docs/job-specification/vault.mdx Outdated Show resolved Hide resolved
website/content/docs/job-specification/vault.mdx Outdated Show resolved Hide resolved
Comment on lines 73 to 76
- **«taskname»/private/**: This directory is used by Nomad to store private files
related to the allocation, such as Vault tokens, that are not always shared with tasks
when using `image` isolation. The contents of files in this directory cannot be read
by the `nomad alloc fs` command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specifically document that this directory is not mounted to task drivers using image isolation but is visible to task drivers using chroot isolation, and that should also be documented in the jobspec flag, because it's very surprising. (So much so I'm not sure that's what we want at all.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's kind of a tricky situation. I think that, ideally, when disable_fileis true the token would written to a path outside of the alloc dir, but within Nomad's data_dir, but that would require plumbing the token all the way to the template runner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it extra scary 😄
image
image

client/allocrunner/taskrunner/task_runner_test.go Outdated Show resolved Hide resolved
Comment on lines 1747 to 1752
// Remove private dir and write the Vault token to the secrets dir to
// simulate an old task.
err := conf.TaskDir.Build(false, nil)
must.NoError(t, err)
err = os.Remove(conf.TaskDir.PrivateDir)
must.NoError(t, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in Slack, this is failing but isn't going to work because it's not doing the unlink we need (ref fs_linux.go#L81-L92). We could maybe just lift that function into this test but the test is going to be pointless immediately after 1.6.0 ships so I'd say let's just remove this test. We've seen it work E2E.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We came up with a better solution in 371105a: restrict the test to Linux environments and unmount the path before deleting it.

This test requires a task path to be deleted, but on Linux systems it is
mounted in a tmpfs, resulting in an error if the path is deleted before
unmounting.

Since our CI runs on Linux machines, restrict the test to Linux
environments and unmount path before deleting.
@lgfa29
Copy link
Contributor

lgfa29 commented Jun 23, 2023

UI test failure is a known flake. I thought I had fixed it in #17676 but it seems like it's not 100% yet.

@lgfa29 lgfa29 merged commit 6f04b91 into hashicorp:main Jun 23, 2023
23 of 25 checks passed
Nomad - Community Issues Triage automation moved this from In Progress to Done Jun 23, 2023
@lgfa29 lgfa29 added this to the 1.6.0 milestone Jun 23, 2023
@lgfa29
Copy link
Contributor

lgfa29 commented Jun 23, 2023

This should go out in Nomad 1.6.0. Thank you very much @grembo for the contribution!

philrenaud pushed a commit that referenced this pull request Jun 26, 2023
This complements the `env` parameter, so that the operator can author
tasks that don't share their Vault token with the workload when using 
`image` filesystem isolation. As a result, more powerful tokens can be used 
in a job definition, allowing it to use template stanzas to issue all kinds of 
secrets (database secrets, Vault tokens with very specific policies, etc.), 
without sharing that issuing power with the task itself.

This is accomplished by creating a directory called `private` within
the task's working directory, which shares many properties of
the `secrets` directory (tmpfs where possible, not accessible by
`nomad alloc fs` or Nomad's web UI), but isn't mounted into/bound to the
container.

If the `disable_file` parameter is set to `false` (its default), the Vault token
is also written to the NOMAD_SECRETS_DIR, so the default behavior is
backwards compatible. Even if the operator never changes the default,
they will still benefit from the improved behavior of Nomad never reading
the token back in from that - potentially altered - location.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants