Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault agent doesn’t respect the number of retries #15725

Closed
njegosrailic opened this issue Jun 1, 2022 · 7 comments
Closed

Vault agent doesn’t respect the number of retries #15725

njegosrailic opened this issue Jun 1, 2022 · 7 comments
Labels
agent bug Used to indicate a potential bug

Comments

@njegosrailic
Copy link

Describe the bug

Looks like the Vault agent doesn’t respect the number of retries.

I also tried setting the VAULT_MAX_RETRIES ENV variable.

To Reproduce
Steps to reproduce the behavior:

Please see the configuration in the environment section below.

Expected behavior

According to the docs, I expect that the Vault agent won’t exit on an error and that the number of retries will be three. This is what I’m getting:

2022-05-16T14:32:48.342Z [DEBUG] cache.apiproxy.client: performing request: method=GET url=https://vault.service.confiad1.consul.csnzoo.com:8200/v1/dev/kv/secrets/app1/secret-new
2022-05-16T14:32:48.346Z [WARN] (view) vault.read(dbtech/dev/db/spdb/static-creds/svc-nrailic-secrets-test2-app): no secret exists at dbtech/dev/db/spdb/static-creds/svc-nrailic-secrets-test2-app (retry attempt 7 after "16s")
2022-05-16T14:32:48.347Z [WARN] (view) vault.read(dev/kv/secrets/app1/secret-new): no secret exists at dev/kv/secrets/app1/secret-new (retry attempt 7 after "16s")

I tried to set the VAULT_MAX_RETRIES ENV variable but no luck.

There is an option to pass the Consul template retry config but the problem is that I don’t understand how to pass it from the Vault config.

This is the Consul template config for the Vault config shown below:

[DEBUG] (runner) final config: {"Consul":{"Address":"","Namespace":"","Auth":{"Enabled":false,"Username":"","Password":""},"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"","CaPath":"","Cert":"","Enabled":false,"Key":"","ServerName":"","Verify":true},"Token":"","TokenFile":"","Transport":{"CustomDialer":null,"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":33,"TLSHandshakeTimeout":10000000000}},"Dedup":{"Enabled":false,"MaxStale":2000000000,"Prefix":"consul-template/dedup/","TTL":15000000000,"BlockQueryWaitTime":60000000000},"DefaultDelims":{"Left":null,"Right":null},"Exec":{"Command":[],"Enabled":false,"Env":{"Denylist":[],"Custom":[],"Pristine":false,"Allowlist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":0},"KillSignal":2,"LogLevel":"DEBUG","FileLog":{"LogFilePath":"","LogRotateBytes":0,"LogRotateDuration":86400000000000,"LogRotateMaxFiles":0},"MaxStale":2000000000,"PidFile":"","ReloadSignal":1,"Syslog":{"Enabled":false,"Facility":"LOCAL0","Name":"consul-template"},"Templates":[{"Backup":false,"Command":[],"CommandTimeout":30000000000,"Contents":"{{ with secret \"/dev/kv/secrets/app1/secret-new\" }}{{ .Data.data.value }}{{ end }}","CreateDestDirs":true,"Destination":"/secrets/secret-new","ErrMissingKey":false,"ErrFatal":true,"Exec":{"Command":[],"Enabled":false,"Env":{"Denylist":[],"Custom":[],"Pristine":false,"Allowlist":[]},"KillSignal":2,"KillTimeout":30000000000,"ReloadSignal":null,"Splay":0,"Timeout":30000000000},"Perms":0,"User":null,"Uid":null,"Group":null,"Gid":null,"Source":"","Wait":{"Enabled":false,"Min":0,"Max":0},"LeftDelim":"{{","RightDelim":"}}","FunctionDenylist":[],"SandboxPath":""}],"TemplateErrFatal":null,"Vault":{"Address":"http://127.0.0.1:8200","Enabled":true,"Namespace":"","RenewToken":false,"Retry":{"Attempts":12,"Backoff":250000000,"MaxBackoff":60000000000,"Enabled":true},"SSL":{"CaCert":"","CaPath":"","Cert":"","Enabled":false,"Key":"","ServerName":"","Verify":false},"Transport":{"CustomDialer":{},"DialKeepAlive":30000000000,"DialTimeout":30000000000,"DisableKeepAlives":false,"IdleConnTimeout":90000000000,"MaxIdleConns":100,"MaxIdleConnsPerHost":33,"TLSHandshakeTimeout":10000000000},"UnwrapToken":false,"DefaultLeaseDuration":300000000000,"LeaseRenewalThreshold":0.9},"Wait":{"Enabled":false,"Min":0,"Max":0},"Once":false,"ParseOnly":false,"BlockQueryWaitTime":60000000000}

I can tell that the config is being parsed properly, because if I pass two values in the retry stanza I got an error that only one is allowed.

Environment:

  • Vault Server Version (retrieve with vault status): 1.10.1
  • Vault CLI Version (retrieve with vault version): 1.10.3
  • Server Operating System/Architecture:

Vault agent configuration file(s):

{
  "auto_auth": {
    "method": {
      "type": "kubernetes",
      "mount_path": "auth/kube_cluster_c5",
      "config": {
        "role": "secrets-test-dev"
      }
    },
    "sink": [
      {
        "type": "file",
        "config": {
          "path": "/vault/vault-token"
        }
      }
    ]
  },
  "cache": {
    "use_auto_auth_token": "true"
  },
  "exit_after_auth": false,
  "pid_file": "/vault/vault.pid",
  "vault": {
    "address": "https://vault.service:8200",
    "retry": [
      {
        "num_retries": "3"
      }
    ]
  },
  "template_config": [
    {
      "error_on_missing_key": false,
      "exit_on_retry_failure": false
    }
  ],
  "template": [
    {
      "destination": "/secrets/secret-new",
      "contents": "{{ with secret \"/dev/kv/secrets/app1/secret-new\" }}{{ .Data.data.value }}{{ end }}",
      "left_delimiter": "{{",
      "right_delimiter": "}}"
    }
  ],
  "listener": [
    {
      "type": "tcp",
      "address": "127.0.0.1:8200",
      "tls_disable": true
    }
  ]
}

Additional context
Add any other context about the problem here.

@hsimon-hashicorp hsimon-hashicorp added agent bug Used to indicate a potential bug ecosystem labels Jun 1, 2022
@jasonodonnell
Copy link
Contributor

jasonodonnell commented Jul 6, 2022

This may have been fixed by #15204, available in Vault 1.11.0.

@njegosrailic
Copy link
Author

Thank you. I'll test it and let you know.

@njegosrailic
Copy link
Author

Hi @jasonodonnell,

Unless I'm missing something the issue still persist:

/ $ vault version
Vault v1.11.1 (0f634755745f4adf62ec0723a0b93d6dce5bc33e), built 2022-07-19T20:16:47Z

/ $ cat /vault/config.json | grep -A 6 vault\":
  "vault": {
    "address": "https://vault.service.intra:8200",
    "retry": [
      {
        "num_retries": "3"
      }
    ]
/ $ cat /vault/config.json | grep -A 5 template_config
  "template_config": [
    {
      "error_on_missing_key": false,
      "exit_on_retry_failure": false
    }
  ],
/ $

Logs:

2022-08-02T08:11:43.370Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 5 after "4s")
2022-08-02T08:11:47.371Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:11:47.371Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:11:47.371Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:11:47.371Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:11:47.376Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 6 after "8s")
2022-08-02T08:11:47.376Z [WARN] (view) vault.read(secrets/dev/test-secret): no secret exists at secrets/dev/test-secret (retry attempt 6 after "8s")
2022-08-02T08:11:55.377Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:11:55.377Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:11:55.377Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:11:55.377Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:11:55.383Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 7 after "16s")
2022-08-02T08:11:55.383Z [WARN] (view) vault.read(secrets/dev/test-secret): no secret exists at secrets/dev/test-secret (retry attempt 7 after "16s")
2022-08-02T08:12:11.384Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:12:11.384Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:12:11.384Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:12:11.384Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:12:11.390Z [WARN] (view) vault.read(secrets/dev/test-secret): no secret exists at secrets/dev/test-secret (retry attempt 8 after "32s")
2022-08-02T08:12:11.390Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 8 after "32s")
2022-08-02T08:12:43.391Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:12:43.391Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:12:43.391Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:12:43.391Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:12:43.396Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 9 after "1m0s")
2022-08-02T08:12:43.396Z [WARN] (view) vault.read(secrets/dev/test-secret): no secret exists at secrets/dev/test-secret (retry attempt 9 after "1m0s")
2022-08-02T08:13:43.397Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:13:43.397Z [INFO]  cache: received request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:13:43.397Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret2
2022-08-02T08:13:43.397Z [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/secrets/dev/test-secret
2022-08-02T08:13:43.405Z [WARN] (view) vault.read(secrets/dev/test-secret2): no secret exists at secrets/dev/test-secret2 (retry attempt 10 after "1m0s")
2022-08-02T08:13:43.405Z [WARN] (view) vault.read(secrets/dev/test-secret): no secret exists at secrets/dev/test-secret (retry attempt 10 after "1m0s")

@jasonodonnell
Copy link
Contributor

I will take a closer look, but unfortunately the retry logic is complicated due to the interaction of sub systems in agent as per the documentation: https://www.vaultproject.io/docs/agent#retry-stanza

@jasonodonnell
Copy link
Contributor

I did find this note in the code for template:

// The cache does its own retry management based on sc.AgentConfig.Retry,
.

@VioletHynes
Copy link
Contributor

Hi there! I'm pretty sure my PR #16970 will fix this issue.

To describe the current behaviour: the issue was that when caching was enabled, configured retries were not respected. If template_config was present (like here), it would ignore the set max_retries value and set retries to the default, otherwise. If template_config is not present, it would ignore the set max_retries value and set retries to 0 (essentially creating an infinite loop of instant retries).

This issue should be resolved soon and fixed in 1.12.0. Thanks for the report!

@VioletHynes
Copy link
Contributor

Hi there! I'm going to close this issue as I just merged #16970 which should fix the issue identified here. It should now respect the number of retries in all cases, regardless of if caching is enabled. This should release in 1.12.

Thanks for the bug report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent bug Used to indicate a potential bug
Projects
None yet
Development

No branches or pull requests

4 participants