Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul template unlimitely spawn TCP connections which DDoS Consul/Vault #1840

Closed
weichuliu opened this issue Nov 17, 2023 · 1 comment · Fixed by #1858
Closed

Consul template unlimitely spawn TCP connections which DDoS Consul/Vault #1840

weichuliu opened this issue Nov 17, 2023 · 1 comment · Fixed by #1858

Comments

@weichuliu
Copy link

We are using vault agent to render templates.

It looks like for each secret keyword in a template, consul-template spawns a new query to Vault, and all queries are spawned simultaneously. This makes rendering a simple template easily a DoS to the Vault.

For example, if consul-template renders a file with 10k secrets, it can be observed that the single process starts 10K concurrent tcp connections to Vault all at a time.

In our environment, 20 pods can easily bring Vault to 700K GoRoutines. Potentially it crashed the Vault quorum.

Consul Template version

I am using vault agent with Vault version 1.15.0.

Configuration

config.hcl:

vault {
  address = "http://localhost:8200"
}
auto_auth {
  method {
    type = "token_file"
    config = {
      token_file_path = "/tmp/token" # root
    }
  }
}
log_level = "info"
exit_after_auth = true
template_config {
  exit_on_retry_failure = true
}
template {
  source      = "/tmp/test.gotmpl"
  destination = "/tmp/properties"
}

/tmp/test.gotmpl -- This simply list secret/test-kv path, and render all secrets to under the path to the output file:

{{ range secrets "secret/metadata/test-kv" }}
{{ $i := . }}
{{     spew_printf "secret/test-kv/%s\n" $i }}
{{     scratch.MapSet "properties" . ( printf "secret/test-kv/%s" $i ) }}
{{ end }}
{{ range $key, $fullPath := scratch.Get "properties" -}}
{{   with secret $fullPath -}}
{{     if sprig_hasKey .Data.data "data" -}}
{{       $value := .Data.data.data -}}
{{         $key }}={{ $value | trimSpace | replaceAll "\n" "\\n" }}
{{     end -}}
{{   end -}}
{{ end -}}

/tmp/token: content is root.

Command

  1. Setup a dev vault
vault server -dev -dev-root-token-id=root
  1. Write 10000 secrets to test-kv.
export VAULT_ADDR=http://127.0.0.1:8200 VAULT_TOKEN=root ; for i in {0..10000}; do vault kv put -mount secret test-kv/$i data=`date +%s`; done
  1. I wrote a Python script to track the tcp connections during rendering:
import subprocess
import time
import os


vault_server_pid = int(subprocess.run(['pgrep', '-f', 'vault.server'], stdout=subprocess.PIPE).stdout.decode())

for i in [0]:
    start = time.time()
    p = subprocess.Popen(['vault', 'agent', '-config=/tmp/test.hcl'], env={**os.environ}, stdout=open(f"/tmp/render_{i}.out", 'w'), stderr=open(f"/tmp/render_{i}.err", 'w'))

    while p.poll() is None:
        time.sleep(0.2)
        t = time.time() - start
        tcp_connections_server = subprocess.run(['lsof', '-p', f'{vault_server_pid}'], stdout=subprocess.PIPE).stdout.decode().count('TCP')
        tcp_connections_client = subprocess.run(['lsof', '-p', f'{p.pid}'], stdout=subprocess.PIPE).stdout.decode().count('TCP')
        print(f"{t:.4} server_tcp:{tcp_connections_server} clien_tcp:{tcp_connections_client}")

Debug output

The vault agent during rendering starts 16K tcp connections, which is even more than the count of secrets (10000).

$ python run-template.py
0.2108 server_tcp:2 clien_tcp:1
0.474 server_tcp:2 clien_tcp:1
0.7608 server_tcp:2 clien_tcp:1
1.053 server_tcp:2 clien_tcp:1
1.335 server_tcp:2 clien_tcp:1
1.618 server_tcp:2 clien_tcp:1
1.903 server_tcp:2 clien_tcp:1
2.185 server_tcp:2 clien_tcp:1
2.464 server_tcp:103 clien_tcp:101
2.77 server_tcp:1639 clien_tcp:4807
3.166 server_tcp:3070 clien_tcp:14458
3.649 server_tcp:3663 clien_tcp:15299
4.027 server_tcp:3663 clien_tcp:15296
4.422 server_tcp:3663 clien_tcp:15308
4.785 server_tcp:3663 clien_tcp:15589
5.161 server_tcp:3663 clien_tcp:15877
5.542 server_tcp:3663 clien_tcp:15792
5.924 server_tcp:3663 clien_tcp:15300
6.302 server_tcp:3663 clien_tcp:15298
6.684 server_tcp:3663 clien_tcp:15466
7.056 server_tcp:3663 clien_tcp:15929
7.426 server_tcp:3663 clien_tcp:16429
7.801 server_tcp:3663 clien_tcp:16533
8.189 server_tcp:3663 clien_tcp:16903
8.563 server_tcp:3663 clien_tcp:16799
8.943 server_tcp:3663 clien_tcp:16664
9.327 server_tcp:3663 clien_tcp:16736
9.723 server_tcp:3663 clien_tcp:16461
10.11 server_tcp:3663 clien_tcp:16244
10.5 server_tcp:3663 clien_tcp:16125
10.9 server_tcp:3663 clien_tcp:16261
11.28 server_tcp:3663 clien_tcp:15831
11.68 server_tcp:3663 clien_tcp:16199
12.06 server_tcp:3663 clien_tcp:16734
12.45 server_tcp:3663 clien_tcp:17137
12.85 server_tcp:3663 clien_tcp:17506
13.26 server_tcp:3663 clien_tcp:17228
13.67 server_tcp:3663 clien_tcp:17088
14.06 server_tcp:3663 clien_tcp:16948
14.44 server_tcp:3663 clien_tcp:16433
14.81 server_tcp:3663 clien_tcp:16290
15.19 server_tcp:3663 clien_tcp:16653
15.57 server_tcp:3663 clien_tcp:16877
15.95 server_tcp:3663 clien_tcp:17185
16.33 server_tcp:3663 clien_tcp:16927
16.69 server_tcp:3663 clien_tcp:16139
17.07 server_tcp:3663 clien_tcp:16262
17.45 server_tcp:3663 clien_tcp:16481
17.84 server_tcp:3663 clien_tcp:16626
18.22 server_tcp:3663 clien_tcp:16937
18.6 server_tcp:3663 clien_tcp:16929
18.98 server_tcp:3663 clien_tcp:16894
19.36 server_tcp:3663 clien_tcp:16984
19.75 server_tcp:3663 clien_tcp:16862
20.12 server_tcp:3663 clien_tcp:16407
20.5 server_tcp:3663 clien_tcp:16146
20.87 server_tcp:3663 clien_tcp:15806
21.25 server_tcp:3663 clien_tcp:16151
21.63 server_tcp:3663 clien_tcp:16021
22.01 server_tcp:3663 clien_tcp:16061
22.38 server_tcp:3663 clien_tcp:16393
22.77 server_tcp:3663 clien_tcp:17089
23.16 server_tcp:3663 clien_tcp:17655
23.53 server_tcp:3663 clien_tcp:17791
23.91 server_tcp:3663 clien_tcp:17816
24.29 server_tcp:3663 clien_tcp:17760
24.67 server_tcp:3663 clien_tcp:17505
25.05 server_tcp:3663 clien_tcp:16939
25.43 server_tcp:3663 clien_tcp:16932
25.81 server_tcp:3663 clien_tcp:17256
26.2 server_tcp:3663 clien_tcp:17508
26.57 server_tcp:3663 clien_tcp:16983
26.95 server_tcp:3663 clien_tcp:16942
27.33 server_tcp:3663 clien_tcp:16618
27.71 server_tcp:3663 clien_tcp:16841
28.08 server_tcp:3663 clien_tcp:17237
28.46 server_tcp:3663 clien_tcp:16840
28.84 server_tcp:3663 clien_tcp:17075
29.21 server_tcp:3663 clien_tcp:16581
29.59 server_tcp:3663 clien_tcp:16751
29.97 server_tcp:3663 clien_tcp:17120
30.35 server_tcp:3663 clien_tcp:16949
30.73 server_tcp:3663 clien_tcp:16575
31.11 server_tcp:3663 clien_tcp:16627
31.49 server_tcp:3663 clien_tcp:16478
31.87 server_tcp:3663 clien_tcp:16718
32.27 server_tcp:3663 clien_tcp:16735
32.66 server_tcp:3663 clien_tcp:16074
33.03 server_tcp:3663 clien_tcp:9753
33.42 server_tcp:3663 clien_tcp:9762
33.78 server_tcp:3663 clien_tcp:9762
34.14 server_tcp:3663 clien_tcp:6932
34.49 server_tcp:3663 clien_tcp:7048
34.86 server_tcp:3663 clien_tcp:7128
35.19 server_tcp:3663 clien_tcp:7603
35.52 server_tcp:3663 clien_tcp:7960
35.85 server_tcp:3663 clien_tcp:7976
36.22 server_tcp:3663 clien_tcp:7976
36.55 server_tcp:3663 clien_tcp:7989
36.89 server_tcp:3268 clien_tcp:2973
37.27 server_tcp:2 clien_tcp:0

Expected behavior

consul-template should have a http connection pool which limits the concurrent requests it's sending to Vault.

Actual behavior

consul-template spawns as many requests as secret to render, DoSing server.

@ccapurso
Copy link
Contributor

ccapurso commented Jan 3, 2024

The new configuration parameter has been introduced to Vault Agent and can be found in hashicorp/vault#24548.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants