Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic secrets: maximum revoke attempts hardcoded #8602

Open
klebediev opened this issue Mar 21, 2020 · 3 comments
Open

Dynamic secrets: maximum revoke attempts hardcoded #8602

klebediev opened this issue Mar 21, 2020 · 3 comments

Comments

@klebediev
Copy link

klebediev commented Mar 21, 2020

Is your feature request related to a problem? Please describe.
While 'stress-testing' vault database secrets engine I found that in case database isn't available for like 20 minutes (which is possible in case of planned maintenance / connectivity issues etc), vault gives up and stops trying to revoke creds (after 6 failed attempts):

2020-03-19T16:13:25.939+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:16:07.379+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:18:58.314+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:22:09.068+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:25:59.641+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:31:11.504+0200 [ERROR] expiration: failed to revoke lease: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp error="failed to revoke entry: resp: (*logical.Response)(nil) err: dial tcp 127.0.0.1:3306: connect: operation timed out"
2020-03-19T16:36:31.507+0200 [ERROR] expiration: maximum revoke attempts reached: lease_id=database/creds/admin_devops_ldap_mysql_tst/U8ih3yEJSdjeRgBqtkfAnlAp

This means some transient db users will live forever, if I understand correctly (correct me if I'm wrong).
Reason: corresponding parameters are hardcoded.

Describe the solution you'd like

  • Make maximum revocation attempts configurable (globally? per backend?)
@KohanS
Copy link

KohanS commented Sep 21, 2020

Maybe the solution is to try to revoke creds after DB availability is restored?

@bitfactory-henno-schooljan
Copy link

There really should be an option to let it keep trying, perhaps increasing backoff time to prevent it from overloading, so these get deleted eventually after a database outage. Now we are forced to garbage collect these things with an external process every time this maximum is reached.
Making its value configurable should do it, infinite would then be 0 or -1.

@bitfactory-henno-schooljan

Actually, when I restart Vault after such an incident it attempts to revoke these old expired leases again. So at least that is a way for me to clean them up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants