Vault can't handle properly AWS RDS postgresql multi-az failover #6792

zenathar · 2019-05-28T11:07:51Z

Environment:

Vault Version: 1.1.2
Operating System/Architecture: centos-release-7-2.1511.el7.centos.2.10.x86_64

Vault Config File:

backend "consul" {
  address = "127.0.0.1:8500"
  path = "vault"
}

listener "tcp" {
  address = "x.x.x.x:8200"
  tls_disable = 1
}

ha_backend "consul" {
  api_addr = "http:/[[vault_dns_name]]:8200"
  cluster_addr = "http://[[vault_dns_name]]:8201"
}

cluster_name = "xxx"
ui = false

Startup Log Output:

May 28 11:26:35 [censored] systemd[1]: Started Vault.
May 28 11:26:35 [censored] systemd[1]: Starting Vault...
May 28 11:26:35 [censored] vault[2140]: ==> Vault server configuration:
May 28 11:26:35 [censored] vault[2140]: HA Storage: consul
May 28 11:26:35 [censored] vault[2140]: Api Address: http://[censored]:8200
May 28 11:26:35 [censored] vault[2140]: Cgo: disabled
May 28 11:26:35 [censored] vault[2140]: Cluster Address: https://[censored]:8201
May 28 11:26:35 [censored] vault[2140]: Listener 1: tcp (addr: "[censored]:8200", cluster address: "[censored]:8201", max_request_duration: "1m30s", max
_request_size: "33554432", tls: "disabled")
May 28 11:26:35 [censored] vault[2140]: Log Level: info
May 28 11:26:35 [censored] vault[2140]: Mlock: supported: true, enabled: true
May 28 11:26:35 [censored] vault[2140]: Storage: consul
May 28 11:26:35 [censored] vault[2140]: Version: Vault v1.1.2
May 28 11:26:35 [censored] vault[2140]: Version Sha: 0082501623c0b704b87b1fbc84c2d725994bac54
May 28 11:26:35 [censored] vault[2140]: ==> Vault server started! Log data will stream in below:
May 28 11:26:35 [censored] vault[2140]: 2019-05-28T11:26:35.921Z [WARN]  storage.consul: appending trailing forward slash to path
May 28 11:26:35 [censored] vault[2140]: 2019-05-28T11:26:35.923Z [WARN]  no `api_addr` value specified in config or in VAULT_API_ADDR; falling back to d
etection if possible, but this value should be manually set
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.819Z [INFO]  core: vault is unsealed
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.821Z [INFO]  core.cluster-listener: starting listener: listener_address=[censored]:8201
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.821Z [INFO]  core.cluster-listener: serving cluster requests: cluster_listen_address=[censored]:8201
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.821Z [INFO]  core: entering standby mode
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.844Z [INFO]  core: acquired lock, enabling active operation
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.890Z [INFO]  core: post-unseal setup starting
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.892Z [INFO]  core: loaded wrapping token key
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.892Z [INFO]  core: successfully setup plugin catalog: plugin-directory=
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.896Z [INFO]  core: successfully mounted backend: type=kv path=secret/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.896Z [INFO]  core: successfully mounted backend: type=system path=sys/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.896Z [INFO]  core: successfully mounted backend: type=identity path=identity/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.896Z [INFO]  core: successfully mounted backend: type=database path=database/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.896Z [INFO]  core: successfully mounted backend: type=cubbyhole path=cubbyhole/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.911Z [INFO]  core: successfully enabled credential backend: type=token path=token/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.911Z [INFO]  core: successfully enabled credential backend: type=approle path=approle/
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.911Z [INFO]  core: restoring leases
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.911Z [INFO]  rollback: starting rollback manager
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.918Z [INFO]  identity: entities restored
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.920Z [INFO]  identity: groups restored
May 28 11:27:05 [censored] vault[2140]: 2019-05-28T11:27:05.922Z [INFO]  core: post-unseal setup complete
May 28 11:27:06 [censored] vault[2140]: 2019-05-28T11:27:06.067Z [INFO]  expiration: lease restore complete

Expected Behavior:
After multi-az failover on postgresql on AWS RDS vault should properly generate new credentials when requested.

Actual Behavior:
After multi-az failover vault hangs maximum amount of time (90s) on generating new credentials, then times out. Credentials are properly generated in about 5-20 minutes after failover.
Additionaly, there is no traffic seen in tcpdump for both ip adresses of AWS rds postgresql - old one (before) and new one (after failover).
Issue won't occur on AWS rds mysql

Steps to Reproduce:

Create aws postgresql with multi-az enabled
Execute following statement on newly created database:

CREATE ROLE vault_root ROLE root; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO vault_root; GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO vault_root; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO vault_root; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO vault_root;'"

Create postgresql database config as described:

/vault read database/config/db
Key                                   Value
---                                   -----
allowed_roles                         [db_rw]
connection_details                    map[max_open_connections:4 connection_url:postgresql://root:*****@[db_url]:5432/db max_connection_lifetime:5s max_idle_connections:-1]
plugin_name                           postgresql-database-plugin
root_credentials_rotate_statements    []

Create postgresql database role as described:

[root@aint2vault01b vault]# ./vault read database/roles/db_rw
Key                      Value
---                      -----
creation_statements      [CREATE ROLE "{{name}}" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}' IN ROLE vault_root; GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "{{name}}"; GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO "{{name}}";]
db_name                 db
default_ttl              1h
max_ttl                  800000h
renew_statements         []
revocation_statements    []
rollback_statements      []

Reboot AWS postgresql database - - tick "Reboot With Failover?" checkbox
Wait about minute-two and try to read database/creds/db_rw

Important Factoids:

References:

The text was updated successfully, but these errors were encountered:

melkorm · 2019-10-02T10:25:05Z

More details on on this issue.

Steps to reproduce

Using https://gist.github.com/melkorm/25ce9f0d3840d29caa3491a47129e00f we can reproduce this by running setup-server.sh then setup-db.sh with correct hostname and password for RDS, after running this commands we can watch for credentials given by vault using watch.sh.
After this we can reboot RDS with failover in AWS console and watch vault logs.

Observations:

It looks like none of configuration parameters can't control time when connection hits driver: bad connection error,
vault behaves correctly if it doesn't run into driver: bad connection state

driver: bad connection logs: https://gist.github.com/melkorm/d6a46b37ba2618222a89c5476d34ab06

If you look trough logs you can find this part:

2019-10-01T15:50:03.346Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err="driver: bad connection" took=5m13.5570328s
2019-10-01T15:50:03.347Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err="context canceled" took=4m11.3820182s
2019-10-01T15:50:03.349Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err="context canceled" took=3m9.3568063s
2019-10-01T15:50:03.349Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err="context canceled" took=1m5.1224078s
2019-10-01T15:50:03.351Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err="context canceled" took=2m7.2336053s
2019-10-01T15:50:04.857Z [TRACE] secrets.database.database_0f754b69.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=4.5787007s

which shows that vault blocks all incoming new password requests, even if DNS points to new IP and waits for driver: bad connection to happen for X amount of minutes (already hit 10 minutes and 5 minutes).

Below are the logs when vault correctly timeouts and uses new RDS instance:

2019-10-01T15:34:58.571Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.4509824s
2019-10-01T15:35:04.782Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:06.118Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.3357046s
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=auth/token/
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=secret/
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=sys/
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=cubbyhole/
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=identity/
2019-10-01T15:35:08.540Z [DEBUG] rollback: attempting rollback: path=database/
2019-10-01T15:35:12.304Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:13.715Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.4107558s
2019-10-01T15:35:19.990Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:21.375Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err="read tcp 172.17.0.2:34286->10.5.8.165:5432: i/o timeout" took=1.3850439s
2019-10-01T15:35:27.494Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:29.499Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err="dial tcp 10.5.8.165:5432: i/o timeout" took=2.0057219s
2019-10-01T15:35:35.631Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:37.637Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err="dial tcp 10.5.8.165:5432: i/o timeout" took=2.0058206s
2019-10-01T15:35:43.754Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:45.762Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err="dial tcp 10.5.8.165:5432: i/o timeout" took=2.0082028s
2019-10-01T15:35:52.002Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:35:54.006Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err="dial tcp 10.5.8.165:5432: i/o timeout" took=2.0043805s
2019-10-01T15:36:00.248Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:36:01.794Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.5458048s
2019-10-01T15:36:07.932Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:36:08.467Z [DEBUG] rollback: attempting rollback: path=secret/
2019-10-01T15:36:08.468Z [DEBUG] rollback: attempting rollback: path=identity/
2019-10-01T15:36:08.468Z [DEBUG] rollback: attempting rollback: path=cubbyhole/
2019-10-01T15:36:08.468Z [DEBUG] rollback: attempting rollback: path=sys/
2019-10-01T15:36:08.468Z [DEBUG] rollback: attempting rollback: path=database/
2019-10-01T15:36:08.469Z [DEBUG] rollback: attempting rollback: path=auth/token/
2019-10-01T15:36:09.247Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.3147729s
2019-10-01T15:36:15.433Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:36:16.892Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.4587123s
2019-10-01T15:36:23.059Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started
2019-10-01T15:36:24.367Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=finished err=<nil> took=1.308189s
2019-10-01T15:36:30.502Z [TRACE] secrets.database.database_3c1d9906.postgresql-database-plugin: create user: transport=builtin status=started

I've tested it with v1.0.1 and v1.2.3 vault versions.

I will dig more into this issue and try to rebuild vault with more debug points to find exactly where timeout is not respected but it would be also nice to get more information from the team so perhaps we can resolve this faster :) Perhaps issue is in https://github.com/lib/pq or in https://github.com/golang/go itself.

PS. Also found that GO https://golang.org/src/database/sql/sql.go#L777 handles driver.ErrBadConn differently that any other error.

tj13 · 2019-10-12T01:11:03Z

Any update about this issue

bandesz · 2019-10-21T10:41:09Z

lib/pq does not handle query timeouts correctly and it can hang for a long time during an AZ failover. Here is the related issue: lib/pq#450.

melkorm · 2019-10-21T11:23:26Z

@bandesz thanks for the link to related issue ! I've discovered this myself and wanted to create similar issue but it's great that it exists already, although after some research it looks like Vault doesn't provide credentials concurrently which causes that application is not able to recover from this error and vault is in broken state - can't serve credentials for this db.

Vault could handle it and accept new requests for credentials while old connection is broken by just allowing for new requests as connection pool can handle it, currently it's not possible, please see https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/vault-tool/GXK3EMW7GGM/ii5DMAFODwAJ

lib/pq#450 this issue is already 3 years old and I guess there are two issues, one related to vault and other to lib/pq.

michelvocks · 2019-11-07T14:00:20Z

Hi @zenathar & @melkorm!

#5731 has been merged recently which looks like a solution for this issue.
I will close this issue for now but please let me know if the issue is still present.

Cheers,
Michel

bandesz · 2019-11-07T14:19:49Z

@michelvocks that seems to be an unrelated issue, can you please link the correct one?

michelvocks · 2019-11-08T08:28:18Z

@bandesz Sorry. It was #5731.

melkorm · 2019-11-17T22:08:32Z

@bandesz @michelvocks Hey, can we actually reopen this as it was a mistake ?

michelvocks · 2019-11-18T08:47:27Z

Hi @melkorm!

Yes. Sorry for the inconvenience!

Cheers,
Michel

tyrannosaurus-becks · 2020-04-01T20:03:06Z

Thanks for providing such clear steps to reproduce this. I'm able to reproduce it locally. I've been testing and I found that if I comment out these lines, the problem disappears. Going to see if I can find an easy workaround for it.

melkorm · 2020-04-01T21:08:16Z

@tyrannosaurus-becks thank you for looking into it 🥇

I think this works as we always open a new connection when asking for credentials so we don't run into broken connections 🤔
but it's weird that we don't run into https://github.com/hashicorp/vault/blob/master/plugins/database/postgresql/postgresql.go#L125 as this lock should be still held when previous query / connection hangs - or am I misunderstanding it ?

I think proper way to fix this issue is to replace / change underlaying Postgres library as it can't handle timeouts correctly so even if we fix that on vault side and it won't block by creating new connections we can run into issue of using all available connections.

PS. I am afraid that there is no simple workaround for it :(

tyrannosaurus-becks · 2020-04-01T21:12:04Z

I have a working branch going here for anyone following along. I've found that if we simply don't cache connections, we eliminate the problem, so I'm working on making a setting for that.

frittentheke · 2020-04-01T21:40:01Z

@tyrannosaurus-becks Even if you do not reuse old connections, the issue would still be on individual connections not timing out (and not receiving any other lower layer error such as a TCP reset). This is what half-open TCP connections are like: silent ;-)

While there is a long outstanding PR to add this lib/pq#792 why not add a timeout wrapper around the function doing the database querying and have that timeout (and potentially retry).

Otherwise you have individual API request that Vault received from its client time out and run into an error - while this then heals things on the next try the client does with a new connections every time as with your PR - it seems kind of unclean to not at least retry once to talk to the SQL backend.

tyrannosaurus-becks · 2020-04-01T21:45:49Z

@melkorm and @frittentheke , what do you think of #8660? I did testing and replicated the issue, and with the linked code, I was able to have no delay with returning creds after failover. Vault already was closing connections if the ping failed, this simply does that every time instead of sometimes.

tyrannosaurus-becks · 2020-04-01T21:46:51Z

Of course with the code, I needed to use a slightly different config:

vault write database/config/mypostgresqldatabase \
    plugin_name=postgresql-database-plugin \
    allowed_roles="my-role" \
    connection_url="postgresql://{{username}}:{{password}}@test-issue-6792.redacted.us-east-1.rds.amazonaws.com:5432/" \
    username="postgres" \
    password="redacted" \
    disable_connection_caching=true

frittentheke · 2020-04-01T22:21:09Z

@tyrannosaurus-becks yes, it's a 99% improvement as it will always use a fresh connection and only an individual query that came in right when the database is being switched might fail.

But would the underlying driver (pq in this case) have the said timeout it would fail and a retry on this individual connection or query could happen as well. But the more I think about it your solution might just be that 99% and in any case be a good addition in options to configure the SQL backend.

melkorm · 2020-04-01T22:31:06Z

@tyrannosaurus-becks I will test it tomorrow 👍

My only concern is that previously when I was testing it the code was hanging on the lock not the actual query, so even with fresh connection we still wait for the lock to be released 🤔 as from my understanding Vault can't provide credentials asynchronously and each API call waits for each other, if you could clarify it it would be great as I can misunderstand something - here you can find more details on this issue https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/vault-tool/GXK3EMW7GGM

tyrannosaurus-becks · 2020-04-01T22:59:00Z

Ah! Thanks! In my testing I didn't encounter the locking issues you mention. I'll be very curious if you encounter them again in your testing.

As for the question in that thread, why does the lock exist, I think it's simply because of the connection being cached. If the lock weren't there, there'd be a race with the connection caching. If you do still encounter locking issues, let me know and I'll scratch my head and see if I can come up with any other options. I could maybe run the client's queries in a goroutine where, if the client doesn't return after a certain amount of time, it releases the lock and moves on....

melkorm · 2020-04-02T16:54:36Z

Hey, so I've gave it more thought and tested it locally and hopefully have some more information.

@tyrannosaurus-becks The patch makes things better, but after running it for a moment and jumping with RDS failovers I ran into this: status=finished err="pq: tuple concurrently updated" took=866.536553ms more logs can be found here: https://gist.github.com/melkorm/86576974ac8d2560981e09a0f2c3440b

also I think that this fix addresses situations when we run into failover before we get the lock e.g.:

We hit vault for credentials A -> it is under failover, we get timeout
we do next API call and we get new connection and freshly baked credentials
any next call also works as we get fresh conns

but if connection failover happens after we get connection and acquire the lock we land back into socket timeout game https://gist.github.com/melkorm/37660c59bc260543aa54e347c9fdb6cb :(

I am thinking why vault even tries to cache connections and locks stuff if Go db plugin handles connection pool and database transactions can handle consistency 🤔

If we could get rid of timeout when we are in the middle of transaction ... but on other hand it fells more like quick-fixing rather than proper solution - perhaps vault could use patched lib/pq library version with timeouts properly implemented ? This is what we thought to do if nothing better happens :/

Cheers, and thank you for working on this 🙏

tyrannosaurus-becks · 2020-04-02T18:53:40Z

That makes sense. I'm catching up to your level of context on the issue. I think I'm going to close the associated PR here because I don't think it gets us to where we need to go.

I was wondering, do you think that if Vault used the DialTimeout method located here, that might help with the hanging issues?

As for the locks, I hear you, if the underlying pq library handles it, there's no reason for us to do the same because it forces synchronous connections unnecessarily.

melkorm · 2020-04-03T08:11:19Z

I was wondering, do you think that if Vault used the DialTimeout method located here, that might help with the hanging issues?

This would resolve issues around connecting to database and currently can be achieved by specifying it on database url level e.g.: postgresql://{{username}}:{{password}}@test-issue-6792.redacted.us-east-1.rds.amazonaws.com:5432/db_name?connect_timeout=1
see here https://github.com/lib/pq/blob/356e267cd3f45ee1303b7d8765e3583cb949950e/conn.go#L341

~~I will think more about locks/pq and update this answer later.~~

So after thinking about for a while I think there are few solutions we could think about.

use patched https://github.com/lib/pq with Added read_timeout and write_timeout lib/pq#792 - this would resolve this issue and perhaps few more and make vault postgres database plugin more stable.
use alternative library for postgres, looking trough https://github.com/lib/pq issues it seems not maintained well enough comparing to alternatives like https://github.com/jackc/pgx and https://github.com/go-pg/pg, those are just examples and seems more high level (ORMs & stuff) but we could pick something more low level which would support timeouts and be more dependable and maintained.
Rethink vault's connection caching and locking http://go-database-sql.org/connection-pool.html reading trough this make sense to leave connection pool to database/sql rather than reimplementing it 🤔 I bet there was a case for this in earlier days of Go but it seems a bit off now.

Let me know what you think @tyrannosaurus-becks

srikiraju · 2020-07-09T05:57:29Z

Can this issue occur with other RDS backends like MySQL? We had a similar issue with RDS MySQL and it might be related to this

aphorise · 2020-09-10T08:05:52Z

@tyrannosaurus-becks - hey what happened to the WIP?

CleverDBA · 2021-02-24T13:08:06Z

Modifying these Vault storage parameters reduces the "5 minute hang" to under a minute, but then you basically lose connection pooling:

connection_url="[whatever your url is]?connect_timeout=5"
max_idle_connections=-1
max_connection_lifetime="1s"

It would be great if Vault handled lost database connections more gracefully. An RDS Postgres failover can happen in a matter of seconds, but vault's default behavior is to hang for 5 or 10 minutes when an RDS Postgres failover occurs.

aphorise · 2022-09-05T15:04:17Z

There's been a lot of changes since 1.1.2 - including for example in the most recent 1.11.x:

database & storage: Change underlying driver library from lib/pq to pgx. This change affects Redshift & Postgres database secrets engines, and CockroachDB & Postgres storage engines [GH-15343]

Hey @zenathar I was interested to know if you've retested this flow or if it's still applicable?

melkorm · 2022-09-25T20:59:54Z

@aphorise Can't really reproduce it atm but looking at pgx code it looks like their are at least trying to handle such cases https://github.com/jackc/pgx/blob/d7c7ddc594209e641b6066b625973e8d7d711142/internal/nbconn/nbconn.go#L62 so in my opinion this issue could be closed and reopened if someone will hit this issue again.

Thank you for replacing pq with pgx as I can imagine it was a lot of work 💪🏼 🎉

hsimon-hashicorp · 2022-10-27T17:45:52Z

As per the last comment, I'm going to go ahead and close this issue now. Please feel free to open a new one as needed. Thanks!

michelvocks closed this as completed Nov 7, 2019

michelvocks reopened this Nov 18, 2019

michelvocks added bug Used to indicate a potential bug secret/database labels Nov 18, 2019

catsby added the version/1.1.x label Nov 18, 2019

tyrannosaurus-becks self-assigned this Mar 31, 2020

tyrannosaurus-becks mentioned this issue Apr 1, 2020

Postgres: Provide option to disable connection caching #8660

Closed

pbernal unassigned tyrannosaurus-becks Apr 8, 2020

pbernal added this to the triaged milestone May 11, 2020

ffung mentioned this issue Dec 24, 2020

operator step-down causes vault to hang when connectivity to postgres is lost #10619

Open

hsimon-hashicorp removed the version/1.1.x label Jul 21, 2021

hsimon-hashicorp added the community-sentiment Tracking high-profile issues from the community label Jan 18, 2022

hsimon-hashicorp removed this from the triaged milestone Jan 18, 2022

hsimon-hashicorp closed this as completed Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vault can't handle properly AWS RDS postgresql multi-az failover #6792

Vault can't handle properly AWS RDS postgresql multi-az failover #6792

zenathar commented May 28, 2019 •

edited

melkorm commented Oct 2, 2019

tj13 commented Oct 12, 2019

bandesz commented Oct 21, 2019

melkorm commented Oct 21, 2019

michelvocks commented Nov 7, 2019 •

edited

bandesz commented Nov 7, 2019

michelvocks commented Nov 8, 2019

melkorm commented Nov 17, 2019

michelvocks commented Nov 18, 2019

tyrannosaurus-becks commented Apr 1, 2020

melkorm commented Apr 1, 2020 •

edited

tyrannosaurus-becks commented Apr 1, 2020

frittentheke commented Apr 1, 2020 •

edited

tyrannosaurus-becks commented Apr 1, 2020

tyrannosaurus-becks commented Apr 1, 2020

frittentheke commented Apr 1, 2020

melkorm commented Apr 1, 2020 •

edited

tyrannosaurus-becks commented Apr 1, 2020 •

edited

melkorm commented Apr 2, 2020

tyrannosaurus-becks commented Apr 2, 2020

melkorm commented Apr 3, 2020 •

edited

srikiraju commented Jul 9, 2020

aphorise commented Sep 10, 2020

CleverDBA commented Feb 24, 2021

aphorise commented Sep 5, 2022 •

edited

melkorm commented Sep 25, 2022

hsimon-hashicorp commented Oct 27, 2022

Vault can't handle properly AWS RDS postgresql multi-az failover #6792

Vault can't handle properly AWS RDS postgresql multi-az failover #6792

Comments

zenathar commented May 28, 2019 • edited

melkorm commented Oct 2, 2019

More details on on this issue.

Steps to reproduce

tj13 commented Oct 12, 2019

bandesz commented Oct 21, 2019

melkorm commented Oct 21, 2019

michelvocks commented Nov 7, 2019 • edited

bandesz commented Nov 7, 2019

michelvocks commented Nov 8, 2019

melkorm commented Nov 17, 2019

michelvocks commented Nov 18, 2019

tyrannosaurus-becks commented Apr 1, 2020

melkorm commented Apr 1, 2020 • edited

tyrannosaurus-becks commented Apr 1, 2020

frittentheke commented Apr 1, 2020 • edited

tyrannosaurus-becks commented Apr 1, 2020

tyrannosaurus-becks commented Apr 1, 2020

frittentheke commented Apr 1, 2020

melkorm commented Apr 1, 2020 • edited

tyrannosaurus-becks commented Apr 1, 2020 • edited

melkorm commented Apr 2, 2020

tyrannosaurus-becks commented Apr 2, 2020

melkorm commented Apr 3, 2020 • edited

srikiraju commented Jul 9, 2020

aphorise commented Sep 10, 2020

CleverDBA commented Feb 24, 2021

aphorise commented Sep 5, 2022 • edited

melkorm commented Sep 25, 2022

hsimon-hashicorp commented Oct 27, 2022

zenathar commented May 28, 2019 •

edited

michelvocks commented Nov 7, 2019 •

edited

melkorm commented Apr 1, 2020 •

edited

frittentheke commented Apr 1, 2020 •

edited

melkorm commented Apr 1, 2020 •

edited

tyrannosaurus-becks commented Apr 1, 2020 •

edited

melkorm commented Apr 3, 2020 •

edited

aphorise commented Sep 5, 2022 •

edited