(PDB-1674) PuppetDB hangs on shutdown when the database is down #1504
Conversation
|
This just needs docs for the new connection-timeout config |
|
Test FAILed. |
|
@pljenkinsro retest this please |
There was a problem hiding this comment.
this alignment looks off
There was a problem hiding this comment.
Maybe the comment is what throws it off? I reformatted it and nothing changed.
|
Test FAILed. |
This commit adds connection timeouts to the command processing and query connection pools. This fixes the problem where the ActiveMQ session can't be properly shutdown because the command connection pool can't get a connection to the database (i.e. the database is down). Previously this would hang, even with a Control-C. This hang was due to the connection pool software getting caught in a loop trying to get a connection to the database. This connection timeout was also applied to the query side of the application as user provided queries are expected to respond quickly and a hanging behavior would be bad there as well. This blocking/retrying behavior is what we want when PuppetDB first starts up as we need to initialize the schema etc. This commit also creates a special connection pool (just for startup) that doesn't have these timeouts. This pool is closed after the startup code that needs a database connection is finished.
da04468 to
1c37e8a
Compare
|
Added some docs, switched from 2ms connection timeout on the query side to 500ms. Travis is failing with connection timeouts, only on PG. I'm wondering if the issue it's having is lazy init. The first poll for a database connection needs to init the pool, which maybe takes more than 2ms? |
|
Test FAILed. |
|
1 transient failing cell ^^^ |
|
Not sure which test is failing, but I also wouldn't be surprised if some tests might need to be adjusted to account for these changes, i.e. a query EAGAIN is now more "normal". |
There was a problem hiding this comment.
Perhaps wrap this (and below)?
There was a problem hiding this comment.
Meh, I do like the wrap there, but the stuff around it is not wrapped. The whole file goes in and out of it, but specifically those sections are not. Probably something we should just do on all files at once if we want the change.
|
This all looks good to me, works fine, and I liked the commit breakdown too. |
…hung-when-db-down (PDB-1674) PuppetDB hangs on shutdown when the database is down
This adds connection timeouts to the command processing and query
connection pools. This fixes the problem where the ActiveMQ session
can't be properly shutdown because the command connection pool can't get
a connection to the database (i.e. the database is down). Previously
this would hang, even with a Control-C. This hang was due to the
connection pool software getting caught in a loop trying to get a
connection to the database.
This connection timeout was also applied to the query side of the
application as user provided queries are expected to respond quickly and
a hanging behavior would be bad there as well.
This blocking/retrying behavior is what we want when PuppetDB first
starts up as we need to initialize the schema etc. This commit also
creates a special connection pool (just for startup) that doesn't have
these timeouts. This pool is closed after the startup code that needs a
database connection is finished.