Skip to content

(PDB-1674) PuppetDB hangs on shutdown when the database is down #1504

Merged
rbrw merged 2 commits intopuppetlabs:stablefrom
senior:ticket/stable/pdb-1674-shutdown-hung-when-db-down
Jul 2, 2015
Merged

(PDB-1674) PuppetDB hangs on shutdown when the database is down #1504
rbrw merged 2 commits intopuppetlabs:stablefrom
senior:ticket/stable/pdb-1674-shutdown-hung-when-db-down

Conversation

@senior
Copy link
Contributor

@senior senior commented Jun 30, 2015

This adds connection timeouts to the command processing and query
connection pools. This fixes the problem where the ActiveMQ session
can't be properly shutdown because the command connection pool can't get
a connection to the database (i.e. the database is down). Previously
this would hang, even with a Control-C. This hang was due to the
connection pool software getting caught in a loop trying to get a
connection to the database.

This connection timeout was also applied to the query side of the
application as user provided queries are expected to respond quickly and
a hanging behavior would be bad there as well.

This blocking/retrying behavior is what we want when PuppetDB first
starts up as we need to initialize the schema etc. This commit also
creates a special connection pool (just for startup) that doesn't have
these timeouts. This pool is closed after the startup code that needs a
database connection is finished.

@senior
Copy link
Contributor Author

senior commented Jun 30, 2015

This just needs docs for the new connection-timeout config

@senior senior changed the title PuppetDB hangs on shutdown when the database is down (PDB-1674) PuppetDB hangs on shutdown when the database is down Jun 30, 2015
@pljenkinsro
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.puppetlabs.com/job/platform_puppetdb_intn-sys_pr/1573/

@senior
Copy link
Contributor Author

senior commented Jun 30, 2015

@pljenkinsro retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this alignment looks off

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment is what throws it off? I reformatted it and nothing changed.

@ajroetker ajroetker added this to the 3.0.0 (stable) milestone Jun 30, 2015
@pljenkinsro
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.puppetlabs.com/job/platform_puppetdb_intn-sys_pr/1575/

Ryan Senior added 2 commits July 1, 2015 07:18
This commit adds connection timeouts to the command processing and query
connection pools. This fixes the problem where the ActiveMQ session
can't be properly shutdown because the command connection pool can't get
a connection to the database (i.e. the database is down). Previously
this would hang, even with a Control-C. This hang was due to the
connection pool software getting caught in a loop trying to get a
connection to the database.

This connection timeout was also applied to the query side of the
application as user provided queries are expected to respond quickly and
a hanging behavior would be bad there as well.

This blocking/retrying behavior is what we want when PuppetDB first
starts up as we need to initialize the schema etc. This commit also
creates a special connection pool (just for startup) that doesn't have
these timeouts. This pool is closed after the startup code that needs a
database connection is finished.
@senior senior force-pushed the ticket/stable/pdb-1674-shutdown-hung-when-db-down branch from da04468 to 1c37e8a Compare July 1, 2015 12:18
@senior
Copy link
Contributor Author

senior commented Jul 1, 2015

Added some docs, switched from 2ms connection timeout on the query side to 500ms. Travis is failing with connection timeouts, only on PG. I'm wondering if the issue it's having is lazy init. The first poll for a database connection needs to init the pool, which maybe takes more than 2ms?

@pljenkinsro
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.puppetlabs.com/job/platform_puppetdb_intn-sys_pr/1592/

@senior
Copy link
Contributor Author

senior commented Jul 1, 2015

1 transient failing cell ^^^

@rbrw
Copy link
Contributor

rbrw commented Jul 1, 2015

Not sure which test is failing, but I also wouldn't be surprised if some tests might need to be adjusted to account for these changes, i.e. a query EAGAIN is now more "normal".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps wrap this (and below)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, I do like the wrap there, but the stuff around it is not wrapped. The whole file goes in and out of it, but specifically those sections are not. Probably something we should just do on all files at once if we want the change.

@rbrw
Copy link
Contributor

rbrw commented Jul 1, 2015

This all looks good to me, works fine, and I liked the commit breakdown too.

rbrw added a commit that referenced this pull request Jul 2, 2015
…hung-when-db-down

(PDB-1674) PuppetDB hangs on shutdown when the database is down
@rbrw rbrw merged commit 0bcab03 into puppetlabs:stable Jul 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants