Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-44050: Configure Postgres connection pool parameters #1001

Merged
merged 2 commits into from Apr 25, 2024

Conversation

dhirving
Copy link
Contributor

@dhirving dhirving commented Apr 24, 2024

Adjusted SQLAlchemy's connection parameters for postgres:

  • Enabled pool_pre_ping to prevent exceptions from being thrown due to stale DB connections.
  • Increased the pool size from 1 to 10, to prevent connections from being re-created excessively in multi-threaded services.

The pool_size increase is less risky than it at first appears... only a handful of services are constructing Butlers in a way that causes more than 1 connection to ever be created in the pool:

  • At SLAC, it's just the Butler server which has only 1 instance and is rarely used currently.
  • At the summit, it is only exposurelog. This service is pretty low concurrency. They are going to do a Butler update soon to get pool_pre_ping, so I will let them know to watch out for other side effects in their testing.
  • If changing this breaks IDF we want to know that now, because it means we likely have serious scalability problems.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes
  • (if changing dimensions.yaml) make a copy of dimensions.yaml in configs/old_dimensions

@dhirving dhirving force-pushed the tickets/DM-44050 branch 2 times, most recently from e586dff to 4dc22d0 Compare April 24, 2024 21:21
Copy link

codecov bot commented Apr 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.04%. Comparing base (29714cd) to head (613a1ed).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1001   +/-   ##
=======================================
  Coverage   89.04%   89.04%           
=======================================
  Files         347      347           
  Lines       44473    44473           
  Branches     9146     9146           
=======================================
  Hits        39601    39601           
  Misses       3550     3550           
  Partials     1322     1322           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'm a bit concerned with a hard-coded pool_size value, it could happen that for some services we may want to play with this parameter. It would be nice to have an option for updating that on client side, but our current model does not support it.

Turn on the SQLAlchemy pool_pre_ping option so that it checks that a database connection is live before using it.

This mitigates an issue where database exceptions would be thrown from the Butler when re-using old connections from the pool that had gone stale.
Previously, the pool size for postgres DB connections was set to 1.  This was causing SQLAlchemy to create a new connection from scratch any time there was more than one thread simultaneously using Butlers sharing a connection pool (as is typical in service use cases.)
@dhirving dhirving merged commit ecffc00 into main Apr 25, 2024
18 checks passed
@dhirving dhirving deleted the tickets/DM-44050 branch April 25, 2024 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants