-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-37249-v25: backport transaction-level-pooling compatibility to v25 #770
Conversation
Apparently "#region ... #endregion" is used to trigger some special linter behavior, and I don't want to fight the battle of learning to disable it.
This was probably a bug that we've never happened to trigger.
This addresses a long-standing TODO comment in Database.query (dating back to the introduction Session and our first real attempt to reduce connection contention). In some cases this might seem to be transforming lazy iteration over database results into aggressive fetching, via calls to fetchall(), but for at least PostgreSQL it seems we were already doing _some_ aggressive fetching before without realizing it, see e.g. https://docs.sqlalchemy.org/en/14/core/connections.html#using-server-side-cursors-a-k-a-stream-results So what we're really doing in the new fetchalls() here is invoking SQLAlchemy's client-side row processing more aggressively. I don't think that's a big change, and in exchange we can now guarantee that we never return iterators to user code that are responsible for closing a connection when the user is "done" with them - a problematic definition for any garbage-collected entity, but a particular problem for iterators that might not ever be exhausted. This also include some typing adjustments for SQLAlchemy objects. Those aren't actually checked yet, since typing support in SQLAlchemy won't be out until it's 2.0 release.
Temporary tables are now always created inside transactions, and via the temporary_table context-manager-returning method, which has been moved to Database from Session. Session has been removed. This should bring us into compatibility with pgbouncer transaction-level pooling - we won't care if a connection we hold actually gets multiplexed onto a different database connection, as long as it doesn't happen during transactions.
Each of the gazillion small queries we run a butler startup was previously grabbing a new connection and then returning it. That was probably wasn't too bad, since it should have just been getting them from the pool (for PostgreSQL) or opening a local file (for SQLite), but this should still be better in both cases. It should also remove a lot of rollbacks, since SQLAlchemy emits those whenever connections are returned to the pool. SQLAlchemy's docs say those shouldn't matter in terms of performance, but they're still noisy.
Now that we're careful to consistently use the same connection, we shouldn't ever need more.
Making SQLite temp tables inside transactions leads to very long exclusive locks during QG generation, causing timeouts in (at least) ci_cpp.
We now have public final session() and transaction() methods that return nothing and protected _session() and _transaction() methods that return connections, reducing the need to assert on self._session_connection being not None and avoiding returning things to users that they don't need.
Codecov ReportBase: 85.33% // Head: 85.35% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## v25.0.x #770 +/- ##
===========================================
+ Coverage 85.33% 85.35% +0.02%
===========================================
Files 260 260
Lines 34535 34579 +44
Branches 5813 5854 +41
===========================================
+ Hits 29469 29516 +47
+ Misses 3823 3822 -1
+ Partials 1243 1241 -2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Checklist