Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microsoft.Data.Sqlite: Pool connections #25018

Merged
merged 1 commit into from
Aug 12, 2021
Merged

Conversation

bricelam
Copy link
Contributor

@bricelam bricelam commented Jun 1, 2021

Resolves #13837, fixes #25101, fixes #25078

TechEmpower Fortunes benchmark results on aspnet-citrine-win:

  5.0 6.0
Requests/sec 3,473 25,510
Mean latency (ms) 73.56 10.04
CPU Usage (%) 93 99

/cc @sebastienros

@bricelam bricelam requested a review from a team June 1, 2021 22:56
@davidfowl
Copy link
Member

cc @sebastienros


_state = ConnectionState.Closed;
OnStateChange(new StateChangeEventArgs(ConnectionState.Open, ConnectionState.Closed));
}

internal void Deactivate()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider resetting other connection state too. The EF Core tests were leaking PRAGMA foreign_keys = 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems important - state leaks via the pool are dangerous.

Perf-wise, in Npgsql, when a connection is returned to the pool, the SQL to reset its state is added to its outgoing buffer, but not executed (that would be an extra network roundtrip). Instead, when the connection is next rented out, the reset SQL will be prepended to the user's SQL in a batch.

I don't know what the overhead of a "roundtrip" is in Sqlite (certainly less important than a remote database server 🤣), but this could still well be worth doing. It may require a bit of bookkeeping but is otherwise pretty easy.

@bricelam

This comment has been minimized.

var connectionStringBuilder = new SqliteConnectionStringBuilder(GetValidatedConnectionString())
{
Mode = SqliteOpenMode.ReadOnly,
Pooling = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be non-pooling? I get that it's used by Exists which probably shouldn't pool, but this is also a public method for creating read-only connections...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, it's internal. Probably should have been inlined in the first place...


_state = ConnectionState.Closed;
OnStateChange(new StateChangeEventArgs(ConnectionState.Open, ConnectionState.Closed));
}

internal void Deactivate()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems important - state leaks via the pool are dangerous.

Perf-wise, in Npgsql, when a connection is returned to the pool, the SQL to reset its state is added to its outgoing buffer, but not executed (that would be an extra network roundtrip). Instead, when the connection is next rented out, the reset SQL will be prepended to the user's SQL in a batch.

I don't know what the overhead of a "roundtrip" is in Sqlite (certainly less important than a remote database server 🤣), but this could still well be worth doing. It may require a bit of bookkeeping but is otherwise pretty easy.

src/Microsoft.Data.Sqlite.Core/SqliteConnection.cs Outdated Show resolved Hide resolved

int rc;

if (_collations != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blocks removes the collations/functions/aggregates right?

If so, it may be worth considering allowing these to persist across pooled open/close... I don't know what the overhead is of continuously tearing these down and creating them again, but in a short-lived connection scenario (e.g. web) it may be a lot.

For comparison, in Npgsql prepared statements persist by default (not destroyed when the connection is returned to the pool), which is really necessary for good perf.

Could be handled later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly worried about memory leaks here--the delegates would hang around a lot longer than the connection you added them to if we held onto them in the pool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's similar to how prepared statements work (but with server memory resources). At least for prepared statements, it seems pretty clear that an application would typically execute the same SQL across pooled connections, and forcing it to re-prepare every time after renting a connection would negate a lot of the perf gains.

I guess the main question here - what's the overhead of tearing these down and recreating them for each pooled connection? If it's a very light operation it indeed doesn't matter, but otherwise it could be important. Could also be an opt-in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume (possibly a bad idea) that it's pretty cheap--it's just adding and removing function pointers to a dictionary.

{
lock (this)
{
if (_pool == null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got the pruning logic right, the first time prune happens, if the pool is empty (e.g. because its connections were pruned), its state is set to Idle. If it stays that way for the next prune, it becomes disabled, and at this point we'd return null from GetPool.

So basically if I do a single operation on a pool, and then go do something else for a long time, the pool will eventually become Disabled, right? At that point if I open a connection for the connection string, it won't be pooled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you open a connection for a pool group that's been disabled, it will re-activate. Disabled pool groups will eventually be freed and opening a new connection for it will re-create the group.


namespace Microsoft.Data.Sqlite
{
internal class SqliteConnectionPoolGroup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely clear on why this type is needed in addition to NpgsqlConnectionPool itself. Wouldn't it be possible to fold this functionality into that?

Copy link
Contributor Author

@bricelam bricelam Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a try before submitting this PR but found that this factoring actually encapsulates the handling of pooled vs non-pooled connections pretty well.

// Copyright (c) .NET Foundation. All rights reserved.
// Licensed under the Apache License, Version 2.0. See License.txt in the project root for license information.

#if !NET
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowadays I think it's cooler to do NET5_OR_GREATER

@barronpm
Copy link

Resolves #13837

TechEmpower Fortunes benchmark results on aspnet-citrine-win:
5.0 6.0
Requests/sec 3,473 25,510
Mean latency (ms) 73.56 10.04
CPU Usage (%) 93 99

/cc @sebastienros

Have you made any performance comparisons between this branch and main? IMO that's a better target to see the benefit of this PR.

@bricelam
Copy link
Contributor Author

@barronpm main was about 3,700 requests/second

@bricelam
Copy link
Contributor Author

bricelam commented Aug 10, 2021

Unfortunately, I'm seeing ericsink/SQLitePCL.raw#430 locally a lot more frequently after this change...

@roji
Copy link
Member

roji commented Aug 11, 2021

Just an idea - we can always make pooling opt-in via a switch (and call it experimental)... Would be sad to not merge this because of that bug (which can also occur without pooling, it seems)

@bricelam
Copy link
Contributor Author

I really like the idea of defaulting the Pooling keyword to False for this release. That'll enable users to try it out and give us feedback before turning it on by default for everyone.

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to squirrel this

@roji
Copy link
Member

roji commented Aug 11, 2021

I also like that this feature may help track down the actual bug because it makes it happen more often...

@sebastienros
Copy link
Member

OrchardCore, Citrine Linux:
Before: 5.5K RPS
After: 14.5K RPS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants