Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB health checks: design #2113

Closed
adamsitnik opened this issue Dec 4, 2023 · 4 comments
Closed

DB health checks: design #2113

adamsitnik opened this issue Dec 4, 2023 · 4 comments
Assignees
Milestone

Comments

@adamsitnik
Copy link
Collaborator

adamsitnik commented Dec 4, 2023

.NET 7 has introduced the concept of DbDataSource, from dotnet/runtime#64812 we can read it's benefits:

  • DbDataSource encapsulates all information and configuration needed to provide an open database connection (auth information, logging setup, type mapping config, etc.), ready for executing commands.
  • This makes it suitable for passing around and registering in DI as a factory for connections, without needing to pass around any additional information.
  • It's intended for DbDataSources to correspond to connection pools managed internally in the database driver.
    • With the current API, drivers need to look up the internal pool each time a connection is opened, by the connection string. Since DbDataSource allows user code to directly reference an (abstracted) internal pool, it eliminates that lookup and helps perf.
    • DbDataSource isn't an abstraction for a connection pool (see #24856); that would require introducing some pooling-specific APIs, is complicated, and might not be a good idea. However, it could evolve into one in the future, with external pooling implementations which can be used across ADO.NET providers.
  • It's also possible to get and execute a DbCommand directly on DbDataSource, without needing to deal with DbConnection at all. In scenarios where no connection state is required (e.g. transactions), it's not necessary to burden users with managing the connection, so this can be done under the hood. In addition, this opens up command execution models which aren't tied to a specific connection (e.g. multiplexing on Npgsql).
  • The proposal includes a shim to make the new abstraction immediately available on all ADO.NET drivers, without them needing to implement anything (unless they wish to customize behavior).

#1993 has proved, that our current design can lead into failures: when the users don't provide an existing DbDataSource when registering a health check, but just a ConnectionString, then an instance of DbDataSource is created each time the health check is invoked.
This is bad for performance, because we end up having multiple connections open. The fix that I've provided in #2045 was just a workaround, we need a proper solution that can be applied to all DB-based health checks (see #2096 as an example).

Similarly to #2040, we should steer the users towards best practices and provide an API that makes it very hard to fall into such perf traps.
We are soon about to release 8.0 and it's great opportunity to introduce the breaking change. Having README.md should help us to ease the pain.

Possible solutions:

Provide a health check that by default resolves an instance of given DbDataSource from DI container.

Pros:

  • very easy to use (just call one method without the need of specyfing any arguments)
  • the perf problem is gone for all the users that register DbDataSource in the DI

Cons:

  • when the users don't register DbDataSource in the DI, they get no compiler error and the health check fails at runtime
  • they need to learn a new concept (DbDataSource)
  • some of the users can be still using DbConnection and not willing to learn and use DbDataSource

Provide a health check that does not accept DbDataSource

Pros:

  • no new concepts to learns (just provide ConnectionString)
  • works with every .NET version

Cons:

  • no benefits of DbDataSource
  • could it cause perf issues for those who started using DbDataSource? cc @roji

In theory, we could implement the check in a way that it would try to resolve DbDataSource first and re-use it when possible:

NpgsqlConnection? connection = null;
NpgsqlDataSource? fromDI = sp.GetService<NpgsqlDataSource>();
if (fromDI is not null && fromDI.ConnectionString.Equals(options.ConnectionString))
{
    connection = await dataSource.OpenConnectionAsync();
}
else
{
     connection = new(options.ConnectionString);
}

This could be a performance hit, but we could cache the result similarly to what I did in #2045.

My current best idea is to provide both :

  • a method that registers health check that requires the connection string to be always provided (the default), but tries to resolve the data source when possible
  • a new method that registers health check that tries to solve it from DI, but allows the users to specify a factory method via optional argument

@sungam3r @bgrainger what is your opinion on that?

@bgrainger
Copy link
Contributor

My current best idea is to provide both :

For clarity, could you provide the method signatures so we're sure we're talking about the same thing? For example, in #2096 I took the existing method:

  • IHealthChecksBuilder AddMySql(this IHealthChecksBuilder builder, string connectionString, [many optional parameters)

and added:

  • IHealthChecksBuilder AddMySql(this IHealthChecksBuilder builder, Func<IServiceProvider, MySqlDataSource>? dataSourceFactory = null, [same list of optional parameters]

Does that implement the new API you're suggesting? If so, I have no objections because that's already what I came up with (assuming that the new API should be additive and we shouldn't make a needless breaking change by removing existing methods).

@bgrainger
Copy link
Contributor

some of the users can be still using DbConnection and not willing to learn and use DbDataSource

This seems like a very strong point to me. Even when users update to .NET 7.0+, they still have to be educated about DbDataSource, then update their DI configuration to use it, then (probably) have to update Controllers and other types to take DbConnection instead of Func<DbConnection> (or other ways of injecting connections that may have been used), etc. We should make it very easy to adopt the new DbDataSource-pattern in this library, but it's probably too early to assume it should be the default.

@adamsitnik
Copy link
Collaborator Author

adamsitnik commented Dec 5, 2023

For clarity, could you provide the method signatures

Sure, please excuse me for not doing that. Here are the APIs for PostgreSQL from #2116:

namespace HealthChecks.NpgSql
{
    public class NpgSqlHealthCheck : Microsoft.Extensions.Diagnostics.HealthChecks.IHealthCheck
    {
        public NpgSqlHealthCheck(NpgSqlHealthCheckOptions options) { }
        public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default) { }
    }
    public class NpgSqlHealthCheckOptions
    {
        public NpgSqlHealthCheckOptions(string connectionString) { }
        public string CommandText { get; set; }
        public Action<NpgsqlConnection>? Configure { get; set; }
        public string? ConnectionString { get; set; }
        public Func<object?, HealthCheckResult>? HealthCheckResultBuilder { get; set; }
    }
}
namespace Microsoft.Extensions.DependencyInjection
{
    public static class NpgSqlHealthCheckBuilderExtensions
    {
        public static IHealthChecksBuilder AddNpgSql(this IHealthChecksBuilder builder, NpgSqlHealthCheckOptions options, string? name = null, HealthStatus? failureStatus = default, IEnumerable<string>? tags = null, TimeSpan? timeout = default) { }
        public static IHealthChecksBuilder AddNpgSql(this IHealthChecksBuilder builder, Func<IServiceProvider, string> connectionStringFactory, string healthQuery = "SELECT 1;", Action<NpgsqlConnection>? configure = null, string? name = null, HealthStatus? failureStatus = default, IEnumerable<string>? tags = null, TimeSpan? timeout = default) { }
        public static IHealthChecksBuilder AddNpgSql(this IHealthChecksBuilder builder, Func<IServiceProvider, NpgsqlDataSource>? dbDataSourceFactory = null, string healthQuery = "SELECT 1;", Action<NpgsqlConnection>? configure = null, string? name = null, HealthStatus? failureStatus = default, IEnumerable<string>? tags = null, TimeSpan? timeout = default) { }
        public static IHealthChecksBuilder AddNpgSql(this IHealthChecksBuilder builder, string connectionString, string healthQuery = "SELECT 1;", Action<NpgsqlConnection>? configure = null, string? name = null, HealthStatus? failureStatus = default, IEnumerable<string>? tags = null, TimeSpan? timeout = default) { }
    }
}

My reasoning:

  1. We should encourage our users to follow the best practices, so the first sample from README.md should show both how to register DbDataSource in the DI and how to register a health check that uses it.

In case of Npgsql, we should point to Npgsql.DependencyInjection package because it configures the registered components, for examply it configures the logger factory:

https://github.com/npgsql/npgsql/blob/c2fc02a858176f2b5eab7a2c2336ff5ab4748ad0/src/Npgsql.DependencyInjection/NpgsqlServiceCollectionExtensions.cs#L124

and registers not only the db data source, but also db connection:

https://github.com/npgsql/npgsql/blob/c2fc02a858176f2b5eab7a2c2336ff5ab4748ad0/src/Npgsql.DependencyInjection/NpgsqlServiceCollectionExtensions.cs#L165-L186

This is the first overload:

public static IHealthChecksBuilder AddAbc(this IHealthChecksBuilder builder, Func<IServiceProvider, AbcDataSource>? dbDataSourceFactory = null,
  1. Right after that we should mention that using DbDataSource is not mandatory and the users can stick with DbConnection.

This can be done by providing just the connection string. Users may need to access IServiceProvider so it gives us two overloads:

public static IHealthChecksBuilder AddAbc(this IHealthChecksBuilder builder, string connectionString, /* optional args */);
public static IHealthChecksBuilder AddAbc(this IHealthChecksBuilder builder, Func<IServiceProvider, string> connectionStringFactory, /* optional args */);
  1. Options

We need connection string and all optional arguments specific to DBs:

  • health query (string)
  • configuration lambda that accepts an instance of DbConnection (Action<AbcConnection>)

I am not sure whether we should allow the users to specify an instance of DbDataSource via options.
I am afraid that some users might create a new one just for the health check, instead of resolving it from the DI container. Example:

services.AddAbc(sp => new AbcOptions()
{
    DataSource = new DbDataSource($ConnectionString) // it should be sp.GetRequiredService<AbcDataSource>()
});

I am not totally against it. From my experience working on the .NET Team it's easy to add new things when needed, but it's very hard to remove existing ones.
So we can just not add it now, provide the mentioned APIs and wait for user feedback to see if there are scenarios that need to have it.

@bgrainger
Copy link
Contributor

That basically aligns with what I did on #2096. (There is also a MySqlConnector.DependencyInjection package that does similar work of registering MySqlDataSource, configuring logging, etc.)

My one piece of feedback there (#2096 (comment)) was that I was less sure about not exposing AbcHealthCheckOptions.DataSource because then there was no way (for a user who uses the AbcHealthCheckOptions overload) to do the right thing: they would be forced to specify a connection string, which is less optimal.

However, I agree that it's easier to add things than take them away, and ultimately I don't have a strong view on the matter. (If changing public MySqlDataSource? DataSource { get; set; } to internal can help unblock that PR and get it merged, I'm happy to do that.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants