DbDataSource, dependency injection and multi-tenancy #4523

roji · 2022-06-20T11:52:45Z

#4503 will add a new Npgsql package that integrates it nicely into DI, thanks to DbDataSource. The scope there is only to register an NpgsqlDataSource (typically as singleton) and an NpgsqlConnection (typically as scoped), which users can then get injected with and use (note that the generic DbDataSource and DbConnection will also be registered, for database-agnostic patterns).

We could go further and provide some tools to help with multitenancy:

The user could provide a "tenant key generator" lambda (which gets us the tenant ID from somewhere in the service provider), plus a "data source creator" lambda (which can create the data source given a tenant ID).
We register some singleton "multitenant data source registry".
We also register a scoped data source service, retrieved via code which gets the tenant ID, and gets the corresponding data source from the singleton registry (or creating it if it doesn't yet exist).
For database-per-tenant scenarios, we have truly distinct data sources, each pointing to a different database.
For schema-per-tenant (and possibly some other multitenancy schemes):
- The user typically wants to set some connection state. For example, in PG there's the search_path state parameter, similar to PATH: it determines which schemas to traverse when resolving an unqualified table name.
- We'd therefore facilitate having scoped data source wrappers, which actually all use the same singleton data source behind the scenes (because it's the same database), but execute some SQL each time a connection is returned from the pool.
- For extra bonus performance points, we could prepend/defer that SQL, so that instead of adding a roundtrip for that, the SQL gets delivered when the user executes a real command (see #4522). This would make this free in terms of perf.
We'd need to give special thought to multiplexing, where there's no connection; the connection-less NpgsqlDataSource.CreateCommand would also need to prepend each time.

Multi-tenancy with discriminator column

Note: the above can even support the tenant ID as a discriminator column. PostgreSQL allows defining arbitrary connection variables, and then using them in queries:

DROP TABLE IF EXISTS data;
CREATE TABLE data (
    id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
    number int
);

INSERT INTO data (number) VALUES (8), (9);

SET foo.bar = 8;
SELECT * FROM data WHERE number=CURRENT_SETTING('foo.bar')::int;

/cc @ajcvickers @JeremyLikness

The text was updated successfully, but these errors were encountered:

luanmm · 2023-10-25T05:48:56Z

Hello, @roji.

Regarding this subject, how can I setup a multi-tenant strategy in Npgsql 7? I'm strugling here to do it, trying mixing Microsoft samples like this one and DbDataSource.

My multi-tenancy is with database separation, so I need a connection string for every tenant (my NpgsqlDataSource can't have a singleton lifetime).

To achieve it, like in the referenced sample, I'm trying to use an IDbContextFactory that resolves the connection string for the "current tenant" and it should create the DbContext accordingly.

The problem is that I would have something like that:

    public TenantDbContext CreateDbContext()
    {
        var context = _pooledFactory.CreateDbContext(); // Here the "OnConfiguring" method from DbContext was already been called, throwing an exception (because the "DataSource" is null, as it is being set below)

        var tenant = _tenantContextAccessor.TenantContext?.Tenant;

        var connectionString = tenant?.ConnectionString;

        if (string.IsNullOrEmpty(connectionString))
        {
            if (tenant == null)
                throw new ArgumentNullException(nameof(tenant), "Database connection is not configured (tenant could not be identified).");

            throw new ArgumentNullException(nameof(tenant), "Database connection is not configured.");
        }

        var dataSourceBuilder = new NpgsqlDataSourceBuilder(connectionString);
        dataSourceBuilder.MapEnum<...>(); // I need some custom enums to be registered (that are being configured in "OnModelCreating", with "HasPostgresEnum" extension method)
        dataSourceBuilder.MapEnum<...>();
        dataSourceBuilder.MapEnum<...>();
        dataSourceBuilder.UseNetTopologySuite();

        context.DataSource = dataSourceBuilder.Build(); // This (definately) is not a good way to pass the DataSource to DbContext, but I'm stuck here (here, the "OnConfiguring" has already been called, as stated above)

        return context;
    }

The "OnConfiguring" is something like this:

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseLazyLoadingProxies()
            .UseNpgsql(DataSource!, options => options // DataSource is always null here
                .EnableRetryOnFailure()
                .MigrationsAssembly(typeof(TenantDbContext).Assembly.FullName)
                .UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery))
            .UseSnakeCaseNamingConvention();

I think this scenario is not so trivial, but multi-tenancy commonly use multiple databases, so I think it will be positive to support (if it isn't yet).

Thanks in advance for the attention.

And congratulations for all the work here.

roji · 2023-10-26T10:00:18Z

@luanmm each NpgsqlDataSource works with exactly one connection string; so if each tenant of yours has a different connection string, that means you need a data source per tenant. It's important that you don't create a data source instead for each EF DbContext; since each data source represents its own connection pool, doing that would effectively disable connection pooling, with pretty bad performance implications.

All this means that you want to have a "registry" of data sources, keyed on the tenant ID, and probably registered as a singleton service in DI; when a request comes in for some tenant, you either get its data source from the registry (if it was already seen), or create a new data source and store it there.

Then, you initialize the DbContext with that data source. This is conceptually the same as initializing it with the tenant-specific connection string (once you have the data source instance from the singleton registry).

Hopefully the above is enough to get you on the right path - if not let me know. This definitely should be made a bit simpler (or we should at least have some docs on it online).

luanmm · 2023-10-26T15:54:31Z

Thank you very much for the tips, @roji!

I was able to address the issue for a common DbContext creation scenario:

public class DataSourceRegistry
{
    private readonly ILogger<DataSourceRegistry> _logger;

    private readonly IDictionary<string, DbDataSource> _entries = new Dictionary<string, DbDataSource>();

    public DataSourceRegistry(ILogger<DataSourceRegistry> logger)
    {
        _logger = logger;
    }

    public DbDataSource GetOrCreate(string key, Func<DbDataSource> factory)
    {
        if (!_entries.TryGetValue(key, out var dataSource))
        {
            _logger.LogInformation($"Creating (and caching) a new data source for identifier \"{key}\".");

            dataSource = factory();

            _entries.Add(key, dataSource);
        }

        return dataSource;
    }
}

In service registration:

        services.TryAddSingleton<DataSourceRegistry>();

        services.AddDbContext<TenantDbContext>((sp, options) =>
        {
            var tenant = sp.GetService<ITenantContextAccessor>()?.TenantContext?.Tenant;
            if (tenant == null)
                throw new ArgumentNullException(nameof(tenant), "Database connection is not configured (tenant could not be identified).");

            var dataSourceRegistry = sp.GetRequiredService<DataSourceRegistry>();

            var dataSource = dataSourceRegistry.GetOrCreate(tenant.Identifier, () =>
            {
                var dataSourceBuilder = new NpgsqlDataSourceBuilder(tenant.ConnectionString);
                dataSourceBuilder.MapEnum<...>();
                dataSourceBuilder.MapEnum<...>();
                dataSourceBuilder.MapEnum<...>();
                dataSourceBuilder.UseNetTopologySuite();

                return dataSourceBuilder.Build();
            });

            options
                .UseLazyLoadingProxies()
                .UseNpgsql(dataSource, options => options
                    .EnableRetryOnFailure()
                    .MigrationsAssembly(typeof(TenantDbContext).Assembly.FullName)
                    .UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery))
                .UseSnakeCaseNamingConvention();
        });

The only thing is that I could not make it work using AddDbContextPool, to use pooled contexts, but is a more advanced use case and not (currently) a requirement. But, if I achieve it at some point, I update here with my solution.

Best regards.

NinoFloris · 2023-10-26T16:15:50Z

Using keyed services is another option if you know all your tenants at startup.

#5134

Docs for keyed services:
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-8.0#keyed-services

Just to note, your registry is registered as a singleton but uses a normal dictionary without a lock.

Singletons can be accessed from multiple threads so better to use a ConcurrentDictionary or to wrap all your code accessing the dictionary in a lock!

luanmm · 2023-10-26T17:13:04Z

@NinoFloris thank you very much for the tips!

I wasn't thinking about concurrency, indeed. It is really a good point.

I will change to a ConcurrentDictionary for now, but will try with keyed services after .NET 8 release next month.

Thank you all for the help!

onurkanbakirci · 2024-05-24T09:29:23Z

Following is an implementation with ConcurrentDictionary instead of Dictionary

public class DataSourceRegistry
{
    private readonly ConcurrentDictionary<string, DbDataSource> _entries = new ConcurrentDictionary<string, DbDataSource>();

    public DbDataSource GetOrCreate(string key, Func<DbDataSource> factory)
    {
        return _entries.GetOrAdd(key, _ => factory());
    }
}

roji added the enhancement label Jun 20, 2022

roji added this to the Backlog milestone Sep 25, 2022

roji self-assigned this Sep 25, 2022

roji mentioned this issue Jan 13, 2023

Register multiple DataSources/Connections with DependencyInjection extensions #4873

Closed

roji mentioned this issue Oct 29, 2023

Multitenancy with Dependenct injection (How to handle connections for isolated databases depending on tenant IDs) #5358

Closed

Hawxy mentioned this issue Nov 18, 2023

Wolverine & Aspire JasperFx/wolverine#635

Open

This comment has been minimized.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DbDataSource, dependency injection and multi-tenancy #4523

DbDataSource, dependency injection and multi-tenancy #4523

roji commented Jun 20, 2022

luanmm commented Oct 25, 2023

roji commented Oct 26, 2023

luanmm commented Oct 26, 2023

NinoFloris commented Oct 26, 2023 •

edited

Loading

luanmm commented Oct 26, 2023

onurkanbakirci commented May 24, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

DbDataSource, dependency injection and multi-tenancy #4523

DbDataSource, dependency injection and multi-tenancy #4523

Comments

roji commented Jun 20, 2022

luanmm commented Oct 25, 2023

roji commented Oct 26, 2023

luanmm commented Oct 26, 2023

NinoFloris commented Oct 26, 2023 • edited Loading

luanmm commented Oct 26, 2023

onurkanbakirci commented May 24, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

NinoFloris commented Oct 26, 2023 •

edited

Loading

onurkanbakirci commented May 24, 2024 •

edited

Loading