-
Notifications
You must be signed in to change notification settings - Fork 4k
.Net MEVD: DI for relational DBs #12043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net MEVD: DI for relational DBs #12043
Conversation
{ | ||
} | ||
|
||
internal PostgresCollectionOptions(PostgresCollectionOptions? source, IEmbeddingGenerator embeddingGenerator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have the copy ctor just to make it more likely to not miss extending the copy logic when a new property gets introduced in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's internal so it doesn't matter much, but I'd just have this a simple copy constructor - accepting only another PostgresCollectionOptions instance - and then set the embedding generator as usual at the call site:
var newOptions = new PostgresCollectionOptions(source)
{
EmbeddingGenerator = embeddingGenerator;
}
This would make this usable for other needs and just be more standard (same with the other option constructors).
// Since we are not constructing the data source, add the IVectorStore as transient, since we | ||
// cannot make assumptions about how data source is being managed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added ServiceLifetime lifetime
argument to let the users decide (we should not be using transient by default, as it's bad for perf).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is literally a copy of the SqlServer file with a SqlServer
-> Sqlite
replace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DependencyInjectionTests
provide more detailed test coverage, I decided to remove this test file to avoid a duplication of efforts.
/// </summary> | ||
public sealed class SqliteServiceCollectionExtensionsTests | ||
{ | ||
private const string? SkipReason = "SQLite vector search extension is required"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sqlite was testing the same thing in two different places, but at the same time not testing at all because it was always skipped. So I decided to remove it, as the new tests provide really deep test coverage.
builder.UseVector(); | ||
return builder.Build(); | ||
}) | ||
.AddPostgresVectorStore(lifetime: ServiceLifetime.Transient), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an example of how our users can use ServiceLifetime.Transient
with the existing overload
|
||
public NpgsqlDataSource DataSource => this._dataSource ?? throw new InvalidOperationException("Not initialized"); | ||
|
||
public string ConnectionString => this._connectionString ?? throw new InvalidOperationException("Not initialized"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the tests Dispose the container and I did not want them to dispose the shared NpgsqlDataSource
, so I had to expose the connection string (with password, as DataSource.ConnectionString
hides the password) to be able to create an independent data source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @adamsitnik! See comments below.
Also... I think I mentioned this at some point, but we never fully discussed: when a user does AddPostgresVectorStore and then also AddPostgresCollection, don't they reasonably expect us to create the collection from the store rather than instantiate it directly? This could be important if AddPostgresVectorStore is passed some options, which the user wants to have applied to the collection (should be trivial to implement).
/// <returns>The service collection.</returns> | ||
public static IServiceCollection AddPostgresVectorStore(this IServiceCollection services, PostgresVectorStoreOptions? options = default, string? serviceId = default) | ||
public static IServiceCollection AddPostgresVectorStore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No Keyed variant of this (which doesn't accept a connection string)? I don't necessarily mind, just try to make sure we have a clear logic is all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct, the methods that do resolve NpgsqlDataSource
from DI don't have keyed overloads.
I just don't want to add too many overloads without having a clear need from the customers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to have them for SQLite but not here thouggh?
dotnet/src/Connectors/Connectors.Memory.PgVector/PostgresServiceCollectionExtensions.cs
Show resolved
Hide resolved
dotnet/src/Connectors/Connectors.Memory.PgVector/PostgresServiceCollectionExtensions.cs
Outdated
Show resolved
Hide resolved
{ | ||
} | ||
|
||
internal PostgresCollectionOptions(PostgresCollectionOptions? source, IEmbeddingGenerator embeddingGenerator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's internal so it doesn't matter much, but I'd just have this a simple copy constructor - accepting only another PostgresCollectionOptions instance - and then set the embedding generator as usual at the call site:
var newOptions = new PostgresCollectionOptions(source)
{
EmbeddingGenerator = embeddingGenerator;
}
This would make this usable for other needs and just be more standard (same with the other option constructors).
dotnet/src/Connectors/Connectors.Memory.SqliteVec/SqliteServiceCollectionExtensions.cs
Show resolved
Hide resolved
// And the IVectorSearchable abstraction. | ||
Verify<IVectorSearchable<TRecord>>(serviceProvider, lifetime, serviceKey); | ||
using ServiceProvider serviceProvider = services.BuildServiceProvider(); | ||
// Let's ensure that concrete types are registered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the code seems pretty self-explanatory to me, not sure the comments are needed.
using ServiceProvider serviceProvider = services.BuildServiceProvider(); | ||
// let's ensure that concrete types are registered | ||
Verify<TCollection>(serviceProvider, lifetime, serviceKey); | ||
Assert.False(wasResolved); // it's lazy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be testing general DI flow that isn't related to MEVD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My purpose was to show that we don't create the object when the extension method gets called, but rather register a factory method that is lazy executed
dotnet/src/VectorDataIntegrationTests/VectorDataIntegrationTests/DependencyInjectionTests.cs
Show resolved
Hide resolved
What if they provided a different lifetime for each of them? Or simply different connection strings? I am not sure if we have the ability to verify that. And TBH I don't see the point of registering both. If you want AddPostgresCollection, just go and register it without AddPostgresVectorStore |
…o diUnification # Conflicts: # dotnet/src/Connectors/Connectors.Memory.PgVector/PostgresServiceCollectionExtensions.cs # dotnet/src/Connectors/Connectors.Memory.SqliteVec/SqliteServiceCollectionExtensions.cs
- make the copy ctors more general purpose - add more overloads
- make serviceKey nullable (and reduce the need of heplers) - rename collectionName to name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adamsitnik, looks good - see remaining last nits.
/// <returns>The service collection.</returns> | ||
public static IServiceCollection AddPostgresVectorStore(this IServiceCollection services, PostgresVectorStoreOptions? options = default, string? serviceId = default) | ||
public static IServiceCollection AddPostgresVectorStore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to have them for SQLite but not here thouggh?
dotnet/src/Connectors/Connectors.Memory.PgVector/PostgresServiceCollectionExtensions.cs
Outdated
Show resolved
Hide resolved
|
||
services.Add(new ServiceDescriptor(typeof(SqliteVectorStore), serviceKey, (sp, _) => | ||
{ | ||
var connectionString = connectionStringProvider(sp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I still slightly prefer just inlining these variables and making everything a single lambda expression without a block - but not important.
Co-authored-by: Shay Rojansky <roji@roji.org>
f3d5b50
into
microsoft:feature-vector-data-preb3
contributes to #10549