Skip to content

Commit

Permalink
.Net: Update Milvus and MongoDB memory connectors (#4218)
Browse files Browse the repository at this point in the history
### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
Related item: #3997

### Description
1. Validate the following memory connectors: Chroma, MongoDB, Milvus,
and Weaviate, by running the integration tests.
2. Update the README of the MongoDB memory connector to make it easier
to follow.
3. Add an optional parameter named `indexName` to the constructors of
the Milvus memory connector. Otherwise, it will fail to find an index.

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [ ] The code builds clean without any errors or warnings
- [ ] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [ ] All unit tests pass, and I have added new tests where possible
- [ ] I didn't break anyone 😄
  • Loading branch information
TaoChenOSU committed Dec 13, 2023
1 parent 46f3dbc commit 2648531
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 24 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
private readonly int _vectorSize;
private readonly SimilarityMetricType _metricType;
private readonly bool _ownsMilvusClient;
private readonly string _indexName;

private const string DefaultIndexName = "default";
private const string IsReferenceFieldName = "is_reference";
private const string ExternalSourceNameFieldName = "external_source_name";
private const string IdFieldName = "id";
Expand Down Expand Up @@ -62,12 +64,13 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
/// <summary>
/// Creates a new <see cref="MilvusMemoryStore" />, connecting to the given hostname on the default Milvus port of 19530.
/// For more advanced configuration opens, construct a <see cref="MilvusClient" /> instance and pass it to
/// <see cref="MilvusMemoryStore(MilvusClient, int, SimilarityMetricType)" />.
/// <see cref="MilvusMemoryStore(MilvusClient, string, int, SimilarityMetricType)" />.
/// </summary>
/// <param name="host">The hostname or IP address to connect to.</param>
/// <param name="port">The port to connect to. Defaults to 19530.</param>
/// <param name="ssl">Whether to use TLS/SSL. Defaults to <c>false</c>.</param>
/// <param name="database">The database to connect to. Defaults to the default Milvus database.</param>
/// <param name="indexName">The name of the index to use. Defaults to <see cref="DefaultIndexName" />.</param>
/// <param name="vectorSize">The size of the vectors used in Milvus. Defaults to 1536.</param>
/// <param name="metricType">The metric used to measure similarity between vectors. Defaults to <see cref="SimilarityMetricType.Ip" />.</param>
/// <param name="loggerFactory">An optional logger factory through which the Milvus client will log.</param>
Expand All @@ -76,25 +79,27 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
int port = DefaultMilvusPort,
bool ssl = false,
string? database = null,
string? indexName = null,
int vectorSize = 1536,
SimilarityMetricType metricType = SimilarityMetricType.Ip,
ILoggerFactory? loggerFactory = null)
: this(new MilvusClient(host, port, ssl, database, callOptions: default, loggerFactory), vectorSize, metricType)
: this(new MilvusClient(host, port, ssl, database, callOptions: default, loggerFactory), indexName, vectorSize, metricType)
{
this._ownsMilvusClient = true;
}

/// <summary>
/// Creates a new <see cref="MilvusMemoryStore" />, connecting to the given hostname on the default Milvus port of 19530.
/// For more advanced configuration opens, construct a <see cref="MilvusClient" /> instance and pass it to
/// <see cref="MilvusMemoryStore(MilvusClient, int, SimilarityMetricType)" />.
/// <see cref="MilvusMemoryStore(MilvusClient, string, int, SimilarityMetricType)" />.
/// </summary>
/// <param name="host">The hostname or IP address to connect to.</param>
/// <param name="username">The username to use for authentication.</param>
/// <param name="password">The password to use for authentication.</param>
/// <param name="port">The port to connect to. Defaults to 19530.</param>
/// <param name="ssl">Whether to use TLS/SSL. Defaults to <c>false</c>.</param>
/// <param name="database">The database to connect to. Defaults to the default Milvus database.</param>
/// <param name="indexName">The name of the index to use. Defaults to <see cref="DefaultIndexName" />.</param>
/// <param name="vectorSize">The size of the vectors used in Milvus. Defaults to 1536.</param>
/// <param name="metricType">The metric used to measure similarity between vectors. Defaults to <see cref="SimilarityMetricType.Ip" />.</param>
/// <param name="loggerFactory">An optional logger factory through which the Milvus client will log.</param>
Expand All @@ -105,24 +110,26 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
int port = DefaultMilvusPort,
bool ssl = false,
string? database = null,
string? indexName = null,
int vectorSize = 1536,
SimilarityMetricType metricType = SimilarityMetricType.Ip,
ILoggerFactory? loggerFactory = null)
: this(new MilvusClient(host, username, password, port, ssl, database, callOptions: default, loggerFactory), vectorSize, metricType)
: this(new MilvusClient(host, username, password, port, ssl, database, callOptions: default, loggerFactory), indexName, vectorSize, metricType)
{
this._ownsMilvusClient = true;
}

/// <summary>
/// Creates a new <see cref="MilvusMemoryStore" />, connecting to the given hostname on the default Milvus port of 19530.
/// For more advanced configuration opens, construct a <see cref="MilvusClient" /> instance and pass it to
/// <see cref="MilvusMemoryStore(MilvusClient, int, SimilarityMetricType)" />.
/// <see cref="MilvusMemoryStore(MilvusClient, string, int, SimilarityMetricType)" />.
/// </summary>
/// <param name="host">The hostname or IP address to connect to.</param>
/// <param name="apiKey">An API key to be used for authentication, instead of a username and password.</param>
/// <param name="port">The port to connect to. Defaults to 19530.</param>
/// <param name="ssl">Whether to use TLS/SSL. Defaults to <c>false</c>.</param>
/// <param name="database">The database to connect to. Defaults to the default Milvus database.</param>
/// <param name="indexName">The name of the index to use. Defaults to <see cref="DefaultIndexName" />.</param>
/// <param name="vectorSize">The size of the vectors used in Milvus. Defaults to 1536.</param>
/// <param name="metricType">The metric used to measure similarity between vectors. Defaults to <see cref="SimilarityMetricType.Ip" />.</param>
/// <param name="loggerFactory">An optional logger factory through which the Milvus client will log.</param>
Expand All @@ -132,10 +139,11 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
int port = DefaultMilvusPort,
bool ssl = false,
string? database = null,
string? indexName = null,
int vectorSize = 1536,
SimilarityMetricType metricType = SimilarityMetricType.Ip,
ILoggerFactory? loggerFactory = null)
: this(new MilvusClient(host, apiKey, port, ssl, database, callOptions: default, loggerFactory), vectorSize, metricType)
: this(new MilvusClient(host, apiKey, port, ssl, database, callOptions: default, loggerFactory), indexName, vectorSize, metricType)
{
this._ownsMilvusClient = true;
}
Expand All @@ -144,23 +152,27 @@ public class MilvusMemoryStore : IMemoryStore, IDisposable
/// Initializes a new instance of <see cref="MilvusMemoryStore" /> over the given <see cref="MilvusClient" />.
/// </summary>
/// <param name="client">A <see cref="MilvusClient" /> configured with the necessary endpoint and authentication information.</param>
/// <param name="indexName">The name of the index to use. Defaults to <see cref="DefaultIndexName" />.</param>
/// <param name="vectorSize">The size of the vectors used in Milvus. Defaults to 1536.</param>
/// <param name="metricType">The metric used to measure similarity between vectors. Defaults to <see cref="SimilarityMetricType.Ip" />.</param>
public MilvusMemoryStore(
MilvusClient client,
string? indexName = null,
int vectorSize = 1536,
SimilarityMetricType metricType = SimilarityMetricType.Ip)
: this(client, ownsMilvusClient: false, vectorSize, metricType)
: this(client, ownsMilvusClient: false, indexName, vectorSize, metricType)
{
}

private MilvusMemoryStore(
MilvusClient client,
bool ownsMilvusClient,
string? indexName = null,
int vectorSize = 1536,
SimilarityMetricType metricType = SimilarityMetricType.Ip)
{
this.Client = client;
this._indexName = indexName ?? DefaultIndexName;
this._vectorSize = vectorSize;
this._metricType = metricType;
this._ownsMilvusClient = ownsMilvusClient;
Expand All @@ -186,8 +198,8 @@ public async Task CreateCollectionAsync(string collectionName, CancellationToken

MilvusCollection collection = await this.Client.CreateCollectionAsync(collectionName, schema, DefaultConsistencyLevel, cancellationToken: cancellationToken).ConfigureAwait(false);

await collection.CreateIndexAsync(EmbeddingFieldName, metricType: this._metricType, cancellationToken: cancellationToken).ConfigureAwait(false);
await collection.WaitForIndexBuildAsync("float_vector", cancellationToken: cancellationToken).ConfigureAwait(false);
await collection.CreateIndexAsync(EmbeddingFieldName, metricType: this._metricType, indexName: this._indexName, cancellationToken: cancellationToken).ConfigureAwait(false);
await collection.WaitForIndexBuildAsync("float_vector", this._indexName, cancellationToken: cancellationToken).ConfigureAwait(false);

await collection.LoadAsync(cancellationToken: cancellationToken).ConfigureAwait(false);
await collection.WaitForCollectionLoadAsync(waitingInterval: TimeSpan.FromMilliseconds(100), timeout: TimeSpan.FromMinutes(1), cancellationToken: cancellationToken).ConfigureAwait(false);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ public class MongoDBMemoryStore : IMemoryStore, IDisposable
/// </summary>
/// <param name="connectionString">MongoDB connection string.</param>
/// <param name="databaseName">Database name.</param>
/// /// <param name="indexName">Name of the search index. If no value is provided default index will be used.</param>
/// <param name="indexName">Name of the search index. If no value is provided default index will be used.</param>
public MongoDBMemoryStore(string connectionString, string databaseName, string? indexName = default) :
this(new MongoClient(connectionString), databaseName, indexName)
{
Expand Down
31 changes: 17 additions & 14 deletions dotnet/src/Connectors/Connectors.Memory.MongoDB/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,26 +6,26 @@ This connector uses [MongoDB Atlas Vector Search](https://www.mongodb.com/produc

1. Create [Atlas cluster](https://www.mongodb.com/docs/atlas/getting-started/)

2. Create a collection
2. Create a [collection](https://www.mongodb.com/docs/atlas/atlas-ui/collections/)

3. Create [Vector Search Index](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) for the collection. The index has to be defined on a field called `embedding`. For example:

3. Create [Vector Search Index](https://www.mongodb.com/docs/atlas/atlas-search/field-types/knn-vector/) for the collection.
The index has to be defined on a field called ```embedding```. For example:
```
{
"mappings": {
"dynamic": true,
"fields": {
"embedding": {
"dimension": 1024,
"similarity": "cosine",
"type": "knnVector"
}
"type": "vectorSearch",
"fields": [
{
"numDimensions": <number-of-dimensions>,
"path": "embedding",
"similarity": "euclidean | cosine | dotProduct",
"type": "vector"
}
}
]
}
```

4. Create the MongoDB memory store

```csharp
var connectionString = "MONGODB ATLAS CONNECTION STRING"
MongoDBMemoryStore memoryStore = new(connectionString, "MyDatabase");
Expand All @@ -38,9 +38,12 @@ Kernel kernel = Kernel.Builder
.Build();
```

> Guide to find the connection string: https://www.mongodb.com/docs/manual/reference/connection-string/
## Important Notes

### Vector search indexes
In this version, vector search index management is outside of ```MongoDBMemoryStore``` scope.

In this version, vector search index management is outside of `MongoDBMemoryStore` scope.
Creation and maintenance of the indexes have to be done by the user. Please note that deleting a collection
(```memoryStore.DeleteCollectionAsync```) will delete the index as well.
(`memoryStore.DeleteCollectionAsync`) will delete the index as well.

0 comments on commit 2648531

Please sign in to comment.