From 8155ba3a0af57b9755ce27d5ebe768935893b75f Mon Sep 17 00:00:00 2001 From: Dhiraj Kumar Azad Date: Mon, 3 Nov 2025 21:33:24 +0530 Subject: [PATCH 1/2] Updated docs for Couchbase --- .../couchbase-connector.md | 213 ++++++++++++++---- 1 file changed, 163 insertions(+), 50 deletions(-) diff --git a/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md b/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md index e27ad81a..af4243f8 100644 --- a/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md +++ b/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md @@ -5,7 +5,7 @@ zone_pivot_groups: programming-languages author: azaddhirajkumar ms.topic: conceptual ms.author: westey -ms.date: 01/14/2025 +ms.date: 11/03/2025 ms.service: semantic-kernel --- @@ -14,27 +14,26 @@ ms.service: semantic-kernel > [!WARNING] > The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release. -::: zone pivot="programming-language-csharp" - ## Overview The Couchbase Vector Store connector can be used to access and manage data in Couchbase. The connector has the following characteristics. -| Feature Area | Support | -|---------------------------------------|-------------------------------------------------------------------------------------------------------------------| -| Collection maps to | Couchbase collection | -| Supported key property types | string | -| Supported data property types | All types that are supported by System.Text.Json (either built-in or by using a custom converter) | -| Supported vector property types | | -| Supported index types | N/A | -| Supported distance functions | | -| Supported filter clauses | | -| Supports multiple vectors in a record | Yes | -| IsFilterable supported? | No | -| IsFullTextSearchable supported? | No | -| StoragePropertyName supported? | No, use `JsonSerializerOptions` and `JsonPropertyNameAttribute` instead. [See here for more info.](#data-mapping) | -| HybridSearch supported? | No | +| Feature Area | Support | +| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Collection maps to | Couchbase collection + index | +| Supported key property types | | +| Supported data property types | All types that are supported by `System.Text.Json` (either built-in or by using a custom converter) | +| Supported vector property types | | +| Supported distance functions | | +| Supported filter clauses | | +| Supports multiple vectors in a record | Yes | +| IsIndexed supported? | Yes | +| IsFullTextIndexed supported? | Yes | +| StoragePropertyName supported? | No, use `JsonSerializerOptions` and `JsonPropertyNameAttribute` instead. [See here for more info.](#data-mapping) | +| HybridSearch supported? | Yes | + +::: zone pivot="programming-language-csharp" ## Getting Started @@ -44,14 +43,15 @@ Add the Couchbase Vector Store connector NuGet package to your project. dotnet add package CouchbaseConnector.SemanticKernel --prerelease ``` -You can add the vector store to the dependency injection container available on the `KernelBuilder` or to -the `IServiceCollection` dependency injection container using extension methods provided by Semantic Kernel. +You can add the vector store to the dependency injection container available on the `KernelBuilder` or to the `IServiceCollection` dependency injection container using extension methods provided by Semantic Kernel. ```csharp using Microsoft.SemanticKernel; +using Couchbase.SemanticKernel; // Using Kernel Builder. -var kernelBuilder = Kernel.CreateBuilder() +var kernelBuilder = Kernel + .CreateBuilder() .AddCouchbaseVectorStore( connectionString: "couchbases://your-cluster-address", username: "username", @@ -61,7 +61,8 @@ var kernelBuilder = Kernel.CreateBuilder() ``` ```csharp -using Microsoft.Extensions.DependencyInjection; +using Microsoft.SemanticKernel; +using Couchbase.SemanticKernel; // Using IServiceCollection with ASP.NET Core. var builder = WebApplication.CreateBuilder(args); @@ -73,6 +74,53 @@ builder.Services.AddCouchbaseVectorStore( scopeName: "scope-name"); ``` +**Configuring Index Type** + +The vector store defaults to using Hyperscale indexes. You can specify a different index type by passing `CouchbaseVectorStoreOptions`: + +```csharp +using Microsoft.SemanticKernel; +using Couchbase.SemanticKernel; + +var builder = WebApplication.CreateBuilder(args); + +// Option 1: Use Hyperscale index +builder.Services.AddCouchbaseVectorStore( + connectionString: "couchbases://your-cluster-address", + username: "username", + password: "password", + bucketName: "bucket-name", + scopeName: "scope-name", + options: new CouchbaseVectorStoreOptions + { + IndexType = CouchbaseIndexType.Hyperscale + }); + +// Option 2: Use Search/FTS index +builder.Services.AddCouchbaseVectorStore( + connectionString: "couchbases://your-cluster-address", + username: "username", + password: "password", + bucketName: "bucket-name", + scopeName: "scope-name", + options: new CouchbaseVectorStoreOptions + { + IndexType = CouchbaseIndexType.Search + }); + +// Option 3: Use Composite index +builder.Services.AddCouchbaseVectorStore( + connectionString: "couchbases://your-cluster-address", + username: "username", + password: "password", + bucketName: "bucket-name", + scopeName: "scope-name", + options: new CouchbaseVectorStoreOptions + { + IndexType = CouchbaseIndexType.Composite + }); +``` + Extension methods that take no parameters are also provided. These require an instance of the `IScope` class to be separately registered with the dependency injection container. @@ -154,7 +202,6 @@ var clusterOptions = new ClusterOptions }; var cluster = await Cluster.ConnectAsync(clusterOptions); - var bucket = await cluster.BucketAsync("bucket-name"); var scope = bucket.Scope("scope-name"); @@ -163,31 +210,93 @@ var vectorStore = new CouchbaseVectorStore(scope); It is possible to construct a direct reference to a named collection. +### Using Search Collection (FTS Index) + +For Full-Text Search indexes and hybrid search scenarios: + ```csharp +using Couchbase.SemanticKernel; using Couchbase; using Couchbase.KeyValue; -using Couchbase.SemanticKernel; -var clusterOptions = new ClusterOptions -{ - ConnectionString = "couchbases://your-cluster-address", - UserName = "username", - Password = "password" -}; +var cluster = await Cluster.ConnectAsync(clusterOptions); +var bucket = await cluster.BucketAsync("bucket-name"); +var scope = bucket.Scope("scope-name"); + +var collection = new CouchbaseSearchCollection( + scope, + "skhotels"); +``` + +### Using Query Collection (Hyperscale or Composite Index) + +For high-performance vector search with Hyperscale indexes: + +```csharp +using Couchbase.SemanticKernel; +using Couchbase; +using Couchbase.KeyValue; var cluster = await Cluster.ConnectAsync(clusterOptions); var bucket = await cluster.BucketAsync("bucket-name"); var scope = bucket.Scope("scope-name"); -var collection = new CouchbaseFtsVectorStoreRecordCollection( +// Using Hyperscale index (default) +var collection = new CouchbaseQueryCollection( + scope, + "skhotels", + indexType: CouchbaseIndexType.Hyperscale); + +// Or using Composite index +var collectionComposite = new CouchbaseQueryCollection( scope, - "hotelCollection"); + "skhotels", + indexType: CouchbaseIndexType.Composite); ``` + +### Index Type Comparison + +Couchbase offers three types of indexes for vector search: + +**Search (FTS) - Full-Text Search Index** + +- Best for hybrid searches combining full-text search with vector similarity +- Allows semantic search alongside traditional keyword matching +- Supports geospatial searches in addition to vector and text +- **Use when:** You need to combine traditional keyword search with vector similarity search in the same query +- **Ideal for:** E-commerce product search, travel recommendations, content discovery with multiple search criteria +- **Requires:** Couchbase Server 7.6+ or Capella + +**Hyperscale Vector Index** + +- Best for pure vector searches - content discovery, recommendations, semantic search +- High performance with low memory footprint - designed to scale to billions of vectors +- Optimized for concurrent operations - supports simultaneous searches and inserts +- **Use when:** You primarily perform vector-only queries without complex scalar filtering +- **Ideal for:** Large-scale semantic search, recommendation systems, content discovery +- **Requires:** Couchbase Server 8.0+ or Capella + +**Composite Vector Index** + +- Best for filtered vector searches - combines vector search with scalar value filtering +- Efficient pre-filtering - scalar attributes reduce the vector comparison scope +- **Use when:** Your queries combine vector similarity with scalar filters that eliminate large portions of data +- **Ideal for:** Compliance-based filtering, user-specific searches, time-bounded queries +- **Requires:** Couchbase Server 8.0+ or Capella + +**Choosing the Right Index Type:** + +- Start with **Hyperscale** for pure vector searches and large datasets (scales to billions) +- Use **Search (FTS)** for hybrid search combining text and vectors +- Choose **Composite** when scalar filters significantly reduce your search space (works well for tens of millions to billions of vectors) + +[Detailed comparison of vector index types](https://docs.couchbase.com/server/current/vector-index/use-vector-indexes.html) ## Data mapping -The Couchbase connector uses `System.Text.Json.JsonSerializer` for data mapping. Properties in the data model are serialized into a JSON object and mapped to Couchbase storage. +The Couchbase connector will use `System.Text.Json.JsonSerializer` to do mapping. Properties in the data model are serialized into a JSON object and stored as the document value in Couchbase. + +Usage of the `JsonPropertyNameAttribute` is supported if a different storage name to the data model property name is required. It is also possible to use a custom `JsonSerializerOptions` instance with a customized property naming policy. -Use the `JsonPropertyName` attribute to map a property to a different name in Couchbase storage. Alternatively, you can configure `JsonSerializerOptions` for advanced customization. ```csharp using Couchbase.SemanticKernel; using Couchbase.KeyValue; @@ -195,17 +304,18 @@ using System.Text.Json; var jsonSerializerOptions = new JsonSerializerOptions { - PropertyNamingPolicy = JsonNamingPolicy.CamelCase + PropertyNamingPolicy = JsonNamingPolicy.SnakeCaseUpper }; -var options = new CouchbaseFtsVectorStoreRecordCollectionOptions +var options = new CouchbaseQueryCollectionOptions { JsonSerializerOptions = jsonSerializerOptions }; -var collection = new CouchbaseFtsVectorStoreRecordCollection(scope, "hotels", options); +var collection = new CouchbaseQueryCollection(scope, "skhotelsjson", options); ``` -Using the above custom `JsonSerializerOptions` which is using `CamelCase`, the following data model will be mapped to the below json. + +Since a naming policy of snake case upper was chosen, here is an example of how this data type will be stored in Couchbase. Also note the use of `JsonPropertyNameAttribute` on the `Description` property to further customize the storage naming. ```csharp using System.Text.Json.Serialization; @@ -213,30 +323,33 @@ using Microsoft.Extensions.VectorData; public class Hotel { - [JsonPropertyName("hotelId")] - [VectorStoreRecordKey] + [VectorStoreKey] public string HotelId { get; set; } - [JsonPropertyName("hotelName")] - [VectorStoreRecordData] + [VectorStoreData(IsIndexed = true)] public string HotelName { get; set; } - [JsonPropertyName("description")] - [VectorStoreRecordData] + [JsonPropertyName("HOTEL_DESCRIPTION")] + [VectorStoreData(IsFullTextIndexed = true)] public string Description { get; set; } - [JsonPropertyName("descriptionEmbedding")] - [VectorStoreRecordVector(Dimensions: 4, DistanceFunction.DotProductSimilarity)] - public ReadOnlyMemory DescriptionEmbedding { get; set; } + [VectorStoreVector(Dimensions: 4, DistanceFunction.CosineSimilarity)] + public ReadOnlyMemory? DescriptionEmbedding { get; set; } } ``` ```json { - "hotelId": "h1", - "hotelName": "Hotel Happy", - "description": "A place where everyone can be happy", - "descriptionEmbedding": [0.9, 0.1, 0.1, 0.1] + "_id" : "h1", + "HOTEL_ID" : "h1", + "HOTEL_NAME" : "Hotel Happy", + "HOTEL_DESCRIPTION" : "A place where everyone can be happy.", + "DESCRIPTION_EMBEDDING" : [ + 0.9, + 0.1, + 0.1, + 0.1 + ] } ``` From f894cb21147d192a3afb4266014666aa7461cae4 Mon Sep 17 00:00:00 2001 From: Dhiraj Kumar Azad Date: Wed, 12 Nov 2025 17:13:35 +0530 Subject: [PATCH 2/2] addressed comments and some minor changes --- .../couchbase-connector.md | 76 +++++++++---------- 1 file changed, 37 insertions(+), 39 deletions(-) diff --git a/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md b/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md index af4243f8..25f9cad3 100644 --- a/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md +++ b/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/couchbase-connector.md @@ -14,6 +14,8 @@ ms.service: semantic-kernel > [!WARNING] > The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release. +::: zone pivot="programming-language-csharp" + ## Overview The Couchbase Vector Store connector can be used to access and manage data in Couchbase. The connector has the @@ -33,8 +35,6 @@ following characteristics. | StoragePropertyName supported? | No, use `JsonSerializerOptions` and `JsonPropertyNameAttribute` instead. [See here for more info.](#data-mapping) | | HybridSearch supported? | Yes | -::: zone pivot="programming-language-csharp" - ## Getting Started Add the Couchbase Vector Store connector NuGet package to your project. @@ -61,7 +61,6 @@ var kernelBuilder = Kernel ``` ```csharp -using Microsoft.SemanticKernel; using Couchbase.SemanticKernel; // Using IServiceCollection with ASP.NET Core. @@ -79,7 +78,6 @@ builder.Services.AddCouchbaseVectorStore( The vector store defaults to using Hyperscale indexes. You can specify a different index type by passing `CouchbaseVectorStoreOptions`: ```csharp -using Microsoft.SemanticKernel; using Couchbase.SemanticKernel; var builder = WebApplication.CreateBuilder(args); @@ -96,7 +94,7 @@ builder.Services.AddCouchbaseVectorStore( IndexType = CouchbaseIndexType.Hyperscale }); -// Option 2: Use Search/FTS index +// Option 2: Use Composite index builder.Services.AddCouchbaseVectorStore( connectionString: "couchbases://your-cluster-address", username: "username", @@ -105,10 +103,10 @@ builder.Services.AddCouchbaseVectorStore( scopeName: "scope-name", options: new CouchbaseVectorStoreOptions { - IndexType = CouchbaseIndexType.Search + IndexType = CouchbaseIndexType.Composite }); -// Option 3: Use Composite index +// Option 3: Use Search vector index builder.Services.AddCouchbaseVectorStore( connectionString: "couchbases://your-cluster-address", username: "username", @@ -117,7 +115,7 @@ builder.Services.AddCouchbaseVectorStore( scopeName: "scope-name", options: new CouchbaseVectorStoreOptions { - IndexType = CouchbaseIndexType.Composite + IndexType = CouchbaseIndexType.Search }); ``` @@ -157,7 +155,6 @@ kernelBuilder.Services.AddCouchbaseVectorStore(); ```csharp using Microsoft.Extensions.DependencyInjection; -using Microsoft.SemanticKernel; using Couchbase.KeyValue; using Couchbase; @@ -210,9 +207,9 @@ var vectorStore = new CouchbaseVectorStore(scope); It is possible to construct a direct reference to a named collection. -### Using Search Collection (FTS Index) +### Using Query Collection (Hyperscale or Composite Index) -For Full-Text Search indexes and hybrid search scenarios: +For high-performance vector search with Hyperscale indexes: ```csharp using Couchbase.SemanticKernel; @@ -223,14 +220,22 @@ var cluster = await Cluster.ConnectAsync(clusterOptions); var bucket = await cluster.BucketAsync("bucket-name"); var scope = bucket.Scope("scope-name"); -var collection = new CouchbaseSearchCollection( +// Using Hyperscale index (default) +var collection = new CouchbaseQueryCollection( scope, - "skhotels"); + "skhotels", + indexType: CouchbaseIndexType.Hyperscale); + +// Or using Composite index +var collectionComposite = new CouchbaseQueryCollection( + scope, + "skhotels", + indexType: CouchbaseIndexType.Composite); ``` -### Using Query Collection (Hyperscale or Composite Index) +### Using Search Collection (Seach Vector Index) -For high-performance vector search with Hyperscale indexes: +For hybrid search scenarios combining full-text search: ```csharp using Couchbase.SemanticKernel; @@ -241,33 +246,16 @@ var cluster = await Cluster.ConnectAsync(clusterOptions); var bucket = await cluster.BucketAsync("bucket-name"); var scope = bucket.Scope("scope-name"); -// Using Hyperscale index (default) -var collection = new CouchbaseQueryCollection( - scope, - "skhotels", - indexType: CouchbaseIndexType.Hyperscale); - -// Or using Composite index -var collectionComposite = new CouchbaseQueryCollection( +var collection = new CouchbaseSearchCollection( scope, - "skhotels", - indexType: CouchbaseIndexType.Composite); + "skhotels"); ``` ### Index Type Comparison Couchbase offers three types of indexes for vector search: -**Search (FTS) - Full-Text Search Index** - -- Best for hybrid searches combining full-text search with vector similarity -- Allows semantic search alongside traditional keyword matching -- Supports geospatial searches in addition to vector and text -- **Use when:** You need to combine traditional keyword search with vector similarity search in the same query -- **Ideal for:** E-commerce product search, travel recommendations, content discovery with multiple search criteria -- **Requires:** Couchbase Server 7.6+ or Capella - -**Hyperscale Vector Index** +**Hyperscale Vector Indexes** - Best for pure vector searches - content discovery, recommendations, semantic search - High performance with low memory footprint - designed to scale to billions of vectors @@ -276,7 +264,7 @@ Couchbase offers three types of indexes for vector search: - **Ideal for:** Large-scale semantic search, recommendation systems, content discovery - **Requires:** Couchbase Server 8.0+ or Capella -**Composite Vector Index** +**Composite Vector Indexes** - Best for filtered vector searches - combines vector search with scalar value filtering - Efficient pre-filtering - scalar attributes reduce the vector comparison scope @@ -284,11 +272,21 @@ Couchbase offers three types of indexes for vector search: - **Ideal for:** Compliance-based filtering, user-specific searches, time-bounded queries - **Requires:** Couchbase Server 8.0+ or Capella +**Search Vector Indexes** + +- Best for hybrid searches combining full-text search with vector similarity +- Allows semantic search alongside traditional keyword matching +- Supports geospatial searches in addition to vector and text +- **Use when:** You need to combine traditional keyword search with vector similarity search in the same query +- **Ideal for:** E-commerce product search, travel recommendations, content discovery with multiple search criteria +- **Requires:** Couchbase Server 7.6+ or Capella + **Choosing the Right Index Type:** -- Start with **Hyperscale** for pure vector searches and large datasets (scales to billions) -- Use **Search (FTS)** for hybrid search combining text and vectors -- Choose **Composite** when scalar filters significantly reduce your search space (works well for tens of millions to billions of vectors) +- Start with **Hyperscale Index** for pure vector searches and large datasets (scales to billions) +- Choose **Composite Index** when scalar filters significantly reduce your search space (works well for tens of millions to billions of vectors) +- Use **Search Vector Index** for hybrid search combining text and vectors + [Detailed comparison of vector index types](https://docs.couchbase.com/server/current/vector-index/use-vector-indexes.html) ## Data mapping