# Introduction

In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [Mistral AI](https://mistral.ai/) as the AI-powered embedding Model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using the FTS, please take a look at [this.](https://developer.couchbase.com/tutorial-mistralai-couchbase-vector-search-with-fts)

Couchbase is a NoSQL distributed document database (JSON) with many of the best features of a relational DBMS: SQL, distributed ACID transactions, and much more. [Couchbase Capella‚Ñ¢](https://cloud.couchbase.com/sign-up) is the easiest way to get started, but you can also download and run [Couchbase Server](http://couchbase.com/downloads) on-premises.

Mistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral's open source and commercial LLMs. 

The [Mistral AI APIs](https://console.mistral.ai/) empower LLM applications via:

- [Text generation](https://docs.mistral.ai/capabilities/completion/), enables streaming and provides the ability to display partial model results in real-time
- [Code generation](https://docs.mistral.ai/capabilities/code_generation/), enpowers code generation tasks, including fill-in-the-middle and code completion
- [Embeddings](https://docs.mistral.ai/capabilities/embeddings/), useful for RAG where it represents the meaning of text as a list of numbers
- [Function calling](https://docs.mistral.ai/capabilities/function_calling/), enables Mistral models to connect to external tools
- [Fine-tuning](https://docs.mistral.ai/capabilities/finetuning/), enables developers to create customized and specilized models
- [JSON mode](https://docs.mistral.ai/capabilities/json_mode/), enables developers to set the response format to json_object
- [Guardrailing](https://docs.mistral.ai/capabilities/guardrailing/), enables developers to enforce policies at the system level of Mistral models

This tutorial demonstrates how to use Mistral AI's embedding capabilities with Couchbase's **Global Secondary Index (GSI)** for optimized vector search operations. GSI provides superior performance for vector operations compared to traditional search methods, especially for large-scale applications.


# How to run this tutorial

This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/mistralai/gsi/mistralai.ipynb).

You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.


# Before you start

## Get Credentials for Mistral AI

Please follow the [instructions](https://console.mistral.ai/api-keys/) to generate the Mistral AI credentials.

## Create and Deploy Your Free Tier Operational cluster on Capella

To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.

To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).

**Note: To run this tutorial, you will need Capella with Couchbase Server version 8.0 or above as GSI vector search is supported only from version 8.0.**

### Couchbase Capella Configuration

When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.

* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.


# Install necessary libraries


In [None]:
%pip install couchbase==4.4.0 mistralai==1.9.10 langchain-couchbase==0.5.0 langchain-core==0.3.76 python-dotenv==1.1.1


# Imports


In [27]:
from datetime import timedelta
from mistralai import Mistral
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase.vectorstores import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy, IndexType
from langchain_core.embeddings import Embeddings
from typing import List
from dotenv import load_dotenv
import os
import time

# Prerequisites


In [28]:
import getpass

# Load environment variables from .env file if it exists
load_dotenv()

# Load from environment variables or prompt for input
CB_HOST = os.getenv('CB_HOST') or input("Cluster URL:")
CB_USERNAME = os.getenv('CB_USERNAME') or input("Couchbase username:")
CB_PASSWORD = os.getenv('CB_PASSWORD') or getpass.getpass("Couchbase password:")
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or input("Couchbase bucket:")
SCOPE_NAME = os.getenv('SCOPE_NAME') or input("Couchbase scope:")
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or input("Couchbase collection:")


# Couchbase Connection


In [29]:
auth = PasswordAuthenticator(
    CB_USERNAME,
    CB_PASSWORD
)


In [30]:
cluster = Cluster(CB_HOST, ClusterOptions(auth))
cluster.wait_until_ready(timedelta(seconds=5))

bucket = cluster.bucket(CB_BUCKET_NAME)
scope = bucket.scope(SCOPE_NAME)
collection = scope.collection(COLLECTION_NAME)


## Setting Up Collections in Couchbase

The setup_collection() function handles creating and configuring the hierarchical data organization in Couchbase:

1. Bucket Creation:
   - Checks if specified bucket exists, creates it if not
   - Sets bucket properties like RAM quota (1024MB) and replication (disabled)
   - Note: You will not be able to create a bucket on Capella

2. Scope Management:  
   - Verifies if requested scope exists within bucket
   - Creates new scope if needed (unless it's the default "_default" scope)

3. Collection Setup:
   - Checks for collection existence within scope
   - Creates collection if it doesn't exist
   - Waits 2 seconds for collection to be ready

Additional Tasks:
- Clears any existing documents for clean state
- Implements comprehensive error handling and logging


In [31]:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
    try:
        # Check if bucket exists, create if it doesn't
        try:
            bucket = cluster.bucket(bucket_name)
        except Exception as e:
            bucket_settings = CreateBucketSettings(
                name=bucket_name,
                bucket_type='couchbase',
                ram_quota_mb=1024,
                flush_enabled=True,
                num_replicas=0
            )
            cluster.buckets().create_bucket(bucket_settings)
            time.sleep(2)  # Wait for bucket creation to complete and become available
            bucket = cluster.bucket(bucket_name)

        bucket_manager = bucket.collections()

        # Check if scope exists, create if it doesn't
        scopes = bucket_manager.get_all_scopes()
        scope_exists = any(scope.name == scope_name for scope in scopes)
        
        if not scope_exists and scope_name != "_default":
            bucket_manager.create_scope(scope_name)

        # Check if collection exists, create if it doesn't
        collections = bucket_manager.get_all_scopes()
        collection_exists = any(
            scope.name == scope_name and collection_name in [col.name for col in scope.collections]
            for scope in collections
        )

        if not collection_exists:
            bucket_manager.create_collection(scope_name, collection_name)

        # Wait for collection to be ready
        collection = bucket.scope(scope_name).collection(collection_name)
        time.sleep(2)  # Give the collection time to be ready for queries

        # Clear all documents in the collection
        try:
            query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
            cluster.query(query).execute()
        except Exception as e:
            print(f"Error while clearing documents: {str(e)}. The collection might be empty.")

        return collection
    except Exception as e:
        raise RuntimeError(f"Error setting up collection: {str(e)}")
    
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)


<couchbase.collection.Collection at 0x10dbaa250>

# Creating Mistral AI Embeddings Wrapper

Since Mistral AI doesn't have native LangChain integration, we need to create a custom wrapper class that implements the LangChain Embeddings interface. This will allow us to use Mistral AI's embedding model with Couchbase's GSI vector store.


In [32]:
class MistralAIEmbeddings(Embeddings):
    """Custom Mistral AI Embeddings wrapper for LangChain compatibility."""
    
    def __init__(self, api_key: str, model: str = "mistral-embed"):
        self.client = Mistral(api_key=api_key)
        self.model = model
    
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        """Embed search docs."""
        try:
            response = self.client.embeddings.create(
                model=self.model,
                inputs=texts,
            )
            return [embedding.embedding for embedding in response.data]
        except Exception as e:
            raise ValueError(f"Error generating embeddings: {str(e)}")
    
    def embed_query(self, text: str) -> List[float]:
        """Embed query text."""
        try:
            response = self.client.embeddings.create(
                model=self.model,
                inputs=[text],
            )
            return response.data[0].embedding
        except Exception as e:
            raise ValueError(f"Error generating query embedding: {str(e)}")


# Mistral Connection


In [33]:
MISTRAL_API_KEY = os.getenv('MISTRAL_API_KEY') or getpass.getpass("Mistral API Key:")
embeddings = MistralAIEmbeddings(api_key=MISTRAL_API_KEY, model="mistral-embed")
mistral_client = Mistral(api_key=MISTRAL_API_KEY)


# Understanding GSI Vector Search

### Optimizing Vector Search with Global Secondary Index (GSI)

With Couchbase 8.0+, you can leverage the power of GSI-based vector search, which offers significant performance improvements over traditional Full-Text Search (FTS) approaches for vector-first workloads. GSI vector search provides high-performance vector similarity search with advanced filtering capabilities and is designed to scale to billions of vectors.

#### GSI vs FTS: Choosing the Right Approach

| Feature               | GSI Vector Search                                               | FTS Vector Search                         |
| --------------------- | --------------------------------------------------------------- | ----------------------------------------- |
| **Best For**          | Vector-first workloads, complex filtering, high QPS performance| Hybrid search and high recall rates      |
| **Couchbase Version** | 8.0.0+                                                         | 7.6+                                      |
| **Filtering**         | Pre-filtering with `WHERE` clauses (Composite) or post-filtering (BHIVE) | Pre-filtering with flexible ordering |
| **Scalability**       | Up to billions of vectors (BHIVE)                              | Up to 10 million vectors                  |
| **Performance**       | Optimized for concurrent operations with low memory footprint  | Good for mixed text and vector queries   |


#### GSI Vector Index Types

Couchbase offers two distinct GSI vector index types, each optimized for different use cases:

##### Hyperscale Vector Indexes (BHIVE)

- **Best for**: Pure vector searches like content discovery, recommendations, and semantic search
- **Use when**: You primarily perform vector-only queries without complex scalar filtering
- **Features**: 
  - High performance with low memory footprint
  - Optimized for concurrent operations
  - Designed to scale to billions of vectors
  - Supports post-scan filtering for basic metadata filtering

##### Composite Vector Indexes

  - **Best for**: Filtered vector searches that combine vector similarity with scalar value filtering
- **Use when**: Your queries combine vector similarity with scalar filters that eliminate large portions of data
- **Features**: 
  - Efficient pre-filtering where scalar attributes reduce the vector comparison scope
  - Best for well-defined workloads requiring complex filtering using GSI features
  - Supports range lookups combined with vector search

#### Index Type Selection for This Tutorial

In this tutorial, we'll demonstrate creating a **BHIVE index** and running vector similarity queries using GSI. BHIVE is ideal for semantic search scenarios where you want:

1. **High-performance vector search** across large datasets
2. **Low latency** for real-time applications
3. **Scalability** to handle growing vector collections
4. **Concurrent operations** for multi-user environments

The BHIVE index will provide optimal performance for our OpenAI embedding-based semantic search implementation.

#### Alternative: Composite Vector Index

If your use case requires complex filtering with scalar attributes, you may want to consider using a **Composite Vector Index** instead:

```python
# Alternative: Create a Composite index for filtered searches
vector_store.create_index(
    index_type=IndexType.COMPOSITE,
    index_description="IVF,SQ8",
    distance_metric=DistanceStrategy.COSINE,
    index_name="pydantic_composite_index",
)
```

**Use Composite indexes when:**
- You need to filter by document metadata or attributes before vector similarity
- Your queries combine vector search with WHERE clauses
- You have well-defined filtering requirements that can reduce the search space

**Note**: Composite indexes enable pre-filtering with scalar attributes, making them ideal for applications where you need to search within specific categories, date ranges, or user-specific data segments.

#### Understanding GSI Index Configuration (Couchbase 8.0 Feature)

Before creating our BHIVE index, it's important to understand the configuration parameters that optimize vector storage and search performance. The `index_description` parameter controls how Couchbase optimizes vector storage through centroids and quantization.

##### Index Description Format: `'IVF[<centroids>],{PQ|SQ}<settings>'`

##### Centroids (IVF - Inverted File)

- Controls how the dataset is subdivided for faster searches
- **More centroids** = faster search, slower training time
- **Fewer centroids** = slower search, faster training time
- If omitted (like `IVF,SQ8`), Couchbase auto-selects based on dataset size

###### Quantization Options

**Scalar Quantization (SQ):**
- `SQ4`, `SQ6`, `SQ8` (4, 6, or 8 bits per dimension)
- Lower memory usage, faster search, slightly reduced accuracy

**Product Quantization (PQ):**
- Format: `PQ<subquantizers>x<bits>` (e.g., `PQ32x8`)
- Better compression for very large datasets
- More complex but can maintain accuracy with smaller index size

##### Common Configuration Examples

- **`IVF,SQ8`** - Auto centroids, 8-bit scalar quantization (good default)
- **`IVF1000,SQ6`** - 1000 centroids, 6-bit scalar quantization
- **`IVF,PQ32x8`** - Auto centroids, 32 subquantizers with 8 bits

For detailed configuration options, see the [Quantization & Centroid Settings](https://docs.couchbase.com/cloud/vector-index/hyperscale-vector-index.html#algo_settings).

For more information on GSI vector indexes, see [Couchbase GSI Vector Documentation](https://docs.couchbase.com/cloud/vector-index/use-vector-indexes.html).

##### Our Configuration Choice

In this tutorial, we use `IVF,SQ8` which provides:
- **Auto-selected centroids** optimized for our dataset size
- **8-bit scalar quantization** for good balance of speed, memory usage, and accuracy
- **COSINE distance metric** ideal for semantic similarity search
- **Optimal performance** for most semantic search use cases

# Setting Up Couchbase GSI Vector Store

Instead of using FTS (Full-Text Search), we'll use Couchbase's GSI (Global Secondary Index) for vector operations. GSI provides better performance for vector search operations and supports advanced index types like BHIVE and COMPOSITE indexes.


In [34]:
vector_store = CouchbaseQueryVectorStore(
    cluster=cluster,
    bucket_name=CB_BUCKET_NAME,
    scope_name=SCOPE_NAME,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    distance_metric=DistanceStrategy.COSINE
)

print("GSI Vector Store created successfully!")


GSI Vector Store created successfully!


# Embedding Documents

Mistral client can be used to generate vector embeddings for given text fragments. These embeddings represent the sentiment of corresponding fragments and can be stored in Couchbase for further retrieval. A custom embedding text can also be added into the embedding texts array by running this code block:


In [35]:
texts = [
    "Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON's versatility, with a foundation that is extremely fast and scalable.",
    "It's used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more.",
    input("custom embedding text")
]

# Store documents in the GSI vector store
vector_store.add_texts(texts)

print("Documents added to GSI vector store successfully!")


2025-11-07 15:50:09,439 - INFO - HTTP Request: POST https://api.mistral.ai/v1/embeddings "HTTP/1.1 200 OK"


Documents added to GSI vector store successfully!


# Understanding Semantic Search in Couchbase

Semantic search goes beyond traditional keyword matching by understanding the meaning and context behind queries. Here's how it works in Couchbase:

## How Semantic Search Works

1. **Vector Embeddings**: Documents and queries are converted into high-dimensional vectors using an embeddings model (in our case, Mistral AI's mistral-embed)

2. **Similarity Calculation**: When a query is made, Couchbase compares the query vector against stored document vectors using the COSINE distance metric

3. **Result Ranking**: Documents are ranked by their vector distance (lower distance = more similar meaning)

4. **Flexible Configuration**: Different distance metrics (cosine, euclidean, dot product) and embedding models can be used based on your needs

The `similarity_search_with_score` method performs this entire process, returning documents along with their similarity scores. This enables you to find semantically related content even when exact keywords don't match.

Now let's see semantic search in action and measure its performance with different optimization strategies.


# Vector Search Performance Optimization

Now let's measure and compare the performance benefits of different optimization strategies. We'll conduct a comprehensive performance analysis across two phases:

## Performance Testing Phases

1. **Phase 1 - Baseline Performance**: Test vector search without GSI indexes to establish baseline metrics

2. **Phase 2 - GSI-Optimized Search**: Create BHIVE index and measure performance improvements

**Important Context:**

- GSI performance benefits scale with dataset size and concurrent load
- With our dataset (~3 documents), improvements may be modest
- Production environments with millions of vectors show significant GSI advantages
- The combination of GSI + embeddings provides optimal semantic search performance


# Phase 1: Baseline Performance (Without GSI Index)

First, let's test the search performance without any GSI indexes. This will help us establish a baseline for comparison.


In [None]:
import logging

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# Phase 1: Baseline Performance (Without GSI Index)
print("="*80)
print("PHASE 1: BASELINE PERFORMANCE (NO GSI INDEX)")
print("="*80)

query = "name a multipurpose database with distributed capability"

try:
    # Perform the semantic search
    start_time = time.time()
    search_results = vector_store.similarity_search_with_score(query, k=3)
    baseline_time = time.time() - start_time

    logging.info(f"Baseline search completed in {baseline_time:.2f} seconds")

    # Display search results
    print(f"\nBaseline Search Results (completed in {baseline_time:.4f} seconds):")
    print("-" * 80)
    for i, (doc, distance) in enumerate(search_results, 1):
        print(f"[Result {i}] Vector Distance: {distance:.4f}")
        # Truncate for readability
        content_preview = doc.page_content[:150] + "..." if len(doc.page_content) > 150 else doc.page_content
        print(f"Text: {content_preview}")
        print("-" * 80)

except Exception as e:
    raise RuntimeError(f"Error performing semantic search: {str(e)}")


# Optimizing Vector Search with Global Secondary Index (GSI)

While the above semantic search using similarity_search_with_score works effectively, we can significantly improve query performance by leveraging Global Secondary Index (GSI) in Couchbase.

Couchbase offers three types of vector indexes, but for GSI-based vector search we focus on two main types:

## Hyperscale Vector Indexes (BHIVE)
- Best for pure vector searches - content discovery, recommendations, semantic search
- High performance with low memory footprint - designed to scale to billions of vectors
- Optimized for concurrent operations - supports simultaneous searches and inserts
- Use when: You primarily perform vector-only queries without complex scalar filtering
- Ideal for: Large-scale semantic search, recommendation systems, content discovery

## Composite Vector Indexes 
- Best for filtered vector searches - combines vector search with scalar value filtering
- Efficient pre-filtering - scalar attributes reduce the vector comparison scope
- Use when: Your queries combine vector similarity with scalar filters that eliminate large portions of data
- Ideal for: Compliance-based filtering, user-specific searches, time-bounded queries

## Choosing the Right Index Type
- Start with Hyperscale Vector Index for pure vector searches and large datasets
- Use Composite Vector Index when scalar filters significantly reduce your search space
- Consider your dataset size: Hyperscale scales to billions, Composite works well for tens of millions to billions

For more details, see the [Couchbase Vector Index documentation](https://docs.couchbase.com/cloud/vector-index/use-vector-indexes.html).


## Understanding Index Configuration (Couchbase 8.0 Feature)

The index_description parameter controls how Couchbase optimizes vector storage and search performance through centroids and quantization:

Format: `'IVF[<centroids>],{PQ|SQ}<settings>'`

### Centroids (IVF - Inverted File):
- Controls how the dataset is subdivided for faster searches
- More centroids = faster search, slower training  
- Fewer centroids = slower search, faster training
- If omitted (like IVF,SQ8), Couchbase auto-selects based on dataset size

### Quantization Options:
- SQ (Scalar Quantization): SQ4, SQ6, SQ8 (4, 6, or 8 bits per dimension)
- PQ (Product Quantization): PQ<subquantizers>x<bits> (e.g., PQ32x8)
- Higher values = better accuracy, larger index size

### Common Examples:
- IVF,SQ8 - Auto centroids, 8-bit scalar quantization (good default)
- IVF1000,SQ6 - 1000 centroids, 6-bit scalar quantization  
- IVF,PQ32x8 - Auto centroids, 32 subquantizers with 8 bits

For detailed configuration options, see the [Quantization & Centroid Settings](https://docs.couchbase.com/cloud/vector-index/hyperscale-vector-index.html#algo_settings).

In the code below, we demonstrate creating a BHIVE index. This method takes an index type (BHIVE or COMPOSITE) and description parameter for optimization settings. Alternatively, GSI indexes can be created manually from the Couchbase UI.


# Phase 2: GSI-Optimized Performance (With BHIVE Index)

Now let's create a BHIVE index and measure the performance improvements when searching with GSI optimization.


In [None]:
# Create a BHIVE index for optimal vector search performance
print("\nCreating BHIVE index for GSI optimization...")
vector_store.create_index(
    index_type=IndexType.BHIVE, 
    index_name="mistral_bhive_index_optimized",
    index_description="IVF,SQ8"
)
print("BHIVE index created successfully!")


Note: To create a COMPOSITE index, the below code can be used.
Choose based on your specific use case and query patterns. For this tutorial's news search scenario, either index type would work, but BHIVE might be more efficient for pure semantic search across news articles.

vector_store.create_index(index_type=IndexType.COMPOSITE, index_name="pydantic_ai_composite_index", index_description="IVF,SQ8")

In [None]:
# Phase 2: GSI-Optimized Performance (With BHIVE Index)
print("\n" + "="*80)
print("PHASE 2: GSI-OPTIMIZED PERFORMANCE (WITH BHIVE INDEX)")
print("="*80)

query = "name a multipurpose database with distributed capability"

try:
    # Perform the semantic search with GSI
    start_time = time.time()
    search_results = vector_store.similarity_search_with_score(query, k=3)
    gsi_time = time.time() - start_time

    logging.info(f"GSI-optimized search completed in {gsi_time:.2f} seconds")

    # Display search results
    print(f"\nGSI-Optimized Search Results (completed in {gsi_time:.4f} seconds):")
    print("-" * 80)
    for i, (doc, distance) in enumerate(search_results, 1):
        print(f"[Result {i}] Vector Distance: {distance:.4f}")
        # Truncate for readability
        content_preview = doc.page_content[:150] + "..." if len(doc.page_content) > 150 else doc.page_content
        print(f"Text: {content_preview}")
        print("-" * 80)

except Exception as e:
    raise RuntimeError(f"Error performing semantic search: {str(e)}")


# Performance Summary

Let's analyze the performance improvements achieved through GSI optimization.


In [None]:
print("\n" + "="*80)
print("VECTOR SEARCH PERFORMANCE OPTIMIZATION SUMMARY")
print("="*80)

print(f"\nüìä Performance Comparison:")
print(f"{'Optimization Level':<35} {'Time (seconds)':<20} {'Status'}")
print("-" * 80)
print(f"{'Phase 1 - Baseline (No Index)':<35} {baseline_time:.4f}{'':16} ‚ö™ Baseline")
print(f"{'Phase 2 - GSI-Optimized (BHIVE)':<35} {gsi_time:.4f}{'':16} ‚úÖ Optimized")

# Calculate improvement
if baseline_time > gsi_time:
    speedup = baseline_time / gsi_time
    improvement = ((baseline_time - gsi_time) / baseline_time) * 100
    print(f"\n‚ú® GSI Performance Gain: {speedup:.2f}x faster ({improvement:.1f}% improvement)")
elif gsi_time > baseline_time:
    slowdown_pct = ((gsi_time - baseline_time) / baseline_time) * 100
    print(f"\n‚ö†Ô∏è  Note: GSI was {slowdown_pct:.1f}% slower than baseline in this run")
    print(f"   This can happen with small datasets. GSI benefits emerge with scale.")
else:
    print(f"\n‚öñÔ∏è  Performance: Comparable to baseline")

print("\n" + "-"*80)
print("KEY INSIGHTS:")
print("-"*80)
print("1. üöÄ GSI Optimization:")
print("   ‚Ä¢ BHIVE indexes excel with large-scale datasets (millions+ vectors)")
print("   ‚Ä¢ Performance gains increase with dataset size and concurrent queries")
print("   ‚Ä¢ Optimal for production workloads with sustained traffic patterns")

print("\n2. üì¶ Dataset Size Impact:")
print(f"   ‚Ä¢ Current dataset: ~3 sample documents")
print("   ‚Ä¢ At this scale, performance differences may be minimal or variable")
print("   ‚Ä¢ Significant gains typically seen with 10M+ vectors")

print("\n3. üéØ When to Use GSI:")
print("   ‚Ä¢ Large-scale vector search applications")
print("   ‚Ä¢ High query-per-second (QPS) requirements")
print("   ‚Ä¢ Multi-user concurrent access scenarios")
print("   ‚Ä¢ Production environments requiring scalability")

print("\n" + "="*80)


# Conclusion

This tutorial demonstrated how to use Mistral AI's embedding capabilities with Couchbase's GSI vector search, including comprehensive performance analysis. Key takeaways include:

## What We Covered

1. **Semantic Search Fundamentals**: Understanding how vector embeddings enable meaning-based search
2. **Mistral AI Integration**: Creating a custom LangChain wrapper for Mistral AI's powerful mistral-embed model
3. **Performance Testing**: Conducting baseline vs GSI-optimized performance comparisons
4. **GSI Index Types**: Understanding BHIVE (pure vector search) and COMPOSITE (filtered searches) indexes
5. **Index Configuration**: Learning about centroids, quantization, and optimization settings

## Key Benefits of This Approach

1. **High-Performance Vector Search**: GSI provides optimized vector operations with low latency
2. **Scalability**: BHIVE indexes designed to handle billions of vectors efficiently
3. **Production-Ready**: Optimal for applications requiring high QPS and concurrent access
4. **Flexible Configuration**: Customizable index settings for different use cases
5. **Advanced Filtering**: COMPOSITE indexes enable complex scalar + vector queries

## Performance Insights

- GSI benefits scale with dataset size and query load
- Small datasets may show modest improvements
- Production environments with millions of vectors see significant performance gains
- Consider your specific use case when choosing between BHIVE and COMPOSITE indexes

## Next Steps

- Scale your dataset to explore GSI performance at higher volumes
- Experiment with different index configurations (IVF centroids, quantization settings)
- Try COMPOSITE indexes for filtered search scenarios
- Integrate this solution into your production RAG or semantic search applications

The combination of Mistral AI's embeddings and Couchbase's GSI vector search provides a powerful, scalable foundation for building intelligent search applications.
