# Introduction

In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [Mistral AI](https://mistral.ai/) as the AI-powered embedding Model. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively, if you want to perform semantic search using the FTS, please take a look at [this.](https://developer.couchbase.com//tutorial-mistralai-couchbase-vector-search-with-fts)

Couchbase is a NoSQL distributed document database (JSON) with many of the best features of a relational DBMS: SQL, distributed ACID transactions, and much more. [Couchbase Capella™](https://cloud.couchbase.com/sign-up) is the easiest way to get started, but you can also download and run [Couchbase Server](http://couchbase.com/downloads) on-premises.

Mistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral's open source and commercial LLMs. 

The [Mistral AI APIs](https://console.mistral.ai/) empower LLM applications via:

- [Text generation](https://docs.mistral.ai/capabilities/completion/), enables streaming and provides the ability to display partial model results in real-time
- [Code generation](https://docs.mistral.ai/capabilities/code_generation/), enpowers code generation tasks, including fill-in-the-middle and code completion
- [Embeddings](https://docs.mistral.ai/capabilities/embeddings/), useful for RAG where it represents the meaning of text as a list of numbers
- [Function calling](https://docs.mistral.ai/capabilities/function_calling/), enables Mistral models to connect to external tools
- [Fine-tuning](https://docs.mistral.ai/capabilities/finetuning/), enables developers to create customized and specilized models
- [JSON mode](https://docs.mistral.ai/capabilities/json_mode/), enables developers to set the response format to json_object
- [Guardrailing](https://docs.mistral.ai/capabilities/guardrailing/), enables developers to enforce policies at the system level of Mistral models

This tutorial demonstrates how to use Mistral AI's embedding capabilities with Couchbase's **Global Secondary Index (GSI)** for optimized vector search operations. GSI provides superior performance for vector operations compared to traditional search methods, especially for large-scale applications.


# How to run this tutorial

This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/mistralai/gsi/mistralai.ipynb).

You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.


# Before you start

## Get Credentials for Mistral AI

Please follow the [instructions](https://console.mistral.ai/api-keys/) to generate the Mistral AI credentials.

## Create and Deploy Your Free Tier Operational cluster on Capella

To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.

To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).

**Note: To run this tutorial, you will need Capella with Couchbase Server version 8.0 or above as GSI vector search is supported only from version 8.0.**

### Couchbase Capella Configuration

When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.

* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.


# Install necessary libraries


In [None]:
%pip install couchbase==4.4.0 mistralai==1.9.10 langchain-couchbase==0.5.0rc1 langchain-core==0.3.76 python-dotenv==1.1.1


Collecting langchain-core==0.3.76
  Using cached langchain_core-0.3.76-py3-none-any.whl.metadata (3.7 kB)
Collecting langsmith>=0.3.45 (from langchain-core==0.3.76)
  Using cached langsmith-0.4.30-py3-none-any.whl.metadata (14 kB)
Using cached langchain_core-0.3.76-py3-none-any.whl (447 kB)
Using cached langsmith-0.4.30-py3-none-any.whl (386 kB)
Installing collected packages: langsmith, langchain-core
[2K  Attempting uninstall: langsmith
[2K    Found existing installation: langsmith 0.2.11
[2K    Uninstalling langsmith-0.2.11:
[2K      Successfully uninstalled langsmith-0.2.11
[2K  Attempting uninstall: langchain-core━━━━━━━━━━[0m [32m0/2[0m [langsmith]
[2K    Found existing installation: langchain-core 0.3.280/2[0m [langsmith]
[2K    Uninstalling langchain-core-0.3.28:━━━━━[0m [32m0/2[0m [langsmith]
[2K      Successfully uninstalled langchain-core-0.3.282m0/2[0m [langsmith]
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2/2[0m [langchain-core]m [langcha

# Imports


In [None]:
from datetime import timedelta
from mistralai import Mistral
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.options import ClusterOptions
from langchain_couchbase.vectorstores import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy, IndexType
from langchain_core.embeddings import Embeddings
from typing import List
from dotenv import load_dotenv
import os


# Prerequisites


In [None]:
import getpass

# Load environment variables from .env file if it exists
load_dotenv()

# Load from environment variables or prompt for input
couchbase_cluster_url = os.getenv('COUCHBASE_CLUSTER_URL') or input("Cluster URL:")
couchbase_username = os.getenv('COUCHBASE_USERNAME') or input("Couchbase username:")
couchbase_password = os.getenv('COUCHBASE_PASSWORD') or getpass.getpass("Couchbase password:")
couchbase_bucket = os.getenv('COUCHBASE_BUCKET') or input("Couchbase bucket:")
couchbase_scope = os.getenv('COUCHBASE_SCOPE') or input("Couchbase scope:")
couchbase_collection = os.getenv('COUCHBASE_COLLECTION') or input("Couchbase collection:")


# Couchbase Connection


In [7]:
auth = PasswordAuthenticator(
    couchbase_username,
    couchbase_password
)


In [8]:
cluster = Cluster(couchbase_cluster_url, ClusterOptions(auth))
cluster.wait_until_ready(timedelta(seconds=5))

bucket = cluster.bucket(couchbase_bucket)
scope = bucket.scope(couchbase_scope)
collection = scope.collection(couchbase_collection)


# Creating Mistral AI Embeddings Wrapper

Since Mistral AI doesn't have native LangChain integration, we need to create a custom wrapper class that implements the LangChain Embeddings interface. This will allow us to use Mistral AI's embedding model with Couchbase's GSI vector store.


In [9]:
class MistralAIEmbeddings(Embeddings):
    """Custom Mistral AI Embeddings wrapper for LangChain compatibility."""
    
    def __init__(self, api_key: str, model: str = "mistral-embed"):
        self.client = Mistral(api_key=api_key)
        self.model = model
    
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        """Embed search docs."""
        try:
            response = self.client.embeddings.create(
                model=self.model,
                inputs=texts,
            )
            return [embedding.embedding for embedding in response.data]
        except Exception as e:
            raise ValueError(f"Error generating embeddings: {str(e)}")
    
    def embed_query(self, text: str) -> List[float]:
        """Embed query text."""
        try:
            response = self.client.embeddings.create(
                model=self.model,
                inputs=[text],
            )
            return response.data[0].embedding
        except Exception as e:
            raise ValueError(f"Error generating query embedding: {str(e)}")


# Mistral Connection


In [None]:
MISTRAL_API_KEY = os.getenv('MISTRAL_API_KEY') or getpass.getpass("Mistral API Key:")
embeddings = MistralAIEmbeddings(api_key=MISTRAL_API_KEY, model="mistral-embed")
mistral_client = Mistral(api_key=MISTRAL_API_KEY)


# Setting Up Couchbase GSI Vector Store

Instead of using FTS (Full-Text Search), we'll use Couchbase's GSI (Global Secondary Index) for vector operations. GSI provides better performance for vector search operations and supports advanced index types like BHIVE and COMPOSITE indexes.


In [11]:
vector_store = CouchbaseQueryVectorStore(
    cluster=cluster,
    bucket_name=couchbase_bucket,
    scope_name=couchbase_scope,
    collection_name=couchbase_collection,
    embedding=embeddings,
    distance_metric=DistanceStrategy.COSINE
)

print("GSI Vector Store created successfully!")


GSI Vector Store created successfully!


# Embedding Documents

Mistral client can be used to generate vector embeddings for given text fragments. These embeddings represent the sentiment of corresponding fragments and can be stored in Couchbase for further retrieval. A custom embedding text can also be added into the embedding texts array by running this code block:


In [12]:
texts = [
    "Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON's versatility, with a foundation that is extremely fast and scalable.",
    "It's used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more.",
    input("custom embedding text")
]

# Store documents in the GSI vector store
vector_store.add_texts(texts)

print("Documents added to GSI vector store successfully!")


Documents added to GSI vector store successfully!


# Creating GSI Vector Index for Optimal Performance

GSI supports different types of vector indexes for optimal performance:

- **BHIVE (Hyperscale Vector Index)**: Best for pure vector searches with high performance and low memory footprint
- **COMPOSITE**: Best for filtered vector searches that combine vector similarity with scalar filtering

Let's create a BHIVE index for our use case:


In [13]:
# Create a BHIVE index for optimal vector search performance
vector_store.create_index(
    index_type=IndexType.BHIVE, 
    index_name="mistral_bhive_index",
    index_description="IVF,SQ8"
)

print("BHIVE index created successfully!")


BHIVE index created successfully!


# Searching For Embeddings with GSI

Now we can search using GSI vector operations, which provide better performance than traditional FTS methods. The GSI vector store handles the embedding generation and similarity search internally:


In [14]:
import time

# Test query
query = "name a multipurpose database with distributed capability"

# Perform GSI-optimized similarity search
start_time = time.time()
search_results = vector_store.similarity_search_with_score(query, k=3)
search_time = time.time() - start_time

print(f"GSI Vector Search completed in {search_time:.4f} seconds")
print("-" * 60)

for i, (doc, score) in enumerate(search_results):
    print(f"Result {i+1}:")
    print(f"Score: {score:.6f}")
    print(f"Text: {doc.page_content}")
    print("-" * 60)


GSI Vector Search completed in 1.3109 seconds
------------------------------------------------------------
Result 1:
Score: 0.286969
Text: Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON's versatility, with a foundation that is extremely fast and scalable.
------------------------------------------------------------
Result 2:
Score: 0.348376
Text: It's used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more.
------------------------------------------------------------
Result 3:
Score: 0.436688
Text: 
------------------------------------------------------------


# GSI Performance Benefits

The GSI approach provides several advantages over traditional FTS methods:

1. **Better Performance**: GSI vector operations are optimized for similarity search
2. **Scalability**: BHIVE indexes can handle billions of vectors efficiently
3. **Memory Optimization**: Lower memory footprint compared to FTS
4. **Concurrent Operations**: Supports simultaneous searches and inserts
5. **Advanced Configuration**: Configurable centroids and quantization options

Let's test with multiple queries to see the performance:


In [15]:
# Test multiple queries to demonstrate GSI performance
test_queries = [
    "fast and scalable database solution",
    "JSON document database with SQL support",
    "high-speed caching for applications"
]

print("GSI Vector Search Performance Tests:")
print("=" * 60)

for i, query in enumerate(test_queries, 1):
    start_time = time.time()
    results = vector_store.similarity_search_with_score(query, k=1)
    search_time = time.time() - start_time
    
    print(f"Query {i}: {query}")
    print(f"Search Time: {search_time:.4f} seconds")
    if results:
        doc, score = results[0]
        print(f"Best Match (Score: {score:.6f}): {doc.page_content[:100]}...")
    print("-" * 60)


GSI Vector Search Performance Tests:
Query 1: fast and scalable database solution
Search Time: 1.2303 seconds
Best Match (Score: 0.278119): Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational data...
------------------------------------------------------------
Query 2: JSON document database with SQL support
Search Time: 0.4636 seconds
Best Match (Score: 0.182554): Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational data...
------------------------------------------------------------
Query 3: high-speed caching for applications
Search Time: 0.4016 seconds
Best Match (Score: 0.261582): It's used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vec...
------------------------------------------------------------


# Additional GSI Index Configuration (Optional)

For more complex use cases, you can also create a COMPOSITE index which combines vector search with scalar filtering:


In [16]:
# Optional: Create a COMPOSITE index for filtered vector searches
#   vector_store.create_index(
#       index_type=IndexType.COMPOSITE, 
#       index_name="mistral_composite_index",
#       index_description="(type, vector_embedding)"
#   )

# Conclusion

This tutorial demonstrated how to use Mistral AI's embedding capabilities with Couchbase's GSI vector search for optimal performance. Key benefits of this approach include:

1. **GSI Performance**: Faster vector operations compared to traditional search methods
2. **Mistral AI Integration**: Powerful embedding model with custom LangChain wrapper
3. **Scalability**: BHIVE indexes handle large-scale vector operations efficiently
4. **Flexibility**: Support for both BHIVE and COMPOSITE index types

The GSI approach provides superior performance for vector search operations, making it ideal for production applications requiring fast semantic search capabilities.
