# Cosmos DB in Fabric

## Advanced Vector Search

### With deployed Azure OpenAI custom model and Azure Key Vault

This sample notebook shows how to do a vector search for Cosmos DB in Fabric using Azure OpenAI directly with a custom embedding model.

*This is an advanced sample that requires Azure Subscription Owner rights.*

### Features of this Notebook
This notebook demonstrates the following concepts:

- How to write a query in Cosmos DB in Fabric to perform a vector similarity search
- How to use the similarity score from a Cosmos DB vector search to filter and sort for the best results
- How to generate embeddings using OpenAI embeddings model with custom dimension size
- How to deploy Azure OpenAI with models and Azure Key Vault to store secrets
- How to authenticate to Azure Open AI using keys from Azure Key Vault
- How to create and configure a new Cosmos DB container with vector indexing


This sample uses the [Fabric, KeyVault & OpenAI Sample on Github](https://github.com/AzureCosmosDB/fabric-keyvault-openai-secrets) sample to automatically provision an Azure KeyVault and Azure OpenAI account and models. It also applies Azure RBAC policies using your Entra ID as well as the Workspace Identity for your Fabric Workspace. Follow the instructions in the repo and copy the output values directly in the cell below.

This sample uses a custom product catalog dataset [fabricSampleDataVectors-3-large-512.json](https://github.com/AzureCosmosDB/cosmos-fabric-samples/blob/main/datasets/fabricSampleDataVectors-3-large-512.json) with embeddings generated using the text-3-large model and 512 dimensions.

This sample demonstrates using fewer dimensions in your data. This reduces the size of the data which can help to optimize performance, but it comes at the cost of reduced accuracy in your vector search results.

Requirements:
- Azure Subscription Owner permissions
- Workspace identity configured for your Fabric workspace

In [None]:
#Install packages
%pip install azure-cosmos
%pip install openai

In [None]:
#Imports and config values
import logging
from rich.pretty import pprint
import json
import requests
import notebookutils
from typing import Optional, List, Dict, Any
from azure.cosmos.aio import CosmosClient
from azure.cosmos import PartitionKey, exceptions, ThroughputProperties
from openai.lib.azure import AsyncAzureOpenAI

# Values copied from the output when deploying the Fabric Keyvault OpenAI sample
KEYVAULT_URI="https://your-keyvault-account.vault.azure.net/"
KEYVAULT_OPENAI_ENDPOINT="openai-endpoint"
KEYVAULT_OPENAI_API_KEY="openai-api-key"
OPENAI_GPT_MODEL="gpt-4.1-mini"
OPENAI_EMBEDDING_MODEL="text-embedding-3-large"

OPENAI_ENDPOINT = notebookutils.credentials.getSecret(KEYVAULT_URI, KEYVAULT_OPENAI_ENDPOINT)
OPENAI_KEY = notebookutils.credentials.getSecret(KEYVAULT_URI, KEYVAULT_OPENAI_API_KEY)
OPENAI_API_VERSION = "2024-12-01-preview"
OPENAI_EMBEDDING_DIMENSIONS = 512

COSMOS_ENDPOINT = 'https://my-cosmos-endpoint.cosmos.fabric.microsoft.com:443/'
COSMOS_DATABASE_NAME = '{your-cosmos-artifact-name}'
COSMOS_CONTAINER_NAME = 'SampleVectorData-text3'

In [None]:
# Custom TokenCredential implementation for Fabric authentication
%pip install azure-core
from azure.core.credentials import TokenCredential, AccessToken
import base64
import notebookutils
from datetime import datetime, timezone

class FabricTokenCredential(TokenCredential):

    def get_token(self, *scopes: str, claims: Optional[str] = None, tenant_id: Optional[str] = None,
                  enable_cae: bool = False, **kwargs: Any) -> AccessToken:
        access_token = notebookutils.credentials.getToken("https://cosmos.azure.com/")
        parts = access_token.split(".")
        if len(parts) < 2:
            raise ValueError("Invalid JWT format")
        payload_b64 = parts[1]
        # Fix padding
        padding = (-len(payload_b64)) % 4
        if padding:
            payload_b64 += "=" * padding
        payload_json = base64.urlsafe_b64decode(payload_b64.encode("utf-8")).decode("utf-8")
        payload = json.loads(payload_json)
        exp = payload.get("exp")
        if exp is None:
            raise ValueError("exp claim missing in token")
        return AccessToken(token=access_token, expires_on=exp) 

In [None]:
# Initialize Azure OpenAI client using keys from KeyVault
OPENAI_CLIENT = AsyncAzureOpenAI(
    api_version=OPENAI_API_VERSION,
    azure_endpoint=OPENAI_ENDPOINT,
    api_key=OPENAI_KEY
)

In [None]:
# Initialize Cosmos DB client and database
COSMOS_CLIENT = CosmosClient(COSMOS_ENDPOINT, FabricTokenCredential())
DATABASE = COSMOS_CLIENT.get_database_client(COSMOS_DATABASE_NAME)

In [None]:
# Creates and configure a container for vector indexing and load sample data
# This container is specifically configured for 512 dimensions
# Vectors generated using text-3-large model with 512 dimensions
async def create_container_and_load_data():
    
    # Define the vector policy for the container
    vector_embedding_policy = {
        "vectorEmbeddings": [
            {
                "path":"/vectors",
                "dataType":"float32",
                "distanceFunction":"cosine",
                "dimensions":512
            }
        ]
    }

    # Define the indexing policy for the container
    indexing_policy = {
        "includedPaths": [
            {
                "path": "/*"
            }
        ],
        "excludedPaths": [
            {
                "path": "/vectors/*"
            },
            {
                "path": "/\"_etag\"/?"
            }
        ],
        "vectorIndexes": [
            {
                "path": "/vectors",
                "type": "quantizedFlat"
            }
        ]
    }

    # Create the vectorized sample product container
    CONTAINER = await DATABASE.create_container_if_not_exists(
        id=COSMOS_CONTAINER_NAME,
        partition_key=PartitionKey(path='/categoryName', kind='Hash'),
        indexing_policy=indexing_policy,
        vector_embedding_policy=vector_embedding_policy,
        offer_throughput=ThroughputProperties(auto_scale_max_throughput=5000))

    print("Container created. Loading products")

    # Load the vectorized product data. Vectors generated using text-3-large model with 512 dimensions
    url = "https://raw.githubusercontent.com/AzureCosmosDB/cosmos-fabric-samples/refs/heads/main/datasets/fabricSampleDataVectors-3-large-512.json"
    data = requests.get(url).json()
    
    # Insert the data into the container
    for item in data:
        await CONTAINER.create_item(item)

    print(f"Products loaded")

    return CONTAINER

# Run the function and get the container reference
CONTAINER = await create_container_and_load_data()

In [None]:
# Define function to generate embeddings for vector search
async def generate_embeddings(text):

    response = await OPENAI_CLIENT.embeddings.create(
        input=text,
        model=OPENAI_EMBEDDING_MODEL,
        dimensions=OPENAI_EMBEDDING_DIMENSIONS
    )
    
    embeddings = response.model_dump()
    return embeddings['data'][0]['embedding']

# Test Fabric's OpenAI model
#search_text = "Hello from Fabric Notebooks"
#embeddings = await generate_embeddings(search_text.strip())
#print(embeddings)

In [None]:
#Define function to perform vector search
async def search_products(search_text: str, similarity: float = 0.8, limit: int = 5) -> List[Dict[str, Any]]:

    try:
        # Generate embeddings for the search text
        embeddings = await generate_embeddings(search_text.strip())

        # Cosmos query with VectorDistance to perform similarity search
        query = """
            SELECT TOP @limit 
                VectorDistance(c.vectors, @embeddings) AS SimilarityScore,
                c.name, 
                c.description,
                c.categoryName,
                c.currentPrice,
                c.inventory,
                c.priceHistory
            FROM c 
            WHERE 
                c.docType = @docType AND
                VectorDistance(c.vectors, @embeddings) >= @similarity
            ORDER BY 
                VectorDistance(c.vectors, @embeddings)
        """

        parameters = [
            {"name": "@limit", "value": limit},
            {"name": "@embeddings", "value": embeddings},
            {"name": "@docType", "value": "product"},
            {"name": "@similarity", "value": similarity}
        ]

        # Async query: gather results into a list
        products = [p async for p in CONTAINER.query_items(
            query=query,
            #enable_cross_partition_query=True,
            parameters=parameters
        )]

        # Remove the vectors property if it appears, unnecessarily large
        for p in products:
            p.pop('vectors', None)

        return products
        
    except exceptions.CosmosHttpResponseError as e:
        logging.error(f"Cosmos DB query failed: {e}")
        raise
    except Exceptions as e:
        logging.error(f"Unexpected error in search_products: {e}")
        raise

In [None]:
# Vector Search for products
# The value for similarity score returns 8 products, the limit will restrict the results to 5
# Feel free to adjust these to see how the results change. 
# You can also modify the search text for different results
products = await search_products(search_text="gaming pc", similarity=0.628, limit=5)

#print the number of products found
print(f"Found {len(products)} products matching the search criteria.")

display(products) # for tabular output
pprint(products) #Json friendly output