# Azure AI Search with Cohere Embed V4 via Microsoft Foundry

This notebook demonstrates how to use Cohere's `embed-v-4-0` embedding model deployed on **Microsoft Foundry** to generate embeddings and perform vector search in **Azure AI Search**.

Key features:
- Uses the **azure-ai-inference SDK** for Microsoft Foundry integration
- Supports **document vs query** embedding types for optimal RAG performance
- **1536-dimensional vectors** from Cohere's embed-v-4-0 model

References:
- [Using Cohere Embeddings in Azure AI Search](https://techcommunity.microsoft.com/blog/azure-ai-foundry-blog/using-cohere-binary-embeddings-in-azure-ai-search-and-command-rr-model-via-azure/4158111)
- [Microsoft Foundry Cohere Embed Models](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/deploy-models-cohere-embed)

## Prerequisites

- Python 3.10+
- Azure AI Search service
- Cohere embed-v4-0 model deployed on Microsoft Foundry

Create a virtual environment and install dependencies:
```bash
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
```

In [None]:
# Install dependencies
! pip install --pre azure-search-documents azure-ai-inference azure-identity python-dotenv

In [11]:
import os
from dotenv import load_dotenv
from azure.ai.inference import EmbeddingsClient
from azure.ai.inference.models import EmbeddingInputType
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    HnswAlgorithmConfiguration,
    SearchField,
    SearchableField,
    SearchFieldDataType,
    SearchIndex,
    SimpleField,
    VectorSearch,
    VectorSearchAlgorithmKind,
    VectorSearchProfile,
)
from azure.search.documents.models import VectorizedQuery

## Configure Credentials

Set your Azure AI Search and Microsoft Foundry credentials in a `.env` file:

```
AZURE_SEARCH_ENDPOINT=https://your-search.search.windows.net
AZURE_SEARCH_API_KEY=your-search-admin-key
AZURE_INFERENCE_ENDPOINT=https://your-resource.services.ai.azure.com/models
AZURE_INFERENCE_CREDENTIAL=your-microsoft-foundry-api-key
```

In [None]:
load_dotenv()

# Microsoft Foundry - Cohere Embeddings
azure_inference_endpoint = os.getenv("AZURE_INFERENCE_ENDPOINT")
azure_inference_credential = os.getenv("AZURE_INFERENCE_CREDENTIAL")

# Azure AI Search
search_service_endpoint = os.getenv("AZURE_SEARCH_ENDPOINT")
search_service_api_key = os.getenv("AZURE_SEARCH_API_KEY")

index_name = "cohere-embed-v4-index"

# Initialize the embeddings client
embedding_client = EmbeddingsClient(
    endpoint=azure_inference_endpoint,
    credential=AzureKeyCredential(azure_inference_credential),
    model="embed-v-4-0"
)

print(f"Inference Endpoint: {azure_inference_endpoint}")
print(f"Search Endpoint: {search_service_endpoint}")

## Generate Embeddings

The `generate_embeddings` function uses the Cohere embed-v4-0 model via Microsoft Foundry to create embeddings.

**Important**: Cohere models support different input types:
- `EmbeddingInputType.DOCUMENT` - Use when indexing documents
- `EmbeddingInputType.QUERY` - Use when searching

**Note**: Cohere models process one input at a time (batch processing is not supported).

In [13]:
def generate_embeddings(texts, input_type="search_document"):
    """
    Generate embeddings using Cohere embed-v4-0 via Microsoft Foundry.
    
    Args:
        texts: A string or list of strings to embed
        input_type: Either "search_document" (for indexing) or "search_query" (for queries)
    
    Returns:
        List of embedding vectors (1536 dimensions each)
    """
    if isinstance(texts, str):
        texts = [texts]
    
    # Map input type to EmbeddingInputType enum
    embed_input_type = EmbeddingInputType.DOCUMENT if input_type == "search_document" else EmbeddingInputType.QUERY
    
    embeddings = []
    for text in texts:
        response = embedding_client.embed(
            input=[text],
            input_type=embed_input_type
        )
        embeddings.append(response.data[0].embedding)
    
    return embeddings

## Create Search Index

Create an Azure AI Search index with a vector field for Cohere embeddings.

The index schema includes:
- `id`: Document identifier (key field)
- `text`: Searchable text content
- `embedding`: Vector field (1536 dimensions, float32)

In [14]:
def create_or_update_index(index_client, index_name):
    """
    Create or update an Azure AI Search index with vector support.
    """
    fields = [
        SimpleField(
            name="id",
            type=SearchFieldDataType.String,
            key=True,
            filterable=True
        ),
        SearchableField(
            name="text",
            type=SearchFieldDataType.String,
            searchable=True
        ),
        SearchField(
            name="embedding",
            type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
            searchable=True,
            vector_search_dimensions=1536,
            vector_search_profile_name="my-vector-config"
        ),
    ]

    vector_search = VectorSearch(
        algorithms=[
            HnswAlgorithmConfiguration(
                name="my-algorithms-config",
                kind=VectorSearchAlgorithmKind.HNSW
            ),
        ],
        profiles=[
            VectorSearchProfile(
                name="my-vector-config",
                algorithm_configuration_name="my-algorithms-config"
            ),
        ],
    )

    index = SearchIndex(
        name=index_name,
        fields=fields,
        vector_search=vector_search
    )

    result = index_client.create_or_update_index(index)
    print(f"Index '{result.name}' created or updated")
    return result

## Index Documents

Upload documents with their embeddings to the search index.

In [15]:
def index_documents(search_client, documents, embeddings):
    """
    Upload documents with embeddings to the search index.
    """
    documents_to_index = [
        {"id": str(idx), "text": doc, "embedding": emb}
        for idx, (doc, emb) in enumerate(zip(documents, embeddings))
    ]
    result = search_client.upload_documents(documents=documents_to_index)
    print(f"Uploaded {len(documents_to_index)} documents")
    return result

## Run the Workflow

1. Create the search index
2. Generate embeddings for sample documents
3. Upload documents to the index

In [16]:
# Sample documents
documents = [
    "Alan Turing was a British mathematician and computer scientist who is widely considered to be the father of theoretical computer science and artificial intelligence.",
    "Albert Einstein was a German-born theoretical physicist who developed the theory of relativity, one of the two pillars of modern physics.",
    "Isaac Newton was an English polymath active as a mathematician, physicist, astronomer, alchemist, theologian, and author who is widely recognized as one of the greatest mathematicians and physicists.",
    "Marie Curie was a Polish and naturalized-French physicist and chemist who conducted pioneering research on radioactivity."
]

# Initialize clients
index_client = SearchIndexClient(
    endpoint=search_service_endpoint,
    credential=AzureKeyCredential(search_service_api_key)
)

search_client = SearchClient(
    endpoint=search_service_endpoint,
    index_name=index_name,
    credential=AzureKeyCredential(search_service_api_key)
)

# Create index
create_or_update_index(index_client, index_name)

# Generate embeddings for documents
print("Generating embeddings...")
embeddings = generate_embeddings(documents, input_type="search_document")
print(f"Generated {len(embeddings)} embeddings of dimension {len(embeddings[0])}")

# Index documents
index_documents(search_client, documents, embeddings)

Index 'cohere-embed-v4-index' created or updated
Generating embeddings...
Generated 4 embeddings of dimension 1536
Uploaded 4 documents


[<azure.search.documents._generated.models._models_py3.IndexingResult at 0x110b86060>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x110b0af90>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1109e8490>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1109e9e10>]

## Vector Search

Perform a vector similarity search using a query embedding.

In [17]:
# Search query
query = "foundational figures in computer science"

# Generate query embedding (use search_query input type)
query_embedding = generate_embeddings(query, input_type="search_query")[0]

# Perform vector search
vector_query = VectorizedQuery(
    vector=query_embedding,
    k=3,
    fields="embedding"
)

results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["id", "text"]
)

print(f"Query: '{query}'\n")
print("Results:")
print("-" * 60)
for result in results:
    print(f"Score: {result['@search.score']:.4f}")
    print(f"Text: {result['text']}")
    print("-" * 60)

Query: 'foundational figures in computer science'

Results:
------------------------------------------------------------
Score: 0.6661
Text: Alan Turing was a British mathematician and computer scientist who is widely considered to be the father of theoretical computer science and artificial intelligence.
------------------------------------------------------------
Score: 0.5813
Text: Albert Einstein was a German-born theoretical physicist who developed the theory of relativity, one of the two pillars of modern physics.
------------------------------------------------------------
Score: 0.5778
Text: Isaac Newton was an English polymath active as a mathematician, physicist, astronomer, alchemist, theologian, and author who is widely recognized as one of the greatest mathematicians and physicists.
------------------------------------------------------------


## Cleanup (Optional)

Delete the index when done.

In [18]:
# Uncomment to delete the index
# index_client.delete_index(index_name)
# print(f"Index '{index_name}' deleted")