# Azure AI Search with Cohere Embed V3 Int8 Support
This code demonstrates the use of the Cohere API to generate embeddings with the latest and highest-performing model from Cohere, [Cohere Embed V3](https://txt.cohere.com/introducing-embed-v3/). It also explains how to store these embeddings in Azure AI Search using the Python SDK as an Int8 vector data type. This approach maintains 100% of the embedding quality while reducing our index size by 4x compared to the Float32 representation.


### Set up a Python virtual environment in Visual Studio Code

1. Open the Command Palette (Ctrl+Shift+P).
1. Search for **Python: Create Environment**.
1. Select **Venv**.
1. Select a Python interpreter. Choose 3.10 or later.

It can take a minute to set up. If you run into problems, see [Python environments in VS Code](https://code.visualstudio.com/docs/python/environments).

## Install required libraries

In [None]:
! pip install -r azure-search-cohere-embed-v3-sample-requirements.txt

In [9]:
import cohere
import numpy as np
import os
from dotenv import load_dotenv
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents import SearchClient
from azure.search.documents.models import (
    VectorizedQuery,
)
from azure.search.documents.indexes.models import (
    HnswAlgorithmConfiguration,
    SearchField,
    SearchableField,
    SearchFieldDataType,
    SearchIndex,
    SimpleField,
    VectorSearch,
    VectorSearchAlgorithmKind,
    VectorSearchProfile,
)
from azure.core.credentials import AzureKeyCredential


## Set Up Cohere and Azure Credentials
Before generating embeddings or interacting with Azure AI Search, we need to set up our credentials for both Cohere and Azure AI Search.

In [10]:
load_dotenv()
cohere_api_key = os.getenv("COHERE_API_KEY")
co = cohere.Client(cohere_api_key)

search_service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
search_service_api_key = os.getenv("AZURE_SEARCH_ADMIN_KEY")
index_name = "cohere-embed-v3-index"
credential = AzureKeyCredential(search_service_api_key)


## Generate Embeddings Function
This function will use the Cohere API to generate int8 embeddings for a list of documents. These embeddings are optimized for search document use cases.

In [11]:
def generate_embeddings(texts, input_type="search_document"):
    model = "embed-english-v3.0"
    # Ensure texts is a list
    if isinstance(texts, str):
        texts = [texts]

    response = co.embed(
        texts=texts,
        model=model,
        input_type=input_type,
        embedding_types=["int8"],
    )
    return [embedding for embedding in response.embeddings.int8]

## Create or Update Azure AI Search Index
This function creates or updates an Azure AI Search index to include a vector field for storing the document embeddings.

In [13]:
def create_or_update_index(client, index_name):
    fields = [
        SimpleField(name="id", type=SearchFieldDataType.String, key=True),
        SearchField(
            name="text",
            type=SearchFieldDataType.String,
            searchable=True,
        ),
        SearchField(
            name="embedding",
            type="Collection(Edm.SByte)",  # OData syntax for 8-bit signed integer
            vector_search_dimensions=1024,
            vector_search_profile_name="my-vector-config",
            # hidden=False, Use hidden=False if you want to return the embeddings in the search results
        ),
    ]

    vector_search = VectorSearch(
        profiles=[
            VectorSearchProfile(
                name="my-vector-config",
                algorithm_configuration_name="my-hnsw",
            )
        ],
        algorithms=[
            HnswAlgorithmConfiguration(
                name="my-hnsw",
                kind=VectorSearchAlgorithmKind.HNSW,
            )
        ],
    )

    index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)
    client.create_or_update_index(index=index)

## Index Documents and Their Embeddings
Finally, this function indexes the documents along with their int8 embeddings into Azure AI Search. For demonstration, document IDs are generated sequentially, but in practical applications, it's essential to use meaningful identifiers like database row ID, unique filenames, or any other unique metadata associated with the document.


In [14]:
def index_documents(search_client, documents, embeddings):
    documents_to_index = [
        {"id": str(idx), "text": doc, "embedding": emb}
        for idx, (doc, emb) in enumerate(zip(documents, embeddings))
    ]
    search_client.upload_documents(documents=documents_to_index)

## Run the workflow

In [15]:
documents = [
    "Alan Turing  was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist.",
    "Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time.",
    "Isaac Newton was an English polymath active as a mathematician, physicist, astronomer, alchemist, theologian, and author who was described in his time as a natural philosopher.",
    "Marie Curie was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity"
]

# Generate embeddings
embeddings = generate_embeddings(documents)

# Initialize Azure Search Index Client
search_index_client = SearchIndexClient(
    endpoint=search_service_endpoint,
    credential=credential,
    index_name=index_name
)

# Create or update the search index to include the embedding field
create_or_update_index(search_index_client, index_name)

# Initialize the SearchClient
search_client = SearchClient(
    endpoint=search_service_endpoint, 
    index_name=index_name, 
    credential=credential
)

# Index the documents and their embeddings
index_documents(search_client, documents, embeddings)


UnauthorizedError: status_code: 401, body: {'message': 'invalid api token'}

## Perform a Vector Search

In [19]:
from azure.search.documents import SearchClient

# Query for vector search
query = "foundational figures in computer science"

# Generate query embeddings
# Use input_type="search_query" for query embeddings
query_embeddings = generate_embeddings(query, input_type="search_query")

search_client = SearchClient(search_service_endpoint, index_name, credential)

vector_query = VectorizedQuery(
    vector=query_embeddings[0], k_nearest_neighbors=3, fields="embedding"
)

results = search_client.search(
    search_text=None,  # No search text for pure vector search
    vector_queries=[vector_query],
)

for result in results:
    print(f"Text: {result['text']}")
    print(f"Score: {result['@search.score']}\n")

Title: Alan Turing  was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist.
Score: 0.6248218

Title: Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time.
Score: 0.5913683

Title: Marie Curie was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity
Score: 0.57643753

