# Azure Cognitive Search Vector Search Code Sample with Azure OpenAI
This code demonstrates how to use Azure Cognitive Search with OpenAI and Azure Python SDK
## Prerequisites
To run the code, install the following packages. Please use the latest pre-release version `pip install azure-search-documents --pre`.

In [None]:
!pip install azure-search-documents
!pip install openai
!pip install python-dotenv

## Import required libraries and environment variables

In [3]:
# Import required libraries
import os
import json
import openai
from dotenv import load_dotenv
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.document_loaders import PyPDFLoader
from tenacity import retry, wait_random_exponential, stop_after_attempt
from azure.core.credentials import AzureKeyCredential
from tenacity import retry, wait_random_exponential, stop_after_attempt  
from azure.core.credentials import AzureKeyCredential  
from azure.search.documents import SearchClient, SearchIndexingBufferedSender  
from azure.search.documents.indexes import SearchIndexClient  

from azure.search.documents.models import (
    QueryAnswerType,
    QueryCaptionType,
    QueryCaptionResult,
    QueryAnswerResult,
    SemanticErrorMode,
    SemanticErrorReason,
    SemanticSearchResultsType,
    QueryType,
    VectorizedQuery,
    VectorQuery,
    VectorFilterMode,    
)
from azure.search.documents.indexes.models import (  
    ExhaustiveKnnAlgorithmConfiguration,
    ExhaustiveKnnParameters,
    SearchIndex,  
    SearchField,  
    SearchFieldDataType,  
    SimpleField,  
    SearchableField,  
    SearchIndex,  
    SemanticConfiguration,  
    SemanticPrioritizedFields,
    SemanticField,  
    SearchField,  
    SemanticSearch,
    VectorSearch,  
    HnswAlgorithmConfiguration,
    HnswParameters,  
    VectorSearch,
    VectorSearchAlgorithmConfiguration,
    VectorSearchAlgorithmKind,
    VectorSearchProfile,
    SearchIndex,
    SearchField,
    SearchFieldDataType,
    SimpleField,
    SearchableField,
    VectorSearch,
    ExhaustiveKnnParameters,
    SearchIndex,  
    SearchField,  
    SearchFieldDataType,  
    SimpleField,  
    SearchableField,  
    SearchIndex,  
    SemanticConfiguration,  
    SemanticField,  
    SearchField,  
    VectorSearch,  
    HnswParameters,  
    VectorSearch,
    VectorSearchAlgorithmKind,
    VectorSearchAlgorithmMetric,
    VectorSearchProfile,
)  
# Configure environment variables
load_dotenv()

True

In [4]:

service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")
key = os.getenv("AZURE_SEARCH_ADMIN_KEY")

# env variables that are used by LangChain
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")
os.environ['OPENAI_API_TYPE'] = "azure"
os.environ['OPENAI_API_VERSION'] = os.getenv("OPENAI_DEPLOYMENT_VERSION")
os.environ['OPENAI_API_BASE'] = os.getenv("OPENAI_DEPLOYMENT_ENDPOINT")

OPENAI_DEPLOYMENT_ENDPOINT = os.getenv("OPENAI_DEPLOYMENT_ENDPOINT")
OPENAI_DEPLOYMENT_NAME = os.getenv("OPENAI_DEPLOYMENT_NAME")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME")

OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME = os.getenv("OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME")
OPENAI_ADA_EMBEDDING_MODEL_NAME = os.getenv("OPENAI_ADA_EMBEDDING_MODEL_NAME")

# Configure OpenAI API
openai.api_type = "azure"
openai.api_version = os.getenv("OPENAI_DEPLOYMENT_VERSION")
openai.api_base = os.getenv("OPENAI_DEPLOYMENT_ENDPOINT")
openai.api_key = os.getenv("OPENAI_API_KEY")
# ---
credential = AzureKeyCredential(key)

## Create embeddings
Read your data, generate OpenAI embeddings and export to a format to insert your Azure Cognitive Search index:

In [5]:
# Test embedding. Create vector from text
embeddingmodel = OpenAIEmbeddings(
    model=OPENAI_ADA_EMBEDDING_DEPLOYMENT_NAME, chunk_size=1)
vec = embeddingmodel.embed_query("transform to vec")
vec

[-0.00942082516849041,
 -0.00464690150693059,
 -0.0015674912137910724,
 -0.006866264622658491,
 0.0005358756170608103,
 0.015242683701217175,
 -0.020154215395450592,
 -0.020111873745918274,
 -0.031106365844607353,
 -0.051401715725660324,
 -0.002101161517202854,
 0.005289070308208466,
 -0.01695042848587036,
 3.806811946560629e-05,
 0.009371427819132805,
 0.004029431845992804,
 0.016357658430933952,
 0.00030961702577769756,
 0.009187950752675533,
 -0.013986573554575443,
 -0.011629602871835232,
 -0.005095008295029402,
 0.005881841294467449,
 -0.012631668709218502,
 -0.02088812179863453,
 0.004558691754937172,
 0.006936832331120968,
 -0.022976934909820557,
 -0.01630120351910591,
 -0.003579560900107026,
 0.010302925482392311,
 0.0035178137477487326,
 -0.004537520930171013,
 -0.028904644772410393,
 -0.009667813777923584,
 -0.01939208060503006,
 0.005659551825374365,
 -0.011827193200588226,
 0.005673665553331375,
 0.0015524955233559012,
 0.011686057783663273,
 -0.009992426261305809,
 -0.01498

In [6]:
# Generate Document Embeddings using OpenAI Ada 002

@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
# Function to generate embeddings for title and content fields, also used for query embeddings
def generate_embeddings(page):
    response = openai.Embedding.create(
        input=page, engine="text-embedding-ada-002")

    embeddings = response['data'][0]['embedding']
    return embeddings

## Prepare data for loading into Azure Cognitive Search

In [7]:
doc_title = "Semantic Kernel"
# load pdf document and split into small chunks
fileName = "../data/semantic-kernel.pdf"
loader = PyPDFLoader(fileName)
pages = loader.load_and_split()
print("Number of chunks: ", len(pages))

doc_with_vector_list = []
doc_id = 0
# Generate embeddings for title and content fields and store in json file in order to upload to Azure Cognitive Search
# we embbbed the title and content fields separately in order to use them in vector search
for page in pages:
    page_with_vector = {}
    page_with_vector['id'] = str(doc_id)
    page_with_vector['title'] = doc_title
    page_with_vector['titleVector'] = generate_embeddings(doc_title)
    page_with_vector['content'] = page.page_content
    page_with_vector['contentVector'] = generate_embeddings(page.page_content)
    doc_with_vector_list.append(page_with_vector)
    doc_id += 1

# Output embeddings to docVectors.json file
with open("./sk_Vectors.json", "w") as f:
    json.dump(doc_with_vector_list, f)

Number of chunks:  187


## Create search index
Create the search index schema and vector search configuration:

In [26]:
# Create a search index
# Note: You must create Cognitive Search resource and get the endpoint and key in advance
index_client = SearchIndexClient(
    endpoint=service_endpoint, credential=credential)

fields = [
    # doc id - mandatory field
    SimpleField(name="id", type=SearchFieldDataType.String, key=True,
                sortable=True, filterable=True, facetable=True),

    # title  
    SearchableField(
        name="title", type=SearchFieldDataType.String, filterable=True, searchable=True),
    
    # titleVector
    SearchField(name="titleVector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                searchable=True, vector_search_dimensions=1536, vector_search_profile_name="HnswProfile"),

    # content 
    SearchableField(name="content", type=SearchFieldDataType.String, searchable=True),
    
    # contentVector
    SearchField(name="contentVector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
                searchable=True, vector_search_dimensions=1536, vector_search_profile_name="HnswProfile"),

]

#The Hierarchical Navigable Small World (HNSW) graph algorithm is a popular method for approximate nearest neighbor search 
# in high-dimensional spaces.
vector_search = VectorSearch(
    algorithms=[
        HnswAlgorithmConfiguration(
            name="Hnsw",
            kind=VectorSearchAlgorithmKind.HNSW,
            parameters=HnswParameters(
                m=4,
                ef_construction=400,
                ef_search=500,
                metric=VectorSearchAlgorithmMetric.COSINE
            )
        ),
        ExhaustiveKnnAlgorithmConfiguration(
            name="ExhaustiveKnn",
            kind=VectorSearchAlgorithmKind.EXHAUSTIVE_KNN,
            parameters=ExhaustiveKnnParameters(
                metric=VectorSearchAlgorithmMetric.COSINE
            )
        )
    ],
    profiles=[
        VectorSearchProfile(
            name="HnswProfile",
            algorithm_configuration_name="Hnsw",
        ),
        VectorSearchProfile(
            name="ExhaustiveKnnProfile",
            algorithm_configuration_name="ExhaustiveKnn",
        )
    ]
)

# define semantic configuration for semantic search, which supported by Azure Cognitive Search as well (this is for keyword search)
semantic_config = SemanticConfiguration(
    name="sk-semantic-config",
    prioritized_fields=SemanticPrioritizedFields(
        title_field=SemanticField(field_name="title"),
        content_fields=[SemanticField(field_name="content")]
    )
)

# Create the semantic settings with the configuration
semantic_search = SemanticSearch(configurations=[semantic_config])
 
#Azure Cognitive Search index name
index_name = "sk-cogsrch-vector-index-11-hnsw"

# Create the search index with the semantic settings
index = SearchIndex(name=index_name, fields=fields,
                    vector_search=vector_search, semantic_search=semantic_search)
result = index_client.create_or_update_index(index)
print(f' {result.name} created')

 sk-cogsrch-vector-index-11-hnsw created


###### Open Azure Portal and see the newly created index in Azure Cognitive Search service

## Insert text and embeddings into vector store
Add texts and metadata from the JSON data to the vector store:

In [27]:
# Upload documents chunks to the index
with open('./sk_Vectors.json', 'r') as file:
    documents = json.load(file)
search_client = SearchClient(
    endpoint=service_endpoint, index_name=index_name, credential=credential)
result = search_client.upload_documents(documents)
print(f"Uploaded {len(documents)} documents")

Uploaded 187 documents


## Perform a vector similarity search

In [28]:
# Pure Vector Search
query = "semantic kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    #text is not passed only vector
    search_text=None,
    vector_queries=[vector_query],
    select=["title", "content"],
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")

Title: Semantic Kernel
Score: 0.87516695
Content: Responsible AI and Semantic Kernel
Article •05/23/2023
An AI system includes not only the technology, but also the people who will use it, the
people who will be affected by it, and the environment in which it is deployed. Creating
a system that is fit for its intended purpose requires an understanding of how the
technology works, what its capabilities and limitations are, and how to achieve the best
performance. Microsoft’s T ransparency Notes are intended to help you understand how
our AI technology works, the choices system owners can make that influence system
performance and behavior, and the importance of thinking about the whole system,
including the technology, the people, and the environment. Y ou can use T ransparency
Notes when developing or deploying your own system, or share them with the people
who will use or be affected by your system.
Microsoft’s T ransparency Notes are part of a broader effort at Microsoft to put our A

In [17]:

query = "semantic kernel planner and kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    search_text=None,
    vector_queries = [vector_query],
    select=["title", "content"],
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")

Title: Semantic Kernel
Score: 0.87037426
Content: To simplify the creation of AI apps, open source projects like LangChain  have
emerged. Semantic K ernel is Microsoft's contribution to this space and is designed to
support enterprise app developers who want to integrate AI into their existing apps.
By using multiple AI models, plugins, and memory all together within Semantic K ernel,
you can create sophisticated pipelines that allow AI to automate complex tasks for users.
For example, with Semantic K ernel, you could create a pipeline that helps a user send an
email to their marketing team. With memory , you could retrieve information about the
project and then use planner  to autogenerate the remaining steps using available
plugins (e.g., ground the user's ask with Microsoft Graph data, generate a response with
GPT-4, and send the email). Finally, you can display a success message back to your user
in your app using a custom plugin.
Step Component Descr iption
1 Ask It starts with a 

In [30]:
query = "semantic kernel planner and kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector", exhaustive=True)
 

results = search_client.search(
    search_text=None,
    vector_queries = [vector_query],
    select=["title", "content"],
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")

Title: Semantic Kernel
Score: 0.8703748
Content: To simplify the creation of AI apps, open source projects like LangChain  have
emerged. Semantic K ernel is Microsoft's contribution to this space and is designed to
support enterprise app developers who want to integrate AI into their existing apps.
By using multiple AI models, plugins, and memory all together within Semantic K ernel,
you can create sophisticated pipelines that allow AI to automate complex tasks for users.
For example, with Semantic K ernel, you could create a pipeline that helps a user send an
email to their marketing team. With memory , you could retrieve information about the
project and then use planner  to autogenerate the remaining steps using available
plugins (e.g., ground the user's ask with Microsoft Graph data, generate a response with
GPT-4, and send the email). Finally, you can display a success message back to your user
in your app using a custom plugin.
Step Component Descr iption
1 Ask It starts with a g

In [18]:
# Vector Search is multi-lingual
query = "Planificador semántico del kernel y kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    search_text=None,
    vector_queries = [vector_query],
    select=["title", "content"],
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")

Title: Semantic Kernel
Score: 0.83616716
Content: Responsible AI and Semantic Kernel
Article •05/23/2023
An AI system includes not only the technology, but also the people who will use it, the
people who will be affected by it, and the environment in which it is deployed. Creating
a system that is fit for its intended purpose requires an understanding of how the
technology works, what its capabilities and limitations are, and how to achieve the best
performance. Microsoft’s T ransparency Notes are intended to help you understand how
our AI technology works, the choices system owners can make that influence system
performance and behavior, and the importance of thinking about the whole system,
including the technology, the people, and the environment. Y ou can use T ransparency
Notes when developing or deploying your own system, or share them with the people
who will use or be affected by your system.
Microsoft’s T ransparency Notes are part of a broader effort at Microsoft to put our A

## Vector Search with a filter

In [23]:
# Pure Vector Search with Filter
query = "programming languages supported by semantic kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    search_text=None,
    vector_queries = [vector_query],
    vector_filter_mode=VectorFilterMode.PRE_FILTER,
    filter="title eq 'Semantic Kernel'",
    select=["title", "content"] 
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}")

Title: Semantic Kernel
Score: 0.891449
Content: Supported Semantic Kernel languages
Article •07/18/2023
Semantic K ernel plans on providing support to the following languages:
While the overall architecture of the kernel is consistent across all languages, we made
sure the SDK for each language follows common paradigms and styles in each language
to make it feel native and easy to use.
Today, not all features are available in all languages. The following tables show which
features are available in each language. The 🔄  symbol indicates that the feature is
partially implemented, please see the associated note column for more details. The ❌
symbol indicates that the feature is not yet available in that language; if you would like
to see a feature implemented in a language, please consider contributing to the project
or opening an issue .
Services C# Python JavaNotes
TextGeneration ✅✅✅ Example: T ext-Davinci-003
TextEmbeddings ✅✅✅ Example: T ext-Embeddings-Ada-002
ChatCompletion ✅✅✅ Examp

## Perform a Hybrid Search

In [29]:
# Hybrid Search
query = "semantic kernel planner and kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    search_text=query,
    vector_queries = [vector_query],
    filter="title eq 'Semantic Kernel'",
    select=["title", "content",],
    top=3
)

for result in results:
    print(f"Title: {result['title']}")
    print(f"Score: {result['@search.score']}")
    print(f"Content: {result['content']}\n")

Title: Semantic Kernel
Score: 0.03306011110544205
Content: To simplify the creation of AI apps, open source projects like LangChain  have
emerged. Semantic K ernel is Microsoft's contribution to this space and is designed to
support enterprise app developers who want to integrate AI into their existing apps.
By using multiple AI models, plugins, and memory all together within Semantic K ernel,
you can create sophisticated pipelines that allow AI to automate complex tasks for users.
For example, with Semantic K ernel, you could create a pipeline that helps a user send an
email to their marketing team. With memory , you could retrieve information about the
project and then use planner  to autogenerate the remaining steps using available
plugins (e.g., ground the user's ask with Microsoft Graph data, generate a response with
GPT-4, and send the email). Finally, you can display a success message back to your user
in your app using a custom plugin.
Step Component Descr iption
1 Ask It start

## Perform a Semantic Hybrid Search

In [25]:
# Semantic Hybrid Search
query = "semantic kernel planner and kernel"

search_client = SearchClient(
    service_endpoint, index_name=index_name, credential=credential)

vector_query = VectorizedQuery(vector=generate_embeddings(query), k_nearest_neighbors=3, fields="contentVector")

results = search_client.search(
    search_text=query,
    vector_queries = [vector_query],
    select=["title", "content"],
    #adding semantic search configuration 
    query_type=QueryType.SEMANTIC,
    semantic_configuration_name='sk-semantic-config',
    query_caption=QueryCaptionType.EXTRACTIVE,
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=3 
)

semantic_answers = results.get_answers()
for answer in semantic_answers:
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}\n")

for result in results:
    print(f"Title: {result['title']}")
    print(f"Content: {result['content']}")

    captions = result["@search.captions"]
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}\n")
        else:
            print(f"Caption: {caption.text}\n")

Semantic Answer: Semantic Kernel.<em> any other operation that you can do in code that is ill-suited for</em> LLMs<em> (e.g., performing calculations).</em> Instead of providing a separate configuration file with semantic descriptions,<em> planner is able to use annotations in the code to understand how the function behaves.</em>
Semantic Answer Score: 0.71630859375

Title: Semantic Kernel
Content: To instantiate planner, all you need to do is pass it a kernel object. Planner will then
automatically discover all of the plugins registered in the kernel and use them to create
plans. The following code initializes both a kernel and a SequentialPlanner. At the end
of this article we'll review the other types of Planners that are available in Semantic
Kernel.
C#      return (
          Convert.ToDouble(context[ "input"], 
CultureInfo.InvariantCulture) -
          Convert.ToDouble(context[ "number2" ], 
CultureInfo.InvariantCulture)
      ).ToString(CultureInfo.InvariantCulture);
  }
  [SKFu