# Azure AI Search LangChain vector code sample
This code demonstrates how to use Azure AI Search with OpenAI and Langchain

### Set up a Python virtual environment in Visual Studio Code

1. Open the Command Palette (Ctrl+Shift+P).
1. Search for **Python: Create Environment**.
1. Select **Venv**.
1. Select a Python interpreter. Choose 3.10 or later.

It can take a minute to set up. If you run into problems, see [Python environments in VS Code](https://code.visualstudio.com/docs/python/environments).

### Install packages

In [23]:
! pip install -r requirements.txt --quiet


[notice] A new release of pip is available: 23.2.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


### Load .env file (Copy .env-sample to .env and update accordingly)

In [24]:
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
import os

load_dotenv(override=True) # take environment variables from .env.

# Variables not used here do not need to be updated in your .env file
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
key_credential = os.environ["AZURE_SEARCH_ADMIN_KEY"] if len(os.environ["AZURE_SEARCH_ADMIN_KEY"]) > 0 else None
index_name = os.environ["AZURE_SEARCH_INDEX"]
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
azure_openai_key = os.environ["AZURE_OPENAI_KEY"] if len(os.environ["AZURE_OPENAI_KEY"]) > 0 else None
azure_openai_embedding_deployment = os.environ["AZURE_OPENAI_EMBEDDING_DEPLOYMENT"]
azure_openai_api_version = os.environ["AZURE_OPENAI_API_VERSION"]

credential = key_credential or DefaultAzureCredential()

### Create LangChain Azure OpenAI Embeddings

In [25]:
from langchain_openai import AzureOpenAIEmbeddings
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

openai_credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(openai_credential, "https://cognitiveservices.azure.com/.default")

# Use API key if provided, otherwise use RBAC authentication
embeddings = AzureOpenAIEmbeddings(
    azure_deployment=azure_openai_embedding_deployment,
    openai_api_version=azure_openai_api_version,
    azure_endpoint=azure_openai_endpoint,
    api_key=azure_openai_key,
    azure_ad_token_provider=token_provider if not azure_openai_key else None
)   

### Create LangChain Vector Store

In [26]:
from langchain.vectorstores.azuresearch import AzureSearch
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndex,
    SearchField,
    VectorSearch,
    HnswAlgorithmConfiguration,
    VectorSearchProfile,
    SemanticSearch,
    SemanticConfiguration,
    SemanticPrioritizedFields,
    SemanticField
)
from langchain_preview_patch.azuresearch import fix_vectorstore

# LangChain is not yet compatible with the latest preview version of the Search SDK
# A workaround is to create the index prior to using the LangChain vector store
search_index = SearchIndex(
    name=index_name,
    fields=[
        SearchField(name="id", key=True, type="Edm.String", searchable=True, filterable=True, facetable=False, sortable=True, hidden=False),
        SearchField(name="content", type="Edm.String", searchable=True, filterable=False, facetable=False, sortable=False, hidden=False),
        SearchField(name="content_vector", type="Collection(Edm.Single)", searchable=True, vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile", hidden=False),
        SearchField(name="metadata", type="Edm.String", searchable=True, filterable=False, facetable=False, sortable=False, hidden=False)
    ],
    vector_search=VectorSearch(
        algorithms=[
            HnswAlgorithmConfiguration(name="hnsw")
        ],
        profiles=[
            VectorSearchProfile(name="myHnswProfile", algorithm_configuration_name="hnsw")
        ]
    ),
    semantic_search=SemanticSearch(
        default_configuration_name="semantic",
        configurations=[
            SemanticConfiguration(
                name="semantic",
                prioritized_fields=SemanticPrioritizedFields(
                    content_fields=[
                        SemanticField(field_name="content")
                    ]
                )
            )
        ]
    )
)
search_index_result = SearchIndexClient(endpoint=endpoint, credential=credential).create_or_update_index(search_index)

# This code will generate a warning that can safely be ignored.
vector_store = AzureSearch(
    azure_search_endpoint=endpoint,
    azure_search_key=key_credential,
    fields=search_index_result.fields,
    vector_search=search_index_result.vector_search,
    semantic_configuration_name=search_index_result.semantic_search.default_configuration_name,
    index_name=index_name,
    embedding_function=embeddings.embed_query
)
# The LangChain vector search methods will not work with a newer SDK
# These methods need to be patched using the replacements found in the langchain_patch directory
fix_vectorstore(vector_store)

vector_search_configuration is not a known attribute of class <class 'azure.search.documents.indexes.models._index.SearchField'> and will be ignored


In [27]:
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os

directory = os.path.join("..", "data", "documents")
files = ["Benefit_Options.pdf", "Northwind_Health_Plus_Benefits_Details.pdf", "Northwind_Standard_Benefits_Details.pdf"]
total_chunks = 0
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

for file in files:
    loader = PyPDFLoader(os.path.join(directory, file))
    file_chunks = loader.load_and_split(splitter)
    results = vector_store.add_documents(documents=file_chunks)
    total_chunks += len(results)
    print(f"Indexed {file}")
print(f"Indexed {total_chunks} chunks")

Indexed Benefit_Options.pdf
Indexed Northwind_Health_Plus_Benefits_Details.pdf
Indexed Northwind_Standard_Benefits_Details.pdf
Indexed 636 chunks


## Perform a vector similarity search

In [28]:
# Perform a similarity search
docs = vector_store.similarity_search(
    "What is included in my Northwind Health Plus plan that is not in standard?",
    k=3,
    search_type="similarity",
)
for doc in docs:
    print("-" * 80)  
    print(f"ID: {doc.metadata['id']}")
    print(f"Chunk Content: {doc.page_content}")

--------------------------------------------------------------------------------
ID: MjNlYWE2OWMtNmIyYy00ZWUzLWE4ZjEtNTg1M2U1OTFiMGUy
Chunk Content: It is important to remember that the Northwind Health Plus plan covers only medically 
necessary services. Non -essential services, such as elective or cosmetic procedures, are not 
covered.
--------------------------------------------------------------------------------
ID: OGVmNGI3MzItODU0Zi00ZDYwLTg5NGYtNTk0NjgyMDgwNzVh
Chunk Content: included in the plan documents or  summary, then it does not apply to the plan.  
You should also be aware that the Northwind Health Plus plan may contain certain 
exceptions, exclusions, and limitations. It is important to familiarize yourself with the plan 
documents to make sure that you u nderstand what services are covered and which are not 
covered. If you have any questions, Northwind Health has customer service representatives 
who are available to answer your questions.
---------------------------

## Perform a hybrid search

In [29]:
# Perform a hybrid search
docs = vector_store.similarity_search(
    query="What is included in my Northwind Health Plus plan that is not in standard?",
    k=3, 
    search_type="hybrid"
)
for doc in docs:
    print("-" * 80)  
    print(f"ID: {doc.metadata['id']}")
    print(f"Chunk Content: {doc.page_content}")

--------------------------------------------------------------------------------
ID: OGVmNGI3MzItODU0Zi00ZDYwLTg5NGYtNTk0NjgyMDgwNzVh
Chunk Content: included in the plan documents or  summary, then it does not apply to the plan.  
You should also be aware that the Northwind Health Plus plan may contain certain 
exceptions, exclusions, and limitations. It is important to familiarize yourself with the plan 
documents to make sure that you u nderstand what services are covered and which are not 
covered. If you have any questions, Northwind Health has customer service representatives 
who are available to answer your questions.
--------------------------------------------------------------------------------
ID: NmUzYzYwNTMtNzAxZi00MjAxLWIzNTYtYzkwYWMxZTYyOGE0
Chunk Content: care servi ces. The plans also cover preventive care services such as mammograms, colonoscopies, and 
other cancer screenings. 
Northwind Health Plus offers more comprehensive coverage than Northwind Standard. This pla

## Perform a hybrid search with semantic reranking (powered by Bing)

In [30]:
# Perform a hybrid search with semantic reranking  
docs_and_scores = vector_store.semantic_hybrid_search_with_score(  
    query="What is included in my Northwind Health Plus plan that is not in standard?",  
    k=3,  
)  
  
# Print the results  
for doc, score in docs_and_scores:  
    print("-" * 80)  
    answers = doc.metadata['answers']  
    if answers:  
        if answers.get('highlights'):  
            print(f"Semantic Answer: {answers['highlights']}")  
        else:  
            print(f"Semantic Answer: {answers['text']}")  
        print(f"Semantic Answer Score: {score}")  
    print("Content:", doc.page_content)  
    captions = doc.metadata['captions']
    print(f"Score: {score}") 
    if captions:  
        if captions.get('highlights'):  
            print(f"Caption: {captions['highlights']}")  
        else:  
            print(f"Caption: {captions['text']}")  
    else:  
        print("Caption not available")  


--------------------------------------------------------------------------------
Content: Northwind Standard, you can choose from a variety of in -network providers, including primary care 
physicians, specialists, hospitals, and pharmacies. This plan  does not offer coverage for emergency 
services, mental health and substance abuse coverage, or out -of-network services.
Comparison of Plans 
Both plans offer coverage for routine physicals, well -child visits, immunizations, and other preventive 
care servi ces. The plans also cover preventive care services such as mammograms, colonoscopies, and 
other cancer screenings. 
Northwind Health Plus offers more comprehensive coverage than Northwind Standard. This plan offers 
coverage for emergency services, both in -network and out -of-network, as well as mental health and 
substance abuse coverage. Northwind Standard does not offer coverage for emergency services, mental 
health and substance abuse coverage, or out -of-network services.
Sc