# Azure Cognitive Search LangChain Vector Code Sample
This code demonstrates how to use Azure Cognitive Search with OpenAI and the Azure Cognitive Search LangChain Vector Store
To run the code, install the following packages. Please use the latest pre-release version `pip install azure-search-documents==11.4.0b8`.

In [28]:
# ! pip install azure-search-documents==11.4.0b8 
# ! pip install openai==0.28.1
# ! pip install python-dotenv
# ! pip install langchain==0.0.336
# ! pip install tiktoken
# ! pip install azure-identity==1.12

## Import required libraries and environment variables

In [3]:
# Import required libraries  
import openai
import os  
from dotenv import load_dotenv
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores.azuresearch import AzureSearch
from azure.search.documents.indexes.models import (
    SemanticSettings,
    SemanticConfiguration,
    PrioritizedFields,
    SemanticField
)
load_dotenv()

False

## Configure OpenAI Settings

In [18]:
# Configure environment variables  
load_dotenv()  
openai.api_type: str = "azure"  
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")  
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")  
openai.api_version = os.getenv("AZURE_OPENAI_API_VERSION")  
### solution for error below when execute Create embeddings and vector store instances paragraph
# InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.
###
# Replace your deployment for embedding in Azure OpenAI
# model: str = "text-embedding-ada-002"
model: str = "startping_embedding"

## Configure Vector Store Settings

In [19]:
vector_store_address: str = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")  
vector_store_password: str = os.getenv("AZURE_SEARCH_ADMIN_KEY") 
index_name: str = "langchain-vector-demo"

## Create embeddings and vector store instances
Read your data, generate OpenAI embeddings and export to a format to insert your Azure Cognitive Search index:

In [20]:
### solution for error below, add openai_api_key setting to OpenAIEmbeddings
# "name": "AuthenticationError",
# "message": "Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription 
# and use a correct regional API endpoint for your resource.",
###
embeddings: OpenAIEmbeddings = OpenAIEmbeddings(deployment=model, chunk_size=1, openai_api_base=os.getenv("AZURE_OPENAI_ENDPOINT"), openai_api_type="azure", openai_api_key=os.getenv("AZURE_OPENAI_API_KEY") )
# embeddings: OpenAIEmbeddings = OpenAIEmbeddings(deployment=model, model=model, chunk_size=1, openai_api_base=os.getenv("AZURE_OPENAI_ENDPOINT"), openai_api_type="azure")
index_name: str = "langchain-vector-demo"
vector_store: AzureSearch = AzureSearch(
    azure_search_endpoint=vector_store_address,
    azure_search_key=vector_store_password,
    index_name=index_name,
    embedding_function=embeddings.embed_query,
    semantic_configuration_name='config',
        semantic_settings=SemanticSettings(
            default_configuration='config',
            configurations=[
                SemanticConfiguration(
                    name='config',
                    prioritized_fields=PrioritizedFields(
                        title_field=SemanticField(field_name='content'),
                        prioritized_content_fields=[SemanticField(field_name='content')],
                        prioritized_keywords_fields=[SemanticField(field_name='metadata')]
                    ))
            ])
    )

## Insert text and embeddings into vector store

In [7]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

loader = TextLoader("./data/state_of_the_union.txt", encoding="utf-8")

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

vector_store.add_documents(documents=docs)

['OGQ5ZjVkNWMtYjk3MC00Y2I3LTgxOGMtYzk5YTliNWY4NGYx',
 'MjI2ZjgwNmMtMDUxMy00MGI0LTk5Y2UtNzM1ODIzYzEzZGFm',
 'MzQ5MzE2NGYtNjVhMC00NmFhLWFiNmYtYTc5NDQwYzFlODdh',
 'NzZjOTNjMWEtNzg1YS00Y2QwLTk2NDMtZWFkYTMyZjMyM2Zj',
 'YzU2ZGJhZWEtYjVhZi00M2M5LWE5ZGMtYTMzNWY2YjkwZjY5',
 'YTgyODMzZTUtZDhiMy00YzA3LTk0M2ItOTIwN2QxYjMzNTky',
 'YTJhOWRhNWUtZDBjNS00NDRhLTk2NDEtYWU3Mjk1OTYwZDAw',
 'YjExNWRjNjQtNzE2Yi00OThlLThhYTUtODM0YjQwNDNlNWU4',
 'MmU1YThjOGUtNTNkOS00ZGFlLTlkZTMtNGJhM2U2ZThjMGJm',
 'OGE3MjQwZGEtNzA5NS00NjEwLThhMTItMzAyOGFiNzFmYWU2',
 'OTA4ZjZiYTAtYmJlMS00NzgwLTk3MTktZTM1Mjk0NTAxZDVl',
 'Yjk5NjhkYTktOTU0My00YjdlLTg0N2YtYmI3NWY5MjFhZWY5',
 'ZDI1Y2I1M2EtMjZlYS00M2FkLTgxMTUtYzA1NjA4MTg0ZjI2',
 'NDE0ZjA1YjQtMjBhOC00M2YzLTk0MDctMjI1NjZmZTdkMjli',
 'MjBhODc5NTktY2E4NC00NTRmLWEyOTQtYjQyODQ5YjBmMmI2',
 'MDhlZmFjZTctMDE0Ny00ZTE0LTk4MjMtZWU5YzRmYTU2MGM0',
 'NWU0ZmNlYjMtYjliNy00NjQyLTgxNzktYzFjNDY4ZGY1YWNl',
 'ZjBlY2UyNzktMDcyNi00NmY3LTg0ZTgtOTNkOWZmZTFmNGU0',
 'MWYwY2U0ZGEtY2EwYi00NWZlLWJmYmQtZGZmNGZmMDJh

## Perform a vector similarity search

In [8]:
# Perform a similarity search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",
    k=3,
    search_type="similarity",
)
print(docs[0].page_content)

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.


## Perform a hybrid search

In [9]:
# Perform a hybrid search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",
    k=3, 
    search_type="hybrid"
)
print(docs[0].page_content)

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.


## Perform a Hybrid Search with Semantic re-ranking (powered by Bing)

In [21]:
### solution for error below
### You need create AI Serach with at least basic priceing tier , and enable Semantic Ranker in its settings (left side)
# Message: Semantic search is not enabled for this service.
# Parameter name: queryType
# Exception Details:	(SemanticQueriesNotAvailable) Semantic search is not enabled for this service.
# 	Code: SemanticQueriesNotAvailable
# 	Message: Semantic search is not enabled for this service.
###    
# Perform a hybrid search with semantic reranking  
docs_and_scores = vector_store.semantic_hybrid_search_with_score(  
    query="What did the president say about Ketanji Brown Jackson",  
    k=3,  
)  
  
# Print the results  
for doc, score in docs_and_scores:  
    print("-" * 80)  
    answers = doc.metadata['answers']  
    if answers:  
        if answers.get('highlights'):  
            print(f"Semantic Answer: {answers['highlights']}")  
        else:  
            print(f"Semantic Answer: {answers['text']}")  
        print(f"Semantic Answer Score: {score}")  
    print("Content:", doc.page_content)  
    captions = doc.metadata['captions']
    print(f"Score: {score}") 
    if captions:  
        if captions.get('highlights'):  
            print(f"Caption: {captions['highlights']}")  
        else:  
            print(f"Caption: {captions['text']}")  
    else:  
        print("Caption not available")  


--------------------------------------------------------------------------------
Content: Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
Score: 0.03181818127632141
Caption: One of the most serious constitutional responsibilities a President has is nominating 