# Azure AI Search LangChain vector code sample

Execution environment used for testing is `python 3.10.0`

### 주의:
`pip install azure-search-documents==11.4.0b8` 버전을 사용해야 합니다. 설치 후 VS code 재실행 해야함.

In [10]:
# pip install -r requirements.txt

## Import required libraries and environment variables

In [1]:
# Import required libraries  
import openai
import os  
from dotenv import load_dotenv
from langchain.embeddings import OpenAIEmbeddings, AzureOpenAIEmbeddings
from langchain.vectorstores.azuresearch import AzureSearch
from azure.search.documents.indexes.models import (
    SemanticSettings,
    SemanticConfiguration,
    PrioritizedFields,
    SemanticField
)


## Configure Azure OpenAI settings

In [2]:
# TODO: change to .env-{myname} and set environment variables.
load_dotenv(override=True, dotenv_path='../.env-leo')

openai.api_type: str = "azure"  
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")  
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai.api_version = os.getenv("AZURE_OPENAI_API_VERSION")  
model: str = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYED_MODEL")

print(openai.api_base)
print(openai.api_version)
print(model)

https://prompton52g-aoai-12.openai.azure.com/
2023-10-01-preview
text-embedding-ada-002


## Configure vector store settings

In [6]:
vector_store_address: str = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")  
vector_store_password: str = os.getenv("AZURE_SEARCH_ADMIN_KEY") 
index_name: str = os.getenv("AZURE_SEARCH_INDEX_NAME") # .env 환경변수에서 자기만의 인덱스 이름을 설정하세요.

print(vector_store_address)
print(index_name)

https://prompton52g-aisearch-12.search.windows.net
langchain-vector-demo-leo


## Create embeddings and vector store instances
Read your data, generate OpenAI embeddings and export to a format to insert your search index:

In [7]:
# Create an embedding object
embeddings: OpenAIEmbeddings = AzureOpenAIEmbeddings(
    azure_deployment=model, model=model, chunk_size=1, 
    azure_endpoint=openai.api_base,
    api_key=openai.api_key,
    openai_api_type=openai.api_type,
    api_version=openai.api_version,
)

# Create an index in Azure Search
vector_store: AzureSearch = AzureSearch(    
    azure_search_endpoint=vector_store_address,
    azure_search_key=vector_store_password,
    index_name=index_name,
    embedding_function=embeddings.embed_query,
    semantic_configuration_name='config',
        semantic_settings=SemanticSettings(
            default_configuration='config',
            configurations=[
                SemanticConfiguration(
                    name='config',
                    prioritized_fields=PrioritizedFields(
                        title_field=SemanticField(field_name='content'),                        
                        prioritized_content_fields=[SemanticField(field_name='content')],
                        prioritized_keywords_fields=[SemanticField(field_name='metadata')]
                    ))
            ])
    )

## Insert text and embeddings into vector store

From here on, it is the same as the existing LangChain usage method.

In [15]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

loader = TextLoader("../data/sample-data/state_of_the_union.txt", encoding="utf-8")

documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
docs[0].metadata
# vector_store.add_documents(documents=docs)

{'source': '../data/sample-data/state_of_the_union.txt'}

## Perform a vector similarity search

In [12]:
# Perform a similarity search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",
    k=3,
    search_type="similarity",
)
print(docs[0].page_content)

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.


## Perform a hybrid search

In [14]:
# Perform a hybrid search
docs = vector_store.similarity_search(
    query="What did the president say about Ketanji Brown Jackson",
    k=3, 
    search_type="hybrid"
)
# print(docs[0].page_content)
print(docs[0].metadata)

{'id': 'MDE5MWNkYWMtNzY4NC00NWJhLTkyNmQtOGVjN2U4OWUyN2Y3', 'source': '../data/sample-data/state_of_the_union.txt'}


## Perform a hybrid search with semantic reranking (powered by Bing)

In [8]:
# Perform a hybrid search with semantic reranking  
docs_and_scores = vector_store.semantic_hybrid_search_with_score(  
    query="What did the president say about Ketanji Brown Jackson",  
    k=3,  
)  
  
# Print the results  
for doc, score in docs_and_scores:  
    print("-" * 80)  
    answers = doc.metadata['answers']  
    if answers:  
        if answers.get('highlights'):  
            print(f"Semantic Answer: {answers['highlights']}")  
        else:  
            print(f"Semantic Answer: {answers['text']}")  
        print(f"Semantic Answer Score: {score}")  
    print("Content:", doc.page_content)  
    captions = doc.metadata['captions']
    print(f"Score: {score}") 
    if captions:  
        if captions.get('highlights'):  
            print(f"Caption: {captions['highlights']}")  
        else:  
            print(f"Caption: {captions['text']}")  
    else:  
        print("Caption not available")  


--------------------------------------------------------------------------------
Content: Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. 

Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. 

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.
Score: 0.03333333507180214
Caption: One of the most serious constitutional responsibilities a President has is nominating 