# Azure AI Search

>[Azure AI Search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) (formerly known as `Azure Cognitive Search`or Azure Search) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.

>Search is foundational to any app that surfaces text to users, where common scenarios include catalog or document search, online retail apps, or data exploration over proprietary content. When you create a search service, you'll work with the following capabilities:
>- A search engine for full text search over a search index containing user-owned content
>- Rich indexing, with lexical analysis and optional AI enrichment for content extraction and transformation
>- Rich query syntax for text search, fuzzy search, autocomplete, geo-search and more
>- Programmability through REST APIs and client libraries in Azure SDKs
>- Azure integration at the data layer, machine learning layer, and AI (AI Services)

This notebook shows how to use Azure AI Search (AAS) within LangChain.

## Set up Azure AI Search

To set up AAS, please follow the instructions [here](https://learn.microsoft.com/en-us/azure/search/search-create-service-portal).

Please note you will need
1. the name of your AAS service, 
2. the name of your AAS index,
3. your API key.

Your API key can be either Admin or Query key, but as we only read data it is recommended to use a Query key.

## Using the Azure AI Search Retriever

In [1]:
import os

from langchain_community.retrievers import (
    AzureAISearchRetriever,
)

Set Service Name, Index Name and API key as environment variables (alternatively, you can pass them as arguments to `AzureAISearchRetriever`). The search index name you use determines which documents are queried, so be sure to select the right one. 

*You may also use `AzureCognitiveSearchRetriever` however this will soon be depreciated. Please switch to `AzureAISearchRetriever` where possible. 

In [None]:
os.environ["AZURE_AI_SEARCH_SERVICE_NAME"] = "<YOUR_AAS_SERVICE_NAME>"
os.environ["AZURE_AI_SEARCH_INDEX_NAME"] = "<YOUR_AAS_INDEX_NAME>"
os.environ["AZURE_AI_SEARCH_API_KEY"] = "<YOUR_API_KEY>"

Create the Retriever

`content_key` is the key in the retrieved result to set as the Document page_content.
`top_k` is the number of number of results you'd like to retrieve. Setting it to None (the default) returns all results. 

In [None]:
retriever = AzureAISearchRetriever(content_key="content", top_k=10)

Now you can use it to retrieve documents from Azure AI Search. 
This is the method you would call to do so. It will return all documents relevant to the query. 

In [None]:
retriever.get_relevant_documents("what is langchain?")

## Example 

First let's create an Azure vector store and upload some data to it.

In [15]:
import os

from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import TokenTextSplitter
from langchain.vectorstores import AzureSearch
from langchain_openai import OpenAIEmbeddings

os.environ["OPENAI_API_KEY"] = "youropenaiapikey"
os.environ["AZURE_AI_SEARCH_SERVICE_NAME"] = "yourazureaisearchservicename"
os.environ["AZURE_AI_SEARCH_API_KEY"] = "yourazureaisearchapikey"

We'll use an embedding model from openai to turn our documents into embeddings stored in the Azure AI Search vector store. We'll also set the index name to `langchain-vector-demo`. This will create a new vector store associated with that index name. 

In [16]:
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

vector_store = AzureSearch(
    embedding_function=embeddings.embed_query,
    azure_search_endpoint=os.getenv("AZURE_AI_SEARCH_SERVICE_NAME"),
    azure_search_key=os.getenv("AZURE_AI_SEARCH_API_KEY"),
    index_name="langchain-vector-demo",
)

Next we'll load data into our newly created vector store. 
For this example we load all the text files from a folder named `qna`. We'll split the text in 1000 token chunks with no overlap. Finally the documents are added to our vector store as emeddings.

In [18]:
loader = DirectoryLoader(
    "qna/",
    glob="*.txt",
    loader_cls=TextLoader,
    loader_kwargs={"autodetect_encoding": True},
)
documents = loader.load()
text_splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

vector_store.add_documents(documents=docs)

['YWY0NzY1MWYtMTU1Ni00YmEzLTlhNTQtZDQxNWFkMTlkNjMx',
 'MTUzM2EyOGYtYWE0My00OTIyLWJkNWUtMjVjNTgwMzZlMjcx',
 'ZGMyMjQ3N2EtNTQ5NC00ZjZhLWIyMzctYjRlZDYzMWUxNGQ4',
 'OWM5MWQ3YzUtZjFkZS00MGI2LTg1OGMtMmRlYzUwMDc2MzZi',
 'ZmFiYWVkOGQtNTcwYi00YTVmLWE3ZDEtMWQ3MTAxYjI2NTJj',
 'NTUwM2ExMjItNTk4Zi00OTg0LTg1ZDItZTZlMGYyMjJiNTIy']

Next we'll create a retriever similar to the one we created above but we're using the index name associated with our new vector store. In this case that's `langchain-vector-demo`.

In [23]:
retriever = AzureAISearchRetriever(content_key="content", top_k=1)

Now we can retrieve the data that is relevant to our query from the documents we uploaded. 

In [24]:
retriever.get_relevant_documents("What is Azure OpenAI?")

[Document(page_content='\n# What is Azure OpenAI?\n\nThe Azure OpenAI service provides REST API access to OpenAI\'s powerful language models including the GPT-3, Codex and Embeddings model series. These models can be easily adapted to your specific task including but not limited to content generation, summarization, semantic search, and natural language to code translation. Users can access the service through REST APIs, Python SDK, or our web-based interface in the Azure OpenAI Studio.\n\n### Features overview\n\n| Feature | Azure OpenAI |\n| --- | --- |\n| Models available | GPT-3 base series <br> Codex series <br> Embeddings series <br> Learn more in our [Models](./concepts/models.md) page.|\n| Fine-tuning | Ada <br> Babbage <br> Curie <br> Cushman* <br> Davinci* <br> \\* available by request. Please open a support request|\n| Price | [Available here](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) |\n| Virtual network support | Yes | \n| Managed Iden