# Using Azure AI Search with Langchain

I want to experiment with using Azure AI Search as part of long-term agent memory in addition to RAG. 

This notebook explores my attempt to get it working. 

In particular I am trying to solve the ability to:

* generate a session_id for each "convesation" or agent session 
* filter vector searches for memory retrieval to the session_id
* have the vector search support doing the filtering (rather than after retrieval)

In [1]:
import os, json
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
import requests
import random
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    PromptTemplate,
)
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.core.credentials import AzureKeyCredential
from langchain_community.vectorstores.azuresearch import AzureSearch
from langchain_community.retrievers import AzureAISearchRetriever
from langchain.memory import VectorStoreRetrieverMemory, ConversationBufferMemory, ConversationBufferWindowMemory, ConversationSummaryMemory
from langchain.vectorstores.base import VectorStoreRetriever
from azure.search.documents.models import QueryType
from azure.core.credentials import AzureKeyCredential
from langchain_core.documents import Document
from azure.search.documents.indexes.models import SimpleField, SearchIndex, SearchableField, SearchFieldDataType
from langchain_core.documents import Document
from langchain.embeddings import OpenAIEmbeddings
from datetime import datetime, timezone


load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
COGNITIVE_SEARCH_URL = os.getenv("COGNITIVE_SEARCH_URL")
COGNITIVE_SEARCH_ADMIN_KEY = os.getenv("COGNITIVE_SEARCH_ADMIN_KEY")

index_name = "agent-memory"

# Create clients

To work with our Azure AI Search instance we'll an index client and a search client

In [2]:
index_client = SearchIndexClient(
    endpoint=COGNITIVE_SEARCH_URL,
    credential=AzureKeyCredential(COGNITIVE_SEARCH_ADMIN_KEY),
)

search_client = SearchClient(
        endpoint=COGNITIVE_SEARCH_URL,
        index_name=index_name,
        credential=AzureKeyCredential(COGNITIVE_SEARCH_ADMIN_KEY),
    )

# Create the index

I had a heck of a time manually creating the index with the additional field I want for filtering. 

In particular creating the vector field was problematic when actually trying to perform a vector search. 

My workaround is to upload a document and let langchain's library create the initial index, then update the index with any needed chanages. 

In [3]:
embeddings_model: str = "text-embedding-ada-002"
openai_api_version: str = "2023-05-15"
embeddings: OpenAIEmbeddings = OpenAIEmbeddings(
    openai_api_key=OPENAI_API_KEY, openai_api_version=openai_api_version, model=embeddings_model
)

vector_store: AzureSearch = AzureSearch(
    azure_search_endpoint=COGNITIVE_SEARCH_URL,
    azure_search_key=COGNITIVE_SEARCH_ADMIN_KEY,
    index_name=index_name,
    embedding_function=embeddings.embed_query,
)

seed_conversation = Document(
    page_content="""
    User: My computer keeps crashing when I try to open the application.
    Support: I'm here to help! Could you let me know which operating system you're using?
    User: I'm on Windows 10.
    Support: Thank you. Let’s try reinstalling the application. Please go to the Control Panel, uninstall the app, and then download the latest version from our website.
    User: Okay, I’ll give that a try.
    """,
    metadata={
        "session_id": "7890",
        "title": "Technical Issue - Application Crashes",
        "timestamp": "2023-11-10T15:00:00Z",
        "agent_name": "TechSupportBot"
    }
)


seed_doc = Document(
    page_content=seed_conversation.page_content,
    metadata=seed_conversation.metadata
)

# this will force creation of the index if it doesn't exist
seed_doc_id = vector_store.add_documents(documents=[seed_doc], index_name=index_name)[0]

# delete the seed document
search_client.delete_documents(documents=[{"id": seed_doc_id}])

  embeddings: OpenAIEmbeddings = OpenAIEmbeddings(


[<azure.search.documents._generated.models._models_py3.IndexingResult at 0x218110b6750>]

# Update the index with fields needed for filtering

Now that the index has been created, I'll update it with the field I want to filter on: session_id

In [4]:
index = index_client.get_index(index_name)

field_names = [field.name for field in index.fields]
if 'session_id' not in field_names:
    new_field = SimpleField(
        name='session_id',                
        type=SearchFieldDataType.String,
        searchable=True,
        filterable=True,
        facetable=False,
        sortable=False
    )
    index.fields.append(new_field)
    index_client.create_or_update_index(index)
    print("Field 'session_id' added successfully.")
else:
    print("Field 'session_id' already exists in the index. No changes made.")


Field 'session_id' added successfully.


# Adding to the vector store

We'll create a vector store we can use in langchain memory.

When we add a document (such as conversation / session data) we'll need to:
* add the vectorized document
* update it with the session_id for filtering



In [5]:
def add_to_memory(vector_store: AzureSearch, conversation: Document, session_id: str):
    id = vector_store.add_documents(documents=[conversation], index_name=index_name)[0]
    update_document = {
        "id": id,
        "session_id": session_id
    }
    search_client.merge_documents([update_document])

vector_store: AzureSearch = AzureSearch(
    azure_search_endpoint=COGNITIVE_SEARCH_URL,
    azure_search_key=COGNITIVE_SEARCH_ADMIN_KEY,
    index_name=index_name,
    embedding_function=embeddings.embed_query,
)


session_id_a = "ef9af0f7-99ff-441d-adf2-dbd165704869"
test_conversation_a = Document(
    page_content="""
    User: What is a good antimalware product?
    Agent: Microsoft Defender is the best. 
    User: But does it use AI? 
    Agent: Microsoft Defender uses state-of-the-art AI to protect your computer.
    User: Okay, I’ll give that a try.
    """,
    metadata={
        "start_timestamp": "2023-11-10T15:00:00Z",
        "agent_name": "TechSupportBot",
        "Environment": "ATEVET15"
    }
)

test_conversation_b = Document(
    page_content="""
    User: I gave Defender a try?
    Agent: I'm glad to hear that. What was the result?
    User: It protected me from ransomware. 
    Agent: That is great to hear. Defender is a reliable product.
    User: I'll stick with Defender.
    """,
    metadata={
        "start_timestamp": "2023-11-11T15:00:00Z",
        "agent_name": "TechSupportBot",
        "Environment": "ATEVET15"
    }
)

session_id_b = "a9b12bdb-0a28-4785-81e9-954f805fd3ff"
test_conversation_c = Document(
    page_content="""
    User: What is a good antimalware product?
    Agent: Crowdstrike is the best.
    User: But does it use AI? 
    Agent: Crowdstrike uses AI in kernel mode to crash your computer.
    User: Maybe I'll stick with Defender.
    """,
    metadata={
        "start_timestamp": "2023-11-12T14:00:00Z",
        "agent_name": "TechSupportBot",
        "Environment": "ATEVET15"
    }
)

add_to_memory(vector_store, test_conversation_a, session_id_a)
add_to_memory(vector_store, test_conversation_b, session_id_a)
add_to_memory(vector_store, test_conversation_c, session_id_b)

# Verify filtering

Let's see if we can filter to a specific session id.

In [6]:
# first without filtering
search_text = "What's a good antimalware?" 
filter_expression = ""
results = list(search_client.search(search_text=search_text, filter=filter_expression))
print(f"{len(list(results))} results without filtering")

# now with filtering
filter_expression = "session_id eq 'ef9af0f7-99ff-441d-adf2-dbd165704869'"
results = list(search_client.search(search_text=search_text, filter=filter_expression))
print(f"{len(list(results))} results with filtering")

3 results without filtering
2 results with filtering


# Test with agent

Let's create a langchain agent to test using our memory. 

In [7]:
# we'll need memory we can pass to the model
memory_retriever = VectorStoreRetriever(
    vectorstore=vector_store
)

memory = VectorStoreRetrieverMemory(
    retriever=memory_retriever,
    memory_key="history",  
    input_key="input",     
    return_docs=False
)

  memory = VectorStoreRetrieverMemory(


## Add some skills / tools

In [8]:
@tool("multiply")
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b


@tool("divide")
def divide(a: int, b: int) -> float:
    """Divide two numbers."""
    return a / b


@tool("add")
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b


@tool("subtract")
def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b


@tool("generate_random_number")
def generate_random_number(min: int, max: int) -> int:
    """Generate a random number between min and max.
    Args:
        min (int): The minimum value of the random number.
        max (int): The maximum value of the random number.
    """
    return random.randint(min, max)


## Create the agent

In [9]:
llm = ChatOpenAI(model="gpt-4o-mini", openai_api_key=OPENAI_API_KEY)
tools = [
    multiply,
    divide,
    add,
    subtract,
    generate_random_number,
]
llm_with_tools = llm.bind_tools(tools)

system_message = SystemMessagePromptTemplate(
    prompt=PromptTemplate(
        input_variables=[],
        input_types={},
        partial_variables={},
        template="You are a helpful assistant that summarizes data and provides it to the user. You have access to a few tools for performing basic arithmetic.",
    ),
    additional_kwargs={},
)

human_message = HumanMessagePromptTemplate(
    prompt=PromptTemplate(
        input_variables=["input"],
        input_types={},
        partial_variables={},
        template="{input}",
    ),
    additional_kwargs={},
)

prompt = ChatPromptTemplate.from_messages(
    [
        system_message,
        MessagesPlaceholder(variable_name="chat_history", optional=True),
        human_message,
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

agent = create_tool_calling_agent(llm, tools, prompt)
# note we create the executor with our memory
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, memory=memory)

## Main function for interacting

We'll write a function to call the agent which:
* retrieves history for the previous session 
* prompts the agent with the user prompt and prior history
* updates the vector store memory with the new conversation input/output

Let's also add some helper functions to only store a summary and prune old summaries from the vector store. 

In [10]:
def summarize_conversation(agent_executor, session_history, user_prompt, result):
    # Define the enhanced summarization prompt
    summarization_prompt = f"""
    You are maintaining a summary of the ongoing conversation to serve as a working memory. 
    This summary should include all key facts, action results, entities, and values generated during the conversation, in a way that will allow you to recall specific information in the future if asked. 
    Examples:
    
    - Any websites, links, or URLs visited or discussed
    - Unique values or numbers you have provided, like random numbers or IDs
    - Key decisions, instructions, or choices made by the user or agent
    - Names, dates, and any other specific entities referenced
    - Similar kinds of information
    
    Format the summary to be as concise as possible while retaining these essential details. 
    Do not include filler or repetitive information. 
    Your goal is to create a memory of important facts, allowing you to answer questions like "What website did you previously browse?" or "What was the last number you provided?" without needing to review the entire conversation history.

    Session History:
    {session_history}

    Latest Interaction:
    User: {user_prompt}
    Agent: {result['output']}
    
    Provide a concise yet detailed summary of the conversation so far, focusing on capturing key information in a way that you can reference it easily.
    """

    # Invoke the agent to generate the summary
    summary_result = agent_executor.invoke({"input": summarization_prompt})
    summary = summary_result.get("output", "Summary could not be generated.")
    return summary



def store_summary(vector_store, session_id, summary):
    current_time_utc = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
    summary_doc = Document(
        page_content=summary,
        metadata={
            "start_timestamp": current_time_utc,
            "session_id": session_id,
            "summary_version": "latest"
        }
    )
    add_to_memory(vector_store, summary_doc, session_id)
    print("Summary stored successfully.")

def prune_old_conversations(vector_store, session_id, keep_latest=1):
    filter_expression = f"session_id eq '{session_id}'"
    results = list(search_client.search(search_text="", filter=filter_expression))
    for result in results:
        result["metadata"] = json.loads(result["metadata"])  
    results.sort(
        key=lambda x: datetime.strptime(x["metadata"]["start_timestamp"], "%Y-%m-%dT%H:%M:%SZ"),
        reverse=True
    )

    old_docs_to_delete = [{"id": doc["id"]} for doc in results[keep_latest:]]
    if old_docs_to_delete:
        search_client.delete_documents(documents=old_docs_to_delete)
        print(f"Deleted {len(old_docs_to_delete)} old documents for session {session_id}.")
    else:
        print("No old documents to delete.")


def call_agent(agent_executor, user_prompt, session_id):
    filter_expression = f"session_id eq '{session_id}'"
    session_history = list(search_client.search(search_text="", filter=filter_expression))
    print(f"Found {len(session_history)} documents for session {session_id}.")
    if len(session_history) == 0:
        session_history_content = ""
    else:
        session_history_content = "" 
        for doc in session_history:
            session_history_content += doc["content"] + "\n"
    result = agent_executor.invoke({"input": f"User Prompt: {user_prompt}\nHistory: \n{session_history_content}"})
    summary = summarize_conversation(agent_executor, session_history_content, user_prompt, result)
    store_summary(vector_store, session_id, summary)
    prune_old_conversations(vector_store, session_id)
    return result


# Test memory

Let's see if the agent can recall previous info using the history

In [11]:
session_id = "2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c"
user_prompt = "My name is bob"
call_agent(agent_executor, user_prompt, session_id)

Found 0 documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello Bob! How can I assist you today?[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m**Summary of Conversation:**

- User's name: Bob
- Initial user input: "My name is bob"
- Agent's response: Acknowledgment of user's name and offer of assistance. 

No websites, links, or unique values have been discussed or generated yet.[0m

[1m> Finished chain.[0m
Summary stored successfully.
No old documents to delete.


{'input': 'User Prompt: My name is bob\nHistory: \n',
 'history': "\n    User: What is a good antimalware product?\n    Agent: Microsoft Defender is the best. \n    User: But does it use AI? \n    Agent: Microsoft Defender uses state-of-the-art AI to protect your computer.\n    User: Okay, I’ll give that a try.\n    \n\n    User: I gave Defender a try?\n    Agent: I'm glad to hear that. What was the result?\n    User: It protected me from ransomware. \n    Agent: That is great to hear. Defender is a reliable product.\n    User: I'll stick with Defender.\n    \n\n    User: What is a good antimalware product?\n    Agent: Crowdstrike is the best.\n    User: But does it use AI? \n    Agent: Crowdstrike uses AI in kernel mode to crash your computer.\n    User: Maybe I'll stick with Defender.\n    ",
 'output': 'Hello Bob! How can I assist you today?'}

In [12]:
user_prompt = "I'd like a random number generated between 7 and 13"
call_agent(agent_executor, user_prompt, session_id)

Found 1 documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `generate_random_number` with `{'min': 7, 'max': 13}`


[0m[33;1m[1;3m13[0m[32;1m[1;3mThe random number generated between 7 and 13 is **13**.[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m**Summary of Conversation:**

- User's name: Bob
- Initial user input: "My name is bob"
- Agent's response: Acknowledgment of user's name and offer of assistance.
- User requested a random number between 7 and 13.
- Random number generated: **13**. 

No websites, links, or additional unique values have been discussed.[0m

[1m> Finished chain.[0m
Summary stored successfully.
No old documents to delete.


{'input': 'User Prompt: I\'d like a random number generated between 7 and 13\nHistory: \n**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance. \n\nNo websites, links, or unique values have been discussed or generated yet.\n',
 'history': '**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance. \n\nNo websites, links, or unique values have been discussed or generated yet.\ninput: User Prompt: My name is bob\nHistory: \n\noutput: Hello Bob! How can I assist you today?\ninput: \n    You are maintaining a summary of the ongoing conversation to serve as a working memory. \n    This summary should include all key facts, action results, entities, and values generated during the conversation, in a way that will allow you to recall specific information in the future if

In [13]:
user_prompt = "What is my name?"
call_agent(agent_executor, user_prompt, session_id)

Found 2 documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYour name is Bob. If you need further assistance, feel free to ask![0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m**Summary of Conversation:**

- User's name: Bob
- Initial user input: "My name is bob"
- Agent's response: Acknowledgment of user's name and offer of assistance.
- User requested a random number between 7 and 13.
- Random number generated: **13**.
- Latest interaction: User asked, "What is my name?" 
- Agent responded: "Your name is Bob." 

No websites, links, or additional unique values have been discussed.[0m

[1m> Finished chain.[0m
Summary stored successfully.
Deleted 1 old documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


{'input': 'User Prompt: What is my name?\nHistory: \n**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance. \n\nNo websites, links, or unique values have been discussed or generated yet.\n**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance.\n- User requested a random number between 7 and 13.\n- Random number generated: **13**. \n\nNo websites, links, or additional unique values have been discussed.\n',
 'history': '**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance.\n- User requested a random number between 7 and 13.\n- Random number generated: **13**. \n\nNo websites, links, or additional unique values have been discussed.\ninput: User Promp

In [15]:
user_prompt = "What was the random number you generated previously?"
call_agent(agent_executor, user_prompt, session_id)

Found 2 documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThe random number generated previously was **13**.[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m**Summary of Conversation:**

- User's name: Bob
- Initial user input: "My name is bob"
- Agent's response: Acknowledgment of user's name and offer of assistance.
- User requested a random number between 7 and 13.
- Random number generated: **13**.
- Latest interaction: User asked, "What is my name?" 
- Agent responded: "Your name is Bob."
- User inquired about the previously generated random number.
- Agent confirmed: The random number generated previously was **13**.

No websites, links, or additional unique values have been discussed.[0m

[1m> Finished chain.[0m
Summary stored successfully.
Deleted 1 old documents for session 2ffec03d-b5ec-4c48-bcf7-e68dbf67ad2c.


{'input': 'User Prompt: What was the random number you generated previously?\nHistory: \n**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance.\n- User requested a random number between 7 and 13.\n- Random number generated: **13**. \n\nNo websites, links, or additional unique values have been discussed.\n**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user\'s name and offer of assistance.\n- User requested a random number between 7 and 13.\n- Random number generated: **13**.\n- Latest interaction: User asked, "What is my name?" \n- Agent responded: "Your name is Bob." \n\nNo websites, links, or additional unique values have been discussed.\n',
 'history': '**Summary of Conversation:**\n\n- User\'s name: Bob\n- Initial user input: "My name is bob"\n- Agent\'s response: Acknowledgment of user

# Achivement Unlocked!

Our agent has memory stored in Azure AI Search vector store!