# Query Enhancement - Query Expansion Techniques

In a RAG Pipeline, the quality of the query sent to the retriever determines how good the retrieved context is - and therefore, how accurate the final llm answer will be.

In [13]:
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_ollama import ChatOllama
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableMap

In [14]:
## Step 1: Load and Split
loader = TextLoader("langchain_crewai_dataset.txt")
text = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(text)

In [15]:
# Step 2: Embedding
embedding = HuggingFaceEmbeddings(model="all-MiniLM-L6-v2")

# Step 3: Vectorstror
vectorstore = FAISS.from_documents(
    embedding=embedding,
    documents=chunks
)

# Step 4: Retriever
retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 5, 'lambda_mult': 0.7}
)

In [16]:
# Step 5: LLM
llm = ChatOllama(
    model="gemma2:9b-instruct-q4_K_M",
    num_ctx=32768,
    reasoning=False
)

# Step 6: Query Expansion
query_expansion_prompt = PromptTemplate.from_template("""
You are a helpful assistant. Expand the following query to improve document retrieval by adding relevant synonyms, technical terms and useful context.


Original query: "{query}

Expanded query:                                                      
"""
)

query_expension_chain = query_expansion_prompt | llm | StrOutputParser()

In [17]:
from IPython.display import Markdown

display(Markdown(query_expension_chain.invoke({"query": "Langchain memory"})))

## Expanded Queries for "Langchain Memory": 

Here are a few expanded queries, each focusing on a different aspect of LangChain memory:

**1. General Understanding:**

* **"LangChain memory management techniques"** (Includes technical terms and focuses on the methods used)
* **"How does LangChain store and access memory?"** (Clarifies the action and purpose of memory)
* **"LangChain persistent memory vs. in-context memory"** (Differentiates between types of memory)

**2. Specific Use Cases:**

* **"LangChain memory for chatbot applications"** (Focuses on a specific application)
* **"Using LangChain memory for dialogue history tracking"** (Provides a concrete use case)
* **"Improving LangChain agent performance with memory"** (Highlights the benefits of memory)

**3. Implementation Details:**

* **"LangChain memory integration with vector databases"** (Mentions a specific technology)
* **"LangChain memory examples and code snippets"** (Seeks practical implementation guidance)
* **"Best practices for designing LangChain memory systems"** (Focuses on optimization and design)


When choosing the best expanded query, consider your specific needs and what information you're looking for. 




In [18]:
# RAG asnwering prompt
answer_prompt = PromptTemplate.from_template("""
Answer the question based on the context below.

Context:
{context}

Question: {input}                                             
"""
)

In [19]:
# Chains
document_chain = create_stuff_documents_chain(
    llm,
    answer_prompt
)

In [20]:
# Run query
def full_query_decomposition_rag_pipeline(user_query):
    # Decompose the query
    sub_qs_text = query_expension_chain.invoke(user_query)
    sub_questions = [q.strip("-*1234567890. ").strip() for q in sub_qs_text.split("\n") if q.strip()]
    
    results = []
    
    for subq in sub_questions:
        docs = retriever.invoke(subq)
        result = document_chain.invoke({'input': subq, 'context': docs})
        results.append(f"Q: {subq}\nA: {result}")
        
    return "\n\n".join(results)     

In [22]:
# Step 6
query = "How does LangChain use memory and agents compared to CrewAI?"
final_answer = full_query_decomposition_rag_pipeline(query)
print("Final Answer:\n")
display(Markdown(final_answer))

Final Answer:



Q: ## Expanded Query Options:
A: The provided text focuses on LangChain's integration with vector databases for semantic search and its role in Retrieval-Augmented Generation (RAG). 

It doesn't directly mention "Expanded Query Options."  To understand what that refers to, we need more context. 

**Here are some possibilities based on the given information:**

* **Query Expansion Techniques:** LangChain could leverage vector databases to expand user queries by suggesting related terms or concepts, improving the accuracy of search results within large document corpora.
* **Multi-Modal Queries:**  The text mentions LangChain's integration with various tools like code execution environments and APIs. "Expanded Query Options" might refer to the ability to construct queries that combine text with other data types (e.g., code snippets, images) for richer search capabilities.
* **Advanced Search Operators:** LangChain could offer advanced search operators beyond basic keyword matching, allowing users to refine their queries based on semantic relationships or other criteria.


Let me know if you have more context about "Expanded Query Options," and I can give you a more precise answer. 


Q: Here are a few expanded query options, each focusing on different aspects:
A: The provided context focuses on the capabilities of LangChain, particularly its ability to integrate with various tools like web search, calculators, code execution environments, and custom APIs. 

It also highlights LangChain's support for hybrid retrieval, combining keyword-based and embedding-based methods for more comprehensive content retrieval.


However,  the question you posed is incomplete. It starts by listing "expanded query options" but doesn't provide the actual options themselves. 

Please provide the complete list of expanded query options so I can answer your question accurately. 


Q: Option 1 (Focus on Technical Details):
A: The provided context repeatedly lists "like web search, calculators, code execution environments, and custom APIs" across various versions (v1, v2, v6, v9).  

Based on this repetition, it seems likely that these are examples of the types of functionalities or tools that CrewAI offers or integrates with. 


Let me know if you'd like me to elaborate on any of these examples or explore potential uses for CrewAI based on this information. 


Q: "**Comparison of LangChain and CrewAI in terms of memory management and agent architectures. What specific memory mechanisms (e.g., vector stores, RNNs) are used in each framework? How do their agent design patterns (e.g., prompt engineering, tool integration) differ?**"
A: The provided context highlights the compatibility between LangChain and CrewAI, emphasizing that they work best as complementary tools in a hybrid system. 

Here's a breakdown based on the information given and common knowledge about each framework:

**LangChain:**

* **Memory Management:** LangChain leverages various memory mechanisms depending on the use case.
    *  **Vector Stores:** Commonly used for storing and retrieving contextual information as embeddings. Popular choices include FAISS, Pinecone, Chroma, etc.
    *  **RNNs (Recurrent Neural Networks):** Can be integrated for maintaining sequential memory within an agent's execution flow. However, RNNs can be resource-intensive and may not scale as well as vector stores for large amounts of context.
* **Agent Architecture:** LangChain agents utilize a planner-executor model:
    * **Planner:** Determines the sequence of tool invocations needed to achieve the desired goal. This often involves decision-making based on the current context and potential outcomes.
    * **Executor:** Executes the planned steps, interacting with tools and potentially incorporating feedback from previous steps.

**CrewAI:**

*  **Memory Management:** The context doesn't specify the exact memory mechanisms used by CrewAI. However, given its focus on role-based collaboration, it likely employs some form of persistent memory to track individual roles, their tasks, and ongoing conversations within a collaborative workflow.
* **Agent Architecture:** While not explicitly detailed, CrewAI likely uses a more structured agent design pattern tailored for its role-based collaboration approach. This could involve:
    *  **Role Definitions:** Clearly defining the responsibilities and capabilities of each role within the system.
    *  **Task Assignment:** Mechanisms for assigning tasks to specific roles based on their expertise or current workload.
    *  **Communication Channels:** Facilitating communication and information sharing between different roles.

**Key Differences:**

* **Focus:** LangChain emphasizes general-purpose agent development with a flexible tool integration framework, while CrewAI specializes in structured collaboration among multiple agents.
* **Memory Emphasis:** LangChain offers more explicit control over memory mechanisms through vector stores and RNNs, whereas CrewAI's memory management is likely geared towards maintaining role-specific context within collaborative workflows.


Let me know if you have any other questions!

Q: Option 2 (Focus on Use Cases):
A: Based on the context provided, CrewAI is particularly beneficial for complex tasks involving multiple steps that can be broken down into specialized components.  

Here are some examples of use cases highlighted in the text:

* **Market research:** Gathering and analyzing data from various sources, potentially involving different agents specializing in data collection, analysis, and interpretation.
* **Legal document analysis:** Breaking down the process into stages like document extraction, keyword identification, legal term recognition, and summarization, each handled by specialized agents.
* **Product development:**  Managing different stages of the product lifecycle, from ideation and design to testing and launch, with agents collaborating on specific aspects.
* **Coding assistants:** Assisting developers with code generation, debugging, documentation, and other tasks through the combined expertise of multiple agents. 


Essentially, CrewAI shines when tasks require collaboration, specialization, and the ability to manage complex workflows involving multiple interconnected steps. 


Q: "**How do LangChain and CrewAI handle long-term context and conversational memory in applications like chatbots or interactive storytelling? Which framework is more suitable for building agents that can learn from past interactions and perform complex multi-step tasks?**"
A: Here's a breakdown of how LangChain and CrewAI address context and conversational memory, along with which might be better suited for your needs:

**LangChain:**

* **Context Handling:**  LangChain leverages its memory modules (ConversationBufferMemory and ConversationSummaryMemory) to manage context. 
    * `ConversationBufferMemory` stores a history of previous turns, ideal for remembering recent details in a conversation.
    * `ConversationSummaryMemory` summarizes past interactions, useful for maintaining a broader understanding of the conversation flow, especially when dealing with lengthy exchanges.

* **Agent Capabilities:** LangChain agents excel at complex multi-step tasks thanks to their planner-executor model. They can:
    * Plan a sequence of actions (tool invocations) to achieve a goal.
    * Make dynamic decisions based on the current context and past interactions.
    * Employ branching logic to handle different scenarios.
    * Utilize memory effectively across multiple steps.

**CrewAI:**

* **Context Handling:** While CrewAI's documentation doesn't explicitly detail specific memory mechanisms, its focus on role-based collaboration implies an implicit form of contextual awareness within each role's interactions. 
* **Agent Capabilities:** CrewAI shines in scenarios requiring coordinated teamwork and diverse perspectives. Its strength lies in managing multiple agents (roles) that collaborate to solve problems.

**Which is More Suitable?**

* **For chatbots or interactive storytelling with a focus on remembering specific details and navigating complex conversations:** LangChain's memory modules and agent capabilities are highly suitable.
* **For applications requiring teamwork, coordination, and diverse perspectives:** CrewAI's role-based collaboration framework might be a better fit. 


Let me know if you have any other questions!

Q: Option 3 (Focus on Development Workflow):
A: Please provide the options for "Option 3 (Focus on Development Workflow)".  I need the choices to answer the question accurately. 


For example, your options might look something like this:

* **Option 3:** LangChain is best suited for building static websites.
* **Option 3:** LangChain's modularity makes it easy to build and maintain complex LLM applications.
* **Option 3:**  LangChain is primarily used for creating chatbot interfaces. 


Let me know the options, and I'll be happy to help! 


Q: "**What are the key differences in developing and deploying agents using LangChain and CrewAI?  How do their respective SDKs and tooling support memory management, agent training, and integration with external APIs?**"
A: While the provided context highlights the compatibility between LangChain and CrewAI, it doesn't delve into specific details about their SDKs, tooling, or how they handle memory management, agent training, and API integration. 

Here's a breakdown based on general knowledge about each platform:

**LangChain:**

* **Focus:**  LangChain is primarily focused on providing the infrastructure for building applications powered by language models. It excels at agent development, offering tools for defining agents, managing their memory, selecting tools, and orchestrating their actions.
* **SDK and Tooling:** LangChain provides a Python SDK with a modular design. This allows developers to easily build agents by chaining together different components like LLMs, prompts, tools, and memory management systems. 
* **Memory Management:** LangChain supports various memory mechanisms like conversation memory, vector stores, and external databases, allowing agents to retain context across interactions.
* **Agent Training:** LangChain doesn't inherently provide a training framework for agents. However, its flexibility allows integration with other training libraries or techniques.
* **API Integration:** LangChain makes it straightforward to integrate with external APIs through its tool wrapper system. Developers can define custom tools that interact with APIs and use them within their agents.

**CrewAI:**

* **Focus:** CrewAI specializes in building collaborative AI systems, particularly focusing on role-based interactions.
* **SDK and Tooling:** CrewAI likely provides a SDK tailored for creating agents with defined roles and responsibilities. The tooling may focus on defining these roles, managing their interactions, and potentially visualizing the collaboration flow.
* **Memory Management:** CrewAI's memory management is probably designed to support role-based context. Each role might have its own memory space or mechanisms for sharing information between roles.
* **Agent Training:**  CrewAI could offer specialized training methods for role-based agents, perhaps incorporating techniques like reinforcement learning to optimize collaboration.
* **API Integration:** CrewAI's API integration capabilities would likely be geared towards supporting the needs of collaborative agents, potentially enabling seamless communication with external tools or APIs relevant to each role.


**In Summary:**

To get a precise answer about the differences in developing and deploying agents using LangChain and CrewAI, you'd need to consult their respective documentation, tutorials, and community resources. 


Q: Tips for further expansion:
A: Based on the context provided, here are some tips for further expanding LangChain's capabilities:

* **Enhance Planner Capabilities:** 

    * Explore more sophisticated planning algorithms to enable agents to handle complex, multi-step tasks with greater accuracy and efficiency.
    * Incorporate machine learning techniques into the planner to learn from past interactions and improve future plan generation.

* **Expand Tool Integration:**  
    * Develop a standardized interface for integrating new tools and APIs, making it easier for developers to connect LangChain agents with a wider range of external resources.
    * Build specialized tool libraries tailored for specific domains (e.g., finance, healthcare) to simplify agent development in those areas.

* **Improve Context Management:**

    * Research advanced memory mechanisms that can effectively handle long-term context and relationships between steps in an agent's execution.
    * Explore techniques for summarizing and highlighting key information from past interactions to aid decision-making in subsequent steps.

* **Promote Reusability and Collaboration:** 

    * Create a platform or registry where developers can share pre-built agent templates, tool integrations, and planning strategies.
    * Develop standardized formats for representing agent knowledge and workflows, enabling easier sharing and modification.


By focusing on these areas, LangChain can become even more powerful and versatile, empowering developers to build sophisticated AI agents that solve real-world problems across diverse domains. 


Q: Specify your level of expertise:** Are you a beginner or an experienced developer? This can help tailor the query to your needs
A: Based on the context provided, it seems like I am designed to be helpful for developers of all levels. 

The information mentions features like:

* **LangChain's ability to connect LLMs with tools:** This suggests a focus on helping developers build applications using large language models.
* **Support for various tools and APIs:**  This indicates a broad range of potential use cases, catering to both beginners exploring basic integrations and experienced developers building complex systems.


Therefore, I can't definitively say whether I am geared towards beginners or experienced developers. My capabilities seem versatile enough to assist both. 


Q: Mention specific use cases:** If you have a particular application in mind, include it in the query (e.g., "customer service chatbot," "research assistant")
A: According to the text, CrewAI is especially useful in these multi-step workflows:

* **Market research:** Agents could collaborate to gather data from various sources, analyze trends, and generate reports. 
* **Legal document analysis:** Agents could specialize in different aspects of legal documents, such as identifying key clauses, extracting relevant information, and summarizing findings.
* **Product development:** Agents could assist with tasks like brainstorming ideas, researching competitors, prototyping designs, and managing project timelines.
* **Coding assistants:** Agents could help with code generation, debugging, documentation, and testing. 


Let me know if you'd like to explore any of these use cases in more detail! 


Q: Identify key features:** Are there any specific memory mechanisms or agent capabilities that are particularly important to you?
A: Based on the context provided, these stand out as particularly important features:

* **ConversationBufferMemory and ConversationSummaryMemory:** The ability for an LLM to remember previous conversation turns is crucial for creating natural and coherent conversations.  `ConversationBufferMemory` stores the entire history, while `ConversationSummaryMemory` provides a condensed version, helpful for managing token limits in longer interactions.
* **Agent capabilities with planner-executor model:** LangChain's agents are powerful because they can:
    * **Reason and plan:** LLMs within agents can determine the best tools and steps to achieve a goal.
    * **Dynamic decision-making:** Agents can adapt their approach based on the evolving situation and tool outputs.
    * **Branching logic:**  Agents can handle complex workflows with multiple possible paths.
    * **Context-aware memory:**  Agents leverage memory effectively, using past information to make informed decisions in each step. 


These features combined allow for building truly interactive and intelligent applications that go beyond simple question-answering.

Q: Remember, the more specific and detailed your query is, the better the chances of retrieving relevant and useful information
A: This statement emphasizes the importance of crafting precise and specific queries when searching for information. 

Here's why:

* **Specificity filters noise:**  Broad queries often return a large volume of results, many of which may be irrelevant. A specific query narrows the search focus, leading to more targeted and useful results.
* **Understanding intent:**  A well-defined query clearly conveys your information needs. It helps search engines or systems understand exactly what you're looking for.

**Example:**

Instead of searching for "dogs," a more specific query like "golden retriever puppies for sale in San Francisco" will yield much more relevant results. 


Let me know if you have any other questions!