Issue: trying to call generate_queries() on a MultiQueryRetriever but where do I get a run_manager from? #9231

maspotts · 2023-08-14T23:38:42Z

Issue you'd like to raise.

I'm trying to use MultiQueryRetriever to generate variations on a question: it seems to work, but I can't use it inside a chain created with load_qa_with_sources_chain() because that generates a chain that expects a list of input_documents, rather than a retriever, and I don't want to use RetrievalQAWithSourcesChain instead of load_qa_with_sources_chain() because I want to continue implementing my own similarity search (eg. I'm supporting switches to select using index.similarity_search() vs. index.max_marginal_relevance_search() on my index, and to specify the number of matches (k)), so I figured I could just call generate_queries() on my MultiQueryRetriever instance and then manually run my (load_qa_with_sources_chain()) chain for each variation. However, that method requires a run_manager, and I can't figure out how to create one. I already have the load_qa_with_sources_chain() chain: can I get a run_manager from that? Or more generally, what's the best way to use MultiQueryRetriever whilst maintaining one's own code for fetching matching text snippets?

Suggestion:

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2023-08-14T23:48:32Z

🤖

Hello,

To create a run_manager for the generate_queries() method in a MultiQueryRetriever instance, you need to create an instance of CallbackManagerForRetrieverRun. This class is used to manage callbacks during the execution of the retriever run. Here is an example of how you can do this:

from langchain.callbacks.manager import CallbackManagerForRetrieverRun
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain.llms.base import BaseLLM
from langchain.retrievers.base import BaseRetriever

# Assuming you have instances of BaseLLM and BaseRetriever
llm = BaseLLM()
retriever = BaseRetriever()

# Create an instance of MultiQueryRetriever
multi_query_retriever = MultiQueryRetriever.from_llm(retriever, llm)

# Create an instance of CallbackManagerForRetrieverRun
run_manager = CallbackManagerForRetrieverRun()

# Use the run_manager in generate_queries method
queries = multi_query_retriever.generate_queries("your question", run_manager)

For fetching matching text snippets and using load_qa_with_sources_chain(), you would need to implement your own code. The provided context does not include any information about load_qa_with_sources_chain(), so I can't provide a specific example of how to use it. However, you would typically use it in a similar way to generate_queries(), passing in the necessary arguments and using the returned results in your application.

If you want to use a custom prompt and output parser, you can create an instance of LLMChain with your custom prompt and output parser, and then pass this LLMChain instance to the MultiQueryRetriever. Here is an example:

from typing import List
from langchain import LLMChain
from pydantic import BaseModel, Field
from langchain.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser

# Output parser will split the LLM result into a list of queries
class LineList(BaseModel):
    # "lines" is the key (attribute name) of the parsed output
    lines: List[str] = Field(description="Lines of text")

class LineListOutputParser(PydanticOutputParser):
    def __init__(self) -> None:
        super().__init__(pydantic_object=LineList)

    def parse(self, text: str) -> LineList:
        lines = text.strip().split("\n")
        return LineList(lines=lines)

output_parser = LineListOutputParser()

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate five 
    different versions of the given user question to retrieve relevant documents from a vector 
    database. By generating multiple perspectives on the user question, your goal is to help
    the user overcome some of the limitations of the distance-based similarity search. 
    Provide these alternative questions seperated by newlines.
    Original question: {question}""",
)
llm = ChatOpenAI(temperature=0)

# Chain
llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser)

# Other inputs
question = "What are the approaches to Task Decomposition?"

# Run
retriever = MultiQueryRetriever(
    retriever=vectordb.as_retriever(), llm_chain=llm_chain, parser_key="lines"
)  # "lines" is the key (attribute name) of the parsed output
# Results
unique_docs = retriever.get_relevant_documents(
    query="What does the course say about regression?"
)
len(unique_docs)

I hope this helps! If you have any other questions, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

maspotts · 2023-08-15T00:02:26Z

I see that CallbackManagerForRetrieverRun() actually requires 3 arguments: run_id, handlers and inheritable_handlers, so I am still not able to run generate_queries() on my MultiQueryRetriever instance. Does anyone know if I'm barking up the wrong tree here?

maspotts · 2023-08-15T00:12:28Z

I'm wondering whether I need to define a subclass of MultiQueryRetriever, in which I override _get_relevant_documents() to incorporate my own similarity search code?

dosubot · 2023-11-14T16:02:06Z

Hi, @maspotts! I'm Dosu, and I'm here to help the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you are trying to use the MultiQueryRetriever in LangChain but are unsure how to obtain a run_manager for the generate_queries() method. I provided an example of how to create a run_manager using CallbackManagerForRetrieverRun. You then asked for clarification on the required arguments for CallbackManagerForRetrieverRun. Additionally, you mentioned considering overriding _get_relevant_documents() in a subclass of MultiQueryRetriever to incorporate your own similarity search code.

Could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, please comment on this issue to let us know. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding, and please don't hesitate to reach out if you have any further questions or concerns!

forviaVision · 2024-03-19T13:08:29Z

Hello, did this issue get resolved? I am still having the issue for the 3 arguments: run_id, handlers and inheritable_handlers. Thank you :-)

dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature labels Aug 14, 2023

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 14, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 21, 2023

MrCsabaToth mentioned this issue Nov 23, 2023

Simplify implementation with LangChain's MultiQueryRetriever CsabaConsulting/ThruThinkCohereWeaviateChat#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: trying to call generate_queries() on a MultiQueryRetriever but where do I get a run_manager from? #9231

Issue: trying to call generate_queries() on a MultiQueryRetriever but where do I get a run_manager from? #9231

maspotts commented Aug 14, 2023 •

edited

Loading

dosubot bot commented Aug 14, 2023

maspotts commented Aug 15, 2023

maspotts commented Aug 15, 2023 •

edited

Loading

dosubot bot commented Nov 14, 2023

forviaVision commented Mar 19, 2024

Issue: trying to call generate_queries() on a MultiQueryRetriever but where do I get a run_manager from? #9231

Issue: trying to call generate_queries() on a MultiQueryRetriever but where do I get a run_manager from? #9231

Comments

maspotts commented Aug 14, 2023 • edited Loading

Issue you'd like to raise.

Suggestion:

dosubot bot commented Aug 14, 2023

Sources

maspotts commented Aug 15, 2023

maspotts commented Aug 15, 2023 • edited Loading

dosubot bot commented Nov 14, 2023

forviaVision commented Mar 19, 2024

maspotts commented Aug 14, 2023 •

edited

Loading

maspotts commented Aug 15, 2023 •

edited

Loading