Aksing about uploaded documents only works when I first ask the AI on a specific one #9451

Zylon989 · 2023-08-18T14:23:13Z

System Info

Hello!
I wrote a code that is very similar to this one
https://python.langchain.com/docs/integrations/toolkits/document_comparison_toolkit

Who can help?

No response

Information

The official example notebooks/scripts
My own modified scripts

Related Components

Reproduction

class DocumentInput(BaseModel):
    question: str = Field()


llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")

tools = []
files = [
    # https://abc.xyz/investor/static/pdf/2023Q1_alphabet_earnings_release.pdf
    {
        "name": "alphabet-earnings",
        "path": "/Users/harrisonchase/Downloads/2023Q1_alphabet_earnings_release.pdf",
    },
    # https://digitalassets.tesla.com/tesla-contents/image/upload/IR/TSLA-Q1-2023-Update
    {
        "name": "tesla-earnings",
        "path": "/Users/harrisonchase/Downloads/TSLA-Q1-2023-Update.pdf",
    },
]

for file in files:
    loader = PyPDFLoader(file["path"])
    pages = loader.load_and_split()
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(pages)
    embeddings = OpenAIEmbeddings()
    retriever = FAISS.from_documents(docs, embeddings).as_retriever()
        # Wrap retrievers in a Tool
tools.append(
    Tool(
        args_schema=DocumentInput,
        name=file["name"],
        description=f"useful when you want to answer questions about {file['name']}",
        func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever),
    )
)
llm = ChatOpenAI(
temperature=0,
model="gpt-3.5-turbo-0613",
)

agent = initialize_agent(
    agent=AgentType.OPENAI_FUNCTIONS,
    tools=tools,
    llm=llm,
    verbose=True,
)

agent({"input": "did alphabet or tesla have more revenue?"})

Expected behavior

It's working so far, but when I directly ask the AI to compare the documents mentioning them by name,
the AI does not know what I am talking about!
I first have to ask questions specificly about each document and seperatly mentioning their names, then the llm can compare them and answer my questions.
Any idea to solve this issue?

I would appreaciate any help!:)

cheers

The text was updated successfully, but these errors were encountered:

dosubot · 2023-08-18T14:32:51Z

🤖

Hello,

From your code, it seems like you're trying to use the LangChain framework to compare two documents and answer questions about them. However, you're facing an issue where the AI doesn't recognize the documents unless you ask about them individually first.

This behavior might be due to the way the LangChain framework handles memory and document retrieval. As per the LangChain memory management, the AI model doesn't inherently know about the documents unless they are brought into the conversation context. When you ask a question about a specific document by name, the corresponding tool is invoked, and the document is loaded into the memory, allowing the AI to answer questions about it.

However, when you ask a question that involves comparing two documents, the AI might not be able to handle it correctly if both documents are not already in the memory. This is because each document is loaded into memory by a separate tool, and these tools are invoked individually when their specific document is mentioned.

To solve this issue, you might need to modify your code to ensure that both documents are loaded into memory at the start of the conversation. You could do this by invoking the tools for both documents at the start, or by modifying the tools to load the documents into memory when they are initialized.

Here's a simple example of how you might modify your code to load both documents into memory at the start:

# Initialize the tools and load the documents into memory
for file in files:
    loader = PyPDFLoader(file["path"])
    pages = loader.load_and_split()
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(pages)
    embeddings = OpenAIEmbeddings()
    retriever = FAISS.from_documents(docs, embeddings).as_retriever()
    # Wrap retrievers in a Tool
    tools.append(
        Tool(
            args_schema=DocumentInput,
            name=file["name"],
            description=f"useful when you want to answer questions about {file['name']}",
            func=RetrievalQA.from_chain_type(llm=llm, retriever=retriever),
        )
    )

# Invoke the tools to load the documents into memory
for tool in tools:
    tool.func()

# Now you can ask questions that involve comparing the two documents
agent({"input": "did alphabet or tesla have more revenue?"})

Please note that this is a simplified example and the actual implementation might need to be more complex, depending on your specific requirements and the details of how the LangChain framework handles memory and document retrieval.

I hope this helps! Let me know if you have any other questions.

Best,
Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot · 2023-11-18T16:01:27Z

Hi, @Zylon989! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you were experiencing an issue where the AI is unable to compare documents unless they are specifically mentioned by name. Dosubot suggested modifying the code to ensure that both documents are loaded into memory at the start of the conversation. This behavior is due to the way the LangChain framework handles memory and document retrieval. It seems that both you and Dosubot have reacted positively to this suggestion.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

dosubot bot added Ɑ: agent Related to agents module Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature labels Aug 18, 2023

tristinbaileel mentioned this issue Nov 1, 2023

[Snyk] Fix for 1 vulnerabilities tristinbaileel/langchain#13

Open

dr-juetz mentioned this issue Nov 1, 2023

[Snyk] Fix for 1 vulnerabilities dr-juetz/langchain#14

Open

dewankpant mentioned this issue Nov 2, 2023

[Snyk] Fix for 1 vulnerabilities dewankpant/langchain#14

Open

dr-juetz mentioned this issue Nov 15, 2023

[Snyk] Fix for 1 vulnerabilities dr-juetz/langchain#16

Open

tristinbaileel mentioned this issue Nov 15, 2023

[Snyk] Fix for 1 vulnerabilities tristinbaileel/langchain#14

Open

dewankpant mentioned this issue Nov 15, 2023

[Snyk] Fix for 1 vulnerabilities dewankpant/langchain#16

Open

bozza-man mentioned this issue Nov 16, 2023

[Snyk] Fix for 1 vulnerabilities Bruteforce-Group/langchain#6

Open

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 18, 2023

dewankpant mentioned this issue Nov 25, 2023

[Snyk] Fix for 1 vulnerabilities dewankpant/langchain#17

Open

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 25, 2023

tristinbaileel mentioned this issue Nov 25, 2023

[Snyk] Fix for 1 vulnerabilities tristinbaileel/langchain#15

Open

dr-juetz mentioned this issue Nov 27, 2023

[Snyk] Fix for 1 vulnerabilities dr-juetz/langchain#17

Open

tristinbaileel mentioned this issue Dec 14, 2023

[Snyk] Fix for 1 vulnerabilities tristinbaileel/langchain#18

Open

dr-juetz mentioned this issue Dec 14, 2023

[Snyk] Fix for 1 vulnerabilities dr-juetz/langchain#19

Open

dewankpant mentioned this issue Dec 14, 2023

[Snyk] Fix for 1 vulnerabilities dewankpant/langchain#19

Open

mjuetz mentioned this issue Jan 5, 2024

[Snyk] Fix for 1 vulnerabilities mjuetz/langchain#6

Open

abdulrahman305 mentioned this issue Apr 15, 2024

[Snyk] Security upgrade @docusaurus/theme-mermaid from 2.4.3 to 3.0.0 abdulrahman305/langchain#44

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aksing about uploaded documents only works when I first ask the AI on a specific one #9451

Aksing about uploaded documents only works when I first ask the AI on a specific one #9451

Zylon989 commented Aug 18, 2023

dosubot bot commented Aug 18, 2023

dosubot bot commented Nov 18, 2023

Aksing about uploaded documents only works when I first ask the AI on a specific one #9451

Aksing about uploaded documents only works when I first ask the AI on a specific one #9451

Comments

Zylon989 commented Aug 18, 2023

System Info

Who can help?

Information

Related Components

Reproduction

Expected behavior

dosubot bot commented Aug 18, 2023

Sources

dosubot bot commented Nov 18, 2023