How can `Document` metadata be passed into prompts? #1136

batmanscode · 2023-02-18T11:20:42Z

Here is an example:

I have created vector stores from several podcasts
metadata = {"guest": guest_name}
question = "which guests have talked about <topic>?"

Using VectorDBQA, this could be possible if {context} contained text + metadata

The text was updated successfully, but these errors were encountered:

batmanscode · 2023-02-21T10:15:36Z

Another format for retrieving text with metadata could be:

TEXT: <what the guest said>
GUEST: <guest_name>

Or maybe even:

TEXT: <what the guest said>
METADATA: {"guest": guest_name}

This way when asking questions, I can ask things like "what did <guest_name> say about <topic>?"

sbc-max · 2023-03-06T19:14:11Z

I have a number of different uses cases where this would also be helpful. I considered just adding the metadata directly to the text before embedding, but that's not ideal.

flash1293 · 2023-07-03T13:29:00Z

Not 100% sure whether applicable to your case, but if you are using the stuff chain, you can do this by adjusting the document_prompt:

document_prompt = PromptTemplate(input_variables=["page_content", "id"], template="{page_content}, id: {id}")
qa = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type="stuff", retriever=vector_store.as_retriever(), chain_type_kwargs={"document_prompt": document_prompt})

if there is an id metadata in your doc, it will be injected correctly

batmanscode · 2023-07-03T13:37:11Z

Not 100% sure whether applicable to your case, but if you are using the stuff chain, you can do this by adjusting the document_prompt:
document_prompt = PromptTemplate(input_variables=["page_content", "id"], template="{page_content}, id: {id}")
qa = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type="stuff", retriever=vector_store.as_retriever(), chain_type_kwargs={"document_prompt": document_prompt})
if there is an id metadata in your doc, it will be injected correctly

Wow that's cool, didn't know about that kwarg! Thanks, will try this 😃

connorjoleary · 2023-07-15T20:36:33Z

This won't change the docs grabbed by the retriever right? For example if I have a guest (Greg) stored in the metadata and I ask "what did Greg say", the retriever won't take the guest into account when grabbing the source and use it to match on something like similarity.

flash1293 · 2023-07-17T08:25:16Z

No, that's just for the refinement of the context documents by the LLM part.

joe-barhouch · 2023-08-10T16:57:08Z

Is there a way i could do the same with a ConversationalRetrievalChain?
I keep running into the error: ValueError: Missing some input keys
This is my function:
`
def get_conversation_chain(vectorstore: FAISS):
llm = ChatOpenAI(model="gpt-4-0613", temperature=0.5, streaming=False)

templates = [
    SystemMessagePromptTemplate.from_template(
        prompts.system_prompt_v1,
        input_variables=["context", "source", "page_number"],
    ),
    HumanMessagePromptTemplate.from_template(
        prompts.user_prompt,
        input_variables=["context", "source", "page_number"],
    ),
]
qa_template = ChatPromptTemplate.from_messages(templates)

memory = ConversationSummaryBufferMemory(
    llm=llm, max_token_limit=5000, memory_key="chat_history", return_messages=True
)
memory.input_key = "question"
memory.output_key = "answer"

conversation_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(
        k=5, search_type="mmr", fetch_k=20, lambda_mult=0.5
    ),
    memory=memory,
    return_source_documents=True,
    chain_type="stuff",
    combine_docs_chain_kwargs={"prompt": qa_template},
)

return conversation_chain

`

Robs-Git-Hub · 2023-09-11T09:08:35Z

Is there a way i could do the same with a ConversationalRetrievalChain? I keep running into the error: ValueError: Missing some input keys This is my function: ` def get_conversation_chain(vectorstore: FAISS): llm = ChatOpenAI(model="gpt-4-0613", temperature=0.5, streaming=False)

@joe-barhouch Did you solve this? I want to use metadata as an input_variable but it only seems to allow 'context', which is page_content.

joe-barhouch · 2023-09-11T09:14:46Z

@Robs-Git-Hub had to step back from Conversational Agents. The layer of abstraction helps with prototypes but hurts full fledged apps.

I ended up implementing my own version with LLMChain with a memory. All of the document retrieval is taken care of by immediately calling similarity_search or similar calls directly from your vectorstore.
Then i can get the metadata I have created and pass it into the prompt.

At the end of the day the RAG application just copy paste the results to the prompt, so I just handled it on my own without the need of the abstraction layer of Conversation Agents

Robs-Git-Hub · 2023-09-11T10:01:13Z

@Robs-Git-Hub had to step back from Conversational Agents. The layer of abstraction helps with prototypes but hurts full fledged apps.

Thanks for the quick reply. Very helpful, and I was reaching a similar conclusion.

theekshanamadumal · 2023-09-26T12:09:06Z

for ConversationalRetrievalChain

document_combine_prompt = PromptTemplate(
     input_variables=["source","year", "page","page_content"],
     template= """source: {source}
     year: {year}
     page: {page} 
     page content: {page_content}"""
)

qa = ConversationalRetrievalChain.from_llm(
                         ...           ,
        combine_docs_chain_kwargs={
            "prompt": retrieval_qa_chain_prompt,
            "document_prompt": document_combine_prompt,
        },
        
)

joe-barhouch · 2023-09-26T12:15:10Z

@theekshanamadumal
Unless you query the metadata, this will give an error with the missing input variables passed into the prompt template

AI-General · 2023-10-12T10:47:45Z

What is difference between "prompt" and "document_prompt"?

theekshanamadumal · 2023-10-12T10:54:56Z

@theekshanamadumal Unless you query the metadata, this will give an error with the missing input variables passed into the prompt template

Yes. you should know what are the metadata fields in the document before creating the document prompt.

theekshanamadumal · 2023-10-12T11:04:11Z

What is difference between "prompt" and "document_prompt"?

document prompt is the Prompt template used to organize content in retrieved documents.
This ends up in the main prompt as the 'context'

dosubot · 2024-02-08T16:04:54Z

Hi, @batmanscode! I'm helping the LangChain team manage their backlog and am marking this issue as stale.

It looks like you opened this issue to discuss passing Document metadata into prompts when using VectorDBQA. There have been contributions from other users sharing similar use cases and suggesting potential solutions. However, the issue remains unresolved.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to LangChain!

sgautam666 · 2024-05-10T23:26:07Z

@Robs-Git-Hub had to step back from Conversational Agents. The layer of abstraction helps with prototypes but hurts full fledged apps.

I ended up implementing my own version with LLMChain with a memory. All of the document retrieval is taken care of by immediately calling similarity_search or similar calls directly from your vectorstore. Then i can get the metadata I have created and pass it into the prompt.

At the end of the day the RAG application just copy paste the results to the prompt, so I just handled it on my own without the need of the abstraction layer of Conversation Agents

Hello, I am looking for similar use case. I am extracting some metadata using 'similarity_search'. Now I want to use this to another QA chain. Can you show me the code snippet you used?

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 8, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 15, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can `Document` metadata be passed into prompts? #1136

How can `Document` metadata be passed into prompts? #1136

batmanscode commented Feb 18, 2023

batmanscode commented Feb 21, 2023

sbc-max commented Mar 6, 2023

flash1293 commented Jul 3, 2023

batmanscode commented Jul 3, 2023

connorjoleary commented Jul 15, 2023

flash1293 commented Jul 17, 2023

joe-barhouch commented Aug 10, 2023 •

edited

Robs-Git-Hub commented Sep 11, 2023

joe-barhouch commented Sep 11, 2023 •

edited

Robs-Git-Hub commented Sep 11, 2023

theekshanamadumal commented Sep 26, 2023

joe-barhouch commented Sep 26, 2023

AI-General commented Oct 12, 2023

theekshanamadumal commented Oct 12, 2023 •

edited

theekshanamadumal commented Oct 12, 2023

dosubot bot commented Feb 8, 2024

sgautam666 commented May 10, 2024

How can Document metadata be passed into prompts? #1136

How can Document metadata be passed into prompts? #1136

Comments

batmanscode commented Feb 18, 2023

batmanscode commented Feb 21, 2023

sbc-max commented Mar 6, 2023

flash1293 commented Jul 3, 2023

batmanscode commented Jul 3, 2023

connorjoleary commented Jul 15, 2023

flash1293 commented Jul 17, 2023

joe-barhouch commented Aug 10, 2023 • edited

Robs-Git-Hub commented Sep 11, 2023

joe-barhouch commented Sep 11, 2023 • edited

Robs-Git-Hub commented Sep 11, 2023

theekshanamadumal commented Sep 26, 2023

for ConversationalRetrievalChain

joe-barhouch commented Sep 26, 2023

AI-General commented Oct 12, 2023

theekshanamadumal commented Oct 12, 2023 • edited

theekshanamadumal commented Oct 12, 2023

dosubot bot commented Feb 8, 2024

sgautam666 commented May 10, 2024

How can `Document` metadata be passed into prompts? #1136

How can `Document` metadata be passed into prompts? #1136

joe-barhouch commented Aug 10, 2023 •

edited

joe-barhouch commented Sep 11, 2023 •

edited

theekshanamadumal commented Oct 12, 2023 •

edited