New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
on_retriever_end() not called with ConversationalRetrievalChain #7290
Comments
I've reproduced similar behavior with a similar but simpler example: from typing import Any, Optional
from uuid import UUID
import langchain
from langchain.callbacks.streaming_stdout import BaseCallbackHandler
from langchain.chains.llm import LLMChain
from langchain.llms import LlamaCpp
langchain.debug = True
class LLMTokenHandler(BaseCallbackHandler):
def on_llm_new_token(
self,
token: str,
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
print(f"on_llm_new_token() CALLED with {token}")
llm = LlamaCpp(
model_path="models/GPT4All-13B-snoozy.ggml.q5_1.bin",
n_ctx=4096,
n_batch=8192,
callbacks=[],
verbose=False,
use_mlock=True,
n_gpu_layers=60,
n_threads=8,
)
prompt_template = "What is the definition of {word}?"
llm_chain = LLMChain(
llm=llm,
prompt=langchain.PromptTemplate.from_template(prompt_template),
callbacks=[LLMTokenHandler()],
)
llm_chain("befuddle") I've done some investigating and I think what's happening is that callbacks passed into chains are not inheritable, so they are being dropped. @agola11, @hwchase17 The LLM is being passed only |
@hwchase17 I'm happy to submit a fix, but I first need to understand a little more about the design choices and intent, and what an appropriate solution would be. |
I am facing the same issue. Is this resolved by any chance? |
@kcho02 To the best of my knowledge, no. You can pass the callback directly to the LLM, but if you use the same LLM object in two or more different chains, this may be undesirable. |
I see. Thanks for the response. Until it's fixed, |
I noticed from the doc https://python.langchain.com/docs/modules/callbacks/#where-to-pass-in-callbacks that the callback could be provided either when calling the constructor or when running a request. I was also facing this issue when passing callbacks to the constructor but it works for me (on_retriever_end() gets called) when passing at run time. So maybe, in the first example above try something like: answer = qa(question, callbacks=[DocumentCallbackHandler()])["answer"] |
@erpic It's a workaround, but requires me to couple the component that calls the chain to the callbacks. @hwchase17 I'm happy to submit a fix, but I first need to understand a little more about the design choices and intent, and what an appropriate solution would be. |
Hi @mssalvatore @erpic, I am facing a similar issue, how did you guys fix this issue? I try to pass at run time but it does not work when chat for multiple round |
@pai4451 I worked round it by writing a wrapper around the LLM. It's not really an approach I can recommend. |
@mssalvatore I found an implementation that exactly matches my needs. |
I work around this issue by inserting the def get_chain(callbacks):
retriever = db.as_retriever(callbacks=callbacks)
model = ChatOpenAI(callbacks=callbacks)
def format_docs(docs):
text = "\n\n".join([d.page_content for d in docs])
return text
def hack(docs):
# https://github.com/langchain-ai/langchain/issues/7290
for callback in callbacks:
callback.on_retriever_end(docs, run_id=uuid4())
return docs
return (
{"context": retriever | hack | format_docs, "question": RunnablePassthrough()}
| prompt
| model
) I use it for documenting how the Full Codefrom uuid import uuid4
import requests
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
import panel as pn
TEXT = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
TEMPLATE = """Answer the question based only on the following context:
{context}
Question: {question}
"""
pn.extension(design="material")
prompt = ChatPromptTemplate.from_template(TEMPLATE)
@pn.cache
def get_vector_store():
full_text = requests.get(TEXT).text
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
texts = text_splitter.split_text(full_text)
embeddings = OpenAIEmbeddings()
db = Chroma.from_texts(texts, embeddings)
return db
db = get_vector_store()
def get_chain(callbacks):
retriever = db.as_retriever(callbacks=callbacks)
model = ChatOpenAI(callbacks=callbacks)
def format_docs(docs):
text = "\n\n".join([d.page_content for d in docs])
return text
def hack(docs):
# https://github.com/langchain-ai/langchain/issues/7290
for callback in callbacks:
callback.on_retriever_end(docs, run_id=uuid4())
return docs
return (
{"context": retriever | hack | format_docs, "question": RunnablePassthrough()}
| prompt
| model
)
async def callback(contents, user, instance):
callback_handler = pn.chat.langchain.PanelCallbackHandler(instance)
chain = get_chain(callbacks=[callback_handler])
await chain.ainvoke(contents)
pn.chat.ChatInterface(callback=callback).servable() |
Thanks, this hack worked for me... |
System Info
LangChain: v0.0.225
OS: Ubuntu 22.04
Who can help?
@agola11
@hwchase17
Information
Related Components
Reproduction
Code
Output
Expected behavior
I expect the
on_retriever_end()
callback to be called immediately after documents are retrieved. I'm not sure what I'm doing wrong.The text was updated successfully, but these errors were encountered: