-
Notifications
You must be signed in to change notification settings - Fork 14.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue: how stream results with long context #5532
Comments
Same problem here, I would like to stream the chain output, something like: to yield the tokens of the last answer. I could stream, and print the json intermediate step, however I couldn't get a generator following different instructions concerning the callbacks in the documentation. |
…async (#6181) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](#5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```
Seems related to #1349. |
was this fixed in 2b3b4e0 ? If so, can someone please create an example how to stream a load_qa_chain response? |
Hi, @qlql489, I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue pertains to encountering errors when attempting to stream results with a long context in a chatbot built using the "Chat Over Documents with Chat History" chapter. Another user, "fitolobo," has also experienced the same problem and is seeking a solution to stream the chain output. User "clemlesne" has pointed out a related issue, while "straeter" has inquired about a potential fix and requested an example on how to stream a load_qa_chain response. Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you! |
Issue you'd like to raise.
i follow the chapter “Chat Over Documents with Chat History” to build a bot chat with pdf,
i want to streanming return,
but when i use stuff chain like this
it return "This model's maximum context length is 4097 tokens, however you requested 5741 tokens (5485 in your prompt; 256 for the completion). Please reduce your prompt; or completion length"
when i use map_reduce chain
it return "Cannot stream results with multiple prompts."
how to resolve it when the context is too long
Suggestion:
No response
The text was updated successfully, but these errors were encountered: