-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913
Comments
🤖 It's great to see your interest in improving the streaming functionality in LangChain for chat models and language models. Your proposed goal and acceptance criteria for the PR are clear and well-defined. If you have a solution in mind and are planning to work on a fix, we encourage you to go ahead and create a pull request. Your contribution will be valuable in enhancing the LangChain platform. Thank you for your willingness to improve the codebase! This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
…angchain-ai#16986) - **Description:** Invoke callback prior to yielding token in stream and astream methods for ChatMistralAI. - **Issue:** langchain-ai#16913
### This pull request makes the following changes: * Fixed issue #16913 Fixed the google gen ai chat_models.py code to make sure that the callback is called before the token is yielded <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - **Description:** a description of the change, - **Issue:** the issue # it fixes if applicable, - **Dependencies:** any dependencies required for this change, - **Twitter handle:** we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev>
### This pull request makes the following changes: * Fixed issue langchain-ai#16913 Fixed the google gen ai chat_models.py code to make sure that the callback is called before the token is yielded <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - **Description:** a description of the change, - **Issue:** the issue # it fixes if applicable, - **Dependencies:** any dependencies required for this change, - **Twitter handle:** we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev>
…ai#17346) **Description:** Invoke callback prior to yielding token in stream method for watsonx. **Issue:** [Callback for on_llm_new_token should be invoked before the token is yielded by the model langchain-ai#16913](langchain-ai#16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>
…ai#17348) **Description:** Invoke callback prior to yielding token in stream method for Ollama. **Issue:** [Callback for on_llm_new_token should be invoked before the token is yielded by the model langchain-ai#16913](langchain-ai#16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>
### This pull request makes the following changes: * Fixed issue langchain-ai#16913 Fixed the google gen ai chat_models.py code to make sure that the callback is called before the token is yielded <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - **Description:** a description of the change, - **Issue:** the issue # it fixes if applicable, - **Dependencies:** any dependencies required for this change, - **Twitter handle:** we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev>
…ai#17346) **Description:** Invoke callback prior to yielding token in stream method for watsonx. **Issue:** [Callback for on_llm_new_token should be invoked before the token is yielded by the model langchain-ai#16913](langchain-ai#16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>
…ai#17348) **Description:** Invoke callback prior to yielding token in stream method for Ollama. **Issue:** [Callback for on_llm_new_token should be invoked before the token is yielded by the model langchain-ai#16913](langchain-ai#16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>
langchain-ai#17625) **Description**: Invoke callback prior to yielding token in stream method for watsonx. **Issue**: langchain-ai#16913
…gingFaceEndpoint (#20366) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community HuggingFaceEndpoint - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…mafile (#20365) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for Llamafile - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community llamafile.py - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…fra] (#20427) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [DeepInfra] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [DeepInfra] - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…gingFaceEndpoint (langchain-ai#20366) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community HuggingFaceEndpoint - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…mafile (langchain-ai#20365) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for Llamafile - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community llamafile.py - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…fra] (langchain-ai#20427) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [DeepInfra] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [DeepInfra] - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…gFaceTextGenInference] (#20426) …gFaceTextGenInference) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [HuggingFaceTextGenInference] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [HuggingFaceTextGenInference] - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…gingFaceEndpoint (langchain-ai#20366) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community HuggingFaceEndpoint - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…mafile (langchain-ai#20365) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for Llamafile - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community llamafile.py - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…fra] (langchain-ai#20427) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [DeepInfra] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [DeepInfra] - **Issue:** langchain-ai#16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…gingFaceEndpoint (#20366) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community HuggingFaceEndpoint - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
…mafile (#20365) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for Llamafile - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in community llamafile.py - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…fra] (#20427) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [DeepInfra] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [DeepInfra] - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
…gFaceTextGenInference] (#20426) …gFaceTextGenInference) - [x] **PR title**: community[patch]: Invoke callback prior to yielding token fix for [HuggingFaceTextGenInference] - [x] **PR message**: - **Description:** Invoke callback prior to yielding token in stream method in [HuggingFaceTextGenInference] - **Issue:** #16913 - **Dependencies:** None - **Twitter handle:** @bolun_zhang If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
Privileged issue
Issue Content
Goal
Improve streaming in LangChain for chat models / language models.
Background
Many chat and language models implement a streaming mode in which they stream tokens one at a time.
LangChain has a callback system that is useful for logging and important APIs like "stream", "stream_log" and "stream_events".
Currently many models incorrectly yield the token (chat generation) before invoking the callback.
Acceptance criteria
For a PR to be accepted and merged, the PR should:
Example PR
Here is an example PR that shows the fix for the OpenAI chat model:
#16909
Find models that need to be fixed
The easiest way to find places in the code that may need to be fixed is using git grep
git grep -C 5 "\.on_llm_new"
Examine the output to determine whether the callback is called before the token is yielded (correct) or after (needs to be fixed).
The text was updated successfully, but these errors were encountered: