Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913

Closed
1 of 5 tasks
eyurtsev opened this issue Feb 2, 2024 · 1 comment
Closed
1 of 5 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature good first issue Good for newcomers help wanted Good issue for contributors Ɑ: models Related to LLMs or chat model modules 🔌: openai Primarily related to OpenAI integrations

Comments

@eyurtsev
Copy link
Collaborator

eyurtsev commented Feb 2, 2024

Privileged issue

  • I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Goal

Improve streaming in LangChain for chat models / language models.

Background

Many chat and language models implement a streaming mode in which they stream tokens one at a time.

LangChain has a callback system that is useful for logging and important APIs like "stream", "stream_log" and "stream_events".

Currently many models incorrectly yield the token (chat generation) before invoking the callback.

Acceptance criteria

For a PR to be accepted and merged, the PR should:

  • Fix the code to make sure that the callback is called before the token is yielded
  • Link to this issue
  • Change ONE and only ONE model
  • FIx sync and async implementation if both are defined

Example PR

Here is an example PR that shows the fix for the OpenAI chat model:

#16909

Find models that need to be fixed

The easiest way to find places in the code that may need to be fixed is using git grep

git grep -C 5  "\.on_llm_new"

Examine the output to determine whether the callback is called before the token is yielded (correct) or after (needs to be fixed).

@eyurtsev eyurtsev added help wanted Good issue for contributors good first issue Good for newcomers labels Feb 2, 2024
Copy link

dosubot bot commented Feb 2, 2024

🤖

It's great to see your interest in improving the streaming functionality in LangChain for chat models and language models. Your proposed goal and acceptance criteria for the PR are clear and well-defined. If you have a solution in mind and are planning to work on a fix, we encourage you to go ahead and create a pull request. Your contribution will be valuable in enhancing the LangChain platform. Thank you for your willingness to improve the codebase!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature 🔌: openai Primarily related to OpenAI integrations labels Feb 2, 2024
efriis pushed a commit that referenced this issue Feb 4, 2024
…16986)

- **Description:** Invoke callback prior to yielding token in stream and
astream methods for ChatMistralAI.
- **Issue:** #16913
hoanq1811 pushed a commit to hoanq1811/langchain that referenced this issue Feb 6, 2024
…angchain-ai#16986)

- **Description:** Invoke callback prior to yielding token in stream and
astream methods for ChatMistralAI.
- **Issue:** langchain-ai#16913
efriis added a commit that referenced this issue Feb 10, 2024
### This pull request makes the following changes:
* Fixed issue #16913

Fixed the google gen ai chat_models.py code to make sure that the
callback is called before the token is yielded

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
baskaryan pushed a commit that referenced this issue Feb 13, 2024
**Description:** Invoke callback prior to yielding token in stream
method for watsonx.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
baskaryan pushed a commit that referenced this issue Feb 13, 2024
**Description:** Invoke callback prior to yielding token in stream
method for Ollama.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
snsten pushed a commit to snsten/langchain that referenced this issue Feb 15, 2024
### This pull request makes the following changes:
* Fixed issue langchain-ai#16913

Fixed the google gen ai chat_models.py code to make sure that the
callback is called before the token is yielded

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
snsten pushed a commit to snsten/langchain that referenced this issue Feb 15, 2024
…ai#17346)

**Description:** Invoke callback prior to yielding token in stream
method for watsonx.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
langchain-ai#16913](langchain-ai#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
snsten pushed a commit to snsten/langchain that referenced this issue Feb 15, 2024
…ai#17348)

**Description:** Invoke callback prior to yielding token in stream
method for Ollama.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
langchain-ai#16913](langchain-ai#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
eyurtsev pushed a commit that referenced this issue Feb 16, 2024
#17625)

**Description**: Invoke callback prior to yielding token in stream
method for watsonx.
 **Issue**: #16913
haydeniw pushed a commit to haydeniw/langchain that referenced this issue Feb 27, 2024
### This pull request makes the following changes:
* Fixed issue langchain-ai#16913

Fixed the google gen ai chat_models.py code to make sure that the
callback is called before the token is yielded

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
haydeniw pushed a commit to haydeniw/langchain that referenced this issue Feb 27, 2024
…ai#17346)

**Description:** Invoke callback prior to yielding token in stream
method for watsonx.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
langchain-ai#16913](langchain-ai#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
haydeniw pushed a commit to haydeniw/langchain that referenced this issue Feb 27, 2024
…ai#17348)

**Description:** Invoke callback prior to yielding token in stream
method for Ollama.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
langchain-ai#16913](langchain-ai#16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
haydeniw pushed a commit to haydeniw/langchain that referenced this issue Feb 27, 2024
langchain-ai#17625)

**Description**: Invoke callback prior to yielding token in stream
method for watsonx.
 **Issue**: langchain-ai#16913
ccurme added a commit that referenced this issue Apr 12, 2024
…gingFaceEndpoint (#20366)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
ccurme pushed a commit that referenced this issue Apr 12, 2024
…mafile (#20365)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
ccurme pushed a commit that referenced this issue Apr 14, 2024
…fra] (#20427)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
junkeon pushed a commit to UpstageAI/langchain that referenced this issue Apr 16, 2024
…gingFaceEndpoint (langchain-ai#20366)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
junkeon pushed a commit to UpstageAI/langchain that referenced this issue Apr 16, 2024
…mafile (langchain-ai#20365)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
junkeon pushed a commit to UpstageAI/langchain that referenced this issue Apr 16, 2024
…fra] (langchain-ai#20427)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
ccurme added a commit that referenced this issue Apr 18, 2024
…gFaceTextGenInference] (#20426)

…gFaceTextGenInference)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [HuggingFaceTextGenInference]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [HuggingFaceTextGenInference]
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
naveentatikonda pushed a commit to naveentatikonda/langchain that referenced this issue Apr 19, 2024
…gingFaceEndpoint (langchain-ai#20366)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
naveentatikonda pushed a commit to naveentatikonda/langchain that referenced this issue Apr 19, 2024
…mafile (langchain-ai#20365)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
naveentatikonda pushed a commit to naveentatikonda/langchain that referenced this issue Apr 19, 2024
…fra] (langchain-ai#20427)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
    - **Issue:** langchain-ai#16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…18629)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_ &
_astream_ methods in llms/ollama.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…18628)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/openai.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…dpoint) (#18627)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/pai_eas_endpoint.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…#18626)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/replicate.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…18625)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/sparkllm.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…off_pro) (#18624)

## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/titan_takeoff_pro.
- Issue: #16913 
- Dependencies: None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…#19392)

**Description:** Invoke callback prior to yielding token for llama.cpp
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](#16913)
**Dependencies:** None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…#19388)

**Description:** Invoke callback prior to yielding token for Fireworks
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](#16913)
**Dependencies:** None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…19389)

**Description:** Invoke callback prior to yielding token for BaseOpenAI
& OpenAIChat
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](#16913)
**Dependencies:** None
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…gingFaceEndpoint (#20366)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…mafile (#20365)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…fra] (#20427)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
hinthornw pushed a commit that referenced this issue Apr 26, 2024
…gFaceTextGenInference] (#20426)

…gFaceTextGenInference)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [HuggingFaceTextGenInference]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [HuggingFaceTextGenInference]
    - **Issue:** #16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label May 3, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale May 10, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature good first issue Good for newcomers help wanted Good issue for contributors Ɑ: models Related to LLMs or chat model modules 🔌: openai Primarily related to OpenAI integrations
Projects
None yet
Development

No branches or pull requests

1 participant