Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core[minor]: Add v2 implementation of astream events #21638

Merged
merged 34 commits into from
May 15, 2024

Conversation

eyurtsev
Copy link
Collaborator

@eyurtsev eyurtsev commented May 13, 2024

This PR introduces a v2 implementation of astream events that removes intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's associated with the astream events implementation together with overhead.

After this PR, the astream events implementation:

  • Uses an async callback handler
  • No longer relies on BaseTracer
  • No longer relies on json patch

As a result of this re-write, a number of issues were discovered with the existing implementation.

Changes in V2 vs. V1

on_chat_model_end output

The outputs associated with on_chat_model_end changed depending on whether it was within a chain or not.

As a root level runnable the output was:

"data": {"output": AIMessageChunk(content="hello world!", id='some id')}

As part of a chain the output was:

            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },

After this PR, we will always use the simpler representation:

"data": {"output": AIMessageChunk(content="hello world!", id='some id')}

NOTE Non chat models (i.e., regular LLMs) are still associated with the more verbose format.

Remove some _stream events

on_retriever_stream and on_tool_stream events were removed -- these were not real events, but created as an artifact of implementing on top of astream_log.

The same information is already available in the x_on_end events.

Propagating Names

Names of runnables have been updated to be more consistent

  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )

Before:

"name": "RunnableConfigurableFields",

After:

"name": "GenericFakeChatModel",

on_retriever_end

on_retriever_end will always return output which is a list of documents (rather than a dict containing a key called "documents")

Retry events

Removed the on_retry callback handler. It was incorrectly showing that the failed function being retried has invoked on_chain_end

https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394

Copy link

vercel bot commented May 13, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview May 15, 2024 2:53pm

@eyurtsev eyurtsev changed the title core[minor]: Replace implementation of astream events core[major]: Replace implementation of astream events May 14, 2024
@eyurtsev eyurtsev changed the title core[major]: Replace implementation of astream events core[minor]: Replace implementation of astream events May 14, 2024
@eyurtsev eyurtsev changed the title core[minor]: Replace implementation of astream events core[minor]: Add v2 implementation of astream events May 14, 2024
@eyurtsev eyurtsev marked this pull request as ready for review May 14, 2024 20:11
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label May 14, 2024
@eyurtsev eyurtsev requested a review from nfcampos May 14, 2024 20:11
@eyurtsev eyurtsev assigned efriis and nfcampos and unassigned efriis May 14, 2024
@dosubot dosubot bot added the 🤖:improvement Medium size change to existing code to handle new use-cases label May 14, 2024
) -> None:
"""End a trace for an LLM run."""
# "chat_model" is only used for the experimental new streaming_events format.
# This change should not affect any existing tracers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment shouldn't be here?

@@ -1936,7 +1800,8 @@ async def _atransform_stream_with_config(
"""Helper method to transform an Async Iterator of Input values into an Async
Iterator of Output values, with callbacks.
Use this to implement `astream()` or `atransform()` in Runnable subclasses."""
from langchain_core.tracers.log_stream import LogStreamCallbackHandler
# Mixing that is used by both astream log and astream events implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

@eyurtsev
Copy link
Collaborator Author

@jacoblee93 / @nfcampos

on_llm_end still has a very verbose output.

Options are:

  1. Keep as is
  2. output only the text (but makes it impossible to pluck out metadata)
  3. output only the text, and start feeding metadata into a separate key in data
  4. do a hack and start outputting AIMessageChunk

@jacoblee93
Copy link
Contributor

@jacoblee93 / @nfcampos

on_llm_end still has a very verbose output.

Options are:

  1. Keep as is
  2. output only the text (but makes it impossible to pluck out metadata)
  3. output only the text, and start feeding metadata into a separate key in data
  4. do a hack and start outputting AIMessageChunk

I'd opt for making as few transformations as possible, which I think would be 1 or 4?

@eyurtsev
Copy link
Collaborator Author

We can keep as (1) for now until we decide what to do. (4) Might be a bad idea since we don't use AIMessage in regular LLMs currently.

@eyurtsev eyurtsev merged commit 5c2cfab into master May 15, 2024
96 checks passed
@eyurtsev eyurtsev deleted the eugene/async_astream_events branch May 15, 2024 15:48
hinthornw pushed a commit that referenced this pull request Jun 20, 2024
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants