# How to stream chat model responses

All [chat models](https://v02.api.js.langchain.com/classes/langchain_core_language_models_chat_models.BaseChatModel.html) implement the [Runnable interface](https://v02.api.js.langchain.com/classes/langchain_core_runnables.Runnable.html), which comes with a **default** implementations of standard runnable methods (i.e. `invoke`, `batch`, `stream`, `streamEvents`).

The **default** streaming implementation provides an `AsyncGenerator` that yields a single value: the final output from the underlying chat model provider.

:::{.callout-tip}

The **default** implementation does **not** provide support for token-by-token streaming, but it ensures that the the model can be swapped in for any other model as it supports the same standard interface.

:::

The ability to stream the output token-by-token depends on whether the provider has implemented proper streaming support.

See which [integrations support token-by-token streaming here](/docs/integrations/chat/).

## Streaming

Below, we use a `---` to help visualize the delimiter between tokens.

```{=mdx}
import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs />
```

In [3]:
const stream = await model.stream("Write me a 1 verse song about goldfish on the moon")

for await (const chunk of stream) {
    console.log(`${chunk.content}
---`);
}


---
Sw
---
imming
---
 in
---
 a
---
 world
---
 of
---
 silver
---
 beams
---
,

---
Gold
---
fish
---
 on
---
 the
---
 moon
---
,
---
 living
---
 their
---
 dreams
---
.
---

---

---


## Stream events

Chat models also support the standard [streamEvents()](https://v02.api.js.langchain.com/classes/langchain_core_runnables.Runnable.html#streamEvents) method.

This method is useful if you're streaming output from a larger LLM application that contains multiple steps (e.g., a chain composed of a prompt, chat model and parser).

In [4]:
let idx = 0

const stream = model.streamEvents(
    "Write me a 1 verse song about goldfish on the moon",
    {
        version: "v2"
    }
);

for await (const event of stream) {
    idx += 1
    if (idx === 5) {
        console.log("...Truncated");
        break;
    } 
    console.log(event);
}

{
  event: 'on_chat_model_start',
  data: { input: 'Write me a 1 verse song about goldfish on the moon' },
  name: 'ChatOpenAI',
  tags: [],
  run_id: 'c9966059-70eb-4f24-9de3-2cf04320c8f6',
  metadata: {
    ls_provider: 'openai',
    ls_model_name: 'gpt-3.5-turbo',
    ls_model_type: 'chat',
    ls_temperature: 1,
    ls_max_tokens: undefined,
    ls_stop: undefined
  }
}
{
  event: 'on_chat_model_stream',
  data: {
    chunk: AIMessageChunk {
      lc_serializable: true,
      lc_kwargs: [Object],
      lc_namespace: [Array],
      content: '',
      name: undefined,
      additional_kwargs: {},
      response_metadata: [Object],
      id: 'chatcmpl-9lOQhe44ip2q0DHfr0eYU9TF4mHtu',
      tool_calls: [],
      invalid_tool_calls: [],
      tool_call_chunks: [],
      usage_metadata: undefined
    }
  },
  run_id: 'c9966059-70eb-4f24-9de3-2cf04320c8f6',
  name: 'ChatOpenAI',
  tags: [],
  metadata: {
    ls_provider: 'openai',
    ls_model_name: 'gpt-3.5-turbo',
    ls_model_type: 'cha

## Next steps

You've now seen a few ways you can stream chat model responses.

Next, check out this guide for more on [streaming with other LangChain modules](/docs/how_to/streaming).