llmstream

A tiny, zero-dependency TypeScript library that parses streaming responses from OpenAI, Anthropic, Google Gemini, and Ollama into a single, clean, normalized async iterator.

Write your streaming UI / agent loop once — swap providers without touching the consumer.

Install

npm install llmstream

Requires a runtime with fetch, ReadableStream, and TextDecoder (Node 18+, Bun, Deno, or any modern browser).

Usage

import { streamLLM } from 'llmstream'

const response = await fetch('https://api.openai.com/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
  },
  body: JSON.stringify({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: 'Hello' }],
    stream: true,
  }),
})

for await (const event of streamLLM(response, { provider: 'openai' })) {
  if (event.type === 'delta')     process.stdout.write(event.text)
  if (event.type === 'tool_call') console.log(event.name, event.args)
  if (event.type === 'finish')    console.log('done:', event.reason)
  if (event.type === 'usage')     console.log('tokens:', event.totalTokens)
}

The exact same loop works with provider: 'anthropic', 'google', or 'ollama'.

Event types

Every provider is normalized to this small, discriminated-union event type:

type LLMEvent =
  | { type: 'delta';            text: string }
  | { type: 'tool_call';        id: string; name: string; args: Record<string, unknown> }
  | { type: 'tool_call_delta';  id: string; argChunk: string }
  | { type: 'finish';           reason: 'stop' | 'tool_calls' | 'length' | 'content_filter' | 'error' }
  | { type: 'usage';            inputTokens: number; outputTokens: number; totalTokens: number }
  | { type: 'error';            message: string; raw?: unknown }

Notes per provider

OpenAI — pass stream_options: { include_usage: true } to get a usage event. Tool-call argument fragments are emitted as tool_call_delta while they stream and a single consolidated tool_call is emitted when the function call is complete.
Anthropic — stop_reason is normalized (end_turn/stop_sequence → stop, tool_use → tool_calls, max_tokens → length, refusal → content_filter).
Google Gemini — use the SSE-style endpoint (...:streamGenerateContent?alt=sse). functionCall parts are emitted as a single tool_call. Gemini doesn't provide tool-call ids, so a synthetic gemini-tool-N id is used.
Ollama — uses newline-delimited JSON, not SSE; the library detects this automatically. Both /api/generate and /api/chat are supported.

Error handling

Parsing errors (malformed JSON, mid-stream disconnects, non-2xx responses) are surfaced as { type: 'error' } events rather than thrown — a single bad chunk won't tear down the whole stream. You can also pass an onError callback for logging:

for await (const event of streamLLM(response, {
  provider: 'openai',
  onError: (err) => console.error('llmstream:', err.message),
})) {
  /* ... */
}

Why

Every LLM provider invents its own streaming format, framing, and finish-reason vocabulary. llmstream collapses all of that into one iterator so the rest of your code can stay provider-agnostic.

Zero dependencies — just the platform.
Tiny — under ~1KLOC of source, fully tree-shakable.
Strict TypeScript — discriminated unions throughout.
Crash-resistant — errors are events, not exceptions.

Running the examples

The examples/ folder ships three runnable demos. Build first so dist/ exists:

npm install
npm run build

Provider-specific scripts:

OPENAI_API_KEY=sk-...     node examples/openai-example.js
ANTHROPIC_API_KEY=sk-...  node examples/anthropic-example.js

Unified CLI demo — works with all four providers:

OPENAI_API_KEY=sk-...     node examples/cli-demo.js openai     "Write a haiku about streams"
ANTHROPIC_API_KEY=sk-...  node examples/cli-demo.js anthropic  "Explain SSE in one sentence"
GOOGLE_API_KEY=...        node examples/cli-demo.js google     "What is async iteration?"
                          node examples/cli-demo.js ollama     "Tell me a joke"

The CLI demo prints text deltas inline, surfaces tool calls, usage, and finish reason, and reports time-to-first-byte / total streaming time — useful for sanity-checking that your normalized iterator is doing what you expect against a live endpoint.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
examples		examples
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llmstream

Install

Usage

Event types

Notes per provider

Error handling

Why

Running the examples

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llmstream

Install

Usage

Event types

Notes per provider

Error handling

Why

Running the examples

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages