Headless React (and React Native) primitives for streaming LLM responses. Provider-agnostic, UI-agnostic, with first-class support for partial JSON streaming so tool-call arguments and structured outputs render as they arrive.
Most React AI libraries either ship opinionated UI (chat bubbles, prebuilt panels) or couple tightly to one provider. react-partial-stream is just hooks and types: feed it a stream of chunks, get back React state you can render however you want.
The wedge: partial JSON parsing. While a tool call's arguments are still streaming in, you can already read the partially-parsed object — type-safe, with isPartial flags so you know what's settled.
npm install react-partial-streamThe hook takes any AsyncIterable<StreamChunk> and gives you a React-state view of the assistant's response:
import { useStreamingMessage } from "react-partial-stream";
import type { StreamSource } from "react-partial-stream";
function Assistant({ stream }: { stream: StreamSource }) {
const { message, isStreaming } = useStreamingMessage(stream);
return (
<div>
{message.content.map((block, i) => {
if (block.type === "text") return <p key={i}>{block.text}</p>;
if (block.type === "thinking") return <pre key={i}>{block.text}</pre>;
if (block.type === "tool-call") {
return (
<pre key={i}>
{block.name}({JSON.stringify(block.args)})
{block.isPartial && " (streaming…)"}
</pre>
);
}
return null;
})}
{isStreaming && <span aria-label="streaming">▍</span>}
</div>
);
}You bring the stream. The reference adapters convert a provider SDK's stream into the StreamChunk shape the hooks consume. Below is the OpenAI flavor running in a React Server Component (so the API key stays on the server):
import OpenAI from "openai";
import { fromOpenAIStream } from "./adapters/openai"; // copy from examples/adapters/
const client = new OpenAI();
async function* getStream() {
const completion = await client.chat.completions.create({
model: "gpt-4o-mini",
stream: true,
messages: [{ role: "user", content: "Hello!" }],
});
yield* fromOpenAIStream(completion);
}
export default function Page() {
return <Assistant stream={getStream()} />;
}For browser-side rendering, proxy the request through your backend and parse the chunks back into the same AsyncIterable shape — the hook doesn't care where the stream came from.
For typed JSON outputs (function call args, structured generation), useStructuredOutput gives you a typed value that fills in as the model emits tokens:
const { value, isPartial } = useStructuredOutput<{ items: string[] }>(stream);
// value?.items renders progressively: ["a"], ["a","b"], … as JSON arrives.
// isPartial flips to false once the stream emits its terminal finish chunk.useStreamingMessage(stream, signal?)— accumulate chunks into anAssistantMessagewith text, thinking, and tool-call blocksuseToolCall(message, id)— selector for a single tool call's stateuseStructuredOutput<T>(stream, signal?)— typed partial JSON value that fills in as it streamsparsePartialJSON(input)— the underlying parser, exported for direct use
Both stream hooks return finishReason: "stop" | "length" | "tool_use" | undefined. It stays undefined while streaming and is set when the stream emits its terminal finish chunk, so you can distinguish a clean completion from a token-limit truncation or a tool-call handoff. If the stream errored, error is set and finishReason stays undefined — the two are mutually exclusive.
const { message, isStreaming, finishReason, error } = useStreamingMessage(stream);
if (error) return <Error message={error.message} />;
if (!isStreaming && finishReason === "length") return <Truncated message={message} />;
if (!isStreaming && finishReason === "tool_use") return <RunTool message={message} />;Pass an AbortSignal to stop a stream from outside the component. Unmounting also cancels — both paths call iter.return() on the iterator so producers get a chance to release resources.
const controller = useMemo(() => new AbortController(), []);
const { message } = useStreamingMessage(stream, controller.signal);
// later: controller.abort();react-partial-stream doesn't talk to any LLM directly. You bring the stream — official adapters for Anthropic, OpenAI, etc. are planned as separate packages so the core stays tiny and dependency-free.
Reference adapters for OpenAI and Anthropic live in examples/adapters/ — copy them into your project or use them as templates for other providers. They map each provider's native chunk type to the StreamChunk shape the hooks consume.
The hooks are platform-agnostic — they only use React core, AbortController, and async iterators, all supported by Hermes. Render to <View>/<Text> instead of the web tags shown in the examples.
The catch is getting a stream into RN: fetch on RN doesn't support streaming response bodies out of the box, so you'll typically need a polyfill like react-native-fetch-api or proxy through your backend. Once you produce an AsyncIterable<StreamChunk>, the hooks work identically.
MIT