jsonriver - Python Streaming JSON Parser

Parse JSON incrementally as it streams in, e.g. from a network request or a language model. Gives you a sequence of increasingly complete values.

This is a Python port of the TypeScript jsonriver library.

Features

Incremental parsing: Get progressively complete JSON values as data arrives
Zero dependencies: Uses only Python standard library
Fully typed: Complete type hints with mypy strict mode compliance
Memory efficient: Reuses objects and arrays when possible
Correct: Final result matches json.loads() exactly
Fast: Optimized for performance with minimal overhead

Installation

From PyPI (recommended)

Using uv:

uv add jsonriver

Using pip:

pip install jsonriver

From source

Using uv:

git clone https://github.com/chrisschnabl/streamjson.git
cd streamjson
uv pip install -e .

Using pip:

git clone https://github.com/chrisschnabl/streamjson.git
cd streamjson
pip install -e .

Usage

import asyncio
import json
from jsonriver import parse


async def make_stream(text: str, chunk_size: int):
    """Simulate a streaming source"""
    for i in range(0, len(text), chunk_size):
        yield text[i:i + chunk_size]


async def main():
    json_str = '{"name": "Alice", "age": 30}'

    stream = make_stream(json_str, chunk_size=3)
    async for value in parse(stream):
        print(json.dumps(value))
    # Output shows incremental results:
    # {}
    # {"name": "Al"}
    # {"name": "Alice"}
    # {"name": "Alice", "age": 30.0}


asyncio.run(main())

How it Works

jsonriver yields a sequence of increasingly complete JSON values. Consider this JSON:

{"name": "Alex", "keys": [1, 20, 300]}

If you parse this one byte at a time, it would yield:

{}
{"name": ""}
{"name": "A"}
{"name": "Al"}
{"name": "Ale"}
{"name": "Alex"}
{"name": "Alex", "keys": []}
{"name": "Alex", "keys": [1]}
{"name": "Alex", "keys": [1, 20]}
{"name": "Alex", "keys": [1, 20, 300]}

Invariants

The library maintains these guarantees:

Type stability: Future versions will have the same type (never changes string → array)
Atomic values: null, true, false, and numbers are only yielded when complete
String growth: Strings may be replaced with longer versions
Array append-only: Arrays only modified by appending or mutating the last element
Object append-only: Objects only modified by adding properties or mutating the last one
Complete keys: Object properties only added once key and value type are known

Error Handling

The parser throws errors for invalid JSON, matching json.loads() behavior:

async def example_error():
    try:
        stream = make_stream('{"invalid": }', 1)
        async for value in parse(stream):
            print(value)
    except ValueError as e:
        print(f"Parse error: {e}")

Development

Setup

# Create virtual environment and install dependencies
uv venv
uv pip install -e ".[dev]"

Testing

# Run all tests
python -m pytest tests/ -v

# Run specific test file
python -m pytest tests/test_parse.py -v

# Run with coverage
python -m pytest tests/ --cov=src/jsonriver

Type Checking

# Check types with mypy
mypy src/jsonriver --strict

Running Examples

python example_jsonriver.py

Project Structure

src/jsonriver/
  __init__.py       # Public API exports
  parse.py          # JSON parser implementation
  tokenize.py       # JSON tokenizer implementation

tests/
  test_parse.py     # Parser tests
  test_tokenize.py  # Tokenizer tests
  test_cross_validate.py  # Cross-validation with TypeScript
  utils.py          # Test utilities

bench/
  python-bench.py   # Full file parsing benchmarks
  streaming-bench.py # Streaming parsing benchmarks
  README.md         # Benchmark results and analysis

API Reference

`parse(stream: AsyncIterator[str]) -> AsyncIterator[JsonValue]`

Incrementally parse a single JSON value from the given iterable of string chunks.

Parameters:

stream: An async iterator that yields string chunks containing JSON data

Yields:

Increasingly complete JSON values as more input is parsed

Raises:

ValueError: If the input is not valid JSON
RuntimeError: For internal parsing errors

Example:

async def parse_json():
    json_str = '{"a": 1, "b": 2}'

    async def stream():
        for char in json_str:
            yield char

    async for value in parse(stream()):
        print(value)

Type Definitions

JsonValue = Union[
    None,
    bool,
    float,
    str,
    list['JsonValue'],
    dict[str, 'JsonValue']
]

JsonObject = dict[str, JsonValue]

Performance

jsonriver is optimized for streaming scenarios, not batch parsing:

Time-to-first-value: 25x faster than json.loads when data arrives in chunks
Progressive updates: Provides 300+ incremental updates for large files
User responsiveness: Shows partial results immediately vs waiting for complete data

Benchmarks

# Full file parsing comparison
python bench/python-bench.py

# Streaming scenario comparison
python bench/streaming-bench.py

Full file parsing: json.loads is ~35x faster (expected, as it's C-based) Streaming parsing: jsonriver is ~25x faster to first value (the key advantage)

See bench/README.md for detailed benchmark results and analysis.

Use Cases

Streaming APIs: Parse JSON from network requests as data arrives
Large payloads: Start processing data before complete response
Real-time UIs: Update UI as JSON parses
LLM responses: Parse structured output from language models
Progress indicators: Show parsing progress to users
Server-sent events: Handle JSON in SSE streams

Comparison with Alternatives

Feature	jsonriver	json.loads	ijson
Incremental parsing	✅	❌	✅
Complete values	✅	✅	❌
No dependencies	✅	✅	❌
Type hints	✅	✅	❌
Memory efficient	✅	❌	✅

License

BSD-3-Clause License

See LICENSE file for full license text.

Credits

This is a Python port of the excellent jsonriver TypeScript library by Peter Burns (@rictic).

Contributing

Contributions are welcome! Please ensure:

All tests pass: pytest tests/ -v
Type checking passes: mypy src/jsonriver --strict
Code follows existing style
New features include tests

Changelog

0.0.1 (2024)

Initial Python port from TypeScript
Full type hints with mypy strict mode
Comprehensive test suite (37 tests)
Complete documentation
Zero dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
bench		bench
node_modules		node_modules
src/jsonriver		src/jsonriver
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
PUBLISHING.md		PUBLISHING.md
PUBLISH_SUCCESS.md		PUBLISH_SUCCESS.md
README.md		README.md
RELEASE_INSTRUCTIONS.md		RELEASE_INSTRUCTIONS.md
VALIDATION_REPORT.md		VALIDATION_REPORT.md
example_jsonriver.py		example_jsonriver.py
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
test_bridge.mjs		test_bridge.mjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

jsonriver - Python Streaming JSON Parser

Features

Installation

From PyPI (recommended)

From source

Usage

How it Works

Invariants

Error Handling

Development

Setup

Testing

Type Checking

Running Examples

Project Structure

API Reference

`parse(stream: AsyncIterator[str]) -> AsyncIterator[JsonValue]`

Type Definitions

Performance

Benchmarks

Use Cases

Comparison with Alternatives

License

Credits

Contributing

Changelog

0.0.1 (2024)

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

chrisschnabl/jsonriver-py

Folders and files

Latest commit

History

Repository files navigation

jsonriver - Python Streaming JSON Parser

Features

Installation

From PyPI (recommended)

From source

Usage

How it Works

Invariants

Error Handling

Development

Setup

Testing

Type Checking

Running Examples

Project Structure

API Reference

parse(stream: AsyncIterator[str]) -> AsyncIterator[JsonValue]

Type Definitions

Performance

Benchmarks

Use Cases

Comparison with Alternatives

License

Credits

Contributing

Changelog

0.0.1 (2024)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`parse(stream: AsyncIterator[str]) -> AsyncIterator[JsonValue]`

Packages