A fast, Rust-powered Apache Thrift toolkit for Python.
thriftrs2 brings native Rust performance to Python Thrift workflows through PyO3. It provides an end-to-end toolkit: parse .thrift IDL files, serialize and deserialize structs, and run RPC clients and servers — all with a Python-first API that feels idiomatic.
Status: alpha. The core serialization, IDL parsing, and RPC paths are stable enough for evaluation; the API may still shift before 1.0.
Existing Python Thrift libraries are pure-Python and carry a serialization bottleneck. thriftrs2 replaces the hot path (parsing, ser/de, RPC framing) with compiled Rust, while keeping the user-facing API in Python where flexibility matters.
| Category | What's included |
|---|---|
| IDL parser | struct, service, enum, union, exception, const, typedef, include, namespace, throws, annotations, field defaults, extends inheritance |
| Protocols | Binary, Compact, JSON (TJSON field-id format) |
| Serialization | serialize / deserialize for structs; dumps / loads for JSON text |
| Transports | TBufferedTransport, TFramedTransport |
| RPC client | Context-manager based, sync call() with automatic request/response framing |
| RPC server | Multi-threaded (configurable workers), sync handler dispatch, oneway support |
| Compatibility | Reads thriftpy2 JSON envelopes; runs structured benchmarks against thriftpy2 |
pip install thriftrs2Requires Python ≥ 3.9.
from thriftrs2 import load
mod = load("example.thrift")
# mod.User → struct type
# mod.UserService → service typefrom thriftrs2 import serialize, deserialize
user = {"id": 1, "name": "Alice", "email": "alice@example.com", "age": 30}
blob = serialize(mod.User, user)
restored = deserialize(mod.User, blob)
assert restored == userProtocol selection:
from thriftrs2 import ProtocolType
blob = serialize(mod.User, user, proto=ProtocolType.Compact)
restored = deserialize(mod.User, blob, proto=ProtocolType.Compact)
# JSON helpers
from thriftrs2 import dumps, loads
text = dumps(mod.User, user)
restored = loads(mod.User, text)from thriftrs2 import make_client, TBufferedTransport, ProtocolType
with make_client(
mod.UserService,
"127.0.0.1", 9090,
TBufferedTransport.transport_type,
protocol=ProtocolType.Binary,
) as client:
user = client.call("get_user", user_id=1)from thriftrs2 import make_server
class Handler:
def get_user(self, user_id):
return mod.User(id=user_id, name="Alice", email="alice@example.com", age=30)
server = make_server(
mod.UserService, Handler(),
transport=TBufferedTransport.transport_type,
protocol=ProtocolType.Binary,
workers=4,
)
server.serve_forever("127.0.0.1", 9090) ┌──────────────────────────────────────────┐
│ Python API │
│ load() serialize() make_client() ... │
└──────────────┬───────────────────────────┘
│ PyO3
┌──────────────┴───────────────────────────┐
│ Rust Core │
│ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │ Parser │ │ Protocol │ │ Python │ │
│ │ (nom) │ │ (bin/cmp │ │ bindings │ │
│ │ │ │ /json) │ │ │ │
│ └──────────┘ └──────────┘ └───────────┘ │
│ ┌──────────────────────────────────────┐ │
│ │ Client / Server (tokio) │ │
│ └──────────────────────────────────────┘ │
└──────────────────────────────────────────┘
- Parser — Nom-based
.thriftIDL parser producing an AST - Protocol — Binary, Compact, and JSON read/write with correct framing
- Client/Server — Tokio-powered async I/O behind a sync Python API
thriftrs2/
├── src/
│ ├── lib.rs # PyO3 module entry point
│ ├── parser/ # IDL parser (lexer, AST, grammar)
│ ├── protocol/ # Binary, Compact, JSON ser/de
│ └── python/ # PyO3 bindings (client, server, types, parser wrappers)
├── python/
│ └── thriftrs2/ # Python package layer
│ ├── __init__.py # Public API re-exports
│ ├── loader.py # load(), make_client(), make_server()
│ └── protocol.py # Python-side protocol helpers
├── examples/ # Runnable examples & benchmarks
│ ├── example.thrift # Sample IDL
│ ├── test.py # Struct round-trip
│ ├── test_protocols.py # Protocol comparison
│ ├── client_example.py # RPC client
│ ├── server_example.py # RPC server
│ ├── ocr_client.py # Larger service client
│ ├── ocr_server.py # Larger service server
│ ├── benchmark.py # Serialization micro-benchmark
│ └── benchmark_all.py # Full matrix: ser/de + RPC vs thriftpy2
├── python/tests/ # pytest + cargo test suites
├── docs/USER_GUIDE.md # Detailed user guide
├── Cargo.toml # Rust crate manifest (version source of truth)
├── pyproject.toml # Python build config (maturin)
└── CHANGELOG.md # Keep a Changelog
Results from benchmark_all.py (500 ser/de, 1K RPC iterations, 50 warmup, 3 runs, on AMD Ryzen 9950X3D). All comparisons vs thriftpy2.
Deserialize + to_dict(), ops/s (higher = better):
| Shape | Wire bytes (Bin/Cmp/JSON) | Binary | Compact | JSON | JSON vs tp2 |
|---|---|---|---|---|---|
| simple | 21 / 11 / 36 B | 1,610,845 | 1,501,299 | 1,155,703 | 3.3× |
| complex | 641 / 460 / 986 B | 233,622 | 227,657 | 113,934 | 2.4× |
| large | 8.0 / 6.0 / 11.5 KB | 18,576 | 18,619 | 8,199 | 3.0× |
| xlarge | 65.8 / 47.6 / 100.5 KB | 2,072 | 2,111 | 796 | 2.2× |
Binary and Compact are neck-and-neck; Compact payloads are ~30% smaller. JSON deserialization uses a direct serde_json::Value → Python conversion path that skips the intermediate ThriftValue tree.
| Shape | Payload | Serialize | vs tp2 | Deserialize | vs tp2 |
|---|---|---|---|---|---|
| simple | 36 B | 3,108,918 ops/s | 10.1× | 1,143,568 ops/s | 3.3× |
| complex | ~1 KB | 131,788 ops/s | 5.6× | 76,620 ops/s | 2.4× |
| large | ~11 KB | 7,155 ops/s | 2.9× | 8,566 ops/s | 3.0× |
| xlarge | ~100 KB | 776 ops/s | 2.9× | 646 ops/s | 2.2× |
Serialization runs 2.9–10.1× faster. Deserialization leads 2.2–3.3× across all payload sizes, reversing the pre-optimization gap at large payloads.
| Protocol | Transport | Conc=1 | Conc=4 | Conc=16 | Conc=64 |
|---|---|---|---|---|---|
| Binary | Buffered | 13,233 (547×) | 15,009 (155×) | 11,739 (31×) | 10,393 (7.1×) |
| Binary | Framed | 15,480 (1.36×) | 13,548 (1.59×) | 12,221 (1.69×) | 10,750 (1.40×) |
| JSON | Buffered | 1,335 (1.15×) | 1,457 (1.33×) | 1,147 (1.22×) | 1,161 (1.33×) |
| JSON | Framed | 1,454 (1.25×) | 1,442 (1.35×) | 1,209 (1.26×) | 1,158 (1.36×) |
Values are throughput (requests / second) with speedup vs thriftpy2. All rows include thriftpy2 comparison (both Buffered and Framed transports). Binary Framed achieves the highest single-connection throughput for large payloads. Under concurrency, both Buffered and Framed Binary deliver ~10K+ req/s sustained. See the full matrix including get_simple, get_complex, save_complex, and save_batch by running:
# CI smoke (fast)
python examples/benchmark_all.py --ci-smoke
# Full matrix
python examples/benchmark_all.py \
--ser-iterations 500 \
--rpc-iterations 1000 \
--warmup 50 \
--rpc-concurrency 1 4 16 64 \
--runs 3These are tracked gaps, not permanent design decisions:
- JSON output envelope — Reads thriftpy2 JSON envelopes but does not yet emit them
- Exception types — Declared Thrift exceptions are decoded and raised as
RuntimeErrorrather than dedicated Python exception classes - Multi-file namespaces —
includeresolves types across files but does not yet enforce a full scoped namespace model for same-name types - Benchmarks — Smoke mode is suitable for CI; production-grade numbers should use
--runs 3with adequate warmup
See open issues for the current backlog.
# Setup
pip install maturin pytest
maturin develop --release
# Run tests
python -m pytest -q
cargo test
cargo check
# Rebuild after Rust changes
maturin develop --releaseContributions are welcome. The project is early-stage, so starting with an issue to discuss scope is recommended before investing in large changes.
- Fork the repository
- Create a feature branch
- Make your changes and add tests
- Run
python -m pytest -q && cargo test - Open a pull request
MIT — see LICENSE.