Highlights
This is a substantial reliability and feature release focused on cross-transport stability, full Kafka feature parity, a new task queue pattern, and large-scale code quality improvements.
Kafka transport - full RPC and Action support
- New
RPCService,RPCClient,RPCServerfor Kafka with permanent per-instance reply topics and correlation-ID routing for sub-second call latency. ActionService/ActionClientnow pre-create all required topics in__init__so subscriptions never wait on auto-creation.- New
subscribe()/publish()primitives onKafkaTransport. - Unique
group_idper instance prevents consumer-group rebalance cascades on broker start;offset_reset=latestremoves 10s session-timeout waits. - Topics are pre-created via
AdminClientto avoid 15swait_for_assignmenttimeouts. - Graceful
UNKNOWN_TOPIC_OR_PARThandling.
New: Task queue endpoint pattern
- Adds task queue pattern alongside Pub/Sub, RPC, and Action.
- Integration tests, benchmarks, examples, and API docs included.
Transport reliability fixes
- AMQP: Resolved nested ioloop re-entry corrupting pika's
_tx_buffers(CI benchmark failures);_pika_call()helper serialises ioloop access on shared connections;Subscriber/RPCServicenow default to dedicated connections to prevent concurrent ioloop access;connection.close()bounded by 5s daemon-thread timeout to preventnode.stop()hangs on FRAME_ERROR. - Redis:
_set_connected(True/False)correctly toggled inconnect()/stop()/_attempt_reconnect()-wait_connected()no longer always times out (saved up to 50s for ActionClient with 5 sub-endpoints).stop()is now best-effort with each cleanup step guarded; pubsub reconnect loop checks_stoppedat every await point. - Kafka:
Subscriber.stop()race with poll thread fixed via_stop_eventpattern;KafkaTransport.stop()boundsproducer.flush()and runsConsumer.close()in a daemon thread with 5s timeout (librdkafka has no native close timeout). - Action:
_handle_get_result()no longer asserts onRUNNINGgoals (was crashing RPC handler threads);GoalHandlerusesissubclassfor msg_type check soself.data/self.resultare proper Pydantic models;BaseActionClient.get_resultimplements proper poll-until-terminal wait logic.
Code quality
- Resolved ~400+ pylint violations across core library, examples, and tests (PRs #69, #70).
- Full mypy
--check-untyped-defsclean across the core library. - Module and function docstrings added across 50 files.
- Replaced mutable default arguments, dangerous defaults, unused imports/variables, f-string logging, and protected-access violations.
max-line-lengthstandardised at 100.
Tooling and CI
- New
make ci,make ci-strict,make ci-fulltargets;test-alltarget combining ci + integration. - Added
mypytypecheck targets to Makefile. - Removed brittle benchmarks workflow; integration tests now cover the same surface.
scripts/run_all_broker_tests.pyrewritten with TCP probe + AdminClient readiness checks for Kafka, per-script log persistence underlogs/integration/, partial stdout/stderr on timeout, and live-buffered (python -u) execution.- Restructured
docs/directory with development, performance, and session-summary subfolders.
Documentation
- Updated API guide and README with task queue pattern and full transport feature matrix.
- README updated with CI commands and troubleshooting guidance.
Compatibility
- Python 3.9+ supported.
- Python 3.14 + AMQP (pika) remains incompatible - relevant tests auto-skipped.
- All Pydantic v2 conventions (
model_dump()) preserved.
Full Changelog: v0.13.1...v0.13.2