v2.3.0
WS reliability + auth polish batch on top of v2.2.0's spec-required tightening.
The big-ticket items: per-sid bounded stash that closes a silent message-loss
window during reconnect bursts (#176), cooperative shutdown via
run_forever(stop_event=...) (#177), async RSA-PSS sign offload onto a
dedicated 2-worker executor so signs don't queue behind getaddrinfo during
reconnect storms (#178), and a run_forever() foot-gun fix that now raises
KalshiSubscriptionError instead of silently returning when no subscription
has landed (#175). Plus 226 spec-required fields tightened to non-optional
with hard-fail drift gates (#172 via #180), all 49→91 REST contract-map
entries (#171 via #181), the first two server_omits_despite_required
exclusions for fields the live demo omits (#183), MessageQueue maxlen
defense-in-depth (#173), _to_decimal_* consolidation (#174), and a
pre-release docs audit sweep across the full mkdocs site (#179).
Soft-breaking at the response-parse boundary only (per #172): server
omission of a previously-optional spec-required field now raises
pydantic.ValidationError instead of silently producing field=None.
Wire format unchanged.
Pre-release docs audit (#179)
Release-prep sweep across all doc surfaces — README, mkdocs site, ROADMAP,
per-resource pages, and public-API docstrings. Findings compiled from a
six-way parallel audit of disjoint file partitions, then triaged:
ROADMAP.md— "Open trackers" section dropped;#45,#53are
closed and#106was a PR (not an issue) whose remaining sub-items all
shipped in this batch. "Next milestone" carry-overs that landed
(MessageQueuemaxlen,_coerce_decimal, WS UX foot-guns,
CONTRACT_MAP completeness via#181) removed. Closes#179.docs/resources/multivariate.md— fixed wrong endpoint path on
lookup_history(was/lookup_history, actual is/lookupwith a
lookback_secondsquery param) and a broken example that called
hist.lookupson a list return.docs/index.md— REST coverage updated from "85 endpoints" to "98
operations" against current spec; sync/async parity claim explicitly
notes WebSocket is async-only.docs/websockets.md— new "Resubscribe-window frame stashing"
subsection documenting the#176mechanism,stash_maxlenbound, and
overflow logging.docs/authentication.md— new "Async RSA-PSS sign offload"
subsection documentingKalshiAuth.sign_request_async()(#178) plus
the dedicatedThreadPoolExecutorlifecycle.docs/configuration.md— new "Lifecycle" section documenting
client.close()semantics and cross-linking the sign-executor teardown.docs/resources/events.md— note documentingEvent.product_metadata
andEventMetadata.market_detailsserver-omission handling from#183.README.md— WS quickstart uses the package-level import
(from kalshi.ws import KalshiWebSocket) instead of the deeper
kalshi.ws.client; channel list clarifies that 11 of the 13 channels
have dedicatedsubscribe_*methods and the remaining two ride the
genericsubscribe()escape hatch.
WS resubscribe-window frame stashing (#176)
Fixes silent message loss during reconnect bursts on high-volume channels.
Previously, between SubscriptionManager._sid_to_client.clear() and the
new sid mapping landing in _wait_for_response, any data frame the
server sent on the freshly-assigned sid was non-matching from the wait's
perspective and discarded with a debug log. Under market-burst
reconnects on ticker / trade / fill, the SDK could drop tens
of messages per reconnect.
SubscriptionManager now stashes those non-matching data frames in a
per-sid bounded deque (stash_maxlen=1000 per sid by default) for the
duration of resubscribe_all. After resubscribe completes,
KalshiWebSocket._handle_reconnect drains the stash through
_process_frame so the frames flow through the normal dispatch path
— seq tracker advances, orderbook manager applies, iterator consumers
receive them in arrival order.
The drain coordinates with #139's seq-gap tracking: replayed frames
go through seq_tracker.track exactly once, so the first live frame
after resubscribe sees the right watermark and doesn't trip a spurious
gap on what would otherwise look like a seq 0 → N jump.
Stash bound: per-sid deque uses collections.deque(maxlen=stash_maxlen).
On overflow, oldest evicts (deque semantics) and a single WARNING per
fill event is logged so callers notice congestion. Memory is bounded at
stash_maxlen * len(active_subs) * avg_frame_size worst-case.
Frames whose sid did not get re-mapped during resubscribe_all (a
per-sub failure that #77's F-P-01 isolates) are dropped on drain with a
debug log — there's no consumer to deliver them to.
Drive-by: SubscriptionManager._wait_for_response swapped two deprecated
asyncio.get_event_loop().time() calls for asyncio.get_running_loop().time()
(the correct API inside an async def).
WS run_forever(stop_event=...) cooperative shutdown (#177)
KalshiWebSocket.run_forever() now accepts an optional
stop_event: asyncio.Event | None = None parameter. When set — typically
from a SIGINT handler via add_signal_handler(SIGINT, stop.set) —
run_forever() clears _running, closes the connection, and drains the
recv loop via its existing not self._running branch. The recv task is
NOT cancelled, so no CancelledError leaks out.
import asyncio, signal
stop = asyncio.Event()
asyncio.get_running_loop().add_signal_handler(signal.SIGINT, stop.set)
async with ws.connect() as session:
await session.subscribe_ticker(tickers=["EXAMPLE-25-T"])
await session.run_forever(stop_event=stop)No behavior change when stop_event is omitted — external cancellation
still propagates as before, and the #175 "missing subscription" guard
remains in place.
WS run_forever() raises on missing subscription (#175)
KalshiWebSocket.run_forever() previously returned immediately when no
subscribe_* call had landed — _recv_task was None and the silent
no-op masked a real user mistake. Documented as a known foot-gun in
#106 F-P-16; the callback-style example in docs/websockets.md
propagated the trap.
Now raises KalshiSubscriptionError at the call site with an
actionable message:
run_forever() requires at least one active subscription. Call subscribe_ticker(...) / subscribe_trade(...) / etc. (or the generic subscribe(channel, ...)) before run_forever() so the recv loop has something to drain. Registering an @ws.on(channel) callback does not subscribe — the server only sends frames for channels you explicitly subscribe to.
Docs updated: the callback example now shows the correct
subscribe_ticker(...) → run_forever() pairing with a comment
explaining that the iterator return value is unused (callbacks fan out
alongside it).
Soft-breaking: code that relied on run_forever() returning silently
as a sleep-until-disconnect for a connection it never intended to use
for streaming now raises. There's no production usage of that shape;
the foot-gun was the bug.
Nightly integration server-omission fixes (#183)
First two server_omits_despite_required cases caught by the post-#172
nightly integration job (run #26141405845 against demo commit 788789c):
Event.product_metadata— spec marksrequired: truebut the live
demo server omits the key entirely on most events (Mars trip, Liverpool
vs Manchester United, "Bitcoin price on Jan 12" and others). Reverted to
dict[str, Any] | None = Noneand registered the deviation in
EXCLUSIONSwithkind="server_omits_despite_required". This is the
first usage of the new exclusion kind shipped in #172.EventMetadata.market_details— spec marksrequired: true(list)
but the live demo server sends JSONnullfor the value. Swapped
list[MarketMetadata]→NullableList[MarketMetadata]. The spec
contract (key present) is still enforced; callers always see a list.
Together these unblock 20 cascading integration-test failures across
tests/integration/test_events.py, test_markets.py, and test_series.py
(every test that calls events.get()).
test_exclusion_map_is_current learned about server_omits_despite_required
as the inverse of the other model exclusion kinds: the SDK field still has
to be present (so we can parse responses when the server does send it) but
must be optional. Stale-exclusion detection now flags either side flipping.
WS / auth polish batch (#173 + #174 + #178)
-
#173 —
MessageQueuedefense-in-depth. The WSMessageQueueunderlying
collections.dequenow carriesmaxlen=maxsize+1as a hard memory ceiling
enforced by deque itself, independent of the manual_sizecounter. If the
counter ever drifts (a put path that forgets to increment, an exception
between append and increment) the buffer cannot grow without bound. New
regression test intests/ws/test_backpressure.pyinjects counter drift and
asserts the cap holds. No observable behavior change in the passing path. -
#174 — types consolidation.
_to_decimal_dollarsand_to_decimal_fp
were byte-identical apart from their docstrings. Collapsed into a single
_coerce_decimalhelper shared by bothDollarDecimalandFixedPointCount.
Public aliases unchanged; only the internal helper is shared. -
#178 — async RSA-PSS sign offload. Added
KalshiAuth.sign_request_async()
that routes the ~1-10 ms RSA-PSS sign through a dedicated
ThreadPoolExecutor(max_workers=2)lazy-initialised on first use.Async REST (
AsyncTransport.request) and async WS connect
(ConnectionManager._build_auth_headers) now use the async sign path; the
syncsign_requestAPI is unchanged for sync-transport callers.The executor is dedicated (not asyncio's shared default pool) so signs
don't queue behindloop.getaddrinfo/ file I/O / otherto_thread()
work on a busy event loop — relevant during WS reconnect storms where
cold DNS resolution (5-50 ms) dominates the sign cost. Per the community
feedback on #178: a falsifiable microbench underscripts/bench_sign_offload.py
uses realloop.time()deltas (NOT theasyncio.sleep(0)ticker which is
special-cased and doesn't measure wall-clock blocking). Measured: inline
p99=2.95 ms vs. offloaded p99=0.68 ms on a 2048-bit key.KalshiClient.close()/AsyncKalshiClient.close()now shut down the
sign executor too; the executor is daemon-style and idempotent to close.
Contract-map completeness (#171)
Maps the remaining 42 REST sub-models, V2 orders family, and internal
containers into CONTRACT_MAP (49 entries → 91). Promotes
test_contract_map_completeness from warnings.warn to pytest.fail so
the next unmapped model fails CI loudly.
_get_schema_fields / _get_required_fields gain a dotted-path syntax
(Parent.field.items) so inline-object schemas the spec doesn't name at the
top level (Batch*OrdersV2* per-entry shapes) can still flow through the
drift pipeline.
Newly-surfaced drift caught by mapping these models:
BidAskDistribution(OHLC): all four price fields tightened to required.PriceDistribution: gains 4 v3.18.0 spec fields (mean_dollars,
previous_dollars,min_dollars,max_dollars), all optional per spec.Candlestick: 6 fields tightened to required.MarketMetadata:image_url+color_codetightened.Schedule/WeeklySchedule: tightened.PositionsResponse,EventCandlesticks,ForecastPercentilesPoint:
tightened.AssociatedEvent(multivariate):is_yes_only+active_quoterstightened.LookupPoint(multivariate):selected_markets+last_queried_tstightened.
OrderbookLevel is mapped to spec's PriceLevelDollarsCountFp, a positional
2-tuple ["<dollars_string>", "<fp_count_string>"]. The SDK wraps it as a
named {price, quantity} object — no field-by-field comparison possible.
_get_schema_fields returns {} for the array-typed spec schema, so drift
checks skip it cleanly.
Fixture builders for Candlestick, BidAskDistribution, PriceDistribution
added to tests/_model_fixtures.py (3 new). Test fixtures parsing
Candlestick / EventCandlesticks / MarketCandlesticks now use those
builders.
Required-but-optional drift closure (#172)
Required-but-optional drift closure (#172). Drops None defaults on 226
spec-required Pydantic model fields across 34 response models (21 REST, 13
WS). The SDK now matches the OpenAPI v3.18.0 / AsyncAPI v0.14 required set
on the wire. Promotes test_required_drift and test_ws_required_drift
from warning to hard CI failure, closing the regression class that allowed
required-but-typed-Optional fields to drift unnoticed.
Breaking (response-parse side)
- 226 fields are no longer
Optional[T] | Nonein response models —
see the full list per model in #172. Wire format is unchanged; the SDK
now refuses to parse responses that omit a spec-required field, where
previously the field defaulted toNone. If the live server omits a
spec-required field,pydantic.ValidationErroris raised on parse. CreateOrderRequest.actionno longer defaults to"buy"— callers
constructing the request model directly must passactionexplicitly.
TheOrdersResource.create(action=None, ...)kwarg path still defaults
to"buy"for back-compat; only the model-construction surface changed.- Test fixtures constructing these models with partial dicts will
raiseValidationError. A new helper moduletests/_model_fixtures
provides complete spec-shaped builders (market_dict,order_dict,
fill_dict, etc.) that accept**overridesfor fields tests care about.
Changed
test_required_drift(REST) andtest_ws_required_drift(WS) promoted
fromwarnings.warntopytest.fail. Future drift on these gates is
CI-blocking.- New
ExclusionKindvalue"server_omits_despite_required"registered
intests/_contract_support.pyfor fields the spec marks required but
the live server omits. Entries MUST cite a demo+prod observation.
Migration
- Code that builds these models from server responses: no change. The
server-side wire shape is what it always was — the SDK type just stopped
lying about which fields are guaranteed. - Code that builds these models in tests / mocks / fixtures: pass all
spec-required fields, or use thetests/_model_fixturesbuilders. The
builders are test-only (live undertests/, never shipped in the
wheel) — production code does not import them. - Callers who relied on
Optionalnarrowing (if order.outcome_side is not None: ...) can drop the guard.mypy --strictwill now flag the
redundant check.
Affected models
21 REST (136 fields): Market, Order, Fill, MultivariateEventCollection,
Settlement, Trade, Event, Series, MarketPosition, EventPosition,
EventMetadata, Milestone, SportFilterDetails, IncentiveProgram,
ApiKey, SeriesFeeChange, MarketCandlesticks, ScopeList,
GetOrderGroupResponse, CreateOrderGroupResponse, CreateOrderRequest.
13 WS payloads (90 fields): UserOrdersPayload, FillPayload,
TickerPayload, TradePayload, MarketPositionsPayload,
QuoteExecutedPayload, QuoteCreatedPayload, QuoteAcceptedPayload,
MultivariatePayload, RfqCreatedPayload, RfqDeletedPayload,
MarketLifecyclePayload, OrderGroupPayload.