Skip to content

feat(http): add QWP egress for streaming SQL query results over WebSocket#6991

Merged
bluestreak01 merged 87 commits into
masterfrom
vi_egress
Apr 26, 2026
Merged

feat(http): add QWP egress for streaming SQL query results over WebSocket#6991
bluestreak01 merged 87 commits into
masterfrom
vi_egress

Conversation

@bluestreak01
Copy link
Copy Markdown
Member

@bluestreak01 bluestreak01 commented Apr 18, 2026

Tandem with questdb/questdb-enterprise#991 — must merge together; the Enterprise PR adds integration tests that exercise QWP egress auth and TLS through QwpQueryClient once this branch lands.


Summary

  • Add a new HTTP endpoint /read/v1 (also /api/v1/read) that streams SQL query results to clients in QWP's binary columnar wire format. SELECT in, columnar batches out.
  • Extends the existing QWP protocol (binary ingestion over WebSocket, shipped in feat(ilp): add QuestDB Wire Protocol (QWP) ingestion over WebSocket and UDP #6800) with an egress direction. Reuses the column type codes, varint encoding, null bitmap, and per-column wire shapes; adds egress-specific message kinds (QUERY_REQUEST, RESULT_BATCH, RESULT_END, QUERY_ERROR, CANCEL, CREDIT, EXEC_DONE, CACHE_RESET).
  • Authoritative wire-format spec: docs/QWP_EGRESS_EXTENSION.md.
  • Java client (QwpQueryClient) lives in the java-questdb-client submodule; this PR bumps the submodule pointer and the Maven dependency to 1.1.1-SNAPSHOT. See companion submodule PR.
  • On-wire zstd compression of RESULT_BATCH bodies, negotiated at upgrade time via X-QWP-Accept-Encoding. Server compresses in Rust (zstd reused from parquet2); client decompresses in C against libzstd v1.5.7 vendored as a git submodule.
  • Per-connection cache bounds with a CACHE_RESET control frame (see Connection-scope caps section) so long-lived connections cannot monotonically accumulate dict / schema state.
  • Prometheus-scrapable counters and gauges exposed under the qwp_egress_* namespace (see Metrics section).

Wire format

QWP egress is one new HTTP endpoint, one new set of WebSocket message kinds, and the same column-data encoding the ingestion side uses. A typical exchange:

client  →  GET /read/v1  (X-QWP-Max-Version: 1, X-QWP-Accept-Encoding: zstd;level=3,raw)
client  ←  101 Switching Protocols  (X-QWP-Version: 1, X-QWP-Content-Encoding: zstd;level=3)

client  →  QUERY_REQUEST(request_id, sql, initial_credit, bind_params...)
client  ←  RESULT_BATCH(request_id, batch_seq=0, full schema, rows 0..N-1)   [FLAG_ZSTD set]
client  ←  RESULT_BATCH(request_id, batch_seq=1, schema reference, rows N..2N-1)
            ...
client  ←  RESULT_END(request_id, total_rows)

Schema-reference mode kicks in from batch 1 onwards so wide schemas don't repeat per batch. All 24 QuestDB column types are wire-format supported (BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, STRING, SYMBOL, TIMESTAMP, TIMESTAMP_NANOS, DATE, UUID, LONG256, GEOHASH (all four storage widths), VARCHAR, IPv4, DECIMAL64, DECIMAL128, DECIMAL256, DOUBLE_ARRAY, LONG_ARRAY, CHAR, BINARY).

NULL semantics inherit QuestDB's existing sentinel conventions (documented in spec §11.5). Notably: NaN is the FLOAT/DOUBLE NULL sentinel, 0 (i.e. 0.0.0.0) is the IPv4 NULL sentinel, and -1 is the universal GEOHASH NULL sentinel across all storage widths.

Flow control and control-plane frames

  • CANCEL — client asks the server to abort an in-flight query. Server flags the streaming state; streamResults observes the flag between batches and aborts with QUERY_ERROR carrying STATUS_CANCELLED. QwpQueryClient.cancel() is thread-safe and callable from any thread.
  • CREDIT — byte-budgeted flow control. Client sets initial_credit in QUERY_REQUEST; server streams at most that many result-payload bytes before parking. Client auto-replenishes by the size of each batch after the user's handler releases it. Zero credit means unbounded (the default — no CREDIT bookkeeping on either side). When credit is exhausted the server yields cooperatively (no thread park) and resumes on the next CREDIT frame.
  • EXEC_DONE — ack for non-SELECT statements. DDL, INSERT, UPDATE, ALTER, DROP, CREATE TABLE, CREATE MAT VIEW, TRUNCATE, and every parse-time-executed statement all run through egress and return {op_type, rows_affected} instead of opening a result stream.
  • CACHE_RESET — server-to-client control frame, emitted at a query boundary when the connection's SYMBOL dict or schema-fingerprint cache crosses its soft cap. Body is a single reset_mask byte (bit 0 = dict, bit 1 = schemas). Recipient flushes the indicated caches; the next RESULT_BATCH delta section starts at deltaStart=0 and, if the schemas bit was set, the next schema is shipped in full mode with a fresh id. See Connection-scope caps.

On-wire compression

  • FLAG_GORILLA — TIMESTAMP / TIMESTAMP_NANOS / DATE columns carry a 1-byte encoding discriminator (0x00 raw / 0x01 Gorilla). Server auto-picks Gorilla when the column has ≥ 3 non-null values and the delta-of-delta bitstream beats raw nonNull × 8. Unordered or jumpy columns fall back to raw. Saves ~80× on periodic timestamps (e.g. 10 ms tick data).
  • FLAG_DELTA_SYMBOL_DICT — SYMBOL values are dedup'd into a connection-scoped dictionary. Each batch ships only the new entries in a per-message delta section; per-row payload is a varint connId. Per-column IntIntHashMap (native-key → connId) on each QwpColumnScratch keeps the per-row hot path to a single int probe; a bytes-keyed hedge inside the connection dict catches cross-column / cross-query duplicates (first-sight probe only, not per-row).
  • FLAG_ZSTD — whole-batch zstd compression of the region after the prelude (msg_kind + request_id + batch_seq stay raw so the client I/O thread can dispatch on them without paying the decompress cost). Negotiated once at upgrade time via X-QWP-Accept-Encoding: zstd;level=N, raw; the server echoes its choice in X-QWP-Content-Encoding. Level is a client hint; server clamps to [1, 9] because zstd levels 10+ drop to <20 MB/s compress speed and would let a slow/malicious client pin a worker thread. A specific batch ships raw when its compressed form isn't actually smaller — avoids zstd's header overhead dominating tiny batches. Connection-string keys: compression={zstd|raw|auto} (default auto = advertise zstd,raw), compression_level=N (default 3).

Server side

  • New package core/.../cutlass/qwp/codec/ — outbound codec: frame writer, schema writer, message-kind constants, type mapper, per-column native scratch, batch buffer, connection-scoped SYMBOL dict, Gorilla encoder.
  • New package core/.../cutlass/qwp/server/egress/ — HTTP handler, upgrade processor, per-connection processor state, request decoder, compression negotiator, metrics.
  • Route registration in HttpServer.addDefaultEndpoints next to the existing /write/v4 ingest route. Same HTTP worker pool, same TLS / auth surface, same LocalValue<State> pattern.
  • SELECT execution runs on the HTTP worker thread (the synchronous pattern that JsonQueryProcessor and ExportQueryProcessor use). Page-frame parallel execution under factory.getCursor() continues to dispatch to QuestDB's shared work pool transparently — the egress encoder is the consumer of that already-parallel pipeline.
  • Non-SELECT execution: executeNonSelect synchronously awaits the operation's future, mapping the compiled-query type into the EXEC_DONE response.
  • zstd JNI module lives inside the existing qdbr Rust crate as qwp_zstd.rs. Reuses the zstd crate already transitively linked via parquet2, so no new native dependency on the server side. ZSTD_CCtx is allocated lazily per connection on first compressed batch and reused across every batch; freed in QwpEgressProcessorState.close().
  • Yield/resume hardening:
    • resumeSend unconditionally flushes deferred bytes before checking streaming state — covers QUERY_ERROR frames parked after endStreaming.
    • streamingBatchSeqCommitted flag enforces "seq incremented before send commits bytes" as a runtime invariant (not just an assertion).
    • Park-on-write and park-for-credit both use a single cooperative-yield path; the HTTP worker is never blocked while a stream is suspended.
    • Handshake response split across onHeadersReady (write bytes to sink) + onRequestComplete (commit via rawSocket.send, may park) + resumeSend (finalise protocol switch after flush). Fixes a pre-existing partial-write bug under small DEBUG_HTTP_FORCE_SEND_FRAGMENTATION_CHUNK_SIZE.
  • Error classification: mapErrorStatus distinguishes CairoException.isCancellationSTATUS_CANCELLED, isInterruption / isOutOfMemorySTATUS_LIMIT_EXCEEDED, isAuthorizationErrorSTATUS_SECURITY_ERROR, SqlExceptionSTATUS_PARSE_ERROR, else STATUS_INTERNAL_ERROR.
  • Bind parameters: server-side decoder accepts every scalar wire type today (the client does not yet expose a bind encoder — see Limitations). Server is lenient on SYMBOL bind type code and routes it through STRING.
  • Mid-batch failure rollback: if a SELECT throws between batchBuffer.beginBatch and the final send (cursor exception, scratch-grow OOM, transient encode failure), the server rolls the connection SYMBOL dict back to its pre-batch size via QwpResultBatchBuffer.rollbackCurrentBatch() before the catch(Throwable) ships QUERY_ERROR. Without the rollback, partially-committed dict entries would never reach the wire, and a subsequent query that dedup'd against the same bytes would emit row payload referencing an id the client was never taught.

Connection-scope caps and CACHE_RESET

Long-lived connections accumulate two forms of connection-scoped state: the SYMBOL delta-dict (heap bytes + entry count) and the schema-fingerprint cache. Under adversarial or just high-cardinality workloads, both can grow large enough to matter for fleet memory budgets. The server enforces soft caps and signals the client when it flushes.

  • DEFAULT_MAX_EGRESS_DICT_ENTRIES = 100_000 — entry count soft cap.
  • DEFAULT_MAX_EGRESS_DICT_HEAP_BYTES = 8 MiB — UTF-8 heap soft cap.
  • DEFAULT_MAX_EGRESS_SCHEMAS_PER_CONNECTION = 4_096 — schema-fingerprint cap (tighter than the inherited ingress cap at 65_535 because egress state is fully server-owned and reusable).

At every query boundary (just after decoding the next QUERY_REQUEST), the server computes a reset mask from the current cache sizes; if non-zero, it emits a CACHE_RESET frame and calls state.applyCacheReset(mask). The client's QwpEgressIoThread receives the frame and calls QwpResultBatchDecoder.applyCacheReset(mask), clearing connDictSize / connDictHeapPos and / or the schema registry so the following RESULT_BATCH delta section starts at deltaStart=0. Because the reset happens between queries, no in-flight RESULT_BATCH references an id that survives the flush.

Per-query scratch buffers on QwpColumnScratch also shrink on query boundary: the scratch tracks the peak observed footprint across the query's batches, and at resetForNewQuery trims any buffer whose capacity has outgrown the peak by more than 4x. A one-off wide batch no longer permanently retains its peak-sized native allocation on a long-lived connection.

QwpEgressProcessorState exposes test-only static overrides for all three caps so tests can trip resets at tiny entry counts without stuffing the connection with millions of rows. Production code never touches them.

Metrics

Prometheus-scrapable counters and gauges are exposed under the qwp_egress_* namespace (same pattern as json_queries_*, pgwire_*). All counters are no-ops when metrics.enabled=false (the default); set QDB_METRICS_ENABLED=true to turn them on.

Metric Type Description
qwp_egress_connections gauge Active post-handshake connections. Increments in finalizeHandshake, decrements in onConnectionClosed.
qwp_egress_queries_started counter Every QUERY_REQUEST successfully decoded.
qwp_egress_queries_errored counter Queries that ended in QUERY_ERROR for any reason other than explicit cancellation.
qwp_egress_queries_cancelled counter Queries that ended in QUERY_ERROR with STATUS_CANCELLED (either client CANCEL or server-side cancel).
qwp_egress_batches_sent counter RESULT_BATCH frames committed to the wire.
qwp_egress_bytes_sent counter Total payload bytes across all RESULT_BATCH frames (post-compression when FLAG_ZSTD is set, pre-WebSocket-framing in all cases).
qwp_egress_bytes_zstd_saved counter Bytes saved by zstd compression relative to the uncompressed payload, summed across batches where FLAG_ZSTD actually shipped.
qwp_egress_rows_streamed counter Total rows committed via RESULT_BATCH.
qwp_egress_cache_reset_dict counter CACHE_RESET frames emitted with the dict bit set.
qwp_egress_cache_reset_schemas counter CACHE_RESET frames emitted with the schemas bit set.

All metrics piggyback on the existing GET /metrics Prometheus endpoint and are exported via Metrics.scrapeIntoPrometheus. Dashboard entry points:

  • Query throughput: rate(qwp_egress_rows_streamed[1m]), rate(qwp_egress_bytes_sent[1m]).
  • Error rate: rate(qwp_egress_queries_errored[5m]) / rate(qwp_egress_queries_started[5m]).
  • Connection health: qwp_egress_connections (current), rate(qwp_egress_cache_reset_dict[1h]) (per-hour cap trips as a leading indicator of long-lived connections accumulating dict state).
  • Cancellation visibility: rate(qwp_egress_queries_cancelled[5m]).

Client side

  • libzstd v1.5.7 vendored as a git submodule under java-questdb-client/core/src/main/c/share/zstd. CMake walks the upstream lib/common/ + lib/decompress/ subset only (compress isn't linked — the client decodes). zstd_jni.c sits alongside (not inside) the submodule so upstream resets don't touch the JNI glue. ZSTD_DISABLE_ASM is set on non-x86 platforms; x86_64 links the hand-tuned Huffman decoder.
  • ZSTD_DCtx allocated lazily per QwpResultBatchDecoder instance (one per I/O thread) and reused across every batch. Decompression scratch grows on demand up to a 64 MiB cap — the cap prevents a hostile/corrupted frame advertising a huge content size from forcing unbounded allocation.
  • Early zstd probe at QwpQueryClient.connect() — if the user didn't set compression=raw, the client allocates + frees a ZSTD_DCtx before starting the I/O thread. Catches mismatched client jars (new server, old client built without the zstd submodule) with a clear error on the user thread rather than surfacing the UnsatisfiedLinkError mid-stream through the batch handler's error callback. The decoder carries a second guard that translates the same mismatch to a QwpDecodeException if a FLAG_ZSTD frame ever reaches it outside the probe's reach.
  • CACHE_RESET handler on the I/O thread flushes the connection-scoped dict and / or schema registry when the server signals a reset. No user-visible event — the reset frame is always followed by a fresh RESULT_BATCH whose deltaStart aligns with the post-reset connDictSize.

Allocation footprint

The codec is structured so per-batch allocations amortise to zero after warmup:

  • QwpResultBatchBuffer accumulates rows column-by-column into per-column QwpColumnScratch objects backed by native memory (one per column, grown to the max observed batch size and reused).
  • Fixed-width values write straight to native bytes — no boxed Long / Double per cell.
  • Decimal128 / Decimal256 sinks are reused per column, not allocated per row.
  • SYMBOL fast path: per-column IntIntHashMap keyed on the native symbol-table int; on first sight per-connection per column, one String allocation and one Utf8String key into the shared connection dict. Amortised: hundreds of bytes per unique symbol value per connection, zero per row.
  • Column names are UTF-8 encoded once on QwpEgressColumnDef.of(...) and reused for both schema-size estimation and emit.
  • QwpEgressRequestDecoder reuses a DirectUtf8String view + StringSink scratch for bind parameters.
  • Client-side decoder: SYMBOL dict held as a native UTF-8 heap + packed (offset, length) entries — a single 64-bit load + two int extractions per lookup, no DirectUtf8String per entry and no ObjList.getQuick on the hot path.
  • zstd compress / decompress scratch buffers live on connection-scoped state (server) and per-IoThread decoder (client), growing on demand and reused across every batch.
  • Error path: sendQueryError UTF-8 encodes the message directly into the wire buffer from a CharSequence, bypassing the intermediate String + byte[].
  • Scratch shrink on query boundary: QwpColumnScratch.resetForNewQuery trims any buffer whose capacity is >4x the per-query peak, back to max(INITIAL_BYTES, 2x peak). A spike batch no longer permanently retains megabytes of native memory.

Performance characteristics

  • HTTP worker scheduling is unchanged — SELECT queries occupy a worker thread for the duration of streaming, same as JSON /exec. Page-frame parallelism continues to fan out underneath. Credit-suspended streams cooperatively yield the worker back to the pool; no thread is blocked while a stream is parked waiting for CREDIT.
  • Wire-level: schema reference mode after the first batch saves the per-column metadata bytes on every subsequent batch; Gorilla compresses ordered timestamp columns ~80× on typical tick data; delta-dict SYMBOL drops per-row cost from bytes to a single varint after the first sighting.
  • zstd at the default level 3 typically halves DOUBLE / LONG traffic and shrinks rotating-SYMBOL batches further (on top of the delta dict). It costs server CPU — best on WAN / cellular clients, worth measuring before enabling on colocated consumers where the NIC isn't the bottleneck.
  • Benchmarks: benchmarks/QwpEgressReadBenchmark.java (narrow: ts/id/price/sym/note — 5 cols) and QwpEgressReadBenchmarkWide.java (wide: 15 cols including five 100k-cardinality symbols and five extra doubles) each ingest N rows then read them back via QWP egress, PostgreSQL wire, and HTTP /exec (JSON), printing rows/sec and MiB/sec for each. Results included in the PR discussion.

Limitations / Phase 2 backlog

Documented in docs/QWP_EGRESS_PHASE2_BACKLOG.md. Remaining items:

  • Mid-stream CANCEL vs active WRITE-parked streaming — the plumbing (server flag, streamResults check, status mapping, client API) is complete, but the IO dispatcher registers each fd for a single operation (READ xor WRITE). While the server is parked on WRITE during a streaming query, kernel-buffered CANCEL frames are dispatched only after the write unblocks — by then the query has often completed. Fixing this requires registering for both read and write during streaming, a dispatcher-level change.
  • Single in-flight query per connection. Multi-query multiplexing requires a fair scheduler — a Phase 2 item.
  • A QwpResultCursor row-iterator wrapper is a stretch goal; the column-batch handler API covers all functional cases today.
  • Codec is duplicated between parent and submodule (encoder server-side, decoder client-side) — a shared module can collapse the duplication once the wire format stabilises.

Test plan

Nineteen test classes cover the egress surface. Breakdown:

  • QwpEgressBootstrapTest — end-to-end handshake, SELECT over every wire type, multi-batch streaming, malformed connect strings, syntax errors, table-not-found, etc. Includes a regression test for the empty-result + reused-schema case (review finding Ignore unsupported field types #2).

  • QwpEgressTypesExhaustiveTest — min/max/zero/null boundary coverage for every wire type.

  • QwpEgressFuzzTest — property-based fuzz: random schema, random per-row values generated in Java, four random query shapes (full scan / projection reorder / id-range filter / reverse limit), row-by-row verification. Each client connect picks a random compression mode (auto / raw / zstd at a random level in [1, 9]) so the handshake negotiation paths get hit across the run.

  • QwpEgressFragmentationFuzzTest — stresses the state machines with both send- and recv-side DEBUG_HTTP_FORCE_{SEND,RECV}_FRAGMENTATION_CHUNK_SIZE capped at small random values. Includes a dedicated regression test for the handshake partial-write fix at chunk=5 bytes.

  • QwpEgressRequestDecoderTest — unit-level decoder coverage: no-binds, all scalar bind types, null, mixed.

  • QwpEgressPageFrameTest — PageFrameCursor streaming, multi-frame, column-top handling.

  • QwpEgressTimestampGorillaTest — ordered / unordered / DATE / TIMESTAMP_NANOS Gorilla coverage, per-column encoding-byte round-trips, batch-size boundaries (0/1/2/3 rows).

  • QwpEgressSymbolEdgeCaseTest — Unicode, emoji, long values, empty strings, NULL, dict growth.

  • QwpEgressDeltaSymbolDictTest — cross-batch and cross-query delta dict correctness, wire-byte savings checks.

  • QwpEgressDdlExecTest — every non-SELECT path (CREATE, INSERT, UPDATE, ALTER, DROP, TRUNCATE, parse-time-executed).

  • QwpEgressCancelTest — CANCEL frame plumbing + mapErrorStatus unit coverage for cancellation / interruption / OOM / auth / SQL / generic error paths.

  • QwpEgressCreditFlowTest — credit-flow correctness (unbounded back-compat, tiny credit forcing many suspend/resume cycles, large result under 4 KiB budget, mixed credit across queries on one connection).

  • QwpEgressErrorCoverageTest — runtime errors, type mismatches, peer disconnects mid-stream, handshake rejections.

  • QwpEgressCompressionTest — zstd round-trip at level 1, default level, and clamped level 22; raw fallback; auto default over multi-batch streaming; compression + chunk=23 fragmentation combined.

  • ZstdTest — native JNI round-trip: highly-compressible runs, random bytes, CCtx / DCtx reuse across 100 buffers, level clamp.

  • QwpQueryClientCloseTimeoutTest — clean-close path for wasLastCloseTimedOut() plus a reflection-injected timeout test that hijacks the I/O thread handle and shortens shutdownJoinMs to hit the leak-rather-than-SIGSEGV branch in ~150 ms.

  • QwpEgressConnSymbolDictTest — unit coverage for the connection-scoped symbol dict: addEntry dedup, rollback semantics for mid-batch failure (review finding Event Appender #1), rollback boundary cases.

  • QwpEgressProcessorStateCacheResetTest — unit coverage for computeCacheResetMask and applyCacheReset: each cap in isolation, exact-boundary triggers, no-op when under caps, unknown mask bits ignored, independence of dict vs schema caches.

  • QwpEgressCacheResetWireTest — end-to-end coverage for the CACHE_RESET frame: dict-entry cap, dict-heap cap, schema cap; no-reset-under-defaults baseline; client dict stays consistent after a reset; in-flight streams are not disrupted by caps crossed mid-batch.

  • QwpEgressMetricsScrapeTest — end-to-end Prometheus metric coverage: connection gauge open/close, query-started / errored / exec-done counters, batch/row/byte counters, scrape output format, disabled-metrics no-op path.

  • Existing 358 QWP ingress unit tests remain green (no regressions on the ingestion side).

  • Reference benchmarks build clean against the local submodule (mvn -pl benchmarks -P local-client compile); require a running QuestDB to execute end-to-end.

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 88cd399a-f2b3-4ec6-b6de-2f939d081fc6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Comprehensive implementation of QWP egress protocol for WebSocket-based query result streaming. Adds frame serialization, schema management, compression negotiation, result buffering, request decoding, HTTP/WebSocket processing infrastructure, and native zstd compression bindings via JNI. Removes TYPE_STRING constant, consolidating string wire representation to TYPE_VARCHAR.

Changes

Cohort / File(s) Summary
Version Updates
benchmarks/pom.xml, core/pom.xml
Updated questdb.client.version from 1.1.0 to 1.1.1-SNAPSHOT; added comment guidance for released client override.
Rust zstd Bindings
core/rust/qdbr/Cargo.toml, core/rust/qdbr/src/lib.rs, core/rust/qdbr/src/qwp_zstd.rs, core/rust/qdb-sqllogictest/src/lib.rs
Added zstd 0.13 dependency; implemented JNI-native zstd compression/decompression context creation/freeing and compress/decompress operations with panic guards; exposed new qwp_zstd module.
QWP Protocol Constants & Types
core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpConstants.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressMsgKind.java
Added compression codec constants (COMPRESSION_NONE, COMPRESSION_ZSTD, level bounds), FLAG_ZSTD header flag, TYPE_BINARY and TYPE_IPV4 column types; removed TYPE_STRING; added QUERY_REQUEST/RESULT_BATCH/RESULT_END/QUERY_ERROR/CANCEL/CREDIT/EXEC_DONE message kind constants and egress status codes.
QWP Codec & Column Handling
core/src/main/java/io/questdb/cutlass/qwp/codec/QwpColumnTypeMapper.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressColumnDef.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpColumnScratch.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressConnSymbolDict.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpResultBatchBuffer.java
Introduced column definition and schema management classes; QuestDB-to-QWP type mapping; per-column native scratch storage with dynamic buffer growth; connection-scoped symbol dictionary with UTF-8 deduplication and heap management; column-major result batch accumulator supporting PageFrame and row-wise append with NULL handling, timestamp Gorilla compression, and symbol dictionary deduping.
QWP Frame Serialization & Encoding
core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressFrameWriter.java, core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressSchemaWriter.java, core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpBitWriter.java, core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpGorillaEncoder.java
Added low-level frame header/payload serialization to unsafe memory; schema reference/full-mode encoding into native buffers; LSB-first bit-packing writer for frame data; server-side Gorilla delta-of-delta timestamp compression with bucket-based encoding.
QWP Request Decoding
core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressRequestDecoder.java
Stateful incremental decoder for inbound QWP egress frames (QUERY_REQUEST, CANCEL, CREDIT) with comprehensive type-specific bind variable decoding, range validation, and error reporting via QwpParseException.
QWP HTTP/WebSocket Processing
core/src/main/java/io/questdb/cutlass/http/HttpFullFatServerConfiguration.java, core/src/main/java/io/questdb/cutlass/http/HttpServer.java, core/src/main/java/io/questdb/cutlass/qwp/server/QwpWebSocketHttpProcessor.java, core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressHttpProcessor.java, core/src/main/java/io/questdb/cutlass/qwp/websocket/WebSocketFrameWriter.java
Added context-path constant and accessor for /read/v1 QWP endpoint; registered QwpEgressHttpProcessor factory; optimized WebSocket handshake response generation with thread-local SHA-1/base64 scratch buffers; added X-QWP-Accept-Encoding and X-QWP-Max-Batch-Rows header constants and optional content-encoding header writing; updated close frame to accept CharSequence for zero-alloc reason encoding.
QWP Egress State & Compression Negotiation
core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressProcessorState.java, core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressCompressionNegotiator.java
Per-connection state container managing batch accumulator, bind variables, pooled column definitions, symbol dictionary, credit-based flow control, streaming cursors (record and page-frame variants), schema-id caching with fingerprint deduplication, compression context/scratch buffers, and protocol lifecycle flags; compression negotiator parsing X-QWP-Accept-Encoding header with zstd level parsing and response header formatting.
QWP Egress Main Processor
core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressUpgradeProcessor.java
Main HTTP request processor implementing WebSocket upgrade, QWP protocol negotiation, frame dispatch (QUERY_REQUEST/CANCEL/CREDIT/PING/PONG/CLOSE), SQL compilation with cursor caching, query streaming with batching/credit management, compression negotiation, error mapping, backpressure handling, and cancellation support.
Java/Rust Native Bindings
core/src/main/java/io/questdb/std/Zstd.java
JNI wrapper for zstd library providing static native method declarations for compression/decompression context creation/freeing and compress/decompress operations.
WAL & Protocol Updates
core/src/main/java/io/questdb/cutlass/line/tcp/QwpWalAppender.java, core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpStringColumnCursor.java, core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpTableBlockCursor.java
Removed QWP TYPE_STRING to VARCHAR mapping (no longer distinct type); updated string column cursor to handle only TYPE_VARCHAR; updated table block cursor initialization to dispatch TYPE_VARCHAR exclusively for strings.
Module Exports
core/src/main/java/module-info.java
Added exports for io.questdb.cutlass.qwp.codec and io.questdb.cutlass.qwp.server.egress packages.
Benchmark Applications
benchmarks/src/main/java/org/questdb/QwpEgressLatencyBenchmark.java, benchmarks/src/main/java/org/questdb/QwpEgressReadBenchmark.java, benchmarks/src/main/java/org/questdb/QwpEgressReadBenchmarkWide.java
Added three JMH/performance benchmarks measuring QWP egress latency, read throughput on standard and wide tables with comparison to PostgreSQL wire and HTTP /exec JSON paths.
Protocol Tests (TYPE_STRING Removal)
core/src/test/java/io/questdb/test/cutlass/line/tcp/QwpWalAppenderTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpColumnDefTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpConstantsTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpCursorBoundsCheckFuzzTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpCursorBoundsCheckTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpSchemaRegistryTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpSchemaTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpStringDecoderTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/e2e/QwpSenderE2ETest.java
Updated test assertions and type constants to use TYPE_VARCHAR exclusively; removed TYPE_STRING test cases and expected error message references.
Egress Codec & Encoding Tests
core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressRequestDecoderTest.java, core/src/test/java/io/questdb/test/std/ZstdTest.java
Added comprehensive unit tests for request decoder covering all bind types, edge cases, and truncation validation; added zstd round-trip and error-case tests for native compression bindings.
Egress Integration & E2E Tests
core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressBootstrapTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressBindRoundTripTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressBindFuzzTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressDdlExecTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressCancelTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressCreditFlowTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressCompressionTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressDeltaSymbolDictTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressErrorCoverageTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressFragmentationFuzzTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressFuzzTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressPageFrameTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressProcessorStateTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressSchemaIdExhaustionTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressSymbolEdgeCaseTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressTimestampGorillaTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpEgressTypesExhaustiveTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpQueryClientCloseTimeoutTest.java, core/src/test/java/io/questdb/test/cutlass/qwp/QwpQueryClientTerminalFailureTest.java
Comprehensive e2e and integration test suite covering protocol bootstrap, bind parameter round-tripping across all types, DDL/DML execution, cancellation/flow control, compression negotiation, symbol dictionary behavior, fragmentation/fuzz testing, timestamp Gorilla encoding, exhaustive type handling, page-frame streaming, schema caching, and client lifecycle management.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A protocol blooms, frames dance through the night,
Schema and symbols streaming with delight,
Zstd compresses, gorillas encode,
Batches flow onward down egress's road,
WebSocket whispers: "Your query's in flight!" ✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch vi_egress

@bluestreak01 bluestreak01 added New feature Feature requests Performance Performance improvements REST API Issues or changes relating to the HTTP endpoints Documentation Missing or suggested improvements for documentation labels Apr 18, 2026
@bluestreak01 bluestreak01 changed the title feat(qwp): add QWP egress for streaming SQL query results over WebSocket feat(http): add QWP egress for streaming SQL query results over WebSocket Apr 19, 2026
bluestreak01 and others added 7 commits April 23, 2026 12:39
…e; move SERVER_INFO to 0x18

# Conflicts:
#	core/src/main/java/io/questdb/cutlass/qwp/codec/QwpEgressMsgKind.java
#	core/src/main/java/io/questdb/cutlass/qwp/server/egress/QwpEgressUpgradeProcessor.java
#	docs/QWP_EGRESS_EXTENSION.md
#	java-questdb-client
@bluestreak01 bluestreak01 added the Core Related to storage, data type, etc. label Apr 23, 2026
bluestreak01 and others added 15 commits April 23, 2026 21:04
The Javadoc block on testNoResetMidStream was at column 0 instead of
being aligned to the 4-space method indent. IntelliJ's formatter
rewrites it, so the JDK17 CI lint step fails on the resulting diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	THIRD_PARTY_LICENSES.txt
#	core/src/main/resources/io/questdb/bin/darwin-aarch64/libquestdbr.dylib
#	core/src/main/resources/io/questdb/bin/darwin-x86-64/libquestdbr.dylib
#	core/src/main/resources/io/questdb/bin/linux-aarch64/libquestdbr.so
#	core/src/main/resources/io/questdb/bin/linux-x86-64/libquestdbr.so
#	core/src/main/resources/io/questdb/bin/windows-x86-64/questdbr.dll
Bumps the java-questdb-client submodule to pick up the QWP
client-review Tier 1 fixes: decoder bounds and cap fixes, bind
encoder NULL framing and per-width scale checks, geohash value
masking, QwpBatchBuffer capacity-growth fix, idempotent client
close, releaseBuffer/closePool race guard, and per-generation
terminalFailure latches. The NULL framing change aligns the wire
format with what the server's bind parser already expects, so
DECIMAL64/128/256 and GEOHASH NULLs now decode without misframing
the rest of the batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	core/src/main/java/io/questdb/cutlass/qwp/websocket/WebSocketFrameWriter.java
# Conflicts:
#	benchmarks/pom.xml
#	core/pom.xml
#	core/src/main/java/io/questdb/cutlass/qwp/protocol/QwpConstants.java
#	core/src/main/resources/io/questdb/bin/windows-x86-64/questdbr.dll
#	core/src/test/java/io/questdb/test/cutlass/qwp/QwpStringDecoderTest.java
#	java-questdb-client
@mtopolnik
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 2471 / 2867 (86.19%)

file detail

path covered line new line coverage
🔵 io/questdb/cutlass/qwp/protocol/QwpConstants.java 0 2 00.00%
🔵 io/questdb/cutlass/qwp/protocol/QwpGorillaEncoder.java 67 90 74.44%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressCompressionNegotiator.java 64 82 78.05%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressRequestDecoder.java 224 281 79.72%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressUpgradeProcessor.java 606 746 81.23%
🔵 qdbr/src/qwp_zstd.rs 107 132 81.06%
🔵 io/questdb/cutlass/qwp/codec/QwpColumnTypeMapper.java 30 35 85.71%
🔵 io/questdb/cutlass/qwp/codec/QwpEgressFrameWriter.java 71 83 85.54%
🔵 io/questdb/cutlass/qwp/protocol/QwpBitWriter.java 39 45 86.67%
🔵 io/questdb/cutlass/qwp/codec/QwpResultBatchBuffer.java 355 395 89.87%
🔵 io/questdb/cutlass/qwp/codec/QwpColumnScratch.java 389 436 89.22%
🔵 io/questdb/cutlass/qwp/server/QwpWebSocketHttpProcessor.java 29 32 90.62%
🔵 io/questdb/cutlass/qwp/codec/QwpEgressColumnDef.java 19 20 95.00%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressProcessorState.java 287 301 95.35%
🔵 io/questdb/cutlass/qwp/codec/QwpEgressConnSymbolDict.java 82 85 96.47%
🔵 io/questdb/cutlass/qwp/websocket/WebSocketFrameWriter.java 8 8 100.00%
🔵 io/questdb/cutlass/qwp/codec/DefaultQwpServerInfoProvider.java 6 6 100.00%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressHttpProcessor.java 6 6 100.00%
🔵 io/questdb/cutlass/qwp/protocol/QwpStringColumnCursor.java 1 1 100.00%
🔵 io/questdb/cairo/CairoConfigurationWrapper.java 1 1 100.00%
🔵 io/questdb/cutlass/line/tcp/QwpWalAppender.java 1 1 100.00%
🔵 io/questdb/cutlass/http/HttpFullFatServerConfiguration.java 4 4 100.00%
🔵 io/questdb/Metrics.java 3 3 100.00%
🔵 io/questdb/cutlass/http/HttpServer.java 3 3 100.00%
🔵 io/questdb/cutlass/qwp/protocol/QwpMessageHeader.java 1 1 100.00%
🔵 io/questdb/cutlass/qwp/server/egress/QwpEgressMetrics.java 49 49 100.00%
🔵 io/questdb/cutlass/qwp/server/QwpWebSocketUpgradeProcessor.java 1 1 100.00%
🔵 io/questdb/cairo/CairoConfiguration.java 1 1 100.00%
🔵 io/questdb/cutlass/qwp/codec/QwpEgressSchemaWriter.java 17 17 100.00%

@bluestreak01 bluestreak01 merged commit 554aced into master Apr 26, 2026
55 checks passed
@bluestreak01 bluestreak01 deleted the vi_egress branch April 26, 2026 04:36
bluestreak01 added a commit that referenced this pull request Apr 26, 2026
Resolved conflicts:
- core/src/test/java/io/questdb/test/cairo/CreateTableTest.java:
  Took the union of imports added on both sides (PropertyKey, Rnd
  from vi_idx; CairoException, LPSZ from master). All four are used
  post-merge.
- core/src/main/resources/io/questdb/bin/windows-x86-64/libquestdb.dll:
  Kept the vi_idx version. The master-side change (PR #6991, QWP
  egress) only rebuilt the DLL without C++ source changes, while
  vi_idx's DLL contains real native code changes for the posting
  index work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nwoolmer added a commit that referenced this pull request Apr 28, 2026
Brings in 14 upstream commits (master..errand merge-base 5037278
through b6b3b15). Notable upstream changes:

* JDK 25 migration (#6980) — drops Java 8/11 reflection helpers
  (isJava8Or11, is32BitJVM, getOrdinaryObjectPointersCompressionStatus,
  AccessibleObject_override_fieldOffset, OVERRIDE constant) from
  Unsafe.java. Our errand branch had inherited them; they are dead
  code in our tree too (the only call site was the dropped helper
  itself). Followed upstream and removed them.
* feat(http): QWP egress / WebSocket SQL streaming (#6991, #7004)
* perf(sql): faster GROUP BY / hash join finalizer (#6997)
* fix(sql): nested window inside aggregate (#6943, #6955)
* feat(sql): configurable parquet export encodings (#6949)

Conflict resolutions:

* core/src/main/java/io/questdb/std/Unsafe.java
  Both sides added different methods between setRssMemLimit() and
  checkAllocLimit(). Kept all three: storeFence (upstream),
  chargeExternalRss (ours), and dropped the dead
  AccessibleObject_override_fieldOffset() helper per upstream's
  JDK 25 cleanup.

* core/src/main/java/io/questdb/std/str/GcUtf8String.java
  Our errand branch redesigned this class for lazy native allocation
  (data byte[] field, allocateNative() on first ptr() call) instead
  of upstream's eager DirectByteBuffer + reflection-fetched address.
  Upstream's only post-divergence change was a tiny refactor
  replacing Unsafe.getUnsafe().X with Unsafe.X — irrelevant to our
  redesign. Took ours.

* core/src/main/resources/io/questdb/bin/linux-x86-64/libquestdbr.so
  Built artifact — both sides changed Rust source. Took ours as
  placeholder; will refresh via `mvn package -pl core -am
  -DskipTests` immediately after this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Core Related to storage, data type, etc. Documentation Missing or suggested improvements for documentation New feature Feature requests Performance Performance improvements REST API Issues or changes relating to the HTTP endpoints tandem

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants