Skip to content

feat: optional gzip compression for ingest#18

Merged
acoshift merged 1 commit into
mainfrom
perf/ingest-gzip
May 25, 2026
Merged

feat: optional gzip compression for ingest#18
acoshift merged 1 commit into
mainfrom
perf/ingest-gzip

Conversation

@acoshift
Copy link
Copy Markdown
Member

Summary

Adds opt-in gzip compression for the ingest path (review item #5 from the earlier performance pass). NDJSON is highly compressible, so for log-shaped workloads this trades a little CPU for a large reduction in wire bytes and latency.

  • New SetGzip(true) enables it. Off by default — existing behavior is unchanged.
  • When enabled, each batch is gzip-compressed and sent with Content-Encoding: gzip. The Quickwit endpoint must accept gzip-encoded ingest requests.
  • The gzip writer is per-worker and reused across flushes via Reset, so the compressor isn't reallocated per batch — matching how the JSON encoder/buffer are already reused.
  • Compression happens after the empty-body check and before request construction, so the 413 auto-reduce/chunking path compresses each chunk independently. Sending via bytes.Reader still lets net/http set Content-Length (compressed size).

Test plan

  • New TestIngest_GzipCompressesBody: server requires Content-Encoding: gzip, decompresses, and asserts every record is delivered in order. (If the header were missing the server 400s, so the test genuinely exercises the encoding path.)
  • All existing tests implicitly cover the default-off path (they read the raw body without decompression).
  • go vet ./... clean
  • go test ./... and go test -race ./... pass

Notes

  • Gzip always compresses when enabled; very small payloads can expand slightly, but this is opt-in for callers who know their batch profile.

🤖 Generated with Claude Code

Add SetGzip(true) to opt into gzip-compressing the NDJSON ingest body.
When enabled, each batch is compressed and sent with a
Content-Encoding: gzip header; the endpoint must accept gzip-encoded
ingest. Off by default, so existing behavior is unchanged.

The gzip writer is per-worker and reused across flushes via Reset to
avoid reallocating the compressor on every batch. Compression happens
after the empty-body check and before request construction, so the
413 auto-reduce/chunking path compresses each chunk independently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@acoshift acoshift merged commit 5a877ef into main May 25, 2026
1 check passed
@acoshift acoshift deleted the perf/ingest-gzip branch May 25, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant