You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
iggy currently persists everything to local disk: per-partition append-only `.log` segments, sparse `.index` files, an append-only state log, per-consumer offset files, system info, and tokens. This issue proposes an opt-in mode where an S3-compatible object store is the only persistence medium — useful for ephemeral / scale-to-zero compute, durable-by-default archive, and deployments where local NVMe provisioning is the bottleneck.
Local-disk mode is unchanged and remains the default; the S3 backend ships behind a default-off cargo feature so fs-only deployments aren't affected.
Component
Iggy server
Proposed solution
A new `ObjectStorage` trait abstracts persistence, with a 10-phase incremental rollout. Each phase migrates one persistence subsystem onto the trait and is independently mergeable.
The S3 client is compio-native: built on `rusty-s3` (sans-IO SigV4 + request shaping) + `cyper` (compio HTTP client, rustls TLS). Critically, this avoids reintroducing tokio into the data path that #2020 just removed.
`ObjectStorage` trait + `CompioFsStorage` + `InMemoryStorage` (test) + `S3Storage` (feature-gated) + `BufferedMultipartWriter`. Seam only — no production callers yet. ← this issue's first deliverable
2
State log + `FileSystemInfoStorage` + tokens onto the trait (journal + snapshot model on object backend).
3
Segment writes via multipart upload.
4
Segment reads via ranged GET + LRU byte cache (active-segment reads keep using the in-flight buffer).
5
Bootstrap, directory ops, per-partition versioned manifests for fast boot.
6
Consumer offsets repacked into one binary object per partition.
7
Retention + segment deletion.
8
Per-partition lease object (S3 conditional PUT) for split-brain safety.
9
Hardening — retries, Prometheus metrics, IAM template, perf benchmarks.
10
Documentation, sample config, release notes.
The S3 backend ships behind a default-off `object-storage` cargo feature, so fs-only deployments don't pull `rusty-s3` / `cyper` / `url` into their dependency graph.
Phase 0 spike outcome
A throwaway feasibility spike (~330 LoC) ran against real AWS S3 in an ephemeral bucket (us-east-1; 1-day lifecycle backstop; bucket torn down on exit). All five scenarios passed:
Scenario
Latency
PUT 1 KiB
108 ms
Range-GET 256 B
33 ms
Multipart 12 MiB upload (3 parts)
1555 ms
Full GET + byte-compare 12 MiB
919 ms
Conditional PUT race (`If-None-Match: *`)
62 ms; loser fenced cleanly with HTTP 412
Three correctness findings, baked into Phase 1:
rustls 0.23 needs an explicit `CryptoProvider::install_default()` when cyper is configured `default-features = false`.
AWS ETags arrive wrapped in quotes; `rusty-s3`'s `complete_multipart_upload` re-wraps them when serializing the XML body, so the ETag must be `.trim_matches('"')`-stripped before being passed back. Otherwise: `400 InvalidPart`.
Multipart minimum part size is 5 MiB except the final part; iggy's typical sub-MiB flushes need a buffering layer (`BufferedMultipartWriter`) to coalesce them into legal parts.
Build on `opendal` instead of `rusty-s3` + `cyper`. Tempting because opendal supports more backends out-of-the-box (GCS native, Azure native), but opendal is tokio-native. Pulling it in would require either reintroducing tokio into iggy's data-path runtime (undoing feat(io_uring): replace tokio s3 crate #2020) or running opendal behind a per-call thread bridge (channel-hop overhead per S3 call, plus an extra runtime). Rejected.
Use `rust-s3` (already a transitive dep via the iceberg sink). Also tokio-based via reqwest. Same problem. Left in place for the existing iceberg connector; not used for the new path.
Roll a thin compio-native HTTP client over `compio::net::TcpStream` + `rustls` directly. Minimal external surface but ~300 LoC of HTTP/SigV4 plumbing to maintain in-tree. Reserved as a fallback if `cyper` later turns out to have sharp edges; the Phase 0 spike confirmed it doesn't, today.
Open questions for maintainers
Issue cadence. Happy to file separate issues per phase (one-issue-one-PR) if you prefer that to a single umbrella. This issue is intended to anchor design discussion for the milestone; each phase still ships in its own PR(s).
Default `multipart_part_size`. Currently 8 MiB (configurable). AWS minimum is 5 MiB except final. Smaller → finer durability + more S3 PUTs; larger → fewer PUTs + larger memory buffers. 8 MiB is a starting guess; alternatives 5 / 16 / 32.
`ack_after_upload` default. True (producer ack waits for part-upload success — durable before producer learns) is the safe default. False is faster but loses messages on crash before next flush; intended for testing only. Reasonable?
GCS-S3-compat support. Phase 8 (fencing) uses S3 `If-None-Match: *` conditional PUT. AWS / MinIO / R2 / Tigris support this; GCS-S3-compat does not. OK to document GCS as not-supported in Phase 8 and revisit later, or should we adopt a different fencing mechanism (paid coordination service)?
Description
iggy currently persists everything to local disk: per-partition append-only `.log` segments, sparse `.index` files, an append-only state log, per-consumer offset files, system info, and tokens. This issue proposes an opt-in mode where an S3-compatible object store is the only persistence medium — useful for ephemeral / scale-to-zero compute, durable-by-default archive, and deployments where local NVMe provisioning is the bottleneck.
Local-disk mode is unchanged and remains the default; the S3 backend ships behind a default-off cargo feature so fs-only deployments aren't affected.
Component
Iggy server
Proposed solution
A new `ObjectStorage` trait abstracts persistence, with a 10-phase incremental rollout. Each phase migrates one persistence subsystem onto the trait and is independently mergeable.
The S3 client is compio-native: built on `rusty-s3` (sans-IO SigV4 + request shaping) + `cyper` (compio HTTP client, rustls TLS). Critically, this avoids reintroducing tokio into the data path that #2020 just removed.
Phase plan
The S3 backend ships behind a default-off `object-storage` cargo feature, so fs-only deployments don't pull `rusty-s3` / `cyper` / `url` into their dependency graph.
Phase 0 spike outcome
A throwaway feasibility spike (~330 LoC) ran against real AWS S3 in an ephemeral bucket (us-east-1; 1-day lifecycle backstop; bucket torn down on exit). All five scenarios passed:
Three correctness findings, baked into Phase 1:
Phase 1 PRs
Alternatives considered
Open questions for maintainers