Skip to content

Releases: abyo-software/s4

v1.2.0 — day-2 operations: prove it, automate it, keep it healthy

11 Jun 13:43

Choose a tag to compare

v1.2 — day-2 operations: prove it, automate it, keep it healthy.
Four additive features — the savings ledger + s4 savings (measured
$ saved in production, the counterpart to s4 estimate's prediction),
s4 maintain (policy-driven migrate/recompact/storage-class
transitions, one-shot or resident), dictionary day-2 ops
(s4 dict-status + restart-less SIGHUP rotation), and opt-in s4fs
writes (pandas/pyarrow writing gateway-compatible objects without the
gateway) — hardened by a 3-round dual-reviewer audit (findings
17 → 4 → 5 doc-trivia; zero P1 across all rounds). The v1.0 freeze
contract holds: everything below is additive and default-off;
flag-less behavior is bit-for-bit identical to v1.1.0.

Fixed (v1.2 audit round 2 — adversarial verification of the round-1 fix wave)

  • P2 Replication replicas no longer carry the s4-ledger marker
    (they are never ledger-counted, so a gateway-routed delete of a
    replica would have re-opened the asymmetric-subtraction bug round 1
    closed). Both replication metadata capture sites route through a
    marker-stripping snapshot helper.
  • P2 Ledger add and subtract now resolve the same logical size on
    churn: ledger-enabled SSE/versioned multipart re-PUTs stamp
    s4-original-size/s4-compressed-size (the exact values the add
    used), REPLACE copies stamp the probe-resolved original, and
    COPY-directive copies probe the destination for the add. Previously
    each add→delete cycle of a sidecar-suppressed multipart object
    stranded phantom original bytes (overstated savings). New e2e pins a
    versioned-multipart churn returning bucket totals to exactly zero.
  • P3 Marker-without-add cases (cap-exceeded multipart Complete,
    ledger flag toggles, and replicas written between the round-1 and
    round-2 fixes — moot for releases since v1.2 ships both) are now
    disclosed in the module contract, the report notes, and the
    Complete-path WARN — not eliminated (zero-clamp + drift note remain
    the guard rails).
  • P3 Access-point copy sources: REPLACE copies now strip
    client-supplied s4-* metadata regardless of the source-addressing
    variant (forged s4-ledger/s4-original-size via AP ARN closed),
    and AP sources can no longer read reserved keys (.s4index /
    .s4dict/).

Fixed (v1.2 audit round 1 — 4 reviewers over v1.1.0..HEAD, 2026-06-12)

  • P2 The savings ledger no longer subtracts objects it never added:
    gateway writes made with the ledger enabled stamp an unforgeable
    internal s4-ledger metadata marker, and deletes/overwrites of
    unmarked objects (backend-direct, s4fs-written, migrate /
    recompact output) skip subtraction with a per-bucket
    skipped_unaccounted tally + report note. Ratio and $/month floor at
    0 with a drift note. Previously a migrate-baked bucket could report
    negative savings after gateway-routed deletes (incl. lifecycle
    expiry).
  • P2 s4 maintain transition copies are pinned with
    x-amz-copy-source-if-match: a concurrent overwrite between the
    attribute HEAD and the CopyObject now makes the backend refuse with
    412 (counted etag-raced) instead of stamping the old object's
    s4-* manifest onto the new bytes (which made the key unreadable
    through the gateway until the next rewrite).
  • P2 CompleteMultipartUpload no longer runs the ledger's
    frame-scan accounting when the ledger flag is off (CPU-only
    regression on large multiparts; output bytes were unaffected).
  • P2 (s4fs) a sidecar PUT failure after a successful body write now
    raises a typed S4SidecarWriteError and still invalidates the
    per-path caches (try/finally), so same-instance reads see the new
    object; previously stale caches could serve the old manifest against
    the new body.
  • P2 The README SemVer freeze table now states the Python binding
    contract as a guaranteed-minimum set plus CHANGELOG-recorded additive
    exports (it still claimed "exactly" the v1.0 names while v1.1/v1.2
    shipped additive helpers).
  • P3 Maintain transition re-sends Expires /
    WebsiteRedirectLocation; report notes now state precisely what a
    REPLACE-directive class copy changes (backend-SSE re-encryption under
    the bucket default, multipart→single-part checksum/ETag change with
    sidecar full-read fallback). Resident s4 maintain no longer risks
    sleeping out a full --interval when SIGTERM lands in the
    flag-check gap (notify_one permit).
  • P3 s4 dict-status: 10 s HTTP timeout (was unbounded), the
    Prometheus text parser no longer panics on multibyte escape
    sequences in third-party output, and the cumulative-counter semantics
    (post-rotation STALE persistence, removed-prefix series lingering)
    are documented. s4 savings against a missing state file now says so
    in a note instead of silently reporting zeros.
  • P3 Ledger internals: Prometheus gauges are stamped inside the
    write lock (no transient reordering), SIGUSR1 dumps route through the
    ledger's own flush() (no .tmp race with the event flush), and
    .s4dict/ propagation objects are explicitly excluded from
    accounting (internal keys are never ledger-counted; documented
    contract). (s4-codec-py) encode_s4_object passthrough CRC32C now
    releases the GIL.

Added

  • s4fs write support (opt-in): S4FileSystem(write_enabled=True)
    (or storage_options={"write_enabled": True} through fsspec/pandas)
    enables pipe_file / put_file / open(path, "wb"), writing
    gateway-compatible S4 objects directly to the backend:
    df.to_parquet("s4://bucket/key") now works without the gateway.
    The encoder (s4_codec.encode_s4_object, new in the Python binding
    alongside bind_index + pick_chunk_size) reproduces the gateway's
    single-PUT path byte-compatibly — S4F2 cpu-zstd frames using the
    gateway chunk-size policy (1 MiB / 4 MiB / 16 MiB thresholds), the
    five manifest metadata keys (s4-codec / s4-original-size /
    s4-compressed-size / s4-crc32c / s4-framed), and, for
    multi-frame bodies, a <key>.s4index sidecar bound to the backend
    ETag + size after the body PUT (single-frame overwrites clean up a
    stale sidecar). Verified end-to-end against MinIO + the real
    gateway binary: gateway GET and Range GET return the original
    bytes, s4 verify-sidecar reports OK with the version binding
    intact, and pandas/pyarrow round-trip through both s4fs and the
    gateway. write_codec="passthrough" stores raw stamped bodies;
    everything else (cpu-gzip / cpu-zstd-dict / nvcomp-* / SSE /
    append / versioning) is refused with a typed error pointing at the
    gateway. Underlying filesystems that cannot stamp S3 user metadata
    are refused with S4MetadataUnsupportedError (an unstamped framed
    body would be served raw by the gateway); s3fs is supported out of
    the box. Default behaviour without the flag is unchanged
    (read-only).

  • Per-prefix dictionary metrics: the --zstd-dict PUT branch now
    exports s4_dict_put_total{prefix,outcome="win"|"loss"} and
    s4_dict_put_bytes_total{prefix,kind="original"|"dict"|"plain"}
    both compression results were already measured per PUT, so the byte
    counters are exact on wins and losses alike. Cardinality is bounded
    by the configured prefix count; with no dict configuration the
    series are never registered (default behaviour bit-for-bit
    unchanged). The gateway also self-monitors: a prefix whose rolling
    win rate over its last 100 dict-path PUTs drops below 0.5 logs a
    stale-dictionary WARN, at most once per prefix per hour.

  • s4 dict-status --metrics-url <URL> [--warn-win-rate 0.5] [--format table|json]: scrapes a running gateway's /metrics
    (built-in minimal Prometheus text parser, no new dependencies) and
    reports per-prefix dictionary win rate, effective compression ratio
    (dict bytes / original bytes) and lazy s4_dict_fetch_total error
    counts. Prefixes below the win-rate threshold get a "dictionary may
    be stale; consider retraining (s4 train-dict)" warning and the
    command exits 1 — cron-able drift monitoring.

  • --zstd-dict-map <FILE> + SIGHUP reload: TOML [mappings]
    table ("<bucket>/<prefix>" = "<dict-id>") as the reloadable twin
    of repeated --zstd-dict flags — identical validation, boot-time
    fetch + fingerprint verification and 1 MiB dictionary cap; a prefix
    configured in both places is a boot error. On SIGHUP the file is
    re-read, new dictionaries are fetched + verified (already-loaded
    ones are reused), and the store is swapped atomically (arc-swap RCU
    — in-flight requests finish on the generation they started with),
    so rotation is s4 train-dict → edit map → kill -HUP, no gateway
    restart. A failed reload keeps the current mappings live (ERROR log

    • s4_dict_reload_total{result="err"}; success bumps
      result="ok"). Without the flag, SIGHUP does not touch dictionary
      configuration. New library surface:
      S4Service::with_shared_zstd_dicts,
      dict::{SharedDictStore, DictWinTracker, parse_zstd_dict_map, merge_dict_entries, build_dict_status, parse_prom_sample}.
  • Docs: README "Operating dictionaries" section (dict-status /
    rotation runbook), plus an explicit note that multipart uploads are
    out of the dictionary path by design — parts never consult the
    dict store, and S3's 5 MiB minimum part size sits far above the
    small-object ceiling (default 1 MiB --zstd-dict-max-bytes).

  • E2E: tests/dict_ops_minio.rs (Docker-gated, real s4 binary)
    — win/loss counters on /metrics, dict-status exit codes 0/1
    with the retrain warning, map-file boot, SIGHUP rotation picking up
    a new prefix without a restart, and the fail-safe on a broken map
    (previous store keeps serving).

  • s4 maintain --policy <FILE> [--execute] [--interval <DUR>] [--format table|json]: policy-driven bucket maintenance. A TOML
    file of [[rule]] entries (unique name, bucket, optional
    prefix, common older-than age gate) runs sequentially top to
    bottom; `action = "...

Read more

v1.1.0 — adoption tooling + small-object compression

11 Jun 03:42

Choose a tag to compare

v1.1 — adoption tooling + small-object compression. Six additive
features (s4 estimate / s4 migrate / zstd dictionaries +
s4 train-dict / s4fs fsspec adapter / s4 recompact / GPU batched
small-PUT compression) hardened by a 3-round dual-reviewer audit
(Claude ×3 + Codex; findings 20 → 7 → 5, P1/P2 zero at round 3). The
v1.0 freeze contract holds: every change below is additive and
default-off; flag-less PUT/GET behavior is bit-for-bit unchanged.

Fixed (audit round 2 — adversarial verification of the round-1 fix wave)

  • P2 CreateMultipartUpload now strips client-supplied s4-*
    metadata like put_object does — a forged x-amz-meta-s4-encrypted
    could otherwise survive onto a completed multipart object and 5xx a
    flag-less GET (multipart re-open of the round-1 PUT fix).
  • P2 migrate / recompact no longer hard-fail every object when
    GetObjectTagging is denied or unimplemented: such objects skip as
    tags-unreadable (data is never rewritten tag-less), NoSuchTagSet
    counts as "no tags", and a new --no-tags flag opts out of tag
    inheritance entirely. Transient tagging errors still fail hard.
  • P2 Version-pinned CopyObject (?versionId=) probes the pinned
    source version — not the latest — for both the REPLACE metadata merge
    and cross-bucket dictionary propagation.
  • P3 Dictionary size cap (1 MiB) is now one consistent contract:
    train-dict --max-dict-bytes and --zstd-dict boot preload reject
    what a flag-less gateway's lazy fetch would refuse.
  • P3 Boot-preloaded dictionaries are bucket-scoped, fetched per
    (bucket, id) with s4-dict-sha256 verification, and the server
    refuses to boot when one dict-id resolves to different bytes across
    buckets (16-hex prefix collision).
  • P3 s4 estimate excludes already-S4 objects (gateway metadata or
    S4F2/S4P1/S4E* magic) from sampling so re-estimating a
    gateway-operated bucket doesn't measure framed/encrypted bytes as if
    they were compressible plaintext (already_s4 count + note).
  • P3 (s4fs) the sidecar staleness check reuses a cached live-info
    snapshot instead of issuing a second backend HEAD per info().
    Trade-off disclosed: external overwrites during one filesystem
    instance's lifetime are detected on the next invalidate_cache() /
    new instance, not per-read (same contract as the metadata cache).

Fixed (audit round 3 — convergence check)

  • P3 s4 estimate's already-S4 body detection is structurally
    validated (known codec id + payload fits the object for S4F2,
    plausible padding length for S4P1) so customer data that merely
    starts with the 4-byte magic isn't silently dropped from sampling.
  • P3 README/CHANGELOG drift from the round-1/2 fixes corrected:
    dictionary 1 MiB cap is documented as one three-surface contract,
    migrate/recompact sample outputs show the full current skip taxonomy,
    --no-tags / tags-unreadable / already-s4 estimate exclusions
    documented.

Fixed (audit round 1 — 4 reviewers over v1.0.0..HEAD, 2026-06-11)

  • P1 s4 migrate could rewrite .s4dict/<id> dictionary objects as
    S4F2-framed data, breaking every cpu-zstd-dict object in the bucket
    (lazy fetch fails fingerprint verification). All three bulk tools
    (estimate / migrate / recompact) now exclude S4-internal keys:
    *.s4index, .s4dict/, and *.__s4ver__/* versioning shadows.
  • P1 A client-supplied x-amz-meta-s4-dict-id on a plain PUT made
    the subsequent GET fail 5xx even with --zstd-dict unset (default-off
    behavior regression). The GET dict branch is now gated on the
    gateway-managed manifest codec (cpu-zstd-dict), and put_object
    strips client-supplied s4-* metadata keys up front.
  • P1 (s4fs) SSE-encrypted objects could return AES-GCM ciphertext
    bytes silently (passthrough + SSE). s4fs now refuses with
    NotImplementedError via three layers: s4-encrypted metadata,
    sidecar SSE binding, and S4E1S4E6 magic sniff.
  • P1 (s4fs) <key>.__s4ver__/<version> shadow objects were not
    hidden from ls/find/glob (prefix check instead of infix), so
    directory dataset scans could silently include stale versions.
  • P2 migrate / recompact rewrites dropped the source object's
    storage class (silent promotion to STANDARD) and object tags; both
    are now inherited. ACLs / Object Lock retention remain uninherited
    (stated in report notes).
  • P2 migrate treated a roundtrip-verify failure as a skip
    (exit 0); it is now a hard failure (exit 1), matching recompact.
    The skipped_verify_failed JSON field remains (always 0) for shape
    compatibility.
  • P2 Cross-bucket CopyObject of a dict-compressed object now
    propagates .s4dict/<id> to the destination bucket (idempotent,
    content-addressed); previously the copy succeeded but every GET on
    the destination failed 5xx.
  • P2 .s4dict/ joined the reserved-key guard: gateway PUT / DELETE
    are rejected with InvalidObjectName (reads still allowed) so a
    bucket-wide dictionary can't be destroyed through the data path.
  • P2 (s4fs) info() no longer trusts a stale sidecar for object
    size (staleness-checked first), and binding-less legacy v1 sidecars
    are no longer used for size or partial range reads.
  • P2 (s4fs) dependency floor corrected to s4-codec>=1.1.0,<2
    the binding APIs s4fs imports don't exist in the 1.0.0 wheel.
  • P3 estimate no longer aborts the whole run when a sampled
    object 404s mid-run (skip + note); module/report now disclose the
    single-stream measurement bias vs the server's 4 MiB chunking.
  • P3 migrate / recompact enforce --max-body-bytes from the
    GET Content-Length before buffering; migrate now also cleans up a
    stale multi-frame sidecar when its rewrite comes out single-frame.
  • P3 recompact no longer auto-promotes backend-written framed
    objects that lack gateway metadata (unstamped-framed skip; opt back
    in with --assume-unstamped-framed).
  • P3 Dict hardening: DictCache is bucket-scoped, train-dict
    stamps s4-dict-sha256 (full-digest verification when present), and
    lazy fetch caps dictionaries at 1 MiB. (s4fs) open() on a framed
    object with inexact size raises instead of silently truncating
    (allow_inexact_open=True restores the old clamp).
  • P3 nvcomp_batched validates device-reported chunk sizes on the
    host before the unsafe copy (typed per-item error instead of a
    potential OOB read on driver misbehavior).

Added

  • --gpu-batch-small-puts (opt-in, requires the nvcomp-gpu build +
    a CUDA-capable GPU at boot — the server refuses to start otherwise) —
    batch concurrent small PUTs into a single nvCOMP batched-zstd
    kernel launch so the GPU pays its fixed launch + PCIe cost once per
    batch instead of once per object. Eligibility: sampling dispatcher
    picked cpu-zstd, no --zstd-dict prefix match, declared
    Content-Length in [--gpu-batch-floor-bytes (default 4 KiB), --gpu-min-bytes (default 1 MiB)). Companion knobs:
    --gpu-batch-max-items (flush at N pending bodies, default 32) and
    --gpu-batch-window-ms (flush after T ms, default 4 — also the
    worst-case latency the batch path adds to a PUT). Wire format is
    unchanged
    : batched objects are byte-layout-identical standard
    nvcomp-zstd bodies (same FCG1 framing + CodecKind::NvcompZstd
    manifest as the per-object GPU path; no new codec id, no new
    metadata) and the GET path has zero batch awareness — proven by
    GPU-gated tests that decompress batch output through the unmodified
    per-object path, plus a MinIO e2e (tests/gpu_batch_e2e.rs).
    Fail-open semantics: queue full (backpressure), GPU error, or a
    batched result that is not smaller than the input all fall back to
    the pre-existing cpu-zstd framed path — observable via the new
    s4_gpu_batch_total{result="batched"|"fallback"} counter. Measured
    on 1000 × 8 KiB log-like objects (RTX 4070 Ti SUPER, nvCOMP
    5.2.0.10): batched GPU = 29.7 ms vs 702 ms per-object GPU (~24×) vs
    15.7–19.5 ms single-thread cpu-zstd-3; GPU output ~10% smaller
    (12.31× vs 11.14× ratio). Honest verdict in README §"GPU small-PUT
    batching": this offloads CPU and improves ratio — it does not beat a
    free CPU core on raw wall time at 8 KiB. New public surface:
    s4_codec::nvcomp_batched::NvcompZstdBatchEncoder (feature-gated),
    s4_server::gpu_batch (aggregator + GpuBatchHandle),
    S4Service::with_gpu_batch, and the gpu_small_batch bench. Flag
    off (default) = bit-for-bit unchanged PUT behaviour.
  • s4 recompact <bucket>[/prefix] --endpoint-url <BACKEND> [--execute]
    rewrite cpu-zstd framed objects at a higher zstd level during a quiet
    window (LSM-compaction for S3). The gateway's PUT path favours latency
    (--zstd-level, default 3); recompact decodes each S4-framed cpu-zstd
    object in-process (same FrameIter walk as the GET path — doubles as
    an integrity check on the stored frames), re-frames the original bytes
    with the same streaming_compress_to_frames + pick_chunk_size pair
    the PUT path uses at --target-zstd-level (default 19), and overwrites
    only when the new frames shrink the stored bytes by
    --min-gain-percent (default 3%). Rewritten objects are stamped with
    new s4-zstd-level metadata (recompact-only stamp — the gateway
    neither reads nor writes it), making re-runs idempotent
    (already-compacted skip) with no checkpoint file.
    --older-than <DUR> (30d / 12h / 45m / 90s) restricts the run
    to cold objects by backend LastModified. Dry-run by default;
    mandatory decompress-roundtrip byte comparison before every write (no
    off switch) and a pre-PUT HEAD ETag re-check (narrows, does not close,
    the concurrent-writer race). Skip taxonomy: not-s4 (run s4 migrate
    first) / already-compacted / unsupported-codec (passthrough,
    cpu-gzip, nvcomp-*, cpu-zstd-dict — this tool is cpu-zstd →
    cpu-zstd only) / `unst...
Read more

v1.0.0 — SemVer-stable surface freeze

08 Jun 17:32

Choose a tag to compare

[1.0.0] — 2026-06-09

v1.0 — SemVer-stable surface freeze. From v1.0 onward the items
enumerated in README.md §"Stability — v1.0 guarantees"
are frozen for the v1.x line; any incompatible change to them ships
in a v2.0.0 release with migration recipes under docs/migration/.
v1.0 is not a marketing claim that "S4 has been battle-tested at
every Fortune 500."
It is a contract that downstream consumers can
pin s4-server = "1" (or s4-codec = "1", or s4-config = "1", or
ghcr.io/abyo-software/s4:1) and rely on the surface listed in
README.md. First public production deployment reference is still
being collected — file an issue tagged production-reference if
you are running S4 at TB scale.

Surface freeze — what's in the v1.0 contract

See README.md for the table. Briefly:

  • Wire formats: S4F2 framed body, S4P1 padding, S4IX v1/v2/v3
    sidecars, S4E1/S4E2/S4E3/S4E4/S4E5/S4E6 SSE envelopes
  • s4 binary subcommands (verify-sidecar, repair-sidecar,
    sweep-orphan-sidecars, verify-audit-log, plus the server's
    documented --<flag> set)
  • s4_server::repair::* public API (verify/repair/sweep + all
    related error / report / policy types)
  • s4_server::service::S4Service shape — new(backend, registry, dispatcher) constructor + every pub fn with_* builder signature
    (23 of them — exact list in README); + the SharedService newtype
    at s4_server::service_arc::SharedService; + SigV4aGate /
    SigV4aGateError / resolve_range / DEFAULT_MAX_BODY_BYTES /
    DEFAULT_REPLICATION_MAX_CONCURRENT
  • s4_server::sse public surface (frozen types, functions, constants)
  • s4_server::streaming public surface (frozen constants + functions)
  • s4-codec codec trait + format constants (Codec trait shape;
    CodecKind / CodecError / IndexError / FrameError / GpuSelectError /
    CompareOp enums all #[non_exhaustive]; index module's pub structs
    • functions + constants; multipart::FrameHeader layout)
  • s4-config: CompressionMode enum (#[non_exhaustive]) +
    BackendConfig / S4Config struct field sets
  • HTTP API surface: s3s 0.13 trait set (S3 wire compatibility)
  • Container image tags + Helm chart values.yaml key set (full
    enumeration of 28 top-level keys in README)

Added

  • Stability section in README.md (§"Stability — v1.0 guarantees")
    enumerating the v1.0 freeze surface with explicit scope rules.
  • docs/security/cargo-audit-ignores.md — per-advisory rationale +
    mitigation + upstream-tracking for the 4 accepted RUSTSEC ignores
    (2026-0098 / 2026-0099 / 2026-0104 / 2025-0134), with verification
    commands to re-check each fact.
  • README "Backend compatibility matrix" sub-section inside §Stability
    documenting CI-verified state honestly: ✓ gating for MinIO; ⚠ opt-in
    for AWS/B2/R2/Wasabi (gate only when operator-configured secrets
    are set); ⚠ claimed-but-not-CI-verified for Garage + Ceph RGW with
    the specific drift symptoms documented.
  • README "Modules NOT in the freeze list" sub-section enumerating
    the 25 s4_server::* modules that exist as pub mod for binary
    • tests needs but are NOT part of the v1.0 contract.
  • README "How to read the freeze table — scope of 'frozen'"
    sub-section: items named in the table ARE the v1.0 contract; other
    pub items in those modules are NOT; pin =1.x.y if depending on
    unlisted items.
  • README "v0.x → v1.0 source compatibility note" sub-section listing
    all 34 enums annotated #[non_exhaustive] (6 s4-codec + 27
    s4-server + 1 s4-config) + the mechanical consumer-side fix
    (add _ => arm) for exhaustive matches.

Changed

  • 34 public enums on the frozen surface gained #[non_exhaustive]
    for forward-compat additive variants. Source-level breaking
    change
    for downstream code with exhaustive match arms; fix is
    mechanical (add _ =>). See README §"v0.x → v1.0 source
    compatibility note" for the full enum list and rationale.
  • pub fn encode_index_v1_for_test (and other _for_test helpers)
    gated out of the v1.0 public API via #[cfg(test)] pub(crate)
    visibility + #[doc(hidden)].
  • crates/s4-codec-py/pyproject.toml PyPI trove classifier bumped
    from Development Status :: 3 - Alpha5 - Production/Stable
    to match the v1.0 frozen-API contract.
  • SECURITY.md Supported Versions section rewritten from "pre-1.0,
    latest commit on main" → "v1.x rolling window of latest minor +
    previous minor; patch releases on the affected minor's release
    branch".
  • Backend compat matrix table in compat-matrix.yml now reflects
    the round-trip-vs-provisioning gate distinction; Garage and Ceph
    round-trips are continue-on-error with explicit warning steps
    documenting the wire-shape drift symptoms.
  • README disclaimers updated from alpha / early-access / pre-1.0
    framing to the v1.0 "surface freeze ≠ production track record"
    narrative.
  • Helm chart values.yaml key set is now frozen at v1.0; key shape
    changes are v2.0 territory. Chart's own version stays in 0.2.x
    (Helm-side SemVer, independent of appVersion); appVersion bumps
    to 1.0.0.
  • crates/s4-codec-py/README.md + Cargo.toml + pyproject.toml
    metadata updated from "GPU/CPU compression" to "CPU compression"
    to match what the Python module actually exports in v1.0
    (CpuZstd + CpuGzip only; GPU codec classes are intentionally
    NOT exposed in v1.0).
  • crates/s4-codec-wasm/README.md status header updated from
    "v0.4 #24 — initial cut" to "v1.0 — frozen public API".
  • .github/workflows/ci.yml security-audit job comment corrected:
    rustls-pemfile is a runtime dep (used by the production HTTPS
    listener in tls.rs), not "dev-only" as the prior comment claimed.

Fixed

  • compat-matrix.yml Garage start step: replaced over-broad
    awk '/HEALTHY|UNHEALTHY|NO ROLE/' that matched the
    ==== HEALTHY NODES ==== table header line in
    dxflrs/garage:v1.1.0 output (producing NODE_ID="====" and a
    hard-fail at layout assign). Now uses garage node id -q
    directly, which returns <hex>@<addr>.
  • compat-matrix.yml Ceph RGW + Garage round-trip steps: marked
    continue-on-error because quay.io/ceph/demo:latest-quincy is
    unmaintained upstream (XAmzContentSHA256Mismatch) and
    dxflrs/garage:v1.1.0 rejects current aws-sdk-rust's
    STREAMING-AWS4-HMAC-SHA256-PAYLOAD (Invalid payload signature).
    Provisioning steps still gate for both.

Roadmap candidates (v1.x, additive only)

  • Chunked SSE-KMS envelope (provisional S4E7) + chunked SSE-C
    (provisional S4E8) for Range GET partial-fetch fast-path.
  • S4F3 streaming frame format enabling streaming PUT checksum
    verify for multipart upload_part.
  • 32-bit runtime smoke promoted from advisory to required CI gate.
  • Per-action SHA pinning on GHA workflows.
  • Cross-region replication promoted from experimental scaffolding
    to production-grade with Jepsen-style consistency tests.
  • Re-introducing Garage + Ceph as ✓ gating once upstream signature
    / image issues resolve.
  • GPU codec exposure in the Python module.
  • Streaming decoder API in the WASM module.
  • npm publish automation for the WASM package.
  • Japanese README (README.ja.md) brought current to v1.0.

Audit history

7 rounds of dual-reviewer (Opus + Codex) adversarial audit drove
~30 individual findings to closure across this cycle:

  • R0 (pre-session, on v1.0 draft README): Opus + Codex, 13 findings
    spanning enum non_exhaustive coverage, README freeze accuracy,
    s3s 0.13 policy, cargo-audit ignores doc, compat-matrix evidence,
    cross-major back-compat caveats.
  • R1: Cluster A (F1 + F2 + F3 sub-agent parallel fixes) + Cluster B
    (main-session README + audit-ignores doc rewrite) + Cluster C
    (compat-matrix manual triggers + Garage / Ceph best-effort wrap).
  • R2: NF-1 — SharedService path correction (s4_server::service
    s4_server::service_arc).
  • R3 (dual reviewer): 11 new findings → fix wave including
    S4Service::default fabrication removal, cloud-backend opt-in
    honest qualifiers, S4Service builder-param contradiction caveat,
    FrameIndex inner-type freeze, v0.x→v1.0 source-break caveat.
  • R4: 4 P2 + 1 P3 — Python class name correction, enum list
    completeness, SECURITY.md update, FrameIndex own-field freeze.
  • R5 (dual): scope-explicit freeze sub-section + Python exception
    enumeration + binding README updates.
  • R6 (dual, split verdict): Codex P1/P2/P3 closures — Python pkg
    GPU marketing removal, PyPI classifier bump, CompressionMode
    non_exhaustive.
  • R7 (dual): s4 = "1"s4-server / s4-codec / s4-config = "1",
    freeze-scope enum-list wording correction, Python README GPU
    build-recipe v1.0 caveat, EOF whitespace, SOCIAL_POSTS.md
    historical-artifact banner.

Cut-commit changes

  • Cargo.toml: workspace.version 0.11.01.0.0
  • crates/s4-server/Cargo.toml: internal-dep pins
    s4-codec, s4-config "0.11""1"
  • crates/s4-codec-wasm/Cargo.toml: internal-dep pin
    s4-codec "0.11""1"
  • crates/s4-codec-py/Cargo.toml: internal-dep pin
    s4-codec-rs "0.11""1" (already landed in round-7 wave;
    noted here for completeness)
  • charts/s4/Chart.yaml: appVersion 0.11.01.0.0;
    chart's own version 0.2.20.2.3 (appVersion bump only,
    no chart-shape change)

v0.11.0 — polish + maintenance (32-bit + Node 24 + compat matrix, 6-round audit clean)

08 Jun 03:06

Choose a tag to compare

Third v0.1x-line cut. Polish + maintenance theme — no production code changes, all 9 GHA workflows + docs + composite actions only. Three-theme wave-1 delivery converged by a 6-round integrated audit (4 P2 + 1 P1 real fixes, 2 false-positive rounds caused by Codex review sandbox network limits — documented inline).

Net diff vs v0.10.0: ~12 files / ~1,400 lines across .github/, docs, charts. Published to crates.io as s4-server@0.11.0 + 3 sibling crates. Container images on ghcr.io: ghcr.io/abyo-software/s4:0.11.0 (multi-arch CPU) + :0.11.0-gpu (nvCOMP amd64) — built automatically by the v0.11.0 tag push.

Wave-1 themes

  • #A4 — 32-bit s4-server runtime end-to-end PUT/GET smoke (ci.yml i686-runtime-smoke job). The v0.10 #A4 --help/--version smoke is now a full MinIO-backed PUT/GET round-trip exercising the i686 hyper/rustls listener, aws-sdk-rust SigV4 signer, and CPU-zstd codec paths. The PUT/GET step lands as advisory (continue-on-error: true) so a first-time 32-bit runtime bug surfaces in the job log + uploaded server artifact without flipping CI red; promote to required after a stretch of green main pushes. README §"Supported targets" 32-bit row: ⚠️ → ✅.

  • #A5 — GitHub Actions Node.js 24 migration. 11 JavaScript actions bumped to their Node 24-ready majors (closing the 2026-09-16 deprecation gate GHA logs have been warning about):

    Action v0.10 v0.11
    actions/checkout v4 v5
    actions/upload-artifact v4 v6
    actions/download-artifact v4 v7
    actions/github-script v7 v8
    codecov/codecov-action v4 v5
    docker/build-push-action v5 v7
    docker/login-action v3 v4
    docker/setup-buildx-action v3 v4
    docker/metadata-action v5 v6
    aws-actions/configure-aws-credentials v4 v6
    azure/setup-helm v4 v5

    Unchanged (already Node 24 at floating tag): Swatinem/rust-cache@v2, benchmark-action/github-action-benchmark@v1, dtolnay/rust-toolchain@stable|@nightly (composite). actionlint clean across all 9 workflows.

  • #A7 — Backend compatibility matrix CI (compat-matrix.yml, weekly schedule + workflow_dispatch). Exercises a PUT/GET + sidecar HEAD round-trip per S3-compatible backend S4 claims support for:

    • Docker tier (no secrets): MinIO + Garage + Ceph RGW (best-effort, upstream demo image unmaintained)
    • Real-cloud tier (operator-provided vars + secrets, silent skip when absent): Backblaze B2 + Cloudflare R2 + Wasabi

    Composite local action .github/actions/compat-roundtrip/action.yml factors the per-backend step. README §"How it Compares" gains a 7-row compat matrix (✅ verified / ⚠️ best-effort / 🔧 configurable in operator CI).

Audit closeout (v0.10.0..v0.11.0)

Round Severity Fix
R1 P2 3fceddd — restore SLSA + SBOM on per-arch builds (imagetools create can't retroactively patch)
R2 P2 c29d69f — restore OCI image labels on per-arch builds + scope compat-matrix TEST_KEY to ${{ github.run_id }}
R3 P2 08545ba — propagate test-key to composite action + flavor-independent merge (CPU arm64 fail no longer skips GPU publish)
R4 P1 157d7e7 — expected-digest-count guard: refuse partial multi-arch publish (CPU arm64 fail must not overwrite :<version> as amd64-only)
R5 / R6 false-positive eebc7e2 — action-version policy comment documents the Codex sandbox network limitation that hallucinated "action versions unpublished" twice

Two false-positive rounds count as effective 2-round clean — every flagged action major (actions/checkout@v5, upload-artifact@v6, download-artifact@v7, github-script@v8, etc.) was verified via gh api /repos/<owner>/<repo>/releases/latest AND every CI run since wave-1 ship (commit 3332f3e) resolves them cleanly.

Cleanup recipe for already-shipped v0.9.0 / v0.10.0 images

The imagetools create shape introduced in v0.10.0 lost OCI labels + SLSA + SBOM. To re-attach them to the existing tags:

gh workflow run docker.yml --ref main \
  -f build_ref=v0.10.0 \
  -f image_tag_override=0.10.0 \
  -f push=true

gh workflow run docker.yml --ref main \
  -f build_ref=v0.9.0 \
  -f image_tag_override=0.9.0 \
  -f push=true

Each per-arch rebuild attaches the labels + attestations now that the build step has them; the merged manifest under each tag overwrites the prior labels-less manifest.

Coverage

  • Workspace tests unchanged (~720 pass, 0 fail) — production code untouched.
  • New CI workflows: 1 new (compat-matrix.yml) + 9 modified (Node 24 bumps + i686 PUT/GET).
  • v0.11.0 compat-matrix first weekly fire: Sunday 06:00 UTC.

v0.12+ candidates (deferred)

  • Chunked SSE-KMS envelope (provisional S4E7) + chunked SSE-C (S4E8) → Range GET partial-fetch for those modes.
  • S4F3 streaming frame format → streaming PUT checksum verify for multipart upload_part.
  • 32-bit s4-server runtime end-to-end smoke promoted from advisory to required (after green-main stretch observed).
  • Per-action SHA pinning instead of floating major tags (security hardening).

Full changelog

See CHANGELOG.md for the per-finding detail.

🤖 Generated with Claude Code

v0.10.0 — encryption-aware completion + Docker distribution + hardening (4-round audit clean)

07 Jun 14:51

Choose a tag to compare

Second v0.10-line cut (= first v0.10). Two-wave delivery of the encryption-aware sidecar completion + Docker image distribution + hardening theme, converged by a 4-round integrated audit (2 P2 fixes, clean R3 + R4).

Net diff vs v0.9.0: ~12 files / ~1,800 lines across s4-server, the Helm chart, the distribution workflows, and the docs.

Published to crates.io as s4-server@0.10.0, s4-codec@0.10.0, s4-config@0.10.0, s4-codec-py@0.10.0. Container images on ghcr.io: ghcr.io/abyo-software/s4:0.10.0 (multi-arch CPU) + ghcr.io/abyo-software/s4:0.10.0-gpu (nvCOMP, amd64) — built automatically by .github/workflows/docker.yml on this tag push. Install via cargo install s4-server or helm install s4 ./charts/s4 --set image.tag=0.10.0 --set backend.endpointUrl=https://....

Wave-1 — encryption-aware completion + Docker distribution

  • s4 repair-sidecar --sse-s4-key <PATH> (--sse-s4-key-rotated id=N,key=PATH) plumbing closes the v0.9 EncryptedSidecarUnsupported reject path. The CLI now decrypts SSE-S4 chunked (S4E6) bodies in-process via the keyring, frame-scans the recovered plaintext, and stamps a v3 sidecar so subsequent Range GETs hit the encryption-aware partial-fetch fast-path. New lib entry s4_server::repair::repair_sidecar_with_keyring; RepairReport::sse_v3_binding exposes the rebuilt SSE binding. RepairError::SseDecryptFailed for keyring mismatches. Hardened against attacker-controlled S4E6 header inflation via SSE_S4_REPAIR_MAX_OVERHEAD_BYTES + SSE_S4_REPAIR_MAX_CHUNK_SLACK_BYTES caps.

  • Official container images on GitHub Container Registry. New .github/workflows/docker.yml builds + pushes ghcr.io/abyo-software/s4:<version> (CPU multi-arch linux/amd64 + linux/arm64) and ghcr.io/abyo-software/s4:<version>-gpu (nvCOMP GPU, amd64 only — nvCOMP redist x86_64-only) on every v*.*.* tag push. SLSA build provenance (mode=max) + SPDX SBOM via Buildx. GHA-backed layer cache scoped per flavor. Mutable tags (latest, <major>.<minor>) gated on stable tag-push events only so prereleases (-rc1) and back-fill workflow_dispatch runs can't move them backward. workflow_dispatch supports build_ref + image_tag_override for back-filling images for tags that pre-date the workflow. Helm chart default image.repository flipped to ghcr (chart version 0.1.0 → 0.2.1, appVersion → 0.10.0); docker-compose.{,gpu}.yml add image: alongside build:.

  • SSE partial-fetch AEAD constraint documentation — new docs/security/sse-partial-fetch-constraint.md walks the AEAD authenticated-encryption contract (NIST SP 800-38D §7.2 quoted), per-mode wire layout, why only S4E6 escapes the constraint (per-chunk nonce + tag), and provisional S4E7 (chunked-KMS) / S4E8 (chunked-SSE-C) roadmap candidates. README §"Server-side encryption — Range GET fast-path matrix" makes the support matrix explicit.

Wave-2 — hardening

  • i686 runtime smoke CI — new i686-runtime-smoke job in .github/workflows/ci.yml installs gcc-multilib + libc6:i386, runs cargo test --target i686-unknown-linux-gnu -p s4-codec -p s4-config --release, builds the s4 binary for i686 (continue-on-error for the aws-sdk-rust / rustls / ring stack), and invokes s4 --help / s4 --version on the i686 ELF. README §"Supported targets" cell flips from "⚠️ compiles, untested at runtime" to "✅ compiles + --help / --version smoke (CI)".

  • Docker / Helm distribution smoke CI — new .github/workflows/docker-smoke.yml validates the v0.10 #B1 distribution surface on every push that touches it (path-filtered to charts/**, Dockerfile*, docker-compose*.yml, plus the docker / docker-smoke workflow files). Three independent jobs: helm-lint-template (helm lint + three helm template invocations: default, pinned tag, GPU suffix), docker-compose-config (both compose files + assert ghcr image refs present), image-smoke (docker pull ghcr.io/abyo-software/s4:latest + --help / --version, continue-on-error: true on pull for the not-yet-published case).

  • Streaming PUT checksum coverage matrix doc — new docs/security/streaming-checksum-coverage.md documents the codec-API constraint that limits the v0.9 #streaming-checksum tee-into-hasher fast-path to single-PUT cpu-zstd / nvcomp-zstd (Codec::supports_streaming_compress() == true). Same "fundamental contract, not deferred plumbing" framing as the SSE-side #A2-doc. Three preconditions for streaming win (streaming codec + streaming downstream + no full-body framing dependency) + which paths meet how many + roadmap candidates (S4F3 streaming frame, streaming nvCOMP, multipart streaming upload_part) with the upstream API blockers for each.

Audit posture

  • 6 per-feature audits (15 Codex CLI rounds total): A1 = 5R, B1 = 4R, B2 = 1R, A2-doc = 1R, A3-doc = 0R, A4 = 0R.
  • 4-round integrated cross-feature audit on the full v0.9.0..main range. 2 P2 fixes (Dockerfile s4 s4 --help arg dup in the docker-smoke workflow; docker.yml back-fill :main + :sha-<x> mis-tag from dispatcher ref). Clean R3 + R4 — 2 consecutive convergence rounds.
  • Zero P1 across all rounds. Both P2 integrated-audit findings caught BEFORE the corresponding image actually shipped (back-fill v0.9.0 image build was in-flight at the time R2 caught the mis-tag; v0.10.0 ships with the fix in place).
  • cargo audit clean (same 4 documented ignores as v0.9.0 / v0.8.22: RUSTSEC-2026-0098/0099/0104 in the upstream aws-sdk-rust TLS stack, RUSTSEC-2025-0134 unmaintained dev-only rustls-pemfile).

Coverage

  • ~720 workspace tests pass, 0 failed (unchanged from v0.9.0).
  • v0.9.0 baseline plus: 4 new A1 unit tests in s4_server::repair, 3 new A1 MinIO E2E tests in sidecar_repair_via_minio.rs. Lib unit count in repair module now ~21.
  • New CI workflows: docker-smoke.yml (3 jobs), i686-runtime-smoke (added to ci.yml).

v0.11+ follow-up (deferred, scope-out)

  • Chunked SSE-KMS envelope (provisional S4E7) + chunked SSE-C (S4E8) → would enable Range GET partial-fetch for those modes.
  • S4F3 streaming frame format → would enable streaming PUT checksum verify for multipart upload_part.
  • Streaming nvcomp-bitcomp / nvcomp-gdeflate (= GPU codec API rework upstream of S4).
  • 32-bit s4-server runtime: end-to-end PUT/GET smoke (today's smoke is --help / --version only).
  • v0.9.0 ghcr.io back-fill: workflow_dispatch in flight at cut time will publish ghcr.io/abyo-software/s4:0.9.0 + :0.9.0-gpu. The 2 P2 fixes in the v0.10 integrated audit mean future back-fills don't mis-tag :main / :sha-<x> — but the v0.9.0 back-fill ran on the pre-fix workflow and may have published those mis-tags. Operator cleanup recipe: trigger gh workflow run docker.yml --ref main -f push=true (no inputs) once the back-fill finishes to refresh :main / :main-gpu to current main HEAD content, overwriting the mis-tagged v0.9.0 entries.

Full changelog

See CHANGELOG.md for the per-finding detail.

🤖 Generated with Claude Code

v0.9.0 — six-feature roadmap landing + 7-round integrated audit (clean)

07 Jun 11:31

Choose a tag to compare

First v0.9 minor cut. Six roadmap items shipped in this release line, followed by a 7-round integrated cross-feature audit that converged on round 7 (clean bill of health). Net diff vs v0.8.22: 26 files / +8,500 lines across s4-codec and s4-server, all behind opt-in flags or new subcommands — no behavioral change on existing CLI surface or default-config deployments.

Published to crates.io as s4-server@0.9.0, s4-codec@0.9.0, s4-config@0.9.0, s4-codec-py@0.9.0. Install via cargo install s4-server.

Headline additions

  • Operator toolings4 verify-sidecar / s4 repair-sidecar / s4 sweep-orphan-sidecars subcommands close the gap that v0.8.x docs/orphan-sidecar-recovery.md left as a manual aws-cli recipe. Library API s4_server::repair::{verify_sidecar, repair_sidecar, sweep_orphan_sidecars} available for programmatic use. DeletePolicy::{DryRun, PairBoundOnly, IncludeUndecodable} tiers protect legacy reserved-name user data (the v0.8.17 --allow-legacy-reserved-key-reads migration scenario) from accidental sweep delete.

  • Performance regression gate — criterion-based bench targets (~30 bench points across codec_roundtrip / frame_codec / index_codec) + GitHub-Pages-backed trend chart via benchmark-action/github-action-benchmark. Bench workflow auto-bootstraps the gh-pages branch on first push.

  • Encryption-aware sidecar (SSE-S4 chunked / S4E6) — Range GET on --sse-chunk-size > 0 objects now hits a partial-fetch fast path via the new v3 sidecar format (extends v2 with a 30-byte SSE binding block: chunk_size + chunk_count + key_id + salt + plaintext_len + header_bytes). SSE-KMS / SSE-C / SSE-S4 buffered (S4E2) / multipart remain on the v0.8.12 #120 buffered fallback (deferred to v0.10+).

  • True streaming PUT checksum verify (tee-into-hasher) for cpu-zstd / nvcomp-zstd single-PUT — closes the v0.8.13 #127 regression that v0.8.14 #129 reverted to a buffered fallback. Honors Content-MD5 + x-amz-checksum-{crc32, crc32c, sha1, sha256, crc64nvme} headers AND SigV4-streaming x-amz-trailer claims. Multipart upload_part keeps the buffered per-part verify (bytes are already in memory there for framing).

  • Chaos infrastructure — 5 deterministic backend-fault scenarios (mid-stream GET error, HEAD latency timeout, concurrent overwrite, SSE keyring rotation mid-PUT, multipart Complete failure) replace the v0.8.18 P7 scaffold. In-memory mock backend; no Docker dep, no flake.

  • 32-bit cross-compile (i686-unknown-linux-gnu) across every workspace crate. Runtime is NOT claimed — cargo check --target parity only. Closes the v0.8.21 R5-8 regression where the 5 GiB usize const overflowed on 32-bit.

Audit posture

  • 6 per-feature audits (11 Codex CLI rounds total) on the roadmap commits.
  • 7-round integrated cross-feature audit on the full v0.9 range (142e50e..main). Catches gaps per-feature audits couldn't see: encrypted-body handling in sidecar tooling, trailer-verify dispatch consistency, OOM hardening, HEAD→GET TOCTOU on the bounded sidecar fetch.
  • Zero P1 findings across all 18 rounds. 7 P2 + 1 self-review fix in the integrated audit, all landed.
  • cargo audit clean (same 4 documented ignores as v0.8.22: RUSTSEC-2026-0098/0099/0104 in the upstream aws-sdk-rust TLS stack, RUSTSEC-2025-0134 unmaintained dev-only rustls-pemfile).

Coverage

  • ~720 workspace tests pass, 0 failed.
  • 17 new lib unit tests in s4_server::repair (parsing, ETag normalization, Option<&str> equality semantics, DeletePolicy::allows truth table, status truth table including MissingHarmless / MissingDivergent / MissingUnknown, body-cap constant pinning, NotFramed / SidecarTooLarge / EncryptedSidecarUnsupported / OverwrittenDuringRepair wire shapes).
  • 14 new MinIO E2E tests covering verify-clean, repair-after-delete, repair-after-clobber, sweep-finds-and-deletes-orphan, sweep-pair-bound-only-preserves-undecodable, post-PUT race detector (best-effort), MissingHarmless on small single-PUT, encrypted-body reject, P2-R3 NotFramed reject (empty + raw body), P2-R4 verify-side MissingHarmless, P2-R5 oversized-sidecar sweep classification, plus 4 server-side encryption-aware sidecar tests (chunked range-GET uses v3 partial-fetch, round-trip correctness, buffered fallback unchanged, non-SSE PUT still emits v2).
  • 6 deterministic chaos scenarios + scaffold smoke.

v0.10 follow-up (deferred, scope-out)

  • Encrypted sidecar repair via CLI keyring plumbing (--sse-s4-key <path>).
  • Encryption-aware sidecar for SSE-KMS / SSE-C / S4E2 / multipart.
  • Streaming PUT checksum verify for multipart upload_part + GPU codec non-streaming branch.
  • 32-bit runtime smoke test of s4-server (currently cargo check --target parity only).

Full changelog

See CHANGELOG.md for the per-finding detail.

🤖 Generated with Claude Code

v0.8.22 — eighth-round convergence (clean bill of health)

06 Jun 17:44

Choose a tag to compare

Convergence reached. Eight consecutive Codex CLI + Claude
Code review rounds against this codebase, totalling 130+ fixes
across 5 security audit cycles + 1 production-readiness sweep +
3 doc-accuracy sweeps. Round 8 returned clean bill of health — convergence reached.

Skipped intermediate versions: v0.8.20 was never published to
crates.io
(Round 6 caught a silent-truncation regression in
v0.8.20 R5-8 → reverted in v0.8.21 R6-1) and v0.8.21 was
never published
(Round 7 caught that v0.8.21 R6-6 introduced
a fresh fabrication in the SIGUSR1 grep recipe → fixed in
v0.8.22 R7-1). End users go straight from v0.8.19 → v0.8.22.

Published to crates.io as s4-server@0.8.22, s4-codec@0.8.22,
s4-config@0.8.22, s4-codec-py@0.8.22. Install via
cargo install s4-server.

What converged

Round 7 → v0.8.22 (#200-#202)

  • #200 R7-1 — Runbook §1 SIGUSR1 grep target corrected to
    "S4 SIGUSR1: dumped attached-manager snapshots" (the real
    substring in main.rs:1830). v0.8.21 R6-6 used a
    hand-written string that never matched.
  • #201 R7-2 — README §roadmap "v0.8.8 released
    (2026-05-20)" replaced with a moving-target reference to
    CHANGELOG + GitHub Releases. The pinned bullet was 13
    patches stale.
  • #202 R7-3 — Threat-model + runbook "Last reviewed"
    stamps both bumped to v0.8.22 with a one-line Stamp
    policy
    declaring future cuts bump both in lockstep.

Round 6 → v0.8.21 (#194-#199, rolled into v0.8.22)

  • #194 R6-1 — Reverted v0.8.20 R5-8's silent-truncation
    regression. --max-body-bytes default stays as the bare
    5 * 1024 * 1024 * 1024 literal, which is a loud compile
    error on 32-bit — the correct failure mode.
  • #195 R6-2 — Runbook trailing "Metric-naming note"
    s4_requests_total{status=~\"5...\"}result=\"err\".
  • #196 R6-3 — Runbook "Last reviewed" stamp bumped
    (R7-3 then re-bumped in lockstep with threat-model).
  • #197 R6-4 — AWS SigV4 vectors docstring reverted
    get-utf8-pathget-utf8 (R5-7 walk-back; AWS upstream
    name is get-utf8).
  • #198 R6-5 — Orphan-sidecar roadmap aligned with
    README #106 (v0.9 s4-tool repair-sidecar / verify).
  • #199 R6-6 — Runbook §1 SIGUSR1 recipe drops sleep 1
    in favour of journalctl ... | grep -m1 ... + sleep 5
    fallback. (R7-1 then fixed the grep target itself.)

Round 5 → v0.8.20 (#186-#193, rolled into v0.8.22)

  • #186 R5-1 — Runbook §1 "graceful shutdown dumps state"
    claim removed. Only SIGUSR1 dumps; shutdown only drains
    the access-log buffer.
  • #187 R5-2 — Runbook §2 / §3 / §7 / §8 metric names
    canonicalised. v0.8.19 D-6 only covered §12's dedicated
    alert table; the other 4 sections shipped fabricated
    names. Real names now: s4_gpu_oom_total,
    s4_requests_total{result=\"err\"},
    s4_replication_{dropped,replicated,status_swept}_total,
    s4_tls_cert_reload_total{result=\"err\"}.
  • #188 R5-3 — README + SOCIAL_POSTS drop the fabricated
    s4_codec_chosen_total{codec} — the codec label lives
    on the real s4_requests_total counter.
  • #189 R5-4docs/orphan-sidecar-recovery.md shell
    recipe defines BACKEND_ENDPOINT alongside ENDPOINT
    (the recipe used \$BACKEND_ENDPOINT without ever
    defining it).
  • #190 R5-5 — Orphan-sidecar stale "v0.8.17 may add"
    claim advanced (then R6-5 / R7 re-aligned to v0.9).
  • #191 R5-6 — Threat-model stamp bumped (R7-3 then
    re-bumped in lockstep with runbook).
  • #192 R5-7 — AWS SigV4 vectors docstring (R6-4 walked
    this back since the AWS upstream name is get-utf8, not
    get-utf8-path).
  • #193 R5-8--max-body-bytes default through u64
    cast. Reverted in R6-1 — silent truncation on
    32-bit was the wrong direction.

Cumulative scope (all 8 audit cycles)

Round Issues fixed Cumulative cuts
R1 (security cycle 1) CRIT 5 + HIGH 9 + MED 4 + hotfix v0.8.11–v0.8.14
R2 (security cycle 2) HIGH+MED 18 v0.8.15
R3 (security cycle 3) follow-up 15 + 5 v0.8.16, v0.8.17
R4 (production readiness) P1-P7 + #172 v0.8.18
R4 (doc audit) 12 v0.8.19
R5 (metric fabrication sweep) 9 v0.8.20 ⛔ skipped publish
R6 (silent-truncation regression) 6 v0.8.21 ⛔ skipped publish
R7 (fresh-fabrication sweep) 3 v0.8.22 ✅ published
R8 (convergence check) 0 — clean

Operator-visible knobs cumulative

--trust-x-forwarded-for (v0.8.11),
--prefer-columnar-gpu (v0.8.13),
--allow-legacy-reserved-key-reads (v0.8.17),
--max-body-bytes (v0.8.19).

Tests

449 lib + 45 integration + 11 AWS SigV4 vectors + 2 server
bolero fuzz + 1 chaos = total target count ≈ 540, all green
under RUSTFLAGS=\"-D warnings\"; cargo clippy --workspace --all-targets clean; cargo fmt --all --check clean; MinIO
E2E + coverage + bench-smoke jobs all green on CI.

Upgrade notes

  • No new operator-visible knobs since v0.8.19. The same four
    opt-ins above.
  • The v0.8.20 → v0.8.21 skip on crates.io means end users on
    v0.8.19 get every fix in #186 through #202 in a single
    upgrade.

Recommended pre-launch reading order

  1. docs/security/threat-model.md
  2. docs/ops/runbook.md
  3. README.md
  4. Per-version per-issue notes: CHANGELOG.md

v0.8.19 — fourth-round doc-accuracy sweep + --max-body-bytes CLI flag

06 Jun 17:15

Choose a tag to compare

Fourth-round Codex CLI + Claude Code review of v0.8.18 caught
fabrications in the v0.8.18 runbook + threat-model + bolero
module doc
(written from memory rather than verified against
the source tree) and one missing CLI flag the threat model
already advertised. v0.8.19 closes all 12 items.

Published to crates.io as s4-server@0.8.19, s4-codec@0.8.19,
s4-config@0.8.19, s4-codec-py@0.8.19. Install via
cargo install s4-server (CPU build).

What's new since v0.8.18

Added (#174)

  • #174 D-1--max-body-bytes <BYTES> CLI flag. The cap
    was builder-only before v0.8.19 (with_max_body_bytes), but
    the threat model already advertised it as an operator-
    tunable defence — the doc was right; the missing piece was
    the CLI flag. Default 5 GiB matches the AWS S3 single-PUT
    max.

Fixed (#175-#185, doc / minor)

  • #175 D-2docs/security/threat-model.md no longer
    references a non-existent --state-dir. Replaced with the
    per-manager --<x>-state-file list (versioning,
    object_lock, mfa_delete, cors, inventory, notifications,
    tagging, replication, lifecycle).
  • #176 D-3 — Runbook §1 (disk full) rewritten. The pre-D-3
    text told operators that systemctl reload would "stop
    accepting new connections" — SIGHUP only rotates TLS
    certificates. Mitigation path now correctly says front S4
    with a load balancer + drain there, or change
    --max-concurrent-connections and restart (not reload).
  • #177 D-4 — Runbook §6 (MFA-Delete recovery) now points
    at the --mfa-delete-state-file <PATH> operator-supplied
    file, not the fictional mfa.json under a fictional
    --state-dir.
  • #178 D-5 — Runbook §12 (signals) SIGUSR1 description was
    wrong: pre-D-5 it claimed access-log flush; reality is the
    v0.8.5 #86 helper atomically dumps every in-memory state
    manager (versioning / object_lock / mfa_delete / cors /
    inventory / notifications / tagging / replication /
    lifecycle) to its --<x>-state-file. Access-log buffer
    drains on shutdown, not on SIGUSR1.
  • #179 D-6 — Runbook metric reference table renamed every
    metric to its canonical name in
    crates/s4-server/src/metrics.rs. The pre-D-6 table cited
    s4_backend_error_total, s4_replication_pending_total,
    s4_replication_completed_total,
    s4_replication_failed_total,
    s4_tls_cert_reload_failed_total,
    s4_gpu_compress_oom_totalnone of those exist.
    Real names: s4_replication_dropped_total,
    s4_replication_replicated_total,
    s4_tls_cert_reload_total{result=\"err\"},
    s4_gpu_oom_total.
  • #180 D-7 — Runbook PromQL alert syntax corrected:
    action=\"s3:Bypass*\" (literal *, never matches) →
    action=~\"s3:Bypass.*\" (regex matcher).
  • #181 D-8 — Runbook §4 SSE-S4 rotation typo retiredsl
    retired slots.
  • #182 D-9crates/s4-server/tests/fuzz_bolero.rs
    module doc trimmed to the 2 targets actually shipped
    (sigv4a_auth_header_bolero, policy_json_bolero). The
    pre-D-9 text claimed 4 targets (including a
    pub(crate)-re-export-based one that doesn't exist). The
    two missing targets are tagged honestly as v0.8.19+
    roadmap.
  • #183 D-10crates/s4-server/tests/chaos.rs placeholder
    smoke test now carries concrete assert_eq! checks so a
    future refactor can't accidentally leave the file
    compiling-but-useless.
  • #184 D-11 — AWS SigV4 vectors module doc no longer
    claims every vector comes from the AWS-published suite.
    Split honestly into AWS-published (4 vectors) and
    S3 spec-derived edge vectors (7 vectors, motivated by the
    v0.8.16 #150 byte-level fix)
    .
  • #185 D-12 — Threat-model residual risk #4 (versioned
    multipart Range GET fall-back to full read) now includes
    the cost note about large multipart objects + range-heavy
    workloads.

Tests

449 lib + 45 integration + 11 SigV4 vectors + 2 bolero + 1
chaos = unchanged from v0.8.18; all green under
RUSTFLAGS=\"-D warnings\"; cargo clippy --workspace --all-targets clean; cargo fmt --all --check clean; MinIO
E2E + coverage + bench-smoke jobs all green on CI.

Notes

  • v0.8.19 closes the fourth-round audit. Four full audit
    cycles (3 security + 1 production-readiness + 1
    doc-accuracy) have now run against this codebase. The doc
    fabrications (#175–#180) were a reminder that runbooks
    written from memory are unreliable; future doc work will be
    verified against the source tree before each commit.
  • --max-body-bytes is the only new operator-visible knob
    since v0.8.17. The four opt-ins now available are:
    --trust-x-forwarded-for (v0.8.11),
    --prefer-columnar-gpu (v0.8.13),
    --allow-legacy-reserved-key-reads (v0.8.17), and
    --max-body-bytes (v0.8.19).

Full per-issue notes: CHANGELOG.md.

v0.8.18 — production-readiness sweep (threat model + runbook + AWS test vectors + server fuzz + coverage CI)

06 Jun 16:37

Choose a tag to compare

Production-readiness sweep. Three audit cycles (v0.8.11-v0.8.17)
closed every CRIT / HIGH / MED security finding. v0.8.18 lifts the
operational maturity, AWS conformance posture, and
quality-gate infrastructure to match. No code-correctness
changes outside what already shipped in v0.8.17 — this release is
docs, tests, and CI.

Published to crates.io as s4-server@0.8.18, s4-codec@0.8.18,
s4-config@0.8.18, s4-codec-py@0.8.18. Install via
cargo install s4-server (CPU build).

What's new since v0.8.17

Added

  • #165 P1docs/security/threat-model.md.
    STRIDE-shape threat model covering 5 attack surfaces (public S3
    wire, compressed payload at rest, key handling, backend trust
    boundary, Object Lock posture). Every mitigation traces to a
    shipped issue number from the three audit cycles. Explicit
    non-goals + known residual risks (the rustls-webpki CVE
    chain etc.) documented so reviewers don't reverse-engineer
    them.
  • #166 P2docs/ops/runbook.md.
    12 operational procedures (disk full, GPU OOM, backend 5xx
    storm, SSE key rotation, KMS KEK loss, MFA secret loss,
    replication backlog, TLS rotation, orphan sweep, legacy
    reserved-key migration, audit advisory, graceful shutdown)
    — each in Symptom → Diagnose → Mitigate → Recover →
    Prevent shape.
  • #167 P3 — AWS SigV4 canonical-request test vectors
    (crates/s4-server/src/routing.rs::aws_sigv4_canonical_vectors).
    11 vectors pinning the v0.8.16 #150 byte-level helpers to
    AWS-published expected outputs (vanilla / vanilla-query-order
    key + value / utf8 / non-UTF8 byte round-trip / reserved-char
    encoding / mixed-case percent normalisation / bare key /
    unreserved set / S3 ListObjectsV2 / path with spaces).
  • #168 P4 — server-side bolero fuzz targets
    (crates/s4-server/tests/fuzz_bolero.rs):
    sigv4a_auth_header_bolero (SigV4a Authorization parser),
    policy_json_bolero (IAM bucket-policy JSON parser). Pairs
    with the existing 7 codec-layer bolero targets so the fuzz
    farm now covers every untrusted parser on the listener edge.
  • #170 P6 — code coverage CI job (cargo-llvm-cov + Codecov
    upload, push-to-main only) + bench smoke job (runs the three
    examples/bench_* binaries to surface bit-rot; not a
    regression gate).
  • #171 P7 — chaos / fault-injection test scaffold
    (crates/s4-server/tests/chaos.rs). Placeholder establishing
    the target; backend-method-level fault injection populates
    v0.8.19+.

Changed

  • #169 P5 — README proptest claim corrected from 38 → 39
    properties.
  • #172.github/workflows/ci.yml notify-on-failure
    step now deduplicates by SHA prefix before opening an issue;
    companion .github/workflows/ci-close-resolved.yml
    auto-closes ci-failure issues once a subsequent main commit
    lands green. Closes the auto-issue spam observed during the
    v0.8.13 / v0.8.14 retry cycle.

Fixed

  • Stale ci-failure GitHub issues #115 / #116 / #117 closed
    with the v0.8.13 / v0.8.14 supersession trail.

Tests

449 lib + 45 integration + 11 SigV4 vectors + 2 bolero + 1
chaos scaffold = total test target count climbs from 519 to
~540, all green under RUSTFLAGS=\"-D warnings\"; cargo clippy --workspace --all-targets clean; cargo fmt --all --check clean; MinIO E2E job green on CI; coverage job
green on CI.

Roadmap (deferred from this release)

  • criterion regression-tracking benches — needs baseline
    storage like benchmark-action/github-action-benchmark. The
    v0.8.18 bench-smoke job is the floor; the regression gate is
    the ceiling.
  • Full chaos scenarios — 5+ tests against backend-method-
    level fault injection. Scaffold ships here; scenarios
    populate v0.8.19+.
  • Supply-chain hardening — sigstore release signing,
    reproducible builds, SBOM badge.

Upgrade notes

  • No new operator-visible knobs since v0.8.17. The three
    opt-ins from prior releases (--trust-x-forwarded-for,
    --prefer-columnar-gpu, --allow-legacy-reserved-key-reads)
    are the entire knob surface.
  • Recommended pre-launch reading order:
    1. docs/security/threat-model.md
    2. docs/ops/runbook.md
    3. README.md

Full per-issue notes: CHANGELOG.md.

v0.8.17 — third-round audit closeout (5 follow-up items on v0.8.16) + migration hatch

06 Jun 15:22

Choose a tag to compare

Third-round audit closeout. A follow-up Codex CLI + Claude
Code review of v0.8.16 caught 5 residual items (2 MED + 3 LOW).
No CRIT / HIGH after the prior two cycles. This is the version
to target for the Reddit launch
— three full multi-agent audit
cycles have closed every CRIT / HIGH / MED finding from the
pre-release review.

Published to crates.io as s4-server@0.8.17, s4-codec@0.8.17,
s4-config@0.8.17, s4-codec-py@0.8.17. Install via
cargo install s4-server (CPU build).

What's new since v0.8.16

Fixed (#160-#162)

  • #160 G-1 — F-5 presigned-URL 501 is now unconditional.
    The v0.8.16 check ran AFTER let gate = gate?;, so
    deployments without --sigv4a-credentials had
    ?X-Amz-Algorithm=AWS4-ECDSA-P256-SHA256 URLs silently fall
    through to the SigV4 path (which doesn't understand SigV4a
    query auth either). The presigned-detect call now runs
    before the gate guard, so every deployment emits the
    deterministic 501.
  • #161 G-2 — reserved-name guard extended to 8 adjacent
    per-object endpoints: get_object_acl, put_object_acl,
    get_object_attributes, get_object_tagging,
    put_object_tagging, delete_object_tagging,
    restore_object, and upload_part_copy (both source +
    destination). The v0.8.16 F-13 fix only covered GET / HEAD /
    DELETE — a curious client could still
    GetObjectAcl(<key>.s4index) or
    PutObjectAcl(<key>.s4index, public-read) to bypass the
    read-reject via the backend's public-URL path. New shared
    helper S4Service::check_not_reserved_key(...) +
    ReservedKeyMode enum so every site uses the same code; the
    three pre-existing F-13 sites + the M-1 PUT / Copy /
    CreateMultipart sites refactor through the same helper.
  • #162 G-3post_magic_entropy_high short-sample guard
    is now reachable. The v0.8.16 F-12 check inside the helper
    defaulted to false for <= 48-byte samples but the upstream
    MIN_SAMPLE_BYTES = 128 short-circuit in pick_from_sample
    filtered every such sample before it could reach F-12. The
    magic-byte arm now runs above the MIN_SAMPLE_BYTES gate, so
    a 40-byte BZh:loglog: user log actually hits the post-magic
    entropy check and gets routed to the default codec
    (compressed) rather than passed through uncompressed. Closes
    the v0.8.15 M-7 motivation that v0.8.16 F-12 thought it had
    closed.

Added (#163-#164)

  • #163 G-4--allow-legacy-reserved-key-reads CLI flag.
    Migration escape hatch for operators upgrading from
    pre-v0.8.15 deployments that may carry legitimate user-owned
    objects whose key ends in .s4index. When set, the
    reserved-name guard does NOT block GET / HEAD / DELETE on
    .s4index keys; writes (PUT / Copy / Create-Multipart /
    tagging-write / ACL-write) stay blocked regardless of the
    flag so an attacker can't inject into the namespace. Default
    false matches v0.8.16 behaviour; boot-time info-log is
    loud when the flag is on so the operator notices the
    migration window is open.
  • #164 G-5docs/orphan-sidecar-recovery.md operator
    recipe for sweeping the orphan <key>.s4index artifacts
    that v0.8.15 H-g left on versioning-Enabled buckets. v0.8.16
    #151 F-7 stopped emitting new orphans by skipping the
    sidecar block on versioned multipart Complete; this recipe
    handles the one-time cleanup of pre-F-7 leftovers. A future
    release may ship a s4 admin sweep-orphan-sidecars
    subcommand that automates the same loop.

Upgrade notes

  • --allow-legacy-reserved-key-reads is the only new
    operator-visible knob since v0.8.16. The cumulative audit
    surface area still totals three opt-ins:
    --trust-x-forwarded-for (v0.8.11 CRIT-4),
    --prefer-columnar-gpu (v0.8.13 #125), and this v0.8.17
    migration hatch.
  • No behavioural breaks since v0.8.16. The G-2 reserved-name
    guard extension closes a leak that wasn't reachable via the
    aws s3 cp happy path anyway.

Tests

438 lib + 45 integration tests green under
RUSTFLAGS=\"-D warnings\"; cargo clippy --workspace --all-targets clean; cargo fmt --all --check clean; MinIO
E2E job (cargo test --workspace --release -- --ignored --test-threads=1) green on CI.

Full per-issue notes: CHANGELOG.md.
Recovery recipe for v0.8.15 orphan sidecars:
docs/orphan-sidecar-recovery.md.