Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 162 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Changelog

All notable changes to HyperCache are recorded here. The format follows
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project
adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [2.0.0] — 2026-05-04

A modernization release. The headline themes:

- Eviction is now sharded by default for concurrency-friendly throughput.
- The distributed-memory backend (`DistMemory`) gained body limits, TLS,
bearer-token auth, lifecycle-context cancellation, and surfaced
listener errors.
- A typed wrapper (`Typed[T, V]`) is available for compile-time
type-safe access without the caller-side type assertions of the
untyped API.
- The legacy `pkg/cache` v1 store and the `longbridgeapp/assert` test
dependency are gone.

The full course-correction plan (Phase 0 baseline → Phase 6 file split,
plus Phase 5a–5e DistMemory hardening) is in commit history. The two
RFCs that informed the design decisions live under [docs/rfcs/](docs/rfcs/).

### Breaking changes

- **`pkg/cache` v1 removed.** All callers must use `pkg/cache/v2`.
- **`longbridgeapp/assert` test dependency removed.** Tests now use
`stretchr/testify/require`. Internal test code only — no impact on
library consumers, but downstream contributors authoring tests
against this codebase must use `require`.
- **`sentinel.ErrMgmtHTTPShutdownTimeout` removed.**
`ManagementHTTPServer.Shutdown` now calls `app.ShutdownWithContext`
and returns the underlying ctx error directly. Callers comparing
against the removed sentinel must switch to `errors.Is(err,
context.DeadlineExceeded)` or equivalent.
- **Sharded eviction is default-on (32 shards).** Items no longer
evict in strict global LRU/LFU order — the algorithm operates
independently within each shard. Total capacity is honored within
±32 (one slot of slack per shard). Use `WithEvictionShardCount(1)`
to restore strict-global ordering at the cost of single-mutex
contention.
- **`hypercache.go` decomposed into 6 files** (`hypercache.go`,
`hypercache_io.go`, `hypercache_eviction.go`,
`hypercache_expiration.go`, `hypercache_dist.go`,
`hypercache_construct.go`). No public API change; third-party
patches against line numbers in the prior single-file layout will
not apply.
- **`ManagementHTTPServer` constructor order fix.**
`WithMgmtReadTimeout` and `WithMgmtWriteTimeout` previously mutated
struct fields *after* `fiber.New` had locked in the defaults — the
options were silent no-ops. Construction order is now correct, so
any code relying on the silent no-op (e.g., setting absurd values
knowing they would be ignored) will see those values take effect.

### Performance

Measurements on Apple M4 Pro, `go test -bench`, `count=5`, benchstat.
Full release snapshot captured in [bench-v2.0.0.txt](bench-v2.0.0.txt).

- **Per-shard atomic `Count`.** `BenchmarkConcurrentMap_Count`:
53 → ~10 ns/op. `_CountParallel`: 1181 → ~13 ns/op. Eliminates the
lock-storm that previously serialized on a single mutex during
eviction-loop count checks.
- **Sharded eviction algorithm** (`pkg/eviction/sharded.go`).
Replaces the global eviction-algorithm mutex with 32 per-shard
mutexes routed by the same hash `ConcurrentMap` uses, so a key's
data shard and eviction shard align (cache-locality on Set).
- **`iter.Seq2` migration** replacing channel-based `IterBuffered`.
`BenchmarkConcurrentMap_All` (renamed from `_IterBuffered`):
757µs → 26.5µs/op (-96.51%). Bytes/op: 1.73 MiB → 0 B/op.
Allocs/op: 230 → 0. Eliminated 32 goroutines + 32 channels per
iteration.
- **xxhash consolidation** (`pkg/cache/v2/hash.go`). Replaced inlined
FNV-1a with `xxhash.Sum64String` folded to 32 bits.
`BenchmarkConcurrentMap_GetShard`: 10.07 → 3.46 ns/op (-65.63%).
- **Sharded item-aware eviction was tried and rejected** per
[RFC 0001](docs/rfcs/0001-backend-owned-eviction.md). The
hypothesis (duplicate-map overhead is the bottleneck) was
falsified — sharded contention dominates. Code removed; lessons
preserved in the RFC for future contributors.

### Features

- **`hypercache.Typed[T, V]` wrapper** for compile-time type-safe
cache access. Wraps an existing `HyperCache[T]`; multiple `Typed`
views can share one underlying cache over disjoint keyspaces.
Includes `Set`, `Get`, `GetTyped` (explicit `ErrTypeMismatch`),
`GetWithInfo`, `GetOrSet`, `GetMultiple`, `Remove`, `Clear`. See
[hypercache_typed.go](hypercache_typed.go) and
[RFC 0002 Phase 1](docs/rfcs/0002-generic-item-typing.md). Phase 2
(deep `Item[V]` generics) is v3 territory, conditional on adoption
signal.
- **`WithDistHTTPLimits(DistHTTPLimits)` option** for the dist
transport: server `BodyLimit` / `ReadTimeout` / `WriteTimeout` /
`IdleTimeout` / `Concurrency`, plus client `ResponseLimit` /
`ClientTimeout`. Defaults: 16 MiB request/response body cap, 5 s
read/write/client timeout, 60 s idle, fiber's 256 KiB concurrency
cap. Partial overrides honored — zero fields inherit defaults.
- **`WithDistHTTPAuth(DistHTTPAuth)` option** for bearer-token auth on
`/internal/*` and `/health` (`Token` for the common case;
`ServerVerify`/`ClientSign` hooks for JWT, mTLS-derived identity,
HMAC, etc.). Constant-time token compare on the server side. The
auto-created HTTP client signs every outgoing request with the
same token. Mismatched-token peers are rejected with HTTP 401
(`sentinel.ErrUnauthorized`).
- **TLS support** via `DistHTTPLimits.TLSConfig`. The server wraps
its listener with `tls.NewListener`; the auto-created HTTP client
attaches the same `*tls.Config` to its `Transport.TLSClientConfig`
with ALPN forced to `http/1.1` (fiber/fasthttp doesn't speak h2).
Same `*tls.Config` configures both sides — operators applying it
consistently across the cluster get encrypted intra-cluster
traffic out of the box. Plaintext peers handshake-fail.
- **Dist server lifecycle context** — `DistMemory.LifecycleContext()`
exposes a context derived from the constructor's that is canceled
on `Stop()`. Replaces the prior pattern where handlers captured
the constructor's `context.Background()` and never observed
cancellation. In-flight handlers and replica forwards see `Done()`
the moment `Stop` is called.
- **`LastServeError()` accessor** on both `distHTTPServer` and
`ManagementHTTPServer`. Replaces the prior `_ = serveErr` pattern
that silently swallowed listener-loop crashes — operators can now
surface the failure to logs/alerts.
- **`Stop()` goroutine-leak fix.** Both `distHTTPServer.stop` and
`ManagementHTTPServer.Shutdown` now call
`app.ShutdownWithContext(ctx)` directly instead of wrapping
`app.Shutdown()` in a goroutine and racing it against ctx done
(which leaked the goroutine when ctx fired first).
- **New sentinels:** `sentinel.ErrTypeMismatch`,
`sentinel.ErrUnauthorized`.

### Internal

Worth surfacing for contributors:

- **v2 module layout** is the file split listed under "Breaking
changes" above — readability win, no API change.
- **Test helpers** introduced under `tests/`:
`tests/dist_cluster_helper.go::SetupInProcessCluster[RF]`,
`tests/merkle_node_helper.go`,
`pkg/backend/dist_memory_test_helpers.go::EnableHTTPForTest`
(build tag `test`).
- **Lint discipline:** 35 `nolint` directives total across the repo,
each with a one-line justification. golangci-lint v2.12.1 runs
clean with `--build-tags test`.

### Removed

- `pkg/cache` v1 (see "Breaking changes").
- `longbridgeapp/assert` test dependency (see "Breaking changes").
- `sentinel.ErrMgmtHTTPShutdownTimeout` (see "Breaking changes").
- Experimental `WithItemAwareEviction` option / `IAlgorithmItemAware`
interface / `LRUItemAware` / `ShardedItemAware` types — landed
briefly during the RFC 0001 spike, then torn out per the RFC's
own discipline when the perf gate failed. The
[RFC document](docs/rfcs/0001-backend-owned-eviction.md) preserves
the measurement and the lessons.

[Unreleased]: https://github.com/hyp3rd/hypercache/compare/v2.0.0...HEAD
[2.0.0]: https://github.com/hyp3rd/hypercache/releases/tag/v2.0.0
70 changes: 67 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,12 @@ Available algorithm names you can pass to `WithEvictionAlgorithm`:

Note: ARC is experimental and isn’t included in the default registry. If you choose to use it, register it manually or enable it explicitly in your build.

#### Sharded eviction (default since v2.0.0)

The configured algorithm is wrapped by a 32-shard router (`pkg/eviction/sharded.go`) that uses the same key hash as `ConcurrentMap` — so a key's data shard and eviction shard line up. This eliminates the global mutex contention single-instance algorithms (LRU/LFU/Clock/CAWOLFU) suffer from. Total capacity is honored within ±32 (one slot of slack per shard), and items evict per-shard rather than in strict global LRU/LFU order.

Use `WithEvictionShardCount(1)` to disable sharding when you need strict-global ordering at the cost of single-mutex contention. Pass any other positive power of two to tune (e.g. `WithEvictionShardCount(64)`).

## API

`NewInMemoryWithDefaults(ctx, capacity)` is the quickest way to start:
Expand Down Expand Up @@ -177,6 +183,7 @@ if err != nil {
| `WithExpirationTriggerBuffer` | Buffer size for coalesced expiration trigger channel. |
| `WithExpirationTriggerDebounce` | Drop rapid-fire triggers within a window. |
| `WithEvictionAlgorithm` | Select eviction algorithm (lru, lfu, clock, cawolfu, arc*). |
| `WithEvictionShardCount` | Number of eviction-algorithm shards (default 32; 1 disables sharding). |
| `WithMaxEvictionCount` | Cap number of items evicted per cycle. |
| `WithMaxCacheSize` | Max cumulative serialized item size (bytes). |
| `WithStatsCollector` | Choose stats collector implementation. |
Expand All @@ -191,9 +198,33 @@ if err != nil {
| `WithDistSeeds` | (DistMemory) Static seed addresses to pre-populate membership. |
| `WithDistTombstoneTTL` | (DistMemory) Retain delete tombstones for this duration before compaction (<=0 = infinite). |
| `WithDistTombstoneSweep` | (DistMemory) Interval to run tombstone compaction (<=0 disables). |
| `WithDistHTTPLimits` | (DistMemory) Body / response / timeout / concurrency caps for the dist HTTP server + auto-client. |
| `WithDistHTTPAuth` | (DistMemory) Bearer-token auth (`Token`) plus optional `ServerVerify` / `ClientSign` hooks. |

*ARC is experimental (not registered by default).

### Type-safe access (`Typed[V]`)

The untyped `HyperCache.Get` returns `(any, bool)`, so callers must
type-assert at every call site. `hypercache.NewTyped[T, V]` wraps an
existing `*HyperCache[T]` to provide a compile-time-typed surface
without changing the underlying storage:

```go
hc, _ := hypercache.NewInMemoryWithDefaults(ctx, 10_000)
sessions := hypercache.NewTyped[backend.InMemory, *Session](hc)

_ = sessions.Set(ctx, "u:42", &Session{UserID: "u-42"}, time.Hour)
s, ok := sessions.Get(ctx, "u:42") // s is *Session — no type assert
```

Multiple `Typed[V1]`, `Typed[V2]` views can share one underlying
cache over disjoint keyspaces. Wrong-type reads return `(zero, false)`
by default (fail-soft); use `GetTyped` for an explicit
`sentinel.ErrTypeMismatch`. See
[docs/rfcs/0002-generic-item-typing.md](docs/rfcs/0002-generic-item-typing.md)
for the design and the v3 deep-generics roadmap.

### Redis / Redis Cluster notes

When using Redis or Redis Cluster, item size accounting uses the configured serializer (e.g. msgpack) to align in-memory and remote representations. Provide the serializer via backend options (`WithSerializer` / `WithClusterSerializer`).
Expand Down Expand Up @@ -222,16 +253,49 @@ Current capabilities (implemented):
- Lightweight gossip snapshot exchange (in-process only).
- Rebalancing (primary change & lost ownership migrations) with batching and concurrency throttling metrics.
- Latency histograms for Get/Set/Remove.
- HTTP transport hardening: bounded request/response bodies, idle-connection timeout, concurrency cap, bearer-token auth, TLS / mTLS via `*tls.Config`, lifecycle-context cancellation on `Stop`, and surfaced listener errors. See "Transport hardening" below.

Limitations / not yet implemented:

- Replica-only ownership diff migrations.
- Full gossip-based dynamic membership & indirect probing.
- Advanced versioning (HLC / vector clocks).
- Tracing spans for distributed operations.
- Security (TLS/mTLS, auth) & compression.
- Compression on the wire.
- Persistence / durability (out of scope presently).

#### Transport hardening (since v2.0.0)

The dist HTTP server and the auto-created HTTP client share a single configuration surface — apply the same option to every node in the cluster.

```go
// 1) Limits: body caps, timeouts, concurrency, optional TLS.
limits := backend.DistHTTPLimits{
BodyLimit: 16 * 1024 * 1024, // server inbound cap
ResponseLimit: 16 * 1024 * 1024, // client inbound cap
IdleTimeout: 60 * time.Second,
TLSConfig: tlsConfig, // non-nil enables HTTPS on both sides
}

// 2) Auth: a shared bearer token covers most clusters; ServerVerify /
// ClientSign hooks are escape hatches for JWT, mTLS-derived
// identity, HMAC, etc.
auth := backend.DistHTTPAuth{Token: "shared-cluster-secret"}

bi, _ := backend.NewDistMemory(ctx,
backend.WithDistNode("nodeA", "127.0.0.1:7001"),
backend.WithDistReplication(3),
backend.WithDistHTTPLimits(limits),
backend.WithDistHTTPAuth(auth),
)
```

Operational helpers:

- `DistMemory.LifecycleContext()` — context derived from the constructor's; canceled on `Stop()`. In-flight handlers and replica forwards observe `Done()`.
- `LastServeError()` on both `distHTTPServer` and `ManagementHTTPServer` — replaces the prior silent-swallow of listener-loop crashes.

Defaults if `WithDistHTTPLimits` is not supplied: 16 MiB body cap, 5 s read/write/client timeouts, 60 s idle, fiber's 256 KiB concurrency cap. Auth is disabled by default.

#### Rebalancing & Ownership Migration (Experimental Phase 3)

The DistMemory backend includes an experimental periodic rebalancer that:
Expand Down Expand Up @@ -283,7 +347,7 @@ Test helpers `AddPeer` and `RemovePeer` simulate join / leave events that trigge
| Advanced versioning (HLC/vector) | Planned |
| Client SDK (direct routing) | Planned |
| Tracing spans | Planned |
| Security (TLS/auth) | Planned |
| Security (TLS/auth) | Done (since v2.0.0; see "Transport hardening") |
| Compression | Planned |
| Persistence | Out of scope (current phase) |
| Chaos / fault injection | Planned |
Expand Down
7 changes: 7 additions & 0 deletions cspell.config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,13 @@ dictionaries: []
words:
- acks
- ALPN
- assertable
- autosync
- backpressure
- baselining
- benchmarkdist
- benchmem
- benchstat
- benchtime
- bitnami
- bodyclose
Expand All @@ -54,6 +57,7 @@ words:
- cmap
- Cmder
- codacy
- codemod
- containedctx
- contextcheck
- cpuprofile
Expand Down Expand Up @@ -85,6 +89,7 @@ words:
- Fprintln
- freqs
- funlen
- geomean
- gerr
- gitversion
- GITVERSION
Expand All @@ -95,6 +100,7 @@ words:
- goconst
- gofiber
- GOFILES
- gofmt
- gofumpt
- goimports
- golangci
Expand Down Expand Up @@ -145,6 +151,7 @@ words:
- NOVENDOR
- paralleltest
- Pipeliner
- pluggability
- popd
- Prealloc
- protoc
Expand Down
Loading
Loading