Loader.Load allocates the batchRequest (and its done channel) at the very top of the function, before the cache lookup:
func (l *Loader[K, V]) Load(originalContext context.Context, key K) Thunk[V] {
ctx, finish := l.tracer.TraceLoad(originalContext, key)
req := &batchRequest[K, V]{
key: key,
done: make(chan struct{}),
}
...
if v, ok := l.cache.Get(ctx, key); ok {
...
return v // req is never used on this path
}
req is only used on a cache miss — it's sent to the batcher (l.curBatcher.input <- req) and read by the returned thunk. On a cache hit it's allocated and immediately discarded, so every cached Load pays for a struct + channel allocation it never uses. In read-heavy workloads where the same keys are loaded repeatedly, this is pure waste and shows up prominently in allocation profiles.
Moving the allocation below the cache-hit early-return eliminates it for all cache hits.
Benchmark (Load on a primed/cached key, -benchmem):
|
allocs/op |
B/op |
ns/op |
| before |
3 |
160 |
45.7 |
| after |
1 |
16 |
20.5 |
The remaining allocation on the hit path is the no-op finish closure returned by NoopTracer.TraceLoad (a fresh func(Thunk[V]){} per call, which isn't free inside a generic function) — happy to file that separately if of interest.
The full package test suite passes with the change. Fix proposed in #123.
Loader.Loadallocates thebatchRequest(and itsdonechannel) at the very top of the function, before the cache lookup:reqis only used on a cache miss — it's sent to the batcher (l.curBatcher.input <- req) and read by the returned thunk. On a cache hit it's allocated and immediately discarded, so every cachedLoadpays for a struct + channel allocation it never uses. In read-heavy workloads where the same keys are loaded repeatedly, this is pure waste and shows up prominently in allocation profiles.Moving the allocation below the cache-hit early-return eliminates it for all cache hits.
Benchmark (
Loadon a primed/cached key,-benchmem):The remaining allocation on the hit path is the no-op finish closure returned by
NoopTracer.TraceLoad(a freshfunc(Thunk[V]){}per call, which isn't free inside a generic function) — happy to file that separately if of interest.The full package test suite passes with the change. Fix proposed in #123.