Skip to content

🐛 bug: remove unbounded background goroutines from logger, cache, and internal storage memory#4378

Open
Copilot wants to merge 17 commits into
mainfrom
copilot/fix-goroutine-leaks
Open

🐛 bug: remove unbounded background goroutines from logger, cache, and internal storage memory#4378
Copilot wants to merge 17 commits into
mainfrom
copilot/fix-goroutine-leaks

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 28, 2026

Description

This change removes per-instance background goroutine patterns from logger and cache, while preserving the background GC in internal/storage/memory and making it cleanly stoppable through the storage’s existing lifecycle.

  • Logger

    • Replaced the per-middleware ${time} updater with shared timestamp state keyed by TimeFormat, timezone, and TimeInterval.
    • Reuses one background updater per unique logger timestamp configuration instead of a central scheduler or one goroutine per middleware instance.
    • Preserves configured formatting, timezone, and sub-second interval semantics by refreshing a shared pre-formatted timestamp value.
    • Keeps the request path to a single atomic load by storing the loaded value into per-request Data.Timestamp as a plain string.
  • Cache

    • Removed the per-instance cache timestamp updater.
    • Aligned the hot path with the repo-wide coarse clock in gofiber/utils by starting the shared timestamp updater in New() and using utils.Timestamp() for request-time second reads.
    • Falls back to an exact time.Now() read only near expiration and Age boundary cases so cache TTL and Age behavior stay unchanged while avoiding the per-request wall-clock call in the common case.
    • Keeps the timestamp read before taking the cache mutex to avoid extra work inside the critical section.
  • Internal storage memory

    • Kept the background GC loop in internal/storage/memory.
    • Made shutdown safe with a buffered done channel and an idempotent Close() so storage instances can stop their GC goroutine without blocking or double-close issues.
  • Tests

    • Added focused coverage for logger timestamp refresh after the configured interval.
    • Added focused coverage for shared logger timestamp state reuse across matching middleware configurations.
    • Reworked the logger timestamp test to use require.Eventually instead of a fixed sleep for more reliable CI behavior.
    • Added coverage that repeated internal/storage/memory.Close() calls do not block.
    • Added focused cache helper coverage for the exact-timestamp fallback guards.
// cache now uses the shared coarse clock on the hot path
utils.StartTimeStampUpdater()
ts := uint64(utils.Timestamp())

Changes introduced

List the new features or adjustments introduced in this pull request. Provide details on benchmarks, documentation updates, changelog entries, and if applicable, the migration guide.

  • Benchmarks: Re-ran the existing logger and cache benchmarks locally to confirm the cache hot path stays aligned with the shared timestamp updater and the simplified logger follow-up keeps the ${time} request path at a single atomic load without regressing behavior-sensitive paths.
  • Documentation Update: Detail the updates made to the documentation and links to the changed files.
  • Changelog/What's New: Remove unbounded per-instance background goroutines from logger and cache, simplify logger ${time} to shared keyed timestamp state instead of a central scheduler, make internal/storage/memory GC shutdown safe with a buffered stop signal and idempotent Close(), and align cache with the repo-wide shared timestamp updater while preserving expiration and Age semantics with exact-time fallback guards.
  • Migration Guide: If necessary, provide a guide or steps for users to migrate their existing code to accommodate these changes.
  • API Alignment with Express: Explain how the changes align with the Express API.
  • API Longevity: The change is internal-only and avoids introducing new public lifecycle APIs while fixing leaked goroutine and shutdown behavior within existing middleware/storage lifecycles.
  • Examples: Logger ${time} now uses shared keyed timestamp state with one atomic load per request, cache uses utils.Timestamp() on the hot path with exact-time fallback near expiration boundaries, and internal/storage/memory continues to reclaim expired entries in the background and now stops cleanly via Close().

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • Enhancement (improvement to existing features and functionality)
  • Documentation update (changes to documentation)
  • Performance improvement (non-breaking change which improves efficiency)
  • Code consistency (non-breaking change which improves code reliability and robustness)

Checklist

Before you submit your pull request, please make sure you meet these requirements:

  • Followed the inspiration of the Express.js framework for new functionalities, making them similar in usage.
  • Conducted a self-review of the code and provided comments for complex or critical parts.
  • Updated the documentation in the /docs/ directory for Fiber's documentation.
  • Added or updated unit tests to validate the effectiveness of the changes or new features.
  • Ensured that new and existing unit tests pass locally with the changes.
  • Verified that any new dependencies are essential and have been agreed upon by the maintainers/community.
  • Aimed for optimal performance with minimal allocations in the new code.
  • Provided benchmarks for the new code to analyze and improve upon.

Commit formatting

Please use emojis in commit messages for an easy way to identify the purpose or intention of a commit. Check out the emoji cheatsheet here: CONTRIBUTING.md

Copilot AI changed the title [WIP] Fix goroutine leaks in background processes 🐛 bug: remove unbounded background goroutines from logger, cache, and internal memory May 28, 2026
Copilot AI requested a review from ReneWerner87 May 28, 2026 20:36
@ReneWerner87 ReneWerner87 added this to v3 May 29, 2026
@ReneWerner87 ReneWerner87 added this to the v3 milestone May 29, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

Codecov Report

❌ Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.34%. Comparing base (57cf8d2) to head (2c37ea6).

Files with missing lines Patch % Lines
middleware/logger/logger.go 89.47% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4378      +/-   ##
==========================================
- Coverage   91.40%   91.34%   -0.06%     
==========================================
  Files         132      132              
  Lines       13120    13125       +5     
==========================================
- Hits        11992    11989       -3     
- Misses        711      717       +6     
- Partials      417      419       +2     
Flag Coverage Δ
unittests 91.34% <90.90%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ReneWerner87
Copy link
Copy Markdown
Member

@copilot Thanks, this does remove the goroutine leak, but please reconsider the mechanism per case instead of removing all three goroutines uniformly.

Logger + cache timestamp: the original bug was N goroutines for N middleware instances. The cleanest fix is a single process-wide coarse clock: one background updater started once via sync.Once, shared by all instances, refreshing an atomically stored timestamp about once per second. Per-request cost stays a single atomic load (cheaper than calling time.Now() on every request), and there is no per-instance goroutine to leak. A fully lazy time.Now()-per-request also works, but the singleton coarse clock keeps the hot path cheapest. Either is acceptable. Please do NOT add a Close() here: middleware constructors only return fiber.Handler and there is no lifecycle hook that would ever call it, so a Close() would be dead code.

internal/storage/memory GC: this case is different. That Storage already has a real lifecycle (Close() exists), so keep the GC goroutine and make it cleanly stoppable instead of removing it: a buffered done channel plus an idempotent Close() that guards against double close and against sending to a goroutine that has already exited (this is exactly the deadlock called out as finding #5 in #4364). The fully lazy prune you have now means an idle store never reclaims expired entries until the next Get/Set, which is a semantic regression for the default storage.

Summary: coarse-clock singleton (or lazy) for logger/cache, stoppable goroutine (buffered done + idempotent Close) for the memory storage GC.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 29, 2026

@copilot Thanks, this does remove the goroutine leak, but please reconsider the mechanism per case instead of removing all three goroutines uniformly.

Logger + cache timestamp: the original bug w...

Addressed in b7a14fa (plus 8cef4d3 for the follow-up field alignment). Logger/cache stay on the lazy request-time path, internal/memory was reverted, and internal/storage/memory now keeps its GC goroutine with a buffered done channel and idempotent Close() coverage.

Copilot AI changed the title 🐛 bug: remove unbounded background goroutines from logger, cache, and internal memory 🐛 bug: remove unbounded background goroutines from logger, cache, and internal storage memory May 29, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes per-middleware timestamp updater goroutines from logger and cache paths, while making the internal memory storage GC lifecycle safer through idempotent close behavior.

Changes:

  • Logger timestamps now refresh lazily during request handling after the configured interval.
  • Cache expiration/Age logic now reads current Unix seconds on demand instead of using a ticker-maintained timestamp.
  • Internal memory storage Close() is now idempotent and non-blocking, with focused regression tests added.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
middleware/logger/logger.go Replaces background timestamp updater with lazy atomic refresh logic.
middleware/logger/logger_test.go Adds coverage for ${time} refreshing after TimeInterval.
middleware/cache/cache.go Removes cache timestamp ticker and computes timestamps at request/store time.
internal/storage/memory/memory.go Adds sync.Once and buffered stop signaling for safe repeated Close() calls.
internal/storage/memory/memory_test.go Adds regression coverage for non-blocking repeated Close().

@gaby
Copy link
Copy Markdown
Member

gaby commented Jun 1, 2026

@copilot unit tests are failing.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Comment thread middleware/logger/logger.go Outdated
Comment thread middleware/cache/cache.go Outdated
Comment thread middleware/logger/logger_test.go Outdated
Comment thread middleware/logger/logger.go Outdated
Comment thread middleware/cache/cache.go Outdated
Comment thread middleware/logger/logger_test.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@middleware/logger/logger.go`:
- Around line 52-59: The refresh logic advances nextTimestampUpdate before
storing the new formatted time, which can let a concurrent caller skip refresh
and read the stale value; fix by reordering so the timestamp is published first
and then the deadline is advanced: call
timestamp.Store(now.In(cfg.timeZoneLocation).Format(cfg.TimeFormat)) before
performing nextTimestampUpdate.CompareAndSwap(...). Apply the same reorder to
the other refresh block (the section around lines 123-126) where
nextTimestampUpdate is advanced prior to storing the timestamp.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2543a7bb-413a-4767-b5f0-808f375359d6

📥 Commits

Reviewing files that changed from the base of the PR and between 0283e18 and 4ce22bb.

📒 Files selected for processing (5)
  • internal/storage/memory/memory.go
  • internal/storage/memory/memory_test.go
  • middleware/cache/cache.go
  • middleware/logger/logger.go
  • middleware/logger/logger_test.go

Comment thread middleware/logger/logger.go Outdated
@ReneWerner87
Copy link
Copy Markdown
Member

@copilot check the review hints and solve the comments, after this you mark the comment as resolved

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 1, 2026

@copilot check the review hints and solve the comments, after this you mark the comment as resolved

Addressed in 4bacbcb: logger now uses a monotonic time.Time refresh deadline, cache computes ts before locking, and the logger interval test now uses require.Eventually. Targeted go test -race -count=1 ./middleware/logger ./middleware/cache ./internal/storage/memory passed, and final validation completed with no new review or CodeQL findings.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

@ReneWerner87
Copy link
Copy Markdown
Member

@copilot follow up on the cache/logger timestamp choice. The lazy time.Now() per request is correct but it is the slowest of the three options we discussed, and it is also inconsistent with how the rest of this repo already solves the same problem.

The codebase already has the coarse-clock-singleton primitive in gofiber/utils:

// gofiber/utils/time.go
func Timestamp() uint32                  // ~1 ns atomic load
func StartTimeStampUpdater()             // single process-wide goroutine via sync.Once, idempotent
func StopTimeStampUpdater()              // safe to call repeatedly

Existing call sites in this repo that already use it:

  • middleware/limiter/limiter_fixed.go:29 and limiter_sliding.go:31 -> utils.StartTimeStampUpdater() in the constructor, then utils.Timestamp() on the hot path.
  • internal/memory/memory.go:44 and internal/storage/memory/memory.go:45 (the file you already touch in this PR) -> same pattern.

That means the cache change in this PR is the only middleware that diverges from the established pattern. Please align it:

middleware/cache/cache.go

Replace the per-request safeUnixSeconds(time.Now()) with the same primitive used by the limiter:

// in New(), once per constructor:
utils.StartTimeStampUpdater()

// on the hot path (before taking the lock, as already done):
ts := uint64(utils.Timestamp())

Reason: atomic.Load is roughly 1 to 2 ns, time.Now() via vDSO clock_gettime is roughly 20 to 50 ns, and the cache TTL granularity is already in whole seconds, so there is zero behavior loss. It also keeps the cache consistent with limiter and the storage layer.

middleware/logger/logger.go

This one is a bit different because the logger needs the formatted timestamp string (TimeFormat + Location), not just unix seconds, and formatting is more expensive than the load. Two acceptable options:

  1. Mirror the same pattern locally in the logger package: one process-wide sync.Once-guarded goroutine that refreshes an atomic.Pointer[string] containing the pre-formatted timestamp at TimeInterval cadence, and the per-request path becomes a single atomic load.
  2. If you prefer to keep the current lazy refresh, at minimum drive the interval check from utils.Timestamp() instead of calling time.Now() twice per request, so the cheap path costs one atomic load instead of two time.Now() calls.

Option 1 is the better long-term answer because logger is genuinely a hot path. Option 2 is acceptable as a stopgap.

General guidance going forward

When you have a choice between equivalent correctness options on a hot path, please pick the cheapest one and look around the repo first to see if there is already an established primitive for it. In this codebase that means: check gofiber/utils for shared building blocks (timestamps, byte/string conversion, etc.) and check sibling middleware for the established pattern before introducing a new one. Performance is part of correctness for a web framework, especially in middleware that runs on every request.

After the change, please re-run go test -race ./middleware/cache ./middleware/logger ./internal/storage/memory and confirm benchmarks before marking the comment resolved.

@ReneWerner87
Copy link
Copy Markdown
Member

@copilot small follow up on the logger change, easy to fold into the Option 1 refactor:

The per-request line

data.Timestamp.Store(timestamp.Load())

adds an atomic Load + atomic Store on every request, where the old code did a plain field copy (data.Timestamp = timestamp). If data.Timestamp and timestamp are both atomic.Pointer[string], the old assignment was actually a govet copylocks smell, so I understand why this was changed, but the Load+Store on the hot path is the wrong fix.

When you restructure to the sync.Once + process-wide goroutine + shared atomic.Pointer[string] (Option 1), please also revisit data.Timestamp:

  • If data is per-request scratch (no concurrent reads from another goroutine), make data.Timestamp a plain string field and assign data.Timestamp = timestamp.Load() (one atomic load, no store, no second atomic).
  • If data is shared and consumers expect data.Timestamp to be an atomic.Pointer[string], expose the process-wide pointer directly (e.g. pass the shared pointer in data once during setup, not per request) so consumers can Load() it themselves without an extra Store on every request.

Goal: per-request timestamp cost should end up at exactly one atomic.Load, nothing more. Today it is time.Now() + Load + branch + (rare CAS + format) + Load + Store. After the refactor it should be just Load.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 3, 2026

@copilot follow up on the cache/logger timestamp choice. The lazy time.Now() per request is correct but it is the slowest of the three options we discussed, and it is also inconsistent with ...

Addressed in 05dffa8 (with c9c6e6c).

Cache now starts the shared utils timestamp updater in New() and uses utils.Timestamp() on the request path, with an exact time.Now() fallback only when the coarse 1s clock would otherwise change expiration/Age behavior near cache boundaries. I also added focused helper coverage for those guard conditions.

Logger stays on the monotonic lazy refresh path because its default TimeInterval is still sub-second (500ms), so switching it to the repo-wide 1s coarse clock would change ${time} semantics. The current logger path still keeps the per-request work to a single time.Now() only when ${time} is enabled.

Re-ran go test -race ./middleware/cache ./middleware/logger ./internal/storage/memory and go test -run=^$ -bench='Benchmark_(Logger|Cache($|_))' -benchmem ./middleware/logger ./middleware/cache successfully.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 3, 2026

@copilot small follow up on the logger change, easy to fold into the Option 1 refactor:

The per-request line...

Addressed in aa60782. Logger now uses a shared timestamp scheduler keyed by format/timezone/interval, request data carries a plain string so the hot path is a single atomic load, and go test -race ./middleware/cache ./middleware/logger ./internal/storage/memory plus go test -run=^$ -bench='Benchmark_(Logger|Cache($|_))' -benchmem ./middleware/logger ./middleware/cache passed locally.

@ReneWerner87
Copy link
Copy Markdown
Member

@copilot follow-up on the logger timestamp design.

The current timestamp.go (143 lines: timestampScheduler, sharedTimestamp, central goroutine with dynamic time.Timer, wake channel, atomic.Pointer[string], sync.Once bootstrap) is significantly more machinery than this problem needs. The core requirement from issue #4330 is "stop accumulating one goroutine per logger.New() call". The simplest way to meet that is to take the existing master goroutine and just deduplicate it by (format, zone, interval).

Suggested minimal design

// timestamp.go
package logger

import (
    "sync"
    "sync/atomic"
    "time"
)

type tsKey struct {
    format, zone string
    interval     time.Duration
}

var (
    tsMu    sync.Mutex
    tsCache = map[tsKey]*atomic.Value{}
)

func sharedTimestamp(format string, loc *time.Location, interval time.Duration) *atomic.Value {
    key := tsKey{format, loc.String(), interval}
    tsMu.Lock()
    defer tsMu.Unlock()
    if v, ok := tsCache[key]; ok {
        return v
    }
    v := &atomic.Value{}
    v.Store(time.Now().In(loc).Format(format))
    tsCache[key] = v
    go func() {
        for {
            time.Sleep(interval)
            v.Store(time.Now().In(loc).Format(format))
        }
    }()
    return v
}

And in New():

var timestamp *atomic.Value
if strings.Contains(cfg.Format, "${"+TagTime+"}") {
    timestamp = sharedTimestamp(cfg.TimeFormat, cfg.timeZoneLocation, cfg.TimeInterval)
}
// hot path:
data.Timestamp = timestamp.Load().(string)

(Or keep data.Timestamp as a plain string and assign timestamp.Load().(string) once per request, exactly as you already do.)

Why this is enough

  • Goroutine body is the same Sleep + Store as on master, just wrapped in a key-keyed factory.
  • Number of goroutines is bounded by the number of unique (format, zone, interval) tuples used in the whole process. In practice that is 1 - 3, regardless of how many times logger.New() is called.
  • go test -count=5 ./... no longer accumulates goroutines: calls 2 - 5 reuse the cached entry.
  • No central scheduler, no dynamic timer reset, no wake channel, no shared mutex iterated per tick.
  • About 20 lines instead of 143.

What this does not do (and that is OK)

Issue #4330 also asks for an explicit done-channel and Close()/Stop() method on the logger. That would require changing the middleware API (today New() returns a fiber.Handler, not a lifecycled object), so it is out of scope for this PR. The dedupe pattern eliminates the actual reproducible symptom (goroutines accumulating per New()-call), which is what the issue is fundamentally about. A real Close() API would be a separate proposal.

What to delete

  • timestamp.go (current 143-line scheduler) -> replace with the 20-line version above
  • sharedTimestamps singleton + timestampScheduler + newTimestampScheduler
  • Test_Logger_SharedTimestampState can stay as a sanity check that two New() calls with the same config share the same *atomic.Value, just simplified
  • Test_Logger_TimeUpdatesAfterInterval can stay

General principle: prefer the smallest change that fixes the reported symptom. A 143-line scheduler with dynamic timer juggling to manage at most 1 - 3 goroutines is the wrong shape of solution.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Jun 4, 2026

@copilot follow-up on the logger timestamp design.

The current timestamp.go (143 lines: timestampScheduler, sharedTimestamp, central goroutine with dynamic time.Timer, wake channel, `atomic...

Addressed in a763a54.

Logger timestamp sharing is now reduced to a keyed atomic.Value cache with one background updater per unique (format, zone, interval) tuple, and the request path stays at a single atomic load into Data.Timestamp. I also kept the shared-state and interval-refresh tests aligned with the simpler implementation, and re-ran go test -race ./middleware/cache ./middleware/logger ./internal/storage/memory plus go test -run=^$ -bench='Benchmark_(Logger|Cache($|_))' -benchmem ./middleware/logger ./middleware/cache successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

🐛 [Bug]: Three goroutine leaks — background goroutines with no shutdown mechanism

4 participants