Skip to content

fix: balance CFSocket context retain to prevent server leak (#7)#9

Merged
atom2ueki merged 2 commits into
mainfrom
fix/socket-context-refcount-leak
May 1, 2026
Merged

fix: balance CFSocket context retain to prevent server leak (#7)#9
atom2ueki merged 2 commits into
mainfrom
fix/socket-context-refcount-leak

Conversation

@atom2ueki
Copy link
Copy Markdown
Owner

Summary

Fixes #7. SwiftWebServer.listen(_:host:completion:) used
Unmanaged.passRetained(self) for the CFSocket context's info pointer
with both retain and release callbacks nil, so the +1 retain was
never balanced. The server could never deinit once listen() ran, even
after close().

This PR switches to passUnretained(self) and provides matching
retain / release callbacks. CFSocketCreate retains via retain per
socket, and CFSocketInvalidate releases via release per socket, so
both the IPv4 and IPv6 sockets contribute balanced retain/release pairs
against self.

Why this approach over the suggestion in the issue

The issue suggested using passRetained(self) plus a release callback
that calls Unmanaged<SwiftWebServer>.fromOpaque(ptr).release(). That
would over-release: a single passRetained produces +1 retain, but the
same context is registered with two CFSockets, and CFSocket calls the
release callback once per socket on invalidation — net result, two
releases against one retain. Pairing passUnretained with both
retain and release callbacks lets CFSocket manage the lifecycle
cleanly per-socket.

Test plan

  • swift build — clean
  • swift test — all 44 tests pass, including new regression
  • New testServerDeinitsAfterCloseDoesNotLeak: listens on an
    ephemeral loopback port, closes, asserts the server deinits via a
    weak reference (with a brief run-loop spin to let CF's deferred
    release fire).

Reviewers

Tagging @claude for review. Will also run /codex review on this PR.

Closes #7

`listen()` used `Unmanaged.passRetained(self)` for the CFSocket context's
`info` pointer with both `retain` and `release` callbacks set to nil, so
nothing balanced the +1 retain. Once `listen()` ran, the server could
never deinit — even after `close()`.

Switch to `passUnretained(self)` and provide matching `retain`/`release`
callbacks. CFSocketCreate now retains via `retain` per socket, and
CFSocketInvalidate releases via `release` per socket, so the IPv4 and
IPv6 sockets each contribute a balanced retain/release pair against
`self`.

Adds a regression test that listens on an ephemeral loopback port,
closes, and asserts the server deinits via a weak reference.
@atom2ueki
Copy link
Copy Markdown
Owner Author

@claude Please review this PR. Specific things to verify:

  1. The retain/release callback contract — am I correct that CFSocketCreate calls retain on the info pointer for each socket, and CFSocketInvalidate calls release once per socket?
  2. With passUnretained(self) + paired retain/release callbacks, is the ARC lifecycle correct on every code path: success (both IPv4+IPv6), partial-failure mid-startup (failStartup), and close()?
  3. Does the regression test reliably catch a regression? Spinning the run loop briefly should let CF's deferred release fire — is 1s too short / too long?

Repo context: Sources/SwiftWebServer/Core/SwiftWebServer.swift is the only source change. Tests are in Tests/SwiftWebServerTests/SwiftWebServerTests.swift.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Claude finished @atom2ueki's task in 2m 21s —— View job


PR Review

Tasks

  • Gather context and read changed files
  • Verify CFSocket retain/release callback contract (question 1)
  • Verify ARC lifecycle on all code paths (question 2)
  • Evaluate regression test reliability (question 3)
  • Post full review

1. CFSocket retain/release callback contract

Your understanding is correct. CFSocketCreate calls the retain callback exactly once on the info pointer at creation time, and CFSocketInvalidate calls the release callback exactly once on info at invalidation time. This is confirmed by the CF open source; roughly:

// on create
if (context->retain) (context->retain)(context->info);

// on invalidate
if (context->release) (context->release)(context->info);

With two sockets (IPv4 + IPv6) sharing one context:

  • CFSocketCreate × 2 → retain × 2 → net +2
  • CFSocketInvalidate × 2 → release × 2 → net −2

Total: 0. Balanced.

Your explanation in the PR description of why the alternative (passRetained + release callback) over-releases is also correct: one passRetained → +1, two sockets call release → −2, net −1 → crash.


2. ARC lifecycle on every code path

Tracing each path with the current code:

Success (both IPv4 + IPv6)

Event Δ
passUnretained (no retain) 0
CFSocketCreate IPv4 → retain +1
CFSocketCreate IPv6 → retain +1
close()CFSocketInvalidate IPv4 → release −1
close()CFSocketInvalidate IPv6 → release −1
Net 0 ✓

failStartup — IPv4 ok, IPv6 create fails
IPv6 CFSocketCreate returns nil: the retain callback is never called for a failed creation. failStartup invalidates only ipv4cfsocket, triggering one release. Net: 0. ✓

failStartup — both sockets created, bind fails
Both sockets called retain. failStartup invalidates both → two release calls. Net: 0. ✓

Non-fatal IPv6 bind failure (falls back to IPv4-only) — lines 344–358

CFSocketInvalidate(ipv6cfsocket)
ipv6cfsocket = nil

IPv6 was created (→ retain +1), then immediately invalidated here (→ release −1). IPv4 is balanced by close(). Net: 0. ✓

close() called before listen()
Both socket ivars are nil; the nil-guards in close() short-circuit everything. Net: 0. ✓

Double close()
After the first close(), ivars are set to nil. Second call is a no-op. No double-release. ✓

One pre-existing issue (not introduced by this PR)Sources/SwiftWebServer/Core/SwiftWebServer.swift:219–221:

let unsafeIntTrue = withUnsafePointer(to: intTrue) { truePointer in
    return truePointer   // ← pointer escapes the closure
}

The returned truePointer is only valid inside the closure. After the closure returns, unsafeIntTrue is technically a dangling pointer. It works in practice because intTrue lives on the same stack frame, but it is undefined behavior in Swift's ownership model. Worth fixing separately with withUnsafePointer(to: intTrue) { setsockopt(..., $0, ...) } directly. Fix this →


3. Regression test reliability

The test is well-constructed and will reliably catch a regression.

Why it catches the old bug: If passRetained(self) with no release callback were restored, the retain fires once per socket but never a release. The CFSocket's internal hold on info is never dropped, so weakServer stays non-nil through the entire 1-second spin and XCTAssertNil fails. The detection is deterministic.

Timing — is 1 second right? The actual deinit should happen synchronously: CFSocketInvalidate fires release in-line (not deferred to the run loop), and the autoreleasepool {} drains immediately when the block closes. So weakServer should become nil before the spin loop even starts. The 1-second spin is purely a safety margin. Since the loop exits early on nil, the passing case runs in microseconds. 1 second is fine — generous, but not wasteful.

One observation: The test uses host: "127.0.0.1" (IPv4 only, one socket). The dual-stack path (two sockets, two retain/release pairs) is not exercised. For the specific regression being tested this is sufficient, but you could add a second variant with host: "localhost" or host: nil to cover the "shared context, two sockets" case. Not a blocker.


Verdict

Approve. The retain/release analysis is correct, all code paths are balanced, and the regression test will reliably catch a reintroduction of the leak. The pre-existing unsafeIntTrue pointer-escape is worth a follow-up fix but is out of scope for this PR.

Two follow-ups from Claude's review:

1. The pre-existing `withUnsafePointer(to: intTrue) { return truePointer }`
   pattern escapes the pointer past the closure scope (UB in Swift's
   ownership model). Move both setsockopt calls inside a single
   `withUnsafePointer(to: &intTrue)` block so the pointer never escapes.

2. The original regression test only exercised the IPv4-only path (one
   CFSocket → one retain/release pair). Refactor it into a helper and
   add a `host: "localhost"` variant so the dual-stack case (two
   sockets sharing one context → two retain/release pairs) is also
   covered — that's the path the issue called out as the worst-case
   leak.
@atom2ueki
Copy link
Copy Markdown
Owner Author

@claude Pushed a follow-up that addresses both suggestions from your review:

  1. SO_REUSEADDR setsockopt calls are now inside a single withUnsafePointer(to: &intTrue) block (no pointer escape).
  2. Renamed the regression test into a helper assertServerDeinitsAfterClose(host:) and added a dual-stack variant via host: "localhost" so the two-sockets-sharing-one-context case is exercised.

Could you re-validate? If clean, I'll merge.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Claude finished @atom2ueki's task in 1m 14s —— View job


Re-validation

Tasks

  • Read follow-up trigger comment
  • Validate SO_REUSEADDR setsockopt pointer-escape fix (SwiftWebServer.swift)
  • Validate refactored dual-stack regression test (SwiftWebServerTests.swift)
  • swift build + swift test — blocked (tool requires user approval in this environment)
  • Post final verdict

1. SO_REUSEADDR pointer-escape fix — ✅ Correct

Before (UB — pointer escapes closure):

let intTrue: UInt32 = 1
let unsafeIntTrue = withUnsafePointer(to: intTrue) { truePointer in
    return truePointer   // dangling after closure returns
}
// ... used later at setsockopt call sites

After (Sources/SwiftWebServer/Core/SwiftWebServer.swift:299–307):

var intTrue: UInt32 = 1
withUnsafePointer(to: &intTrue) { ptr in
    let optlen = socklen_t(MemoryLayout<UInt32>.size)
    if ipv4cfsocket != nil {
        setsockopt(CFSocketGetNative(ipv4cfsocket), SOL_SOCKET, SO_REUSEADDR, ptr, optlen)
    }
    if ipv6cfsocket != nil {
        setsockopt(CFSocketGetNative(ipv6cfsocket), SOL_SOCKET, SO_REUSEADDR, ptr, optlen)
    }
}

Both setsockopt calls execute inside the closure — ptr is valid for their entire lifetime and never escapes. The var is required for the & borrow sigil. This is the canonical correct pattern. ✓


2. Dual-stack regression test — ✅ Correct

The refactor is clean:

  • assertServerDeinitsAfterClose(host:file:line:) is a private helper with proper #filePath/#line propagation, so XCTest failure messages will point to the individual call site (not the helper). ✓
  • testServerDeinitsAfterClose_IPv4Only()host: "127.0.0.1" — one CFSocket, one retain/release pair. ✓
  • testServerDeinitsAfterClose_DualStackLoopback()host: "localhost" — two CFSockets sharing one context, two retain/release pairs — this is the dual-stack path the issue was most concerned about. ✓

The helper body is unchanged from the original, so its reliability characteristics (deterministic catch of the old bug, 1s run-loop spin as safety margin, early exit on nil) remain as analyzed in my first review.

One minor note: host: "localhost" resolves to both ::1 (IPv6) and 127.0.0.1 (IPv4) on most platforms, which is the intended dual-stack behavior. In a CI sandbox that has IPv6 disabled, the IPv6 socket creation will fail gracefully (the code already handles this with a fallback to IPv4-only), and the test will still pass — just exercising the single-socket path instead. This is acceptable behavior; adding an explicit note in the test doc comment would be nice but is not a blocker.


Verdict

Both changes are correct. Ship it. The pointer-escape UB is eliminated, and the test suite now exercises both the single-socket and dual-socket retain/release paths.

Note: I was unable to run swift build / swift test locally in this run — the CI workflow run linked below will be the authoritative signal that the suite passes green.

@atom2ueki atom2ueki merged commit 59dd507 into main May 1, 2026
@atom2ueki atom2ueki deleted the fix/socket-context-refcount-leak branch May 1, 2026 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refcount leak: Unmanaged.passRetained(self) in listen() never released

1 participant