perf(broadcast): build PreparedMessage once per broadcast, not per re…#153
Open
tomi77 wants to merge 4 commits into
Open
perf(broadcast): build PreparedMessage once per broadcast, not per re…#153tomi77 wants to merge 4 commits into
tomi77 wants to merge 4 commits into
Conversation
…cipient Today every recipient of a broadcast pays the cost of ws.NewPreparedMessage / webtransport.NewPreparedMessage even though the frame bytes are identical for the entire broadcast. With N matching sockets that is N redundant frame allocations and an N-fold loss of the per-frame cache that gorilla/websocket keeps inside PreparedMessage. Introduce an opaque, transport-keyed, build-once cache (packet.PreparedFrame). The default adapter allocates one cache per broadcast in _encode (next to where it sets WsPreEncodedFrame) and threads it through Options. The websocket and webtransport transports build their PreparedMessage on first send and reuse the cached value for every remaining recipient. Backward compatible: when PreparedFrame is nil (unicast emits, custom adapters that don't set it, anything outside the broadcast path) the behavior is identical. The cache is allocated only inside the existing single-string fast path, so binary and polling paths are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hot path Two follow-ups to the per-broadcast PreparedMessage cache that target the remaining per-recipient allocations on the broadcast hot path: 1. types.Buffer.View() — a sibling of Clone() that shares the underlying byte slice and only allocates the small wrapper struct. The adapter's broadcast path is read-only over the encoded payload, so an independent read offset is enough to keep recipients isolated. 2. Use it from socket.io's Client.WriteToEngine when WsPreEncodedFrame is set (i.e., the adapter has pre-encoded a shared payload). Unicast emits and buffers without a View() implementation keep the existing Clone behavior. 3. Drop the defensive Options clone-and-normalize in engine.io's sendPacket. Options is read-only downstream — every transport already derives the effective compression flag from a possibly-nil Compress pointer. Polling's send loop is updated to treat a nil Options / Compress as the documented default (true) so behavior is preserved end-to-end. In broadcast benchmarks against a real fpserver feed (see profile in the upstream PR description), this removes ~17% of broadcast-path alloc_space (Buffer.Clone) and ~15% of alloc_objects (sendPacket Options clone), which translates to noticeably smoother CPU under fan-out bursts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8cc900d to
88d572f
Compare
Owner
|
I need more time to evaluate this PR. Please be patient. |
ef405a9 to
88db3de
Compare
zishang520
reviewed
May 8, 2026
zishang520
left a comment
Owner
There was a problem hiding this comment.
Looking forward to your response.
Comment on lines
+61
to
+80
| // View returns a Buffer that shares the underlying byte slice with b | ||
| // but has independent read state (off / lastRead). It is intended for | ||
| // hot paths like broadcast send, where the same encoded payload is | ||
| // consumed by many recipients concurrently. | ||
| // | ||
| // Callers MUST treat the returned view (and b itself, while any view | ||
| // is in flight) as read-only with respect to the underlying bytes — | ||
| // any write/grow/truncate on either side will race. Reads through | ||
| // independent views are safe because each view has its own off. | ||
| func (b *Buffer) View() *Buffer { | ||
| if b == nil { | ||
| return nil | ||
| } | ||
| return &Buffer{ | ||
| buf: b.buf, | ||
| off: b.off, | ||
| lastRead: b.lastRead, | ||
| } | ||
| } | ||
|
|
Owner
There was a problem hiding this comment.
View() returns a writable *Buffer sharing mutable state with no synchronization — the fix is to return a ReadableBuffer interface (read-only subset) so the compiler enforces the read-only contract instead of relying on documentation.
Comment 1 — View() return type (buffer-ext.go):
Introduce ReadableBuffer, a read-only subset of BufferInterface containing
only the read-side methods (io.Reader/Seeker/WriterTo/ByteScanner/
RuneScanner, Bytes, Stringer, Peek, Len, Size, Cap, Available, Next,
ReadBytes, ReadString). BufferInterface now embeds ReadableBuffer and adds
the write-side methods on top.
(*Buffer).View(), (*BytesBuffer).View(), and (*StringBuffer).View() now
return ReadableBuffer instead of *Buffer / BufferInterface. The compiler
therefore prevents callers from invoking write/mutate methods (Write,
Reset, Truncate, Grow, …) on a view whose underlying byte slice is shared
with the original — enforcing the read-only contract at compile time
rather than through documentation alone.
bufferViewer in client.go is updated to match, and the local data variable
in WriteToEngine is narrowed to io.Reader (which both ReadableBuffer and
BufferInterface satisfy), so no type assertion is needed at the call site.
Comment 2 — nil Options in sendPacket (socket.go):
Expand the comment to name the exact nil-guard in each transport:
websocket / webtransport: `if packet.Options != nil` before every field
access, so nil passes through safely.
polling: `opt == nil || opt.Compress == nil || *opt.Compress` in send(),
and `options != nil && options.Compress != nil` in DoWrite() —
both short-circuit correctly on nil.
The previous defensive clone-and-normalize (always producing a non-nil
Options with a non-nil Compress) is therefore no longer needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…cipient
Today every recipient of a broadcast pays the cost of ws.NewPreparedMessage / webtransport.NewPreparedMessage even though the frame bytes are identical for the entire broadcast. With N matching sockets that is N redundant frame allocations and an N-fold loss of the per-frame cache that gorilla/websocket keeps inside PreparedMessage.
Introduce an opaque, transport-keyed, build-once cache (packet.PreparedFrame). The default adapter allocates one cache per broadcast in _encode (next to where it sets WsPreEncodedFrame) and threads it through Options. The websocket and webtransport transports build their PreparedMessage on first send and reuse the cached value for every remaining recipient.
Backward compatible: when PreparedFrame is nil (unicast emits, custom adapters that don't set it, anything outside the broadcast path) the behavior is identical. The cache is allocated only inside the existing single-string fast path, so binary and polling paths are unaffected.