Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/socket: reuse closure in Recv/SendMmsg #126

Closed
wants to merge 1 commit into from

Conversation

matzf
Copy link
Contributor

@matzf matzf commented Jan 26, 2022

The closure for the callback to RawConn.Read/Write is responsible for
multiple allocations per call to RecvMmsg and SendMmsg.
The batched read and write are used primarily to avoid per-call
overhead, so any such overhead negates the advantage of using these
functions.

This change introduces a struct type holding all the variables
captured by the closure passed to RawConn.Read/Write. The struct is
reused to amortize the allocations by means of a sync.Pool.
A suitable global sync.Pool instance already existed, for buffers used
to pack mmsg headers.

This change allows to reuse all allocations in WriteBatch. In ReadBatch,
only the returned net.Addr instances still need to be allocated for each
message, which cannot be avoided without fundamental changes to the
package interface.

name             old time/op    new time/op    delta
UDP/Batch-1-8      5.34µs ± 1%    5.40µs ± 3%     ~     (p=0.173 n=8+10)
UDP/Batch-2-8      9.74µs ± 1%    9.24µs ± 9%   -5.21%  (p=0.035 n=9+10)
UDP/Batch-4-8      16.2µs ± 4%    16.2µs ± 1%     ~     (p=0.758 n=9+7)
UDP/Batch-8-8      30.0µs ± 4%    30.0µs ± 4%     ~     (p=0.971 n=10+10)
UDP/Batch-16-8     57.3µs ± 3%    60.9µs ±16%   +6.43%  (p=0.031 n=9+9)
UDP/Batch-32-8      115µs ± 5%     119µs ± 6%   +3.15%  (p=0.043 n=10+10)
UDP/Batch-64-8      234µs ±16%     237µs ± 4%     ~     (p=0.173 n=10+8)
UDP/Batch-128-8     447µs ± 4%     470µs ± 7%   +5.22%  (p=0.002 n=10+10)
UDP/Batch-256-8     960µs ±10%     966µs ±19%     ~     (p=0.853 n=10+10)
UDP/Batch-512-8    1.00ms ± 7%    0.99ms ± 7%     ~     (p=0.387 n=9+9)

name             old alloc/op   new alloc/op   delta
UDP/Batch-1-8        232B ± 0%       52B ± 0%  -77.59%  (p=0.000 n=10+10)
UDP/Batch-2-8        280B ± 0%      104B ± 0%  -62.86%  (p=0.000 n=10+10)
UDP/Batch-4-8        384B ± 0%      208B ± 0%  -45.83%  (p=0.000 n=10+10)
UDP/Batch-8-8        592B ± 0%      416B ± 0%  -29.73%  (p=0.000 n=10+10)
UDP/Batch-16-8     1.01kB ± 0%    0.83kB ± 0%  -17.46%  (p=0.000 n=10+10)
UDP/Batch-32-8     1.84kB ± 0%    1.66kB ± 0%   -9.57%  (p=0.002 n=8+10)
UDP/Batch-64-8     3.51kB ± 0%    3.33kB ± 0%   -5.00%  (p=0.000 n=10+8)
UDP/Batch-128-8    6.84kB ± 0%    6.66kB ± 0%   -2.57%  (p=0.001 n=7+7)
UDP/Batch-256-8    13.5kB ± 0%    13.3kB ± 0%   -1.33%  (p=0.000 n=10+10)
UDP/Batch-512-8    14.7kB ± 0%    14.5kB ± 0%   -1.19%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
UDP/Batch-1-8        8.00 ± 0%      2.00 ± 0%  -75.00%  (p=0.000 n=10+10)
UDP/Batch-2-8        10.0 ± 0%       4.0 ± 0%  -60.00%  (p=0.000 n=10+10)
UDP/Batch-4-8        14.0 ± 0%       8.0 ± 0%  -42.86%  (p=0.000 n=10+10)
UDP/Batch-8-8        22.0 ± 0%      16.0 ± 0%  -27.27%  (p=0.000 n=10+10)
UDP/Batch-16-8       38.0 ± 0%      32.0 ± 0%  -15.79%  (p=0.000 n=10+10)
UDP/Batch-32-8       70.0 ± 0%      64.0 ± 0%   -8.57%  (p=0.000 n=10+10)
UDP/Batch-64-8        134 ± 0%       128 ± 0%   -4.48%  (p=0.000 n=10+10)
UDP/Batch-128-8       262 ± 0%       256 ± 0%   -2.29%  (p=0.000 n=10+10)
UDP/Batch-256-8       518 ± 0%       512 ± 0%   -1.16%  (p=0.000 n=10+10)
UDP/Batch-512-8       562 ± 0%       556 ± 0%   -1.07%  (p=0.000 n=10+10)

Contributes to golang/go#26838

The closure for the callback to RawConn.Read/Write is responsible for
multiple allocations per call to RecvMmsg and SendMmsg.
The batched read and write are used primarily to avoid per-call
overhead, so any such overhead negates the advantage of using these
functions.

This change introduces a struct type holding all the variables
captured by the closure passed to RawConn.Read/Write. The struct is
reused to amortize the allocations by means of a sync.Pool.
A suitable global sync.Pool instance already existed, for buffers used
to pack mmsg headers.

This change allows to reuse all allocations in WriteBatch. In ReadBatch,
only the returned net.Addr instances still need to be allocated for each
message, which cannot be avoided without fundamental changes to the
package interface.

```
name             old time/op    new time/op    delta
UDP/Batch-1-8      5.34µs ± 1%    5.40µs ± 3%     ~     (p=0.173 n=8+10)
UDP/Batch-2-8      9.74µs ± 1%    9.24µs ± 9%   -5.21%  (p=0.035 n=9+10)
UDP/Batch-4-8      16.2µs ± 4%    16.2µs ± 1%     ~     (p=0.758 n=9+7)
UDP/Batch-8-8      30.0µs ± 4%    30.0µs ± 4%     ~     (p=0.971 n=10+10)
UDP/Batch-16-8     57.3µs ± 3%    60.9µs ±16%   +6.43%  (p=0.031 n=9+9)
UDP/Batch-32-8      115µs ± 5%     119µs ± 6%   +3.15%  (p=0.043 n=10+10)
UDP/Batch-64-8      234µs ±16%     237µs ± 4%     ~     (p=0.173 n=10+8)
UDP/Batch-128-8     447µs ± 4%     470µs ± 7%   +5.22%  (p=0.002 n=10+10)
UDP/Batch-256-8     960µs ±10%     966µs ±19%     ~     (p=0.853 n=10+10)
UDP/Batch-512-8    1.00ms ± 7%    0.99ms ± 7%     ~     (p=0.387 n=9+9)

name             old alloc/op   new alloc/op   delta
UDP/Batch-1-8        232B ± 0%       52B ± 0%  -77.59%  (p=0.000 n=10+10)
UDP/Batch-2-8        280B ± 0%      104B ± 0%  -62.86%  (p=0.000 n=10+10)
UDP/Batch-4-8        384B ± 0%      208B ± 0%  -45.83%  (p=0.000 n=10+10)
UDP/Batch-8-8        592B ± 0%      416B ± 0%  -29.73%  (p=0.000 n=10+10)
UDP/Batch-16-8     1.01kB ± 0%    0.83kB ± 0%  -17.46%  (p=0.000 n=10+10)
UDP/Batch-32-8     1.84kB ± 0%    1.66kB ± 0%   -9.57%  (p=0.002 n=8+10)
UDP/Batch-64-8     3.51kB ± 0%    3.33kB ± 0%   -5.00%  (p=0.000 n=10+8)
UDP/Batch-128-8    6.84kB ± 0%    6.66kB ± 0%   -2.57%  (p=0.001 n=7+7)
UDP/Batch-256-8    13.5kB ± 0%    13.3kB ± 0%   -1.33%  (p=0.000 n=10+10)
UDP/Batch-512-8    14.7kB ± 0%    14.5kB ± 0%   -1.19%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
UDP/Batch-1-8        8.00 ± 0%      2.00 ± 0%  -75.00%  (p=0.000 n=10+10)
UDP/Batch-2-8        10.0 ± 0%       4.0 ± 0%  -60.00%  (p=0.000 n=10+10)
UDP/Batch-4-8        14.0 ± 0%       8.0 ± 0%  -42.86%  (p=0.000 n=10+10)
UDP/Batch-8-8        22.0 ± 0%      16.0 ± 0%  -27.27%  (p=0.000 n=10+10)
UDP/Batch-16-8       38.0 ± 0%      32.0 ± 0%  -15.79%  (p=0.000 n=10+10)
UDP/Batch-32-8       70.0 ± 0%      64.0 ± 0%   -8.57%  (p=0.000 n=10+10)
UDP/Batch-64-8        134 ± 0%       128 ± 0%   -4.48%  (p=0.000 n=10+10)
UDP/Batch-128-8       262 ± 0%       256 ± 0%   -2.29%  (p=0.000 n=10+10)
UDP/Batch-256-8       518 ± 0%       512 ± 0%   -1.16%  (p=0.000 n=10+10)
UDP/Batch-512-8       562 ± 0%       556 ± 0%   -1.07%  (p=0.000 n=10+10)
```

Contributes to golang/go#26838
@gopherbot
Copy link
Contributor

This PR (HEAD: d1dda93) has been imported to Gerrit for code review.

Please visit https://go-review.googlesource.com/c/net/+/380934 to see it.

Tip: You can toggle comments from me using the comments slash command (e.g. /comments off)
See the Wiki page for more info

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 1: Run-TryBot+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/380934.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Gopher Robot:

Patch Set 1:

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/380934.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Gopher Robot:

Patch Set 1: TryBot-Result+1

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/380934.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 1: Code-Review+2

(1 comment)


Please don’t reply on this GitHub thread. Visit golang.org/cl/380934.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Michael Knyszek:

Patch Set 1: Trust+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/380934.
After addressing review feedback, remember to publish your drafts!

gopherbot pushed a commit that referenced this pull request Jan 27, 2022
The closure for the callback to RawConn.Read/Write is responsible for
multiple allocations per call to RecvMmsg and SendMmsg.
The batched read and write are used primarily to avoid per-call
overhead, so any such overhead negates the advantage of using these
functions.

This change introduces a struct type holding all the variables
captured by the closure passed to RawConn.Read/Write. The struct is
reused to amortize the allocations by means of a sync.Pool.
A suitable global sync.Pool instance already existed, for buffers used
to pack mmsg headers.

This change allows to reuse all allocations in WriteBatch. In ReadBatch,
only the returned net.Addr instances still need to be allocated for each
message, which cannot be avoided without fundamental changes to the
package interface.

```
name             old time/op    new time/op    delta
UDP/Batch-1-8      5.34µs ± 1%    5.40µs ± 3%     ~     (p=0.173 n=8+10)
UDP/Batch-2-8      9.74µs ± 1%    9.24µs ± 9%   -5.21%  (p=0.035 n=9+10)
UDP/Batch-4-8      16.2µs ± 4%    16.2µs ± 1%     ~     (p=0.758 n=9+7)
UDP/Batch-8-8      30.0µs ± 4%    30.0µs ± 4%     ~     (p=0.971 n=10+10)
UDP/Batch-16-8     57.3µs ± 3%    60.9µs ±16%   +6.43%  (p=0.031 n=9+9)
UDP/Batch-32-8      115µs ± 5%     119µs ± 6%   +3.15%  (p=0.043 n=10+10)
UDP/Batch-64-8      234µs ±16%     237µs ± 4%     ~     (p=0.173 n=10+8)
UDP/Batch-128-8     447µs ± 4%     470µs ± 7%   +5.22%  (p=0.002 n=10+10)
UDP/Batch-256-8     960µs ±10%     966µs ±19%     ~     (p=0.853 n=10+10)
UDP/Batch-512-8    1.00ms ± 7%    0.99ms ± 7%     ~     (p=0.387 n=9+9)

name             old alloc/op   new alloc/op   delta
UDP/Batch-1-8        232B ± 0%       52B ± 0%  -77.59%  (p=0.000 n=10+10)
UDP/Batch-2-8        280B ± 0%      104B ± 0%  -62.86%  (p=0.000 n=10+10)
UDP/Batch-4-8        384B ± 0%      208B ± 0%  -45.83%  (p=0.000 n=10+10)
UDP/Batch-8-8        592B ± 0%      416B ± 0%  -29.73%  (p=0.000 n=10+10)
UDP/Batch-16-8     1.01kB ± 0%    0.83kB ± 0%  -17.46%  (p=0.000 n=10+10)
UDP/Batch-32-8     1.84kB ± 0%    1.66kB ± 0%   -9.57%  (p=0.002 n=8+10)
UDP/Batch-64-8     3.51kB ± 0%    3.33kB ± 0%   -5.00%  (p=0.000 n=10+8)
UDP/Batch-128-8    6.84kB ± 0%    6.66kB ± 0%   -2.57%  (p=0.001 n=7+7)
UDP/Batch-256-8    13.5kB ± 0%    13.3kB ± 0%   -1.33%  (p=0.000 n=10+10)
UDP/Batch-512-8    14.7kB ± 0%    14.5kB ± 0%   -1.19%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
UDP/Batch-1-8        8.00 ± 0%      2.00 ± 0%  -75.00%  (p=0.000 n=10+10)
UDP/Batch-2-8        10.0 ± 0%       4.0 ± 0%  -60.00%  (p=0.000 n=10+10)
UDP/Batch-4-8        14.0 ± 0%       8.0 ± 0%  -42.86%  (p=0.000 n=10+10)
UDP/Batch-8-8        22.0 ± 0%      16.0 ± 0%  -27.27%  (p=0.000 n=10+10)
UDP/Batch-16-8       38.0 ± 0%      32.0 ± 0%  -15.79%  (p=0.000 n=10+10)
UDP/Batch-32-8       70.0 ± 0%      64.0 ± 0%   -8.57%  (p=0.000 n=10+10)
UDP/Batch-64-8        134 ± 0%       128 ± 0%   -4.48%  (p=0.000 n=10+10)
UDP/Batch-128-8       262 ± 0%       256 ± 0%   -2.29%  (p=0.000 n=10+10)
UDP/Batch-256-8       518 ± 0%       512 ± 0%   -1.16%  (p=0.000 n=10+10)
UDP/Batch-512-8       562 ± 0%       556 ± 0%   -1.07%  (p=0.000 n=10+10)
```

Contributes to golang/go#26838

Change-Id: I16ecfc38dbb5a4d9b1ceacd1dd99fda38f346807
GitHub-Last-Rev: d1dda93
GitHub-Pull-Request: #126
Reviewed-on: https://go-review.googlesource.com/c/net/+/380934
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
@gopherbot
Copy link
Contributor

This PR is being closed because golang.org/cl/380934 has been merged.

@gopherbot gopherbot closed this Jan 27, 2022
@matzf matzf deleted the mmsg-closure-allocs branch January 28, 2022 08:09
WeiminShang added a commit to WeiminShang/net that referenced this pull request Nov 16, 2022
The closure for the callback to RawConn.Read/Write is responsible for
multiple allocations per call to RecvMmsg and SendMmsg.
The batched read and write are used primarily to avoid per-call
overhead, so any such overhead negates the advantage of using these
functions.

This change introduces a struct type holding all the variables
captured by the closure passed to RawConn.Read/Write. The struct is
reused to amortize the allocations by means of a sync.Pool.
A suitable global sync.Pool instance already existed, for buffers used
to pack mmsg headers.

This change allows to reuse all allocations in WriteBatch. In ReadBatch,
only the returned net.Addr instances still need to be allocated for each
message, which cannot be avoided without fundamental changes to the
package interface.

```
name             old time/op    new time/op    delta
UDP/Batch-1-8      5.34µs ± 1%    5.40µs ± 3%     ~     (p=0.173 n=8+10)
UDP/Batch-2-8      9.74µs ± 1%    9.24µs ± 9%   -5.21%  (p=0.035 n=9+10)
UDP/Batch-4-8      16.2µs ± 4%    16.2µs ± 1%     ~     (p=0.758 n=9+7)
UDP/Batch-8-8      30.0µs ± 4%    30.0µs ± 4%     ~     (p=0.971 n=10+10)
UDP/Batch-16-8     57.3µs ± 3%    60.9µs ±16%   +6.43%  (p=0.031 n=9+9)
UDP/Batch-32-8      115µs ± 5%     119µs ± 6%   +3.15%  (p=0.043 n=10+10)
UDP/Batch-64-8      234µs ±16%     237µs ± 4%     ~     (p=0.173 n=10+8)
UDP/Batch-128-8     447µs ± 4%     470µs ± 7%   +5.22%  (p=0.002 n=10+10)
UDP/Batch-256-8     960µs ±10%     966µs ±19%     ~     (p=0.853 n=10+10)
UDP/Batch-512-8    1.00ms ± 7%    0.99ms ± 7%     ~     (p=0.387 n=9+9)

name             old alloc/op   new alloc/op   delta
UDP/Batch-1-8        232B ± 0%       52B ± 0%  -77.59%  (p=0.000 n=10+10)
UDP/Batch-2-8        280B ± 0%      104B ± 0%  -62.86%  (p=0.000 n=10+10)
UDP/Batch-4-8        384B ± 0%      208B ± 0%  -45.83%  (p=0.000 n=10+10)
UDP/Batch-8-8        592B ± 0%      416B ± 0%  -29.73%  (p=0.000 n=10+10)
UDP/Batch-16-8     1.01kB ± 0%    0.83kB ± 0%  -17.46%  (p=0.000 n=10+10)
UDP/Batch-32-8     1.84kB ± 0%    1.66kB ± 0%   -9.57%  (p=0.002 n=8+10)
UDP/Batch-64-8     3.51kB ± 0%    3.33kB ± 0%   -5.00%  (p=0.000 n=10+8)
UDP/Batch-128-8    6.84kB ± 0%    6.66kB ± 0%   -2.57%  (p=0.001 n=7+7)
UDP/Batch-256-8    13.5kB ± 0%    13.3kB ± 0%   -1.33%  (p=0.000 n=10+10)
UDP/Batch-512-8    14.7kB ± 0%    14.5kB ± 0%   -1.19%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
UDP/Batch-1-8        8.00 ± 0%      2.00 ± 0%  -75.00%  (p=0.000 n=10+10)
UDP/Batch-2-8        10.0 ± 0%       4.0 ± 0%  -60.00%  (p=0.000 n=10+10)
UDP/Batch-4-8        14.0 ± 0%       8.0 ± 0%  -42.86%  (p=0.000 n=10+10)
UDP/Batch-8-8        22.0 ± 0%      16.0 ± 0%  -27.27%  (p=0.000 n=10+10)
UDP/Batch-16-8       38.0 ± 0%      32.0 ± 0%  -15.79%  (p=0.000 n=10+10)
UDP/Batch-32-8       70.0 ± 0%      64.0 ± 0%   -8.57%  (p=0.000 n=10+10)
UDP/Batch-64-8        134 ± 0%       128 ± 0%   -4.48%  (p=0.000 n=10+10)
UDP/Batch-128-8       262 ± 0%       256 ± 0%   -2.29%  (p=0.000 n=10+10)
UDP/Batch-256-8       518 ± 0%       512 ± 0%   -1.16%  (p=0.000 n=10+10)
UDP/Batch-512-8       562 ± 0%       556 ± 0%   -1.07%  (p=0.000 n=10+10)
```

Contributes to golang/go#26838

Change-Id: I16ecfc38dbb5a4d9b1ceacd1dd99fda38f346807
GitHub-Last-Rev: d1dda931f61bd08cab782fa50406574d5e227154
GitHub-Pull-Request: golang/net#126
Reviewed-on: https://go-review.googlesource.com/c/net/+/380934
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants