Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/binary: using Write to write a []byte causes unnecessary allocation #27757

Closed
dominikh opened this issue Sep 19, 2018 · 9 comments
Closed
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@dominikh
Copy link
Member

What version of Go are you using (go version)?

go version go1.11 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

go test -benchmem -bench=. https://play.golang.org/p/mkwTIJMlyJE

What did you expect to see?

I expect binary.Write to write the input []byte directly to the writer, without causing an allocation the size of the input

What did you see instead?

goos: linux
goarch: amd64
BenchmarkWrite-8   	    3000	    351964 ns/op	 2007084 B/op	       3 allocs/op
PASS
ok  	_/tmp	1.122s

Write has a fast path for simple types that it supports directly. At the beginning of the fast path, a buffer the size of the input value is allocated (1 byte for uint8, 2 bytes for uint16, ..., n bytes for []byte). In the specific case of []byte, however, this buffer is unnecessary. We don't even copy the data into it, we just set the variable to the input.

There is no direct need for using binary.Write with a known input of []byte since endianness doesn't matter and one could just say w.Write(b) – however. inputs may be dynamic, and it should probably be binary.Write that handles []byte specially, not every caller. A lot of people also use binary.Write on []byte for the sake of consistency with surrounding code.

/cc @griesemer

@dominikh dominikh added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 19, 2018
@dominikh dominikh added this to the Go1.12 milestone Sep 19, 2018
@rsc
Copy link
Contributor

rsc commented Sep 19, 2018

If binary.Write took a ...interface{} then I could see adding a []byte special case. I'm really confused about why you care about a []byte special case for a single parameter though. Just call w.Write(data) instead of binary.Write(w, binary.BigEndian, data). It will not only allocate less, it will run faster since it doesn't have to convert the []byte to an interface{} and then type switch to find it back. I'm very skeptical this needs to be optimized.

But feel free to send a CL if it's important to you.

@rsc rsc added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Sep 19, 2018
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 19, 2018
@agnivade
Copy link
Contributor

After 2179e49, this has improved slightly.

goos: linux
goarch: amd64
pkg: stdtest
BenchmarkWriteBin-4   	   10000	    234153 ns/op	 2007072 B/op	       2 allocs/op
PASS

@josharian
Copy link
Contributor

IIUC, the request is to do this TODO:

2179e49#diff-e41aaf059e0604e922505a771fd5d190R292

This is a relatively easy first contribution.

@josharian josharian added the Suggested Issues that may be good for new contributors looking for work to do. label Oct 11, 2018
@gopherbot
Copy link

Change https://golang.org/cl/142938 mentions this issue: encoding/binary: avoid allocation when writing slice of uint8

@fmstephe
Copy link

fmstephe commented Oct 23, 2018

I would like to work on this issue.

If this is the right way of doing it. I have a PR which is almost ready. Should I wait for a reply here before I submit that?

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Oct 23, 2018

@fmstephe You can send the change without waiting for a reply. Thanks.

But in the comment stream I see that 6 days ago https://golang.org/cl/142938 was sent, so you should look at that change first and perhaps comment on that one if yours is better.

@fmstephe
Copy link

I must be blind :) It's right there above my comment.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Unplanned Dec 11, 2018
shuLhan added a commit to shuLhan/go that referenced this issue Mar 6, 2019
Continuing CL 133135, this commit allocated backing slice only if type
of data to be written is not slice of uint8.

Also, since the size of backing slice is known for basic types, we merge
the call of intDataSize into one switch statement.

Fixes golang#27757

Benchmark stats with count=10 using Go +f563652399,

name                     old time/op    new time/op     delta
ReadSlice1000Int32s-8      5.83µs ± 6%     5.86µs ± 7%      ~     (p=0.475 n=48+49)
ReadStruct-8               1.39µs ± 4%     1.38µs ± 1%      ~     (p=0.553 n=48+43)
ReadInts-8                  336ns ± 1%      340ns ± 1%    +1.23%  (p=0.000 n=42+47)
WriteInts-8                 361ns ± 5%      344ns ± 2%    -4.87%  (p=0.000 n=49+47)
WriteSlice1000Int32s-8     5.80µs ± 3%     5.86µs ± 4%    +1.17%  (p=0.000 n=48+49)
WriteSlice1024Uint8-8       371ns ± 5%       74ns ± 5%   -80.01%  (p=0.000 n=49+50)
PutUint16-8                0.84ns ± 0%     0.84ns ± 2%    +0.54%  (p=0.000 n=43+46)
PutUint32-8                0.84ns ± 0%     0.82ns ± 2%    -2.96%  (p=0.000 n=49+47)
PutUint64-8                0.95ns ± 0%     0.95ns ± 2%    +0.38%  (p=0.000 n=45+42)
LittleEndianPutUint16-8    0.71ns ± 1%     0.65ns ± 5%    -9.46%  (p=0.000 n=50+48)
LittleEndianPutUint32-8    0.63ns ± 0%     0.64ns ± 5%    +1.65%  (p=0.000 n=44+50)
LittleEndianPutUint64-8    0.71ns ± 0%     0.63ns ± 3%   -10.82%  (p=0.000 n=40+50)
PutUvarint32-8             27.2ns ± 1%     27.2ns ± 1%      ~     (p=0.743 n=49+46)
PutUvarint64-8             67.5ns ± 0%     67.5ns ± 0%      ~     (p=0.293 n=50+40)

name                     old speed      new speed       delta
ReadSlice1000Int32s-8     686MB/s ± 5%    683MB/s ± 6%      ~     (p=0.468 n=48+49)
ReadStruct-8             54.0MB/s ± 4%   54.4MB/s ± 1%      ~     (p=0.546 n=48+43)
ReadInts-8               89.3MB/s ± 1%   88.1MB/s ± 2%    -1.32%  (p=0.000 n=42+50)
WriteInts-8              82.9MB/s ± 5%   87.1MB/s ± 2%    +5.08%  (p=0.000 n=49+47)
WriteSlice1000Int32s-8    690MB/s ± 3%    682MB/s ± 4%    -1.15%  (p=0.000 n=48+49)
WriteSlice1024Uint8-8    2.76GB/s ± 4%  13.83GB/s ± 5%  +401.42%  (p=0.000 n=48+50)
PutUint16-8              2.37GB/s ± 0%   2.37GB/s ± 1%    -0.35%  (p=0.000 n=44+45)
PutUint32-8              4.75GB/s ± 0%   4.91GB/s ± 1%    +3.43%  (p=0.000 n=49+46)
PutUint64-8              8.43GB/s ± 1%   8.40GB/s ± 2%    -0.46%  (p=0.000 n=45+43)
LittleEndianPutUint16-8  2.80GB/s ± 0%   3.09GB/s ± 5%   +10.45%  (p=0.000 n=49+48)
LittleEndianPutUint32-8  6.33GB/s ± 0%   6.24GB/s ± 4%    -1.41%  (p=0.000 n=44+49)
LittleEndianPutUint64-8  11.2GB/s ± 1%   12.6GB/s ± 1%   +12.52%  (p=0.000 n=46+46)
PutUvarint32-8            147MB/s ± 1%    147MB/s ± 0%      ~     (p=0.947 n=49+43)
PutUvarint64-8            119MB/s ± 0%    119MB/s ± 0%      ~     (p=0.338 n=49+46)

name                     old alloc/op   new alloc/op    delta
ReadSlice1000Int32s-8      4.13kB ± 0%     4.13kB ± 0%      ~     (all equal)
ReadStruct-8                 200B ± 0%       200B ± 0%      ~     (all equal)
ReadInts-8                  32.0B ± 0%      32.0B ± 0%      ~     (all equal)
WriteInts-8                 64.0B ± 0%      64.0B ± 0%      ~     (all equal)
WriteSlice1000Int32s-8     4.13kB ± 0%     4.13kB ± 0%      ~     (all equal)
WriteSlice1024Uint8-8      1.06kB ± 0%     0.03kB ± 0%   -96.97%  (p=0.000 n=50+50)
PutUint16-8                 0.00B           0.00B           ~     (all equal)
PutUint32-8                 0.00B           0.00B           ~     (all equal)
PutUint64-8                 0.00B           0.00B           ~     (all equal)
LittleEndianPutUint16-8     0.00B           0.00B           ~     (all equal)
LittleEndianPutUint32-8     0.00B           0.00B           ~     (all equal)
LittleEndianPutUint64-8     0.00B           0.00B           ~     (all equal)
PutUvarint32-8              0.00B           0.00B           ~     (all equal)
PutUvarint64-8              0.00B           0.00B           ~     (all equal)

name                     old allocs/op  new allocs/op   delta
ReadSlice1000Int32s-8        2.00 ± 0%       2.00 ± 0%      ~     (all equal)
ReadStruct-8                 16.0 ± 0%       16.0 ± 0%      ~     (all equal)
ReadInts-8                   8.00 ± 0%       8.00 ± 0%      ~     (all equal)
WriteInts-8                  14.0 ± 0%       14.0 ± 0%      ~     (all equal)
WriteSlice1000Int32s-8       2.00 ± 0%       2.00 ± 0%      ~     (all equal)
WriteSlice1024Uint8-8        2.00 ± 0%       1.00 ± 0%   -50.00%  (p=0.000 n=50+50)
PutUint16-8                  0.00            0.00           ~     (all equal)
PutUint32-8                  0.00            0.00           ~     (all equal)
PutUint64-8                  0.00            0.00           ~     (all equal)
LittleEndianPutUint16-8      0.00            0.00           ~     (all equal)
LittleEndianPutUint32-8      0.00            0.00           ~     (all equal)
LittleEndianPutUint64-8      0.00            0.00           ~     (all equal)
PutUvarint32-8               0.00            0.00           ~     (all equal)
PutUvarint64-8               0.00            0.00           ~     (all equal)

Change-Id: I09a310d9b90a72e95fdaa8a3868f0acae7d69879
shuLhan added a commit to shuLhan/go that referenced this issue Mar 6, 2019
Continuing CL 133135, this commit allocated backing slice only if type
of data to be written is not slice of uint8.

Also, since the size of backing slice is known for basic types, we merge
the call of intDataSize into one switch statement.

Fixes golang#27757

Benchmark stats with count=10 using Go +f563652399,

name                     old time/op    new time/op     delta
ReadSlice1000Int32s-8      5.83µs ± 6%     5.86µs ± 7%      ~     (p=0.475 n=48+49)
ReadStruct-8               1.39µs ± 4%     1.38µs ± 1%      ~     (p=0.553 n=48+43)
ReadInts-8                  336ns ± 1%      340ns ± 1%    +1.23%  (p=0.000 n=42+47)
WriteInts-8                 361ns ± 5%      344ns ± 2%    -4.87%  (p=0.000 n=49+47)
WriteSlice1000Int32s-8     5.80µs ± 3%     5.86µs ± 4%    +1.17%  (p=0.000 n=48+49)
WriteSlice1024Uint8-8       371ns ± 5%       74ns ± 5%   -80.01%  (p=0.000 n=49+50)
PutUint16-8                0.84ns ± 0%     0.84ns ± 2%    +0.54%  (p=0.000 n=43+46)
PutUint32-8                0.84ns ± 0%     0.82ns ± 2%    -2.96%  (p=0.000 n=49+47)
PutUint64-8                0.95ns ± 0%     0.95ns ± 2%    +0.38%  (p=0.000 n=45+42)
LittleEndianPutUint16-8    0.71ns ± 1%     0.65ns ± 5%    -9.46%  (p=0.000 n=50+48)
LittleEndianPutUint32-8    0.63ns ± 0%     0.64ns ± 5%    +1.65%  (p=0.000 n=44+50)
LittleEndianPutUint64-8    0.71ns ± 0%     0.63ns ± 3%   -10.82%  (p=0.000 n=40+50)
PutUvarint32-8             27.2ns ± 1%     27.2ns ± 1%      ~     (p=0.743 n=49+46)
PutUvarint64-8             67.5ns ± 0%     67.5ns ± 0%      ~     (p=0.293 n=50+40)

name                     old speed      new speed       delta
ReadSlice1000Int32s-8     686MB/s ± 5%    683MB/s ± 6%      ~     (p=0.468 n=48+49)
ReadStruct-8             54.0MB/s ± 4%   54.4MB/s ± 1%      ~     (p=0.546 n=48+43)
ReadInts-8               89.3MB/s ± 1%   88.1MB/s ± 2%    -1.32%  (p=0.000 n=42+50)
WriteInts-8              82.9MB/s ± 5%   87.1MB/s ± 2%    +5.08%  (p=0.000 n=49+47)
WriteSlice1000Int32s-8    690MB/s ± 3%    682MB/s ± 4%    -1.15%  (p=0.000 n=48+49)
WriteSlice1024Uint8-8    2.76GB/s ± 4%  13.83GB/s ± 5%  +401.42%  (p=0.000 n=48+50)
PutUint16-8              2.37GB/s ± 0%   2.37GB/s ± 1%    -0.35%  (p=0.000 n=44+45)
PutUint32-8              4.75GB/s ± 0%   4.91GB/s ± 1%    +3.43%  (p=0.000 n=49+46)
PutUint64-8              8.43GB/s ± 1%   8.40GB/s ± 2%    -0.46%  (p=0.000 n=45+43)
LittleEndianPutUint16-8  2.80GB/s ± 0%   3.09GB/s ± 5%   +10.45%  (p=0.000 n=49+48)
LittleEndianPutUint32-8  6.33GB/s ± 0%   6.24GB/s ± 4%    -1.41%  (p=0.000 n=44+49)
LittleEndianPutUint64-8  11.2GB/s ± 1%   12.6GB/s ± 1%   +12.52%  (p=0.000 n=46+46)
PutUvarint32-8            147MB/s ± 1%    147MB/s ± 0%      ~     (p=0.947 n=49+43)
PutUvarint64-8            119MB/s ± 0%    119MB/s ± 0%      ~     (p=0.338 n=49+46)

name                     old alloc/op   new alloc/op    delta
ReadSlice1000Int32s-8      4.13kB ± 0%     4.13kB ± 0%      ~     (all equal)
ReadStruct-8                 200B ± 0%       200B ± 0%      ~     (all equal)
ReadInts-8                  32.0B ± 0%      32.0B ± 0%      ~     (all equal)
WriteInts-8                 64.0B ± 0%      64.0B ± 0%      ~     (all equal)
WriteSlice1000Int32s-8     4.13kB ± 0%     4.13kB ± 0%      ~     (all equal)
WriteSlice1024Uint8-8      1.06kB ± 0%     0.03kB ± 0%   -96.97%  (p=0.000 n=50+50)
PutUint16-8                 0.00B           0.00B           ~     (all equal)
PutUint32-8                 0.00B           0.00B           ~     (all equal)
PutUint64-8                 0.00B           0.00B           ~     (all equal)
LittleEndianPutUint16-8     0.00B           0.00B           ~     (all equal)
LittleEndianPutUint32-8     0.00B           0.00B           ~     (all equal)
LittleEndianPutUint64-8     0.00B           0.00B           ~     (all equal)
PutUvarint32-8              0.00B           0.00B           ~     (all equal)
PutUvarint64-8              0.00B           0.00B           ~     (all equal)

name                     old allocs/op  new allocs/op   delta
ReadSlice1000Int32s-8        2.00 ± 0%       2.00 ± 0%      ~     (all equal)
ReadStruct-8                 16.0 ± 0%       16.0 ± 0%      ~     (all equal)
ReadInts-8                   8.00 ± 0%       8.00 ± 0%      ~     (all equal)
WriteInts-8                  14.0 ± 0%       14.0 ± 0%      ~     (all equal)
WriteSlice1000Int32s-8       2.00 ± 0%       2.00 ± 0%      ~     (all equal)
WriteSlice1024Uint8-8        2.00 ± 0%       1.00 ± 0%   -50.00%  (p=0.000 n=50+50)
PutUint16-8                  0.00            0.00           ~     (all equal)
PutUint32-8                  0.00            0.00           ~     (all equal)
PutUint64-8                  0.00            0.00           ~     (all equal)
LittleEndianPutUint16-8      0.00            0.00           ~     (all equal)
LittleEndianPutUint32-8      0.00            0.00           ~     (all equal)
LittleEndianPutUint64-8      0.00            0.00           ~     (all equal)
PutUvarint32-8               0.00            0.00           ~     (all equal)
PutUvarint64-8               0.00            0.00           ~     (all equal)

Change-Id: I09a310d9b90a72e95fdaa8a3868f0acae7d69879
@agnivade
Copy link
Contributor

@dominikh - The CL sent to fix this was closed because folks didn't like the new make calls. Do you still want to keep this open and look for other ways to fix this ?

@dominikh
Copy link
Member Author

There's not a lot of enthusiasm for fixing this, and maybe it's easier for callers not to call binary.Write after all... I'll close the issue; we can reopen it if more people complain about this.

@golang golang locked and limited conversation to collaborators Jun 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

8 participants