Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: make encoding/binary.LittleEndian.* have negligible inlining cost on amd64 #42958

Open
josharian opened this issue Dec 2, 2020 · 11 comments

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Dec 2, 2020

encoding/binary.LittleEndian.* typically compile down to a single instruction on amd64. Their inlining cost is very high. I suggest we add a hack to the inlining budget to consider them to be an intrinsic, for the purposes of cost. Probably the same for other platforms as well.

cc @mdempsky

@mvdan
Copy link
Member

@mvdan mvdan commented Dec 3, 2020

Are there any other such std APIs that we should teach the inliner about? For example, I imagine many math/bits APIs also compile to one or two amd64 instructions.

@cespare
Copy link
Contributor

@cespare cespare commented Dec 3, 2020

Unsafe conversions (see #42739).

@egonelbre
Copy link
Contributor

@egonelbre egonelbre commented Dec 3, 2020

Related issue (#42788)

@randall77
Copy link
Contributor

@randall77 randall77 commented Dec 3, 2020

Most of the math/bits APIs are semantically inlined, so they aren't really in the same bucket.

"semantically inlined" = the compiler treats these APIs specially. It throws away their Go implementation and uses a compiler-provided one. The good part is that it always optimizes (unless you give -N, I think?). The bad part is that if you copy the implementation into your own function, you don't get the optimizations.

encoding/binary is not semantically inlined. They just reduce to good code using standard optimizations. The good part of that is if you copy an encoding/binary body into your own function, the optimization still happens.

Not particularly relevant to this issue, other than to note that math/bits probably doesn't need to be included here.

@bcmills
Copy link
Member

@bcmills bcmills commented Dec 3, 2020

encoding/binary is not semantically inlined. They just reduce to good code using standard optimizations.

My understanding is that inlining cost is currently based on the function AST. Would it make sense to compute a “compiled inlining cost” and stash that somewhere (maybe in the export data) for more accurate cost estimates? (Is there already a separate issue for that, beyond #17566?)

@randall77
Copy link
Contributor

@randall77 randall77 commented Dec 3, 2020

I don't think there is a separate issue.
Yes, cost estimates based on AST size certainly suck.
We could record with each function the count or bytes of instructions that it compiles to, which would be better.
Within a package this can get tricky because we inline everything before we compile anything, but cross-package that works.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Dec 3, 2020

Would it make sense to compute a “compiled inlining cost” and stash that somewhere (maybe in the export data) for more accurate cost estimates?

The long term aspiration is to do inlining after we're in SSA form and have had a chance to apply some initial optimizations. I think that's a ways off still though.

In the mean time, I think some ad-hoc special cases for notable standard library functions seems fine. Or maybe a std-only //go:inlinecost directive that allows use to override the inlining cost.

We could record with each function the count or bytes of instructions that it compiles to, which would be better.

That's an interesting idea.

@josharian
Copy link
Contributor Author

@josharian josharian commented Dec 3, 2020

We could record with each function the count or bytes of instructions that it compiles to, which would be better.

I’ve experimented with this.

It yields fairly drastically different inlining results—appends are suddenly crazy expensive, as is code that can panic, because the cold code paths count too. It’s probably still better, but I didn’t pursue it because of the inconsistency vs intra-package inlining.

@dsnet
Copy link
Member

@dsnet dsnet commented Dec 4, 2020

Whatever happens, I think we should avoid special casing just encoding/binary.LittleEndian.*. There are a number of equivalent functions that avoid the encoding/binary package as a dependency that may want to benefit from this as well (e.g., "compress/flate".load64)

@josharian
Copy link
Contributor Author

@josharian josharian commented Dec 25, 2020

@dsnet I hear you but special casing is easy and can happen ~immediately. Recognizing equivalent forms is definitely not easy in anything resembling the current inliner. These routines are used heavily in high performance code because it is one way to do something like vectorization. I'd rather not make the perfect to the enemy of the good here.

danderson added a commit to inetaf/netaddr that referenced this issue Dec 25, 2020
This allows IP manipulation operations to operate on IPs as two registers, which empirically
leads to significant speedups, in particular on OOO superscalars where the two halves can
be processed in parallel.

You might expect that we could keep the representation as [16]byte, and do a cycle of
BigEndian.Uint64+tweak+BigEndian.PutUint64, and that would compile down to efficient
code. Unfortunately, due to a variety of missing optimizations in the Go compiler,
that is not the case and that code turns into byte-wise operations. On the other hand,
converting to a uint64 pair at construction results in efficient construction (a pair of
MOVQ+BSWAPQ) and efficient twiddling operations (single-cycle arithmetic on 64-bit
pairs). See also:
 - golang/go#41663
 - golang/go#41684
 - golang/go#42958
 - The discussion in #63

name                              old time/op    new time/op    delta
StdIPv4-8                            146ns ± 2%     141ns ± 2%    -3.42%  (p=0.016 n=5+5)
IPv4-8                               120ns ± 1%     107ns ± 2%   -10.65%  (p=0.008 n=5+5)
IPv4_inline-8                        120ns ± 0%     118ns ± 1%    -1.67%  (p=0.016 n=4+5)
StdIPv6-8                            211ns ± 2%     215ns ± 1%    +2.18%  (p=0.008 n=5+5)
IPv6-8                               281ns ± 1%     252ns ± 1%   -10.19%  (p=0.008 n=5+5)
IPv4Contains-8                      11.8ns ± 4%     4.7ns ± 2%   -60.00%  (p=0.008 n=5+5)
ParseIPv4-8                         68.1ns ± 4%    78.8ns ± 1%   +15.74%  (p=0.008 n=5+5)
ParseIPv6-8                          419ns ± 1%     409ns ± 0%    -2.40%  (p=0.016 n=4+5)
StdParseIPv4-8                      73.7ns ± 1%    88.8ns ± 2%   +20.50%  (p=0.008 n=5+5)
StdParseIPv6-8                       132ns ± 2%     134ns ± 1%      ~     (p=0.079 n=5+5)
IPPrefixMasking/IPv4_/32-8          36.3ns ± 3%     4.8ns ± 4%   -86.72%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/17-8          39.0ns ± 0%     4.8ns ± 3%   -87.78%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/0-8           36.9ns ± 2%     4.8ns ± 4%   -87.07%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/128-8         32.7ns ± 1%     4.7ns ± 2%   -85.47%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/65-8          39.8ns ± 1%     4.7ns ± 1%   -88.13%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/0-8           40.7ns ± 1%     4.7ns ± 2%   -88.41%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/128-8     136ns ± 3%       5ns ± 2%   -96.53%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      142ns ± 2%       5ns ± 1%   -96.65%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       143ns ± 2%       5ns ± 3%   -96.67%  (p=0.008 n=5+5)
IPSetFuzz-8                         22.7µs ± 2%    16.4µs ± 2%   -27.84%  (p=0.008 n=5+5)

name                              old alloc/op   new alloc/op   delta
StdIPv4-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4-8                               0.00B          0.00B           ~     (all equal)
IPv4_inline-8                        0.00B          0.00B           ~     (all equal)
StdIPv6-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv6-8                               16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4Contains-8                       0.00B          0.00B           ~     (all equal)
ParseIPv4-8                          0.00B          0.00B           ~     (all equal)
ParseIPv6-8                          48.0B ± 0%     48.0B ± 0%      ~     (all equal)
StdParseIPv4-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
StdParseIPv6-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/17-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/128-8          0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/65-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8     16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                         2.60kB ± 0%    2.60kB ± 0%      ~     (p=1.000 n=5+5)

name                              old allocs/op  new allocs/op  delta
StdIPv4-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4-8                                0.00           0.00           ~     (all equal)
IPv4_inline-8                         0.00           0.00           ~     (all equal)
StdIPv6-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv6-8                                1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4Contains-8                        0.00           0.00           ~     (all equal)
ParseIPv4-8                           0.00           0.00           ~     (all equal)
ParseIPv6-8                           3.00 ± 0%      3.00 ± 0%      ~     (all equal)
StdParseIPv4-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
StdParseIPv6-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/17-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/128-8           0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/65-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8        1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                           33.0 ± 0%      33.0 ± 0%      ~     (all equal)

Signed-off-by: David Anderson <dave@natulte.net>
danderson added a commit to inetaf/netaddr that referenced this issue Dec 26, 2020
This allows IP manipulation operations to operate on IPs as two registers, which empirically
leads to significant speedups, in particular on OOO superscalars where the two halves can
be processed in parallel.

You might expect that we could keep the representation as [16]byte, and do a cycle of
BigEndian.Uint64+tweak+BigEndian.PutUint64, and that would compile down to efficient
code. Unfortunately, due to a variety of missing optimizations in the Go compiler,
that is not the case and that code turns into byte-wise operations. On the other hand,
converting to a uint64 pair at construction results in efficient construction (a pair of
MOVQ+BSWAPQ) and efficient twiddling operations (single-cycle arithmetic on 64-bit
pairs). See also:
 - golang/go#41663
 - golang/go#41684
 - golang/go#42958
 - The discussion in #63

name                              old time/op    new time/op    delta
StdIPv4-8                            146ns ± 2%     141ns ± 2%    -3.42%  (p=0.016 n=5+5)
IPv4-8                               120ns ± 1%     107ns ± 2%   -10.65%  (p=0.008 n=5+5)
IPv4_inline-8                        120ns ± 0%     118ns ± 1%    -1.67%  (p=0.016 n=4+5)
StdIPv6-8                            211ns ± 2%     215ns ± 1%    +2.18%  (p=0.008 n=5+5)
IPv6-8                               281ns ± 1%     252ns ± 1%   -10.19%  (p=0.008 n=5+5)
IPv4Contains-8                      11.8ns ± 4%     4.7ns ± 2%   -60.00%  (p=0.008 n=5+5)
ParseIPv4-8                         68.1ns ± 4%    78.8ns ± 1%   +15.74%  (p=0.008 n=5+5)
ParseIPv6-8                          419ns ± 1%     409ns ± 0%    -2.40%  (p=0.016 n=4+5)
StdParseIPv4-8                      73.7ns ± 1%    88.8ns ± 2%   +20.50%  (p=0.008 n=5+5)
StdParseIPv6-8                       132ns ± 2%     134ns ± 1%      ~     (p=0.079 n=5+5)
IPPrefixMasking/IPv4_/32-8          36.3ns ± 3%     4.8ns ± 4%   -86.72%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/17-8          39.0ns ± 0%     4.8ns ± 3%   -87.78%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/0-8           36.9ns ± 2%     4.8ns ± 4%   -87.07%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/128-8         32.7ns ± 1%     4.7ns ± 2%   -85.47%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/65-8          39.8ns ± 1%     4.7ns ± 1%   -88.13%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/0-8           40.7ns ± 1%     4.7ns ± 2%   -88.41%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/128-8     136ns ± 3%       5ns ± 2%   -96.53%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      142ns ± 2%       5ns ± 1%   -96.65%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       143ns ± 2%       5ns ± 3%   -96.67%  (p=0.008 n=5+5)
IPSetFuzz-8                         22.7µs ± 2%    16.4µs ± 2%   -27.84%  (p=0.008 n=5+5)

name                              old alloc/op   new alloc/op   delta
StdIPv4-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4-8                               0.00B          0.00B           ~     (all equal)
IPv4_inline-8                        0.00B          0.00B           ~     (all equal)
StdIPv6-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv6-8                               16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4Contains-8                       0.00B          0.00B           ~     (all equal)
ParseIPv4-8                          0.00B          0.00B           ~     (all equal)
ParseIPv6-8                          48.0B ± 0%     48.0B ± 0%      ~     (all equal)
StdParseIPv4-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
StdParseIPv6-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/17-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/128-8          0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/65-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8     16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                         2.60kB ± 0%    2.60kB ± 0%      ~     (p=1.000 n=5+5)

name                              old allocs/op  new allocs/op  delta
StdIPv4-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4-8                                0.00           0.00           ~     (all equal)
IPv4_inline-8                         0.00           0.00           ~     (all equal)
StdIPv6-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv6-8                                1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4Contains-8                        0.00           0.00           ~     (all equal)
ParseIPv4-8                           0.00           0.00           ~     (all equal)
ParseIPv6-8                           3.00 ± 0%      3.00 ± 0%      ~     (all equal)
StdParseIPv4-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
StdParseIPv6-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/17-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/128-8           0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/65-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8        1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                           33.0 ± 0%      33.0 ± 0%      ~     (all equal)

Signed-off-by: David Anderson <dave@natulte.net>
majst01 added a commit to majst01/netaddr that referenced this issue Dec 26, 2020
This allows IP manipulation operations to operate on IPs as two registers, which empirically
leads to significant speedups, in particular on OOO superscalars where the two halves can
be processed in parallel.

You might expect that we could keep the representation as [16]byte, and do a cycle of
BigEndian.Uint64+tweak+BigEndian.PutUint64, and that would compile down to efficient
code. Unfortunately, due to a variety of missing optimizations in the Go compiler,
that is not the case and that code turns into byte-wise operations. On the other hand,
converting to a uint64 pair at construction results in efficient construction (a pair of
MOVQ+BSWAPQ) and efficient twiddling operations (single-cycle arithmetic on 64-bit
pairs). See also:
 - golang/go#41663
 - golang/go#41684
 - golang/go#42958
 - The discussion in inetaf#63

name                              old time/op    new time/op    delta
StdIPv4-8                            146ns ± 2%     141ns ± 2%    -3.42%  (p=0.016 n=5+5)
IPv4-8                               120ns ± 1%     107ns ± 2%   -10.65%  (p=0.008 n=5+5)
IPv4_inline-8                        120ns ± 0%     118ns ± 1%    -1.67%  (p=0.016 n=4+5)
StdIPv6-8                            211ns ± 2%     215ns ± 1%    +2.18%  (p=0.008 n=5+5)
IPv6-8                               281ns ± 1%     252ns ± 1%   -10.19%  (p=0.008 n=5+5)
IPv4Contains-8                      11.8ns ± 4%     4.7ns ± 2%   -60.00%  (p=0.008 n=5+5)
ParseIPv4-8                         68.1ns ± 4%    78.8ns ± 1%   +15.74%  (p=0.008 n=5+5)
ParseIPv6-8                          419ns ± 1%     409ns ± 0%    -2.40%  (p=0.016 n=4+5)
StdParseIPv4-8                      73.7ns ± 1%    88.8ns ± 2%   +20.50%  (p=0.008 n=5+5)
StdParseIPv6-8                       132ns ± 2%     134ns ± 1%      ~     (p=0.079 n=5+5)
IPPrefixMasking/IPv4_/32-8          36.3ns ± 3%     4.8ns ± 4%   -86.72%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/17-8          39.0ns ± 0%     4.8ns ± 3%   -87.78%  (p=0.008 n=5+5)
IPPrefixMasking/IPv4_/0-8           36.9ns ± 2%     4.8ns ± 4%   -87.07%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/128-8         32.7ns ± 1%     4.7ns ± 2%   -85.47%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/65-8          39.8ns ± 1%     4.7ns ± 1%   -88.13%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_/0-8           40.7ns ± 1%     4.7ns ± 2%   -88.41%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/128-8     136ns ± 3%       5ns ± 2%   -96.53%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      142ns ± 2%       5ns ± 1%   -96.65%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       143ns ± 2%       5ns ± 3%   -96.67%  (p=0.008 n=5+5)
IPSetFuzz-8                         22.7µs ± 2%    16.4µs ± 2%   -27.84%  (p=0.008 n=5+5)

name                              old alloc/op   new alloc/op   delta
StdIPv4-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4-8                               0.00B          0.00B           ~     (all equal)
IPv4_inline-8                        0.00B          0.00B           ~     (all equal)
StdIPv6-8                            16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv6-8                               16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPv4Contains-8                       0.00B          0.00B           ~     (all equal)
ParseIPv4-8                          0.00B          0.00B           ~     (all equal)
ParseIPv6-8                          48.0B ± 0%     48.0B ± 0%      ~     (all equal)
StdParseIPv4-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
StdParseIPv6-8                       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/17-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv4_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/128-8          0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/65-8           0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_/0-8            0.00B          0.00B           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8     16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8      16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8       16.0B ± 0%      0.0B       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                         2.60kB ± 0%    2.60kB ± 0%      ~     (p=1.000 n=5+5)

name                              old allocs/op  new allocs/op  delta
StdIPv4-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4-8                                0.00           0.00           ~     (all equal)
IPv4_inline-8                         0.00           0.00           ~     (all equal)
StdIPv6-8                             1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv6-8                                1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPv4Contains-8                        0.00           0.00           ~     (all equal)
ParseIPv4-8                           0.00           0.00           ~     (all equal)
ParseIPv6-8                           3.00 ± 0%      3.00 ± 0%      ~     (all equal)
StdParseIPv4-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
StdParseIPv6-8                        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
IPPrefixMasking/IPv4_/32-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/17-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv4_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/128-8           0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/65-8            0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_/0-8             0.00           0.00           ~     (all equal)
IPPrefixMasking/IPv6_zone_/128-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/65-8       1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPPrefixMasking/IPv6_zone_/0-8        1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
IPSetFuzz-8                           33.0 ± 0%      33.0 ± 0%      ~     (all equal)

Signed-off-by: David Anderson <dave@natulte.net>
Signed-off-by: Stefan Majer <stefan.majer@gmail.com>
@dsnet
Copy link
Member

@dsnet dsnet commented Dec 27, 2020

Random thought: Should little-endian and big-endian functionality be moved to math/bits?

With the acceptance of #395, you could imagine an API that operates on fixed-width arrays rather than pointers:

func LoadUint32LE(v [4]byte) uint32
func StoreUint32LE(v uint32) [4]byte
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.