pkg/loadbalancer: Optimize L3n4Addr.Hash for performance #14617

gandro · 2021-01-14T15:44:36Z

This commit optimizes the L3n4Addr.Hash function. It is used as a key in
Golang maps and therefore needs to be unique, but it does not need to actually
be a secure hash.

The previous implementation was slow issues because it relied on defer
statements, reflection (due to the use of %+v), and SHA256. This
function is used during service lookup in Hubble's hot path, therefore
this optimization should reduce the overhead of Hubble:

Before this commit:

BenchmarkL3n4Addr_Hash_IPv4-8         	  608378	      1856 ns/op
BenchmarkL3n4Addr_Hash_IPv6_Short-8   	  620024	      2080 ns/op
BenchmarkL3n4Addr_Hash_IPv6_Long-8    	  535290	      2168 ns/op

After this commit:

BenchmarkL3n4Addr_Hash_IPv4-8         	15079537	        81.1 ns/op
BenchmarkL3n4Addr_Hash_IPv6_Short-8   	 8348496	       149 ns/op
BenchmarkL3n4Addr_Hash_IPv6_Long-8    	 4463416	       259 ns/op

brb

Nice! Do you have a flamegraph for "after"?

gandro · 2021-01-14T15:59:32Z

Nice! Do you have a flamegraph for "after"?

This is the flamegraph for just the service lookup afterwards. Most time is still spent in Hash, but now it's mostly memory allocation.

Edit: Flamegraph for Hubble overall:

Before: GetServiceByAddr takes 17.9% of CPU overall.

After: GetServiceByAddr is the first child of Parser.Decode() here, it takes 1.45% of CPU overall.

borkmann · 2021-01-14T16:13:29Z

pkg/loadbalancer/loadbalancer.go

+	b = append(b, '|')
+	b = strconv.AppendUint(b, uint64(a.Scope), 10)
+
+	return string(b)


One more thing, how does this relate to L3n4AddrID? E.g.

cilium/pkg/maps/lbmap/lbmap.go

Line 563 in 2b9485d

func (svcs svcMap) addFE(fe *loadbalancer.L3n4AddrID) *loadbalancer.SVC {

is calling the Hash() from an L3n4AddrID type where the Sprintf() was on %+v.

Calling fe.Hash() (where fe is of type L3n4AddrID) is equivalent to calling fe.L3n4Addr.Hash(). In other words, there is no inheritance. The %+v part was still only printing the contents of L3n4Addr, the ID is and was omitted.

Not sure that is intended, but this commit does not change that.

L3n4AddrID.Hash is simply L3n4Addr.Hash and thus the ID field will not be used. If that is an issue, we probably want to define L3n4AddrID.Hash as L3n4Addr.Hash()|ID or something along those lines.

aditighag

🚀

aanm · 2021-01-14T16:22:26Z

pkg/loadbalancer/loadbalancer.go

+	// Note: 32 bytes is not enough for long IPv6 addresses, but it is cheaper
+	// to reallocate on overflow than to check if a.IP is IPv6
+	b := make([]byte, 0, 32)
+
+	b = append(b, a.IP.String()...)


why don't we do something like:

ip := a.IP.String() if bytes.IndexRune(ip, ':') < 0 { b = make([]byte, 0, maxV4 /*(=32)*/) } else { b = make([]byte, 0, maxV6 /*(=39? + 5 + sizeOfScope)*/) }

~~This seems to do the trick without a measurable overhead. Will change.~~

Actually, after some more benchmarking, simply relying on reallocation is still faster in most cases than trying to be smart about it:

Fastest / Most imprecise:

const lenIPv4 = 15 const lenProto = 1 const lenPort = 6 const lenScope = 2 // Note: The capacity might not be enough for long IPv6 addresses, but it is cheaper // to reallocate on overflow than to check the length of a.IP b := make([]byte, 0, lenIPv4+lenProto+lenPort+lenScope)

BenchmarkL3n4Addr_Hash_IPv4-8 16097442 70.7 ns/op BenchmarkL3n4Addr_Hash_IPv6_Short-8 9431970 125 ns/op BenchmarkL3n4Addr_Hash_IPv6_Long-8 5561029 215 ns/op BenchmarkL3n4Addr_Hash_IPv6_Max-8 3719448 314 ns/op

Middle ground:

const lenIPv4 = 15 const lenIPv6 = 39 const lenProto = 1 const lenPort = 6 const lenScope = 2 var b []byte ip := a.IP.String() if strings.IndexRune(ip, ':') < 0 { b = make([]byte, 0, lenIPv4+lenProto+lenPort+lenScope) } else { b = make([]byte, 0, lenIPv6+lenProto+lenPort+lenScope) }

BenchmarkL3n4Addr_Hash_IPv4-8 14471590 80.9 ns/op BenchmarkL3n4Addr_Hash_IPv6_Short-8 9139532 132 ns/op BenchmarkL3n4Addr_Hash_IPv6_Long-8 6556764 204 ns/op BenchmarkL3n4Addr_Hash_IPv6_Max-8 3862281 313 ns/op

Slowest / Most-precise:

const lenProto = 1 const lenPort = 6 const lenScope = 2 ip := a.IP.String() b := make([]byte, 0, len(ip)+lenProto+lenPort+lenScope)

BenchmarkL3n4Addr_Hash_IPv4-8 12182581 97.3 ns/op BenchmarkL3n4Addr_Hash_IPv6_Short-8 7979997 147 ns/op BenchmarkL3n4Addr_Hash_IPv6_Long-8 5991332 215 ns/op BenchmarkL3n4Addr_Hash_IPv6_Max-8 3743278 317 ns/op

pkg/loadbalancer/loadbalancer.go

kaworu · 2021-01-14T16:31:09Z

pkg/loadbalancer/loadbalancer.go

+	b = append(b, '|')
+	b = strconv.AppendUint(b, uint64(a.Scope), 10)
+
+	return string(b)


L3n4AddrID.Hash is simply L3n4Addr.Hash and thus the ID field will not be used. If that is an issue, we probably want to define L3n4AddrID.Hash as L3n4Addr.Hash()|ID or something along those lines.

pkg/loadbalancer/loadbalancer.go

This adds a simple benchmark for different length IPv4- and IPv6-based L3nL4Addr hashes. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

This commit optimizes the L3n4Addr.Hash function. It is used as a key in maps and therefore needs to be unique, but it does not need to actually be a secure hash. The previous implementation was slow issues because it relied on defer statements, reflection (due to the use of `%+v`), and SHA256. This function is used during service lookup in Hubble's hot path, therefore this optimization should reduce the overhead of Hubble. Before this commit: BenchmarkL3n4Addr_Hash_IPv4-8 608378 1856 ns/op BenchmarkL3n4Addr_Hash_IPv6_Short-8 620024 2080 ns/op BenchmarkL3n4Addr_Hash_IPv6_Long-8 535290 2168 ns/op After this commit: BenchmarkL3n4Addr_Hash_IPv4-8 15079537 81.1 ns/op BenchmarkL3n4Addr_Hash_IPv6_Short-8 8348496 149 ns/op BenchmarkL3n4Addr_Hash_IPv6_Long-8 4463416 259 ns/op Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

gandro · 2021-01-14T17:36:09Z

Pushed an additional benchmark and made the length calculation a bit more tight (24 instead of 32 bytes by default). See also #14617 (comment)

gandro · 2021-01-14T17:37:52Z

test-me-please

christarazi

🚀

gandro · 2021-01-15T08:14:04Z

Net-Next hit #14598
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.13-net-next/460/
Two errors in the image build that look spurious and unrelated:
https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Kernel/4448/
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.20-kernel-4.9/408/

gandro · 2021-01-15T08:14:12Z

retest-net-next

gandro · 2021-01-15T08:14:32Z

retest-4.9

gandro · 2021-01-15T08:14:38Z

retest-4.19

aanm · 2021-01-15T09:34:28Z

pkg/loadbalancer/loadbalancer.go

 func (a L3n4Addr) Hash() string {
+	const lenIPv4 = 15


const lenIPv4 = 16 // len("255.255.255.255|")

aanm · 2021-01-15T09:37:27Z

pkg/loadbalancer/loadbalancer.go

+	const lenIPv4 = 15
+	const lenProto = 1
+	const lenPort = 6
+	const lenScope = 2


const lenScope = 1 // len("0") or len("1"), there isn't a "|" as part of the scope

The sum should be correct. I did not add the | to the IP address, and added it to the scope instead

kaworu

Thanks LGTM!

gandro · 2021-01-18T09:26:38Z

net-next hit #14598 again, as well as an error similar to #12690 - but in a different context: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.13-net-next/468/testReport/junit/Suite-k8s-1/13/K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_Tests_with_direct_routing_With_host_policy_Tests_NodePort/

gandro · 2021-01-18T09:26:49Z

retest-net-next

gandro added release-note/misc This PR makes changes that have no direct user impact. sig/loadbalancing labels Jan 14, 2021

gandro requested review from a team and aditighag January 14, 2021 15:44

maintainer-s-little-helper bot added this to In progress in 1.10.0 Jan 14, 2021

maintainer-s-little-helper bot assigned aditighag Jan 14, 2021

brb approved these changes Jan 14, 2021

View reviewed changes

borkmann approved these changes Jan 14, 2021

View reviewed changes

borkmann reviewed Jan 14, 2021

View reviewed changes

aditighag approved these changes Jan 14, 2021

View reviewed changes

aditighag removed their assignment Jan 14, 2021

aanm reviewed Jan 14, 2021

View reviewed changes

kaworu reviewed Jan 14, 2021

View reviewed changes

pkg/loadbalancer/loadbalancer.go Outdated Show resolved Hide resolved

gandro added 2 commits January 14, 2021 18:33

pkg/loadbalancer: Add simple benchmark for L3n4Addr.Hash

6465f85

This adds a simple benchmark for different length IPv4- and IPv6-based L3nL4Addr hashes. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

gandro force-pushed the pr/gandro/lb-hash-performance-improvement branch from d779912 to 9e32e1f Compare January 14, 2021 17:34

christarazi approved these changes Jan 14, 2021

View reviewed changes

aanm reviewed Jan 15, 2021

View reviewed changes

kaworu approved these changes Jan 15, 2021

View reviewed changes

maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 18, 2021

gandro merged commit 6754e8e into master Jan 18, 2021

gandro deleted the pr/gandro/lb-hash-performance-improvement branch January 18, 2021 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/loadbalancer: Optimize L3n4Addr.Hash for performance #14617

pkg/loadbalancer: Optimize L3n4Addr.Hash for performance #14617

gandro commented Jan 14, 2021 •

edited

brb left a comment

gandro commented Jan 14, 2021 •

edited

borkmann Jan 14, 2021

gandro Jan 14, 2021

kaworu Jan 14, 2021

aditighag left a comment

aanm Jan 14, 2021

gandro Jan 14, 2021 •

edited

gandro Jan 14, 2021

kaworu Jan 14, 2021

gandro commented Jan 14, 2021

gandro commented Jan 14, 2021

christarazi left a comment

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

aanm Jan 15, 2021

aanm Jan 15, 2021

gandro Jan 15, 2021

kaworu left a comment

gandro commented Jan 18, 2021

gandro commented Jan 18, 2021

pkg/loadbalancer: Optimize L3n4Addr.Hash for performance #14617

pkg/loadbalancer: Optimize L3n4Addr.Hash for performance #14617

Conversation

gandro commented Jan 14, 2021 • edited

brb left a comment

Choose a reason for hiding this comment

gandro commented Jan 14, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aditighag left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gandro Jan 14, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gandro commented Jan 14, 2021

gandro commented Jan 14, 2021

christarazi left a comment

Choose a reason for hiding this comment

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

gandro commented Jan 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaworu left a comment

Choose a reason for hiding this comment

gandro commented Jan 18, 2021

gandro commented Jan 18, 2021

gandro commented Jan 14, 2021 •

edited

gandro commented Jan 14, 2021 •

edited

gandro Jan 14, 2021 •

edited