Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.10 backports 2022-07-21 #20620

Merged
merged 9 commits into from
Jul 28, 2022
Merged

Conversation

gandro
Copy link
Member

@gandro gandro commented Jul 21, 2022

Once this PR is merged, you can update the PR labels via:

$ for pr in 19843 20491 20383 20537 20538 20569; do contrib/backporting/set-labels.py $pr done 1.10; done

christarazi and others added 4 commits July 21, 2022 14:35
[ upstream commit 9242f05 ]

This will come in handy for upcoming commits that will try to improve
the performance of these functions.

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 189d4d3 ]

Use strings.Builder instead of fmt.Sprintf() which is known to be not
performant.

This optimization was identified by inspecting a pprof (memory profile)
of a cluster with a wildcard L7 DNS rule policy (matchPattern: '*') and
a workload selected by the policy making a large amount of unique DNS
requests.

```
$ go test -v -run '^$' -bench 'Benchmark_maskedIPNetToLabelString' -benchtime 50000x -benchmem ./pkg/labels/cidr > old.txt
$ go test -v -run '^$' -bench 'Benchmark_maskedIPNetToLabelString' -benchtime 50000x -benchmem ./pkg/labels/cidr > new.txt
$ benchcmp old.txt new.txt
benchcmp is deprecated in favor of benchstat:
https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                old ns/op     new ns/op     delta
Benchmark_maskedIPNetToLabelString-8     8122          4450          -45.21%

benchmark                                old allocs     new allocs     delta
Benchmark_maskedIPNetToLabelString-8     39             32             -17.95%

benchmark                                old bytes     new bytes     delta
Benchmark_maskedIPNetToLabelString-8     488           368           -24.59%
```

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 46e3d07 ]

Avoid using fmt.Sprintf() when a simple string concatentation does the
same job, and preallocate slices where possible.

These optimizations were identified by inspecting a pprof (memory
profile) of a cluster with a wildcard L7 DNS rule policy (matchPattern:
'*') and a workload selected by the policy making a large amount of
unique DNS requests.

Without the previous commit that optimizes the maskedIPToLabelString():

```
$ go test -v -run '^$' -bench 'Benchmark_GetCIDRLabels' -benchtime 5000x -benchmem ./pkg/labels/cidr > old.txt
$ go test -v -run '^$' -bench 'Benchmark_GetCIDRLabels' -benchtime 5000x -benchmem ./pkg/labels/cidr > new.txt
$ benchcmp old.txt new.txt
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                     old ns/op     new ns/op     delta
Benchmark_GetCIDRLabels-8     391590        390163        -0.36%

benchmark                     old allocs     new allocs     delta
Benchmark_GetCIDRLabels-8     1455           1424           -2.13%

benchmark                     old bytes     new bytes     delta
Benchmark_GetCIDRLabels-8     64143         63189         -1.49%
```

With previous commit:

```
benchmark                     old ns/op     new ns/op     delta
Benchmark_GetCIDRLabels-8     390163        295259        -24.32%

benchmark                     old allocs     new allocs     delta
Benchmark_GetCIDRLabels-8     1424           1131           -20.58%

benchmark                     old bytes     new bytes     delta
Benchmark_GetCIDRLabels-8     63189         58707         -7.09%
```

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 372407f ]

The DNS proxy has a configurable semaphore, which rejects requests if
there are more than the configured number of requests in flight. There
is a metric to measure how long a DNS proxy waited on the semaphore, but
no metric to track if requests were dropped because of the semaphore.

The existing monitoring isn't quite sufficient. The metric
proxy_upstream_time{"timeout", "dns", "semaphoreTime"} will measure
requests rejected due to timeouts, but also any other error which
implements neterror.Timeout (e.g. an actual timeout communicating with
upstream DNS).

Signed-off-by: Rahul Joshi <rkjoshi@google.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro requested a review from a team as a code owner July 21, 2022 12:59
@gandro gandro added backport/1.10 kind/backports This PR provides functionality previously merged into master. labels Jul 21, 2022
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These LGTM. The latter was a test build fix for the former which was sent by an external contributor, so I've reviewed that PR's backport as well.

@michi-covalent
Copy link
Contributor

#20517 -- add an option to wait for kube-proxy (@michi-covalent)

thanks sebastian, but let me open a separate pull request to backport this. helm chart changed quite a bit since v1.10, and it's complaining about cilium.image template function missing:

Error: template: cilium/templates/cilium-agent-daemonset.yaml:508:18: executing "cilium/templates/cilium-agent-daemonset.yaml" at <include "cilium.image" .Values.image>: error calling include: template: no template "cilium.image" associated with template "gotpl"

@michi-covalent
Copy link
Contributor

opened #20628

@gandro
Copy link
Member Author

gandro commented Jul 26, 2022

opened #20628

Thanks! I'll remove the commit from this one then.

joamaki and others added 5 commits July 26, 2022 10:11
[ upstream commit a0c1ad6 ]

updateVersion only set EndpointSliceV1, but not EndpointSlice.
Fix this, so that tests can set the capabilities correctly via Force()
for older k8s versions.

Fixes: 7a1039f ("k8s: Consolidate check for EndpointSlice support")
Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 11b678b ]

Commit 372407f exported errFailedAcquireSemaphore and
errTimedOutAcquireSemaphore but didn't update the tests to use the new
types. This wasn't caught because privileged tests weren't run on the
corresponding PR cilium#20491.

Fixes: 372407f ("Add metric on number of requests rejected by DNS Proxy semaphore")

Signed-off-by: Tobias Klauser <tobias@cilium.io>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 908316c ]

Signed-off-by: Raphaël Pinson <raphael@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 522da61 ]

Signed-off-by: Raphaël Pinson <raphael@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 25e35f1 ]

When stopping the EndpointSlice Kubernetes watchers we should also
cancel the waiting to sync this group resource. In failing doing it so,
Cilium will timeout on waiting for these resources on Kubernetes
versions that should have EndpointSlice v1beta1 available but it's not
enabled.

Fixes: a0c1ad6 ("pkg/k8s/version: Set EndpointSlice cap when version >=1.17")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro force-pushed the pr/v1.10-backport-2022-07-21 branch from 149aa2c to 1a1001a Compare July 26, 2022 08:12
@gandro gandro removed the request for review from michi-covalent July 26, 2022 08:12
@gandro
Copy link
Member Author

gandro commented Jul 26, 2022

/test-backport-1.10

@gandro
Copy link
Member Author

gandro commented Jul 27, 2022

@gandro
Copy link
Member Author

gandro commented Jul 27, 2022

/ci-aks-1.10

@gandro
Copy link
Member Author

gandro commented Jul 27, 2022

/ci-eks-1.10

@gandro
Copy link
Member Author

gandro commented Jul 27, 2022

/ci-gke-1.10

@gandro
Copy link
Member Author

gandro commented Jul 28, 2022

I'm marking this ready to merge. The two outstanding reviews are for rather trivial PRs and one of the authors (Jussi) is on PTO.

As mentioned before, l4lb is known broken at the moment, and so is CI AKS (it was also failing on the last PR #20509)

@gandro gandro added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jul 28, 2022
@aanm aanm merged commit 2308095 into cilium:v1.10 Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants