-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.9 backports 2020-10-26 #13751
v1.9 backports 2020-10-26 #13751
Conversation
[ upstream commit 63ced97 ] This requirement is not specific to kubernetes, and the kubernetes guide already points towards the main system requirements document. Move the BPFFS requirement there. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 4a7a0db ] Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 4e29130 ] Since Cilium receives CNI events when a pod is created, Cilium can calculate the lag for kube-apiserver events by checking the time an ADD event for that Pod was received and subtracting by the time the CNI event for that pod was received. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit c40dfb0 ] - Use `GOOGLE_CREDENTIALS` instead of `GOOGLE_APPLICATION_CREDENTIALS`. - Refer to https://github.com/openshift/installer/blob/master/docs/user/gcp/iam.md to assign appropriate roles to service account. - Make it a bit clearer that the firewall rule creation is time-sensitive. Ref: #13627 (comment) Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit a0ec2ad ] Previously, the same IPv4 addr was used for SNAT-ing packets sent from local endpoints to outside (BPF-masq) and requests sent from outside clients to a NodePort service to an intermediate node. The commit [1] made the IPV4_NODEPORT selection to be based on k8s NodeIP. I.e. if a device has multiple IPv4 addrs, then the one which is used as a k8s NodeIP is preferred. This change might have broken environments in which a single device has two IP addrs: private (k8s NodeIP) used to communicate among k8s nodes and public - with outside. In this scenario, packets sent from local endpoints to outside are SNAT-ed to the private IP addr which is obviously wrong. An example of such environment is packet.net in "Layer3 networking mode" in which two NICs are bonded and have public and private IP addrs assigned. To fix this, introduce IPV4_MASQUERADE which is decoupled from IPV4_NODEPORT, and is determined by the same "firstGlobalV4Addr" routine as the latter, but with the preference for public IP addrs. One downside of the fix is that it introduces a redundant check on the "to-netdev" fastpath when IPV4_NODEPORT==IPV4_MASQUERADE which the BPF verifier is not able to optimize away (both IPV4_{NODEPORT,MASQUERADE} are defined via static data): 29: (61) r2 = *(u32 *)(r4 +26) 30: (18) r0 = 0x1501a8c0 32: (18) r3 = 0x1501a8c0 34: (67) r3 <<= 32 35: (77) r3 >>= 32 36: (1d) if r2 == r3 goto pc+55 37: (18) r0 = 0x1501a8c0 39: (18) r3 = 0x1501a8c0 41: (67) r3 <<= 32 42: (77) r3 >>= 32 43: (1d) if r2 == r3 goto pc+48 [1]: 870c02c ("Makes k8sNodeIP the preferred IP when initializing NodePort addresses".) Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit fd66932 ] Found using `git grep -P "\b([a-zA-Z]+)(?:\s+\1\b)+"` Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 7b52c08 ] The documentation should be refering to Hubble Relay when talking about cluster-wide visibility. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 541cbe9 ] Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 160d6b9 ] Let's consistently refer to Hubble Relay for cluster-wide visibility. While here, update the instructions for the new method of automatic TLS certificates generation (kubernetes cronjob). Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 1ec611e ] As of Hubble v0.7.0, which ships with Cilium 1.9, this environment variable is now deprecated. Let's encourage the use of the new one instead. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit eb832e4 ] Now that Hubble is enabled by default, Hubble installation instructions can be simplified to only instruct about enabling Hubble Relay for cluster-wide visibility. Details about the Hubble architecture are now omitted to ensure the instructions are easy to follow. A new link that points to the observability section of the documentation is added. In this section, users will find details about Hubble architecture should they be interested. Co-authored-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 72c1073 ] This commit is to print step on how to fix github action check for generated files. This will surely help new contributors to have better experience and reduce feedback loop. Suggested by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Tam Mach <sayboras@yahoo.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 61100c5 ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 91748c7 ] We haven't decoupled both yet. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 6b54fde ] This tag has no functional changes over previous one, however it includes a fix to the manifest format issue that was seen due to a bug in BuildKit (see #13429). Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 320b83f ] Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 253368a ] This commit contains no functional changes. It just reorders some of the functions by importance in this file so that it's easier to parse. This should hopefully reduce the number of times needed to scroll up and down. Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 1c0f00d ] This commit ensures that the EventQueue is fully drained, even when its not running its loop. When endpoints are being restored, their EventQueue is initialized, but non-running state (processing events). It is the job of the endpoint manager to kick off the event loop by calling Expose() on the endpoint. This commit fixes the following commits which causes Cilium to be stuck waiting for the EventQueue to drain (WaitToBeDrained()): 290d9e9 ("daemon: Init endpoint queue during validation") 79bf425 ("endpoint: Add function to initialize event queue") Cilium becoming stuck is described in the following flow: - Endpoints began restoration - Endpoint's EventQueue initialized (but never run) - Endpoint's metadata data resolver controller kicked off - Visilbity and bandwidth policy events enqueued - Endpoint fails restoration due to some issue (e.g. interface not found, etc) - Endpoint queued for deletion because it failed restoration - As part of endpoint deletion, the EventQueue is stopped and drained - Cilium deadlocks trying to drain, but the EventQueue run loop was never run, which would pop events off the `events` channel, and close the `eventsClosed` channel This commit fixes this deadlock by forcefully running the event loop to drain the queue. After the `events` channel is closed (from Stop()), the loop will terminate and the `eventsClosed` channel will close, thereby unblocking WaitToBeDrained(). Stacktrace from `gops`: ``` goroutine 632 [chan receive, 1 minutes]: github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).WaitToBeDrained(0xc00013c960) /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:322 +0x1ad github.com/cilium/cilium/pkg/endpoint.(*Endpoint).Delete(0xc000ad6900, 0x27faee0, 0xc00062ac40, 0x27fba60, 0xc00099a120, 0x2877280, 0xc0005d2340, 0x430101, 0x0, 0x0, ...) /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:2194 +0x91 github.com/cilium/cilium/daemon/cmd.(*Daemon).deleteEndpointQuiet(...) /go/src/github.com/cilium/cilium/daemon/cmd/endpoint.go:674 github.com/cilium/cilium/daemon/cmd.(*Daemon).regenerateRestoredEndpoints.func2(0xc00062ac40, 0xc000b63214, 0xc000ad6900) /go/src/github.com/cilium/cilium/daemon/cmd/state.go:302 +0x7c created by github.com/cilium/cilium/daemon/cmd.(*Daemon).regenerateRestoredEndpoints /go/src/github.com/cilium/cilium/daemon/cmd/state.go:296 +0x8a0 ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 2643a77 ] Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit e93da19 ] Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 17c4f80 ] The whitelist argument needs to be added only to the kiam-agent daemonset. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit f4fc01e ] In Kubernetes watchers, we need to keep the resource version of objects so the watchers can keep track of the newest changes of a particular object otherwise any event received from Kubernetes will be considered "new". Fixes: 620b243 ("pkg/k8s: use slimmer CRDs and Table types") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit aec71a9 ] Ingress and IngressClass resources have graduated to networking.k8s.io/v1 in Kubernetes v1.19. This also introduced changes to the objects. Using Helm `.Capabilities`, the correct fields are chosen. Also in Helm, `GitVersion` is being deprecated and has thus been replaced with `Version`. See Cilium issue #13611 See Kubernetes issue #89778 Signed-off-by: Youssef Azrak <yazrak.tech@gmail.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good for my PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good for my commit
FIY: The only conflict was in 6b54fde due to |
test-backport-1.9 |
Putting this back into draft. I noticed that two commits from PR #13674 are missing, because the scripts did not correlate the commits due to escaping errors in the commit message (which is used as a regex). Will move out of draft once I fixed that issue and made sure the script picks up the commits properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit looks good, also checked Michi's OpenShift doc update - looks good too!
Looks good for my commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like backports for commits b32876d and c0b8a82 from #13674 are missing (while f8fe7b5 was properly backported). I didn't spot anything in the PR description regarding these. Were they overlooked, were there merge conflicts or did I miss something?
Sorry, only saw #13751 (comment) after I submitted this. LGTM once this is resolved.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM given #13751 (comment) is resolved.
[ upstream commit 11282c7 ] Use $API_SERVER_IP and $API_SERVER_PORT variables in example commands where they are previously defined. This makes it easier to define these in the shell session and then just copy-and-paste the given commands. In other cases, use REPLACE_WITH_API_SERVER_IP/REPLACE_WITH_API_SERVER_PORT as placeholders instead, so they are not confused with the env vars. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 933e4af ] Use the $AZURE_* variables in example commands. This makes it easier to define these in the shell session and then just copy-and-paste the given commands. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit fdc0918 ] Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 7b4d700 ] During the backport process, when cherry-picking commits to backport, the cherry-pick script may fail if there are conflicts. In that case, the temporary file holding the backported commit is not clean up. This commit fixes it to clean the temporary file even in case of failure. Suggested-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 1cab8f3 ] This package adds a wrapper around fsnotify to support watching for files which do not exist yet. See the inline comments for more details for how it is implemented and its limitations. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit d26ab29 ] This change fixes the certloader Watch function to use the new fswatcher package which supports the event sequence emitted when Kubernetes updates a volume mount for a ConfigMap or Secret. The previous code did not work as expected on Kubernetes. Kubernetes sets up a volume mount with symlinks containing a double indirection, see e.g. spf13/viper#415 (comment) Simply adding the certificate paths to fsnotify therefore is not sufficient, as fsnotify watches the symlink target, which gets removed once Kubernetes updates the volume. Therefore the certloader code would handle a single update before its fsnotify watches became stale and it would stop picking up new changes. This change uses the newly introduced fswatcher package instead. It wraps fsnotify, but is designed to work with non-existing paths and symlink structures which contain a double indirection. Fixes: #13621 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 893ae41 ] When the certificate files are mounted in a K8s volume, it is common to observe a sequence of CHMOD, REMOVE, CREATE for each file. This causes the files being reloaded three times (once for each event). This commit introduces event coalescing, meaning that we delay the reload for a short amount of time. This means if there are additional events which would also cause a reload within the time frame, these additional events are coalesced with the inital event. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit 9af7a93 ] The bindata script in contrib/scripts/bindata.sh was removed in commit ff22f48 ("make: remove the need for go-bindata"). The corresponding make target was removed as well by that commit but crept back in through commit 2255260 ("k8s, policy: Add automated CRD generation"), but it wasn't ever used. Remove it for good. Fixes: 2255260 ("k8s, policy: Add automated CRD generation") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
10d1387
to
1f9130f
Compare
Pushed the two missing commits. It's now 33 commits, same as the output of |
test-backport-1.9 Edit: GKE seems stuck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my commits
Once this PR is merged, you can update the PR labels via: