Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.8 backports 2020-11-09 #13951

Merged
merged 15 commits into from
Nov 10, 2020
Merged

v1.8 backports 2020-11-09 #13951

merged 15 commits into from
Nov 10, 2020

Conversation

tklauser
Copy link
Member

@tklauser tklauser commented Nov 9, 2020

Once this PR is merged, you can update the PR labels via:

$ for pr in 13892 13927 13907 13886 13914 13921 13922 13878 12658 12865 13206; do contrib/backporting/set-labels.py $pr done 1.8; done

Dropped:

@tklauser tklauser requested a review from a team as a code owner November 9, 2020 13:18
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.8 kind/backports This PR provides functionality previously merged into master. labels Nov 9, 2020
@tklauser tklauser added backport/1.8 kind/backports This PR provides functionality previously merged into master. labels Nov 9, 2020
@tklauser
Copy link
Member Author

tklauser commented Nov 9, 2020

test-backport-1.8

Failure in privileged tests:

--- FAIL: TestGetFlows (0.00s)
    --- FAIL: TestGetFlows/Observe_0_flows_from_1_peer_without_address (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x17cd2b8]
goroutine 12 [running]:
testing.tRunner.func1.1(0x19afd00, 0x2e63370)
	/home/travis/.gimme/versions/go1.14.10.linux.amd64/src/testing/testing.go:999 +0x319
testing.tRunner.func1(0xc000145b00)
	/home/travis/.gimme/versions/go1.14.10.linux.amd64/src/testing/testing.go:1002 +0x402
panic(0x19afd00, 0x2e63370)
	/home/travis/.gimme/versions/go1.14.10.linux.amd64/src/runtime/panic.go:969 +0x166
github.com/cilium/cilium/pkg/hubble/testutils.(*FakeGRPCServerStream).Context(0x0, 0x0, 0x0)
	/home/travis/gopath/src/github.com/cilium/cilium/pkg/hubble/testutils/grpc.go:61 +0x28
github.com/cilium/cilium/pkg/hubble/relay/observer.(*Server).GetFlows(0xc00004fb30, 0xc0001dcc40, 0x1f2fea0, 0xc00047b290, 0x0, 0x0)
	/home/travis/gopath/src/github.com/cilium/cilium/pkg/hubble/relay/observer/server.go:81 +0x71
github.com/cilium/cilium/pkg/hubble/relay/observer.TestGetFlows.func11(0xc000145b00)
	/home/travis/gopath/src/github.com/cilium/cilium/pkg/hubble/relay/observer/server_test.go:337 +0x351
testing.tRunner(0xc000145b00, 0xc0004e7000)
	/home/travis/.gimme/versions/go1.14.10.linux.amd64/src/testing/testing.go:1050 +0xdc
created by testing.(*T).Run
	/home/travis/.gimme/versions/go1.14.10.linux.amd64/src/testing/testing.go:1095 +0x28b
FAIL	github.com/cilium/cilium/pkg/hubble/relay/observer	0.053s
FAIL

Looks like this is caused by the backport of #12865

@tklauser
Copy link
Member Author

tklauser commented Nov 9, 2020

test-backport-1.8

Failed due to unused parameters in bpf code: https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/2480/ Fixed by backporting #13921

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PRs look good 👍

@tklauser
Copy link
Member Author

tklauser commented Nov 9, 2020

test-backport-1.8

aanm and others added 5 commits November 10, 2020 11:42
[ upstream commit 73be2c1 ]

To check if images are published across all repositories the
`check-docker-images.sh` script will be able to perform this check of a
particular release.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 6ae59f1 ]

Running test/bpf/verifier-test.sh on net-next kernels fails with the
following error:

    Tail call map owned by prog type 3, but prog type is 6!

This happens because the verifier compares the type of the BPF program that
created each pinned map to the type of the new program that is trying to use
those maps. It errors if the two types (original map creator vs. map user)
don't match. Since previous loaded programs are of TC type, we need to
remove all maps before creating them again from XDP programs.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 86e419e ]

In cluster that have some high churn of pods being created and deleted
with different security identities, garbage collecting 250 identities
per minute might not be sufficient. Thus, we are increasing the default
limit to 2500 identities per minute.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit af95561 ]

The security id lookup could return nil if the identity cache
isn't initialized during endpoints restore time, resulting in a crash.
Hence, add a nil check before populating log record values.

Signed-off-by: Aditi Ghag <aditi@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 5923daf ]

There are applications that when a DNS name resolves to multiple IPs,
they will store the IPs and use them past their TTL point.

For example:
 - name resolves to IP1,IP2
 - app connects to IP1
 - protocol error forces disconnect
 - app connects to IP2

This patch keeps the IPs that map to a name alive as long as one of the
IPs for the given name is alive, so that applications like the one above
will not fail.

Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
@tklauser
Copy link
Member Author

tklauser commented Nov 10, 2020

test-backport-1.8

note: rebased on latest v1.8 to get Go bump, addressed @pchaigno's feedback and pulled in #12865 again with a fix for the test failure (#13206)

@tklauser
Copy link
Member Author

tklauser commented Nov 10, 2020

test-missed-k8s

previous failure: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3684/ all k8s 1.12 and 1.13 tests seem to have failed. Looks like the endpoint regeneration recovery controller is failing in some of (or all?) the tests.

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my commits

@tklauser
Copy link
Member Author

tklauser commented Nov 10, 2020

Previous failure (https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-K8s/3684/) was due to a complexity issue, likely introduced by backporting #13908 (thanks @pchaigno for investigating), see https://cilium.slack.com/archives/C7PE7V806/p1605016684205600 for discussion.

Will drop #13908 from this PR.

ArthurChiao and others added 4 commits November 10, 2020 15:17
[ upstream commit 492800e ]

Disable endpoints' policy verdict notification, e.g. via CLI
`cilium endpoint config <ep_id> PolicyVerdictNotification=false`
has broken, with compiler complaning about unused variables
as well as function signature mismatch.

Signed-off-by: Arthur Chiao <arthurchiao@hotmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 2b6ec7c ]

Up until this commit, POLICY_VERDICT_NOTIFY was always enabled in our
compile tests. We therefore missed a compile-time regression when policy
verdicts were disabled.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 3ea43ee ]

In GKE environments, if Cilium is deployed in 'kube-system' namespace
with the 'kubernetes.io/cluster-service: "true"' labels, GKE will delete
the DaemonSet within 1 minutes [1]. To avoid this problem these labels
are no longer installed for new installations, however, if the user
tries to install this combination of options, we should fail the
installation and warn the user about the correct usage.

[1] kubernetes/kubernetes#51376
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 14ff743 ]

When deploying Cilium in its own namespace, it's required to define
resource quotas. For now we will create a ResourceQuota for 10k pods
that are node-critical and 15 pods that are cluster-critical.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
pchaigno and others added 6 commits November 10, 2020 15:17
[ upstream commit b2f1bf0 ]

WaitTerminatingPodsInNsWithFilter is similar to the existing
WaitTerminatingPodsInNs helper. It waits for a set of pods in state
terminating to be removed from a specific namespace, but only applies to
pods matching a given filter.

This commit also renames the helper functions
WaitCleanAllTerminatingPodsXXX to WaitTerminatingPodsXXX for brievity.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 6699f66 ]

WaitforPods unmarshals the output of 'kubectl get pods -o json' into a
PodList object. This unmarshalling fails where there's a single pod
returned.

This commit introduces a new helper function, a simplified version of
WaitforPods, to wait for a single pod.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 417cded ]

This commit moves our RuntimeVerifier test to K8sVerifier, to be
executed on our 4.9 VMs. This will allow us to execute it on our 4.19
and net-next VMs as well in the future.

The BPF programs are currently compiled to try and achieve the maximum
complexity possible on 4.9 (where e.g. BPF NodePort is not supported).
The Makefile will be extended in a subsequent patchset to include
max-complexity targets for 4.19 and net-next.

In our K8s test pipelines, we can only access VMs through kubeconfig.
Thus, to be able to compile and load the BPF programs on the VM, we
define a new privileged Pod which mounts the bpffs and the Cilium source
directory. All test commands are executed in this privileged Pod after
uninstalling Cilium from the cluster.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
…comes with the grpc request has valuable metadata information in it that is useful for things like authentication and authorization for downstream servers.

[ upstream commit 39a3422 ]

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
[ upstream commit 1026a0a ]

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Fixes: b60676a92500 ("test: Move RuntimeVerifier to K8sVerifier")
Suggested-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
@tklauser
Copy link
Member Author

test-backport-1.8

@borkmann borkmann merged commit 943a93f into v1.8 Nov 10, 2020
@borkmann borkmann deleted the pr/v1.8-backport-2020-11-09 branch November 10, 2020 19:06
@aanm aanm mentioned this pull request Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants