Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.8 backports 2020-11-30 #14213

Merged
merged 9 commits into from
Dec 2, 2020
Merged

v1.8 backports 2020-11-30 #14213

merged 9 commits into from
Dec 2, 2020

Conversation

aanm
Copy link
Member

@aanm aanm commented Nov 30, 2020

Once this PR is merged, you can update the PR labels via:

$ for pr in 13923 14112 14145 14182 14090 14171 14110; do contrib/backporting/set-labels.py $pr done 1.8; done

[ upstream commit 53f35fb ]

Signed-off-by: Mandar U Jog <mjog@google.com>
Signed-off-by: André Martins <andre@cilium.io>
@aanm aanm requested a review from a team as a code owner November 30, 2020 11:30
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.8 kind/backports This PR provides functionality previously merged into master. labels Nov 30, 2020
@aanm aanm marked this pull request as draft November 30, 2020 11:31
@aanm aanm force-pushed the pr/v1.8-backport-2020-11-30 branch from ffcf922 to 3346230 Compare November 30, 2020 11:38
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for Hubble-related changes.

@aanm aanm marked this pull request as ready for review November 30, 2020 11:56
@aanm
Copy link
Member Author

aanm commented Nov 30, 2020

test-backport-1.8

@brb
Copy link
Member

brb commented Nov 30, 2020

@aanm You probably want to exclude #14044 (see #14059 (comment)).

@aanm
Copy link
Member Author

aanm commented Nov 30, 2020

@aanm You probably want to exclude #14044 (see #14059 (comment)).

Thanks, I forgot about it

brb and others added 6 commits November 30, 2020 13:29
[ upstream commit baeb61f ]

Add endpoint DebugPolicy option that, if enabled, logs endpoint policy
map update details to /var/run/cilium/state/endpoint-policy.log.

The new DebugPolicy option is enabled if the new flag
--debug-verbose=policy is set, but can be enabled also independently
via:

  cilium endpoint config <EPID> DebugPolicy=true

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 8704e85 ]

Endpoint's Mutex has been renamed as 'mutex'. Update comments to
reflect this and also the lock level requirement (Lock for writing,
RLock for reading).

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit baf84ad ]

Module listings can allow figuring out the availability of certain
functionality like iptables or aes modules which can be useful when
debugging certain types of problems.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 2a3e5d4 ]

The probe mode is expected to only run alongside kube-proxy as hybrid.
There was confusion that the kube-proxy log was throwing (harmless) warnings
to its log that it could not bind sockets to service ports in the hostns.
This is due to Cilium performing bind protection right out of the bind(2)
syscall with eBPF. To avoid this confusion, defer to kube-proxy to bind
sockets instead. This is less efficient and consuming more resources, but
if users want to avoid the overhead, they would run kube-proxy free in strict
mode anyway where Cilium does the bind protection by default anyway.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 1b29044 ]

This introduces a check that we do not overwrite the numeric security
identity provided by the datapath trace point. Only if the datapath did
not provide an identity (i.e. in `FROM_LXC` trace points) do we want to
fall back on the identity from the user-space ip cache or endpoint
manager.

The numeric identity from the datapath can differ from the one we obtain
from user-space (e.g. the endpoint manager or the IP cache), because the
identity could have changed between the time the datapath event was
created and the time the event reaches the Hubble parser. To aid in
troubleshooting, we want to preserve what the datapath observed when it
made the policy decision.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: André Martins <andre@cilium.io>
[ upstream commit 885a319 ]

Previously, the SetUpSuite() routine called netns.New(). It expected
that the latter only creates a new netns without setting it.  However,
according to the docs it's not the case:

    package netns // import "github.com/vishvananda/netns"

    func New() (ns NsHandle, err error)
        New creates a new network namespace, sets it as current and returns
        a handle to it.

This meant that we changed the netns before locking the OS thread which
could result in other Go runtime threads running in the test netns.

Fixes: b059c31 ("daemon: Add unit tests for device detection")
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: André Martins <andre@cilium.io>
@aanm
Copy link
Member Author

aanm commented Nov 30, 2020

test-backport-1.8

@aanm aanm force-pushed the pr/v1.8-backport-2020-11-30 branch from 3346230 to 1973d24 Compare November 30, 2020 12:30
@aanm
Copy link
Member Author

aanm commented Nov 30, 2020

test-backport-1.8

@aanm aanm force-pushed the pr/v1.8-backport-2020-11-30 branch from 1973d24 to 4af4ed5 Compare November 30, 2020 16:29
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backport of my commit is trivially correct 👍

@joestringer
Copy link
Member

Several RuntimeFQDNPolicies test failures, suggests something is off with the backport of PR #14110. I looked around and it seems like the v1.7 and master versions hit no such issue so it's presumably either some missing commit or some unintended change specific to this backport:

https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/2802/

Cannot access to allowed DNS name "http://world1.cilium.test"
Expected command: docker exec -i  app1 curl --path-as-is -s  -D /dev/stderr --output /dev/stderr -w '%{http_code}' --connect-timeout 5 http://world1.cilium.test 
To succeed, but it failed:
Exitcode: 28 
Err: Process exited with status 28
Stdout:
 	 000
Stderr:

@jrajahalme
Copy link
Member

jrajahalme commented Dec 1, 2020

Looks like the failure is releted to DNS poller which was removed in 1.9 (and master), so the upstream commit misses code that is needed. It is in v1.7 commits, found this via visual comparison of the commits:

This PR:

	_, _, err := poller.ruleManager.UpdateGenerateDNS(ctx, lookupTime, updatedDNSIPs)
	return err

v1.7:

	_, newlyAllocatedIdentities, err := poller.ruleManager.UpdateGenerateDNS(ctx, lookupTime, updatedDNSIPs)
	if err == nil {
		ipcache.UpsertGeneratedIdentities(newlyAllocatedIdentities)
	}
	return err

Looks like this happened during manual fixing of compile failures after backports were cherry-picked. I'll push a fix ASAP.

@jrajahalme
Copy link
Member

test-backport-1.8

[ upstream commit 60bd47f ]

Add a map for newly allocated identities to ipcache.AllocateCIDR
functions that the caller can use to upsert the IPs to ipcache later,
after affected endpoint policy maps have been updated.

Use this new functionality on the DNS proxy code path, that makes sure
that new policy map entries are in place before an IP received from a
DNS server is placed in ipcache. This is really straightforward as the
logic for waiting was already in place for delaying the forwarding of
the DNS response.

Policy update path is still allowing ipcache upserts at policy
ingestion time rather than waiting for the policy maps to be
updated. This means that new, more specific CIDRs (e.g., 10.0.0/24) in
policies can still cause momentary drops on traffic currently using a
less specific CIDR (e.g., 10.0/16).

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
… regenerated by endpoints.

[ upstream commit 8f20d3b ]

Move ipcache CIDR upserts and releases to the policy reaction queue,
where upserts can be executed after regenerations have been completed,
i.e. after endpoint policy maps have been updated. This way IP
addresses are mapped to newly allocated identities only after endpoint
policy maps are ready to classify them.

Correspondingly, on deletes the to-be-deleted CIDR identities are
first deleted from ipcache so that when they are deleted from endpoint
policy maps they are no longer used in classification. Releases of
CIDR identities must still be serialized with ipcache upserts via the
policy reaction queue so that they are executed in the same order
w.r.t. ipcache upserts as policy deletes and adds.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: André Martins <andre@cilium.io>
@joestringer
Copy link
Member

Fixed trivial build conflict, re-running CI.

@joestringer
Copy link
Member

test-backport-1.8

@aanm
Copy link
Member Author

aanm commented Dec 2, 2020

retest-runtime

1 similar comment
@jrajahalme
Copy link
Member

retest-runtime

@jrajahalme
Copy link
Member

Restarted runtime tests due to provisioning fail.

@aanm aanm merged commit c15fca6 into v1.8 Dec 2, 2020
@aanm aanm deleted the pr/v1.8-backport-2020-11-30 branch December 2, 2020 17:43
@brb brb mentioned this pull request Dec 2, 2020
@aanm aanm mentioned this pull request Dec 4, 2020
@brb brb mentioned this pull request Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants