Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.7 backports 2020-11-23 #14140

Merged
merged 8 commits into from
Nov 30, 2020
Merged

v1.7 backports 2020-11-23 #14140

merged 8 commits into from
Nov 30, 2020

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented Nov 23, 2020

Once this PR is merged, you can update the PR labels via:

$ for pr in 10440 11684 13912 12832; do contrib/backporting/set-labels.py $pr done 1.7; done

@christarazi christarazi requested a review from a team as a code owner November 23, 2020 21:39
@christarazi christarazi added backport/1.7 kind/backports This PR provides functionality previously merged into master. labels Nov 23, 2020
@christarazi christarazi requested a review from brb November 23, 2020 21:40
@christarazi
Copy link
Member Author

test-backport-1.7

@christarazi
Copy link
Member Author

test-backport-1.7

@christarazi
Copy link
Member Author

christarazi commented Nov 24, 2020

@brb Looks like #13912 requires a bit more diligence to backport. The latest compilation error I have is because of signal.SignalWakeGC which was introduced by #11684. Would you mind having a look into finishing this backport? Feel free to push directly to this branch.

@@ -162,7 +162,7 @@ func LookupElementFromPointers(fd int, structPtr unsafe.Pointer, sizeOfStruct ui
}

if ret != 0 || err != 0 {
return fmt.Errorf("Unable to lookup element in map with file descriptor %d: %s", fd, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%w is fairly recent addition, does version of Go use in 1.7 support it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like 1.7 is on Go 1.13, so luckily it was the first version where this feature was added!

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my changes, thanks.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will take care of this backport.

@brb brb force-pushed the pr/v1.7-backport-2020-11-23 branch from cda179c to 11055b3 Compare November 25, 2020 13:16
@brb
Copy link
Member

brb commented Nov 25, 2020

test-backport-1.7

@brb
Copy link
Member

brb commented Nov 25, 2020

kube-system cilium-twv6q 0/1 ImagePullBackOff 0 10m 192.168.36.11 k8s1 <none> :-(

@brb
Copy link
Member

brb commented Nov 25, 2020

test-backport-1.7

3 similar comments
@brb
Copy link
Member

brb commented Nov 26, 2020

test-backport-1.7

@brb
Copy link
Member

brb commented Nov 26, 2020

test-backport-1.7

@brb
Copy link
Member

brb commented Nov 27, 2020

test-backport-1.7

@brb
Copy link
Member

brb commented Nov 27, 2020

Oh, snap, more stuff needs to be backported:

Stderr:
 	 # github.com/cilium/cilium/pkg/maps/ctmap [github.com/cilium/cilium/pkg/maps/ctmap.test]
	 pkg/maps/ctmap/ctmap_privileged_test.go:226:15: undefined: mapTypeIPv4AnyGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:229:14: undefined: newMap
	 pkg/maps/ctmap/ctmap_privileged_test.go:229:35: undefined: mapTypeIPv4AnyGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:235:15: undefined: mapTypeIPv4TCPGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:238:14: undefined: newMap
	 pkg/maps/ctmap/ctmap_privileged_test.go:238:35: undefined: mapTypeIPv4TCPGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:372:15: undefined: mapTypeIPv6AnyGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:375:16: undefined: newMap
	 pkg/maps/ctmap/ctmap_privileged_test.go:375:37: undefined: mapTypeIPv6AnyGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:381:15: undefined: mapTypeIPv6TCPGlobal
	 pkg/maps/ctmap/ctmap_privileged_test.go:381:15: too many errors
	 make: *** [tests-privileged] Error 1

tklauser and others added 8 commits November 27, 2020 16:46
[ upstream commit b563284 ]

All of these are not used outside the ctmap package.

Also make the mapTypeIP* consts typed to avoid type conversions when
using them.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
[ upstream commit fc5a3bd ]

There are users on 4.9 kernels which are suffering connectivity loss since
the CT map is full and GC doesn't trigger yet. We can help improving the
situation with the same framework we set in place for NAT when under stress.
Upon insertion error, send a signal to the agent in order to trigger GC so
that it can free up old entries.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
[ upstream commit 24fcbe6 ]

Rework the 1:1 relationship with signal to channel and instead allow different
signals from BPF datapath for the same go channel. This is useful so we can push
the different BPF signals into the metric collection. Wire-up Signal{CT,Nat}FillUp
signal into the SignalWakeGC channel.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
[ upstream commit c9810bf ]

[ Backporter's note: Resolved slight conflicts with (*Map).DumpEntries
                     and removed reference to NodePort hybrid mode which
                     doesn't exist in v1.7 branch. ]

This commit adds a mechanism to remove orphan SNAT entries.

We call an SNAT entry orphan if it does not have either a corresponding
CT entry or an SNAT entry in a reverse order. Both cases can happen due
to LRU eviction heuristics (both CT and NAT maps are of the LRU type).

The mechanism for the removal is based on the GC signaling in the
datapath. When the datapath SNAT routine fails to find a free mapping
after SNAT_SIGNAL_THRES attempts, it sends the signal via the perf ring
buffer. The consumer of the buffer is the daemon. After receiving the
signal it invokes the CT GC.

The newly implemented GC addition iterates over all SNAT entries and
checks whether a corresponding CT entry is found, and if not, it tries
to remove both SNAT entries (for original and reverse flows).

For now, I didn't add GC of orphan SNAT entries created by DSR to keep
complexity of changes as low as possible. This will come as a follow up.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit 0c83c28 ]

[ Backporter's notes: Resolved conflict with exporting MapType* consts
                      which were unexported in master branch. ]

Previously, after receiving the signal from the datapath, we iterated
NAT map twice: first to compare against CT TCP map, second - against CT
any map.

Obviously, doing the iterations two times was inefficient. This commit
fixes that by passing both CT {TCP,any} maps to the NAT GC routine. This
allows the NAT GC to iterate once.

Suggested-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit 0e220cf ]

[ Backporter's notes: The upstream commit is coming from v1.8 branch, as
                      the fix was not needed on v1.9 and master. See
                      #14022 (comment)
                      and
                      #13912 (comment) ]

To fix the ctmap privileged test failure, the following needs to be
applied (otherwise, if ...; errors.Is(err, unix.ENOENT) is always false
in the PurgeOrphanNATEntries(); the change was introduced in v1.9 by
2283103):

Signed-off-by: Tom Payne <tom@isovalent.com>
…onstant

[ upstream commit e4bf8ca ]

Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit 57784e3 ]

[ Backporter's notes: Resolved conflict with gcFamily const types. ]

Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
@brb brb force-pushed the pr/v1.7-backport-2020-11-23 branch from 11055b3 to e708d24 Compare November 27, 2020 15:46
@brb
Copy link
Member

brb commented Nov 27, 2020

test-backport-1.7

1 similar comment
@brb
Copy link
Member

brb commented Nov 27, 2020

test-backport-1.7

@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Nov 27, 2020
@aanm aanm merged commit 54444f3 into v1.7 Nov 30, 2020
@aanm aanm deleted the pr/v1.7-backport-2020-11-23 branch November 30, 2020 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants