New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.6 backports 2020-02-17 #10223
v1.6 backports 2020-02-17 #10223
Conversation
@@ -116,8 +118,8 @@ func prefixMatchesKey(prefix, key string) bool { | |||
|
|||
// NewKVStoreBackend creates a pkg/allocator.Backend compatible instance. The | |||
// specific kvstore used is configured in pkg/kvstore. | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey) (*kvstoreBackend, error) { | |||
if kvstore.Client() == nil { | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey, backend kvstore.BackendOperations) (*kvstoreBackend, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported func NewKVStoreBackend returns unexported type *allocator.kvstoreBackend, which can be annoying to use
@@ -116,8 +118,8 @@ func prefixMatchesKey(prefix, key string) bool { | |||
|
|||
// NewKVStoreBackend creates a pkg/allocator.Backend compatible instance. The | |||
// specific kvstore used is configured in pkg/kvstore. | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey) (*kvstoreBackend, error) { | |||
if kvstore.Client() == nil { | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey, backend kvstore.BackendOperations) (*kvstoreBackend, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported func NewKVStoreBackend returns unexported type *allocator.kvstoreBackend, which can be annoying to use
@@ -116,8 +118,8 @@ func prefixMatchesKey(prefix, key string) bool { | |||
|
|||
// NewKVStoreBackend creates a pkg/allocator.Backend compatible instance. The | |||
// specific kvstore used is configured in pkg/kvstore. | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey) (*kvstoreBackend, error) { | |||
if kvstore.Client() == nil { | |||
func NewKVStoreBackend(basePath, suffix string, typ allocator.AllocatorKey, backend kvstore.BackendOperations) (*kvstoreBackend, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported func NewKVStoreBackend returns unexported type *allocator.kvstoreBackend, which can be annoying to use
never-tell-me-the-odds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my PRs
61f84a8
to
46df1c0
Compare
never-tell-me-the-odds |
46df1c0
to
fb55d0d
Compare
never-tell-me-the-odds |
fb55d0d
to
194d0ed
Compare
test-me-please |
194d0ed
to
d541170
Compare
test-me-please |
never-tell-me-the-odds |
test-me-please |
d541170
to
06c53ed
Compare
never-tell-me-the-odds |
test-me-please |
1 similar comment
test-me-please |
seems like Cilium was not running in runtime vm |
test-me-please |
1 similar comment
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my clustermesh stuff
@@ -36,7 +36,7 @@ var kvstoreDeleteCmd = &cobra.Command{ | |||
Fatalf("Unable to delete keys: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this spot, but this backport is missing deleting all the package global functions in pkg/kvstore/kvstore.go
(76c7fca#diff-fb5855a8577b97a3f31a2f1fe6c75009L49-L169)
@@ -143,15 +146,15 @@ func (k *kvstoreBackend) DeleteAllKeys() { | |||
kvstore.Client().DeletePrefix(k.basePrefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops! I forgot to change this in the original PR. I'll make a follow up :/
test-me-please |
@nebril the tests have been failing for the runtime VM for the same reason, this is probably a regression. I would suggest to drop the PRs that had conflicts, rebase against 1.6, test again and open a new PR with the PRs that had conflicts. |
@aanm yeah, got really unlucky with provisioning errors on this PR, wanted to make sure this isn't the underlying issue |
[ upstream commit f868f97 ] We need to avoid using the package level functions but need to keep the tracing for debug. Moving the trace calls into the kvstore backends is verbose but is the simplest way to do this. Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 76c7fca ] We need to use different backends at the same time. Currently we assume a single connected backend instance by using the package level functions. We now direclty call these functions on the default client instead, allowing future changes to use non-global instances of the backends. Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 913ed22 ] We previously only used single pkg/kvstore/allocator.kvstoreBackend instance system-wide. This also used a single, global, etcd/consul backend instance (in this case a backend to allocator.kvstoreBackend, itself the kvstore backend for the identity allocator). With clustermesh, cilium-agent needs to sync with remote etcd instances. This means that multiple etcd clients can exist, and that the identity allocator may use different ones. This is now possible by keeping an internal reference instead of using the global one. Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit f28f6cd ] Clustermesh remote-cluster watching accidentally used the local caching identity allocator in cilium-agent. This meant that while ipcache and nodes were sycned correctly, identities were not. This was because the watch was called on the main identity allocator, adding itself to itself as a remote cluster. This resulted in two connections to the local etcd, and double entries when listing identities. A new allocator instance is created to use the remote cluster etcd backend. This is then added to the main allocators remote clusters list. Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 5e01fb6 ] If "cilium bpf nat list" fails, then we will be able to rely to the raw dumps when debugging. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 8408128 ] When Cilium performs a connection tracking garbage collection it will also purge NAT entries from the NAT bpf map. To perform this action, Cilium will execute BPF map lookups from userspace by giving the memory location of a golang structure, previously NatEntry4 now NatEntry6. The kernel will then write into this memory location the value of the NAT entry being looked up. As the size of NatEntry4 is only 38 bytes, as oppose the NatEntry6 which is 50 bytes, the kernel would then write 12 bytes in random memory locations of the Cilium agent. This could cause Cilium agent to crash or to have errors such as: ``` msg="Cannot create CEP" ... error="CiliumEndpoint.cilium.io \"��\\x00\\x00\\x00\\x00\\x00\\x00-multi-node-headless-854b65674d-lj45r\" is invalid: metadata.name: Invalid value: \"��\\x00\\x00\\x00\\x00\\x00\\x00-multi-node-headless-854b65674d-lj45r\": ``` Fixes: b9d2a0a ("maps/nat: Add NatKey{4,6} types") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 5b9002d ] Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 4630078 ] Fix the space hack which stopped working with make v4.3 (works with v4.2 though): [..] " [-DENABLE_HOST_REDIRECT-DENABLE_IPV4-DENABLE_IPV6-DENABLE_NAT46]"; clang -DENABLE_HOST_REDIRECT-DENABLE_IPV4-DENABLE_IPV6-DENABLE_NAT46 -I/home/brb/sandbox/gopath/src/github.com/cilium/cilium/bpf/include -I/home/brb/sandbox/gopath/src/github.com/cilium/cilium/bpf -D__NR_CPUS__=8 -O2 -g -target bpf -emit-llvm -Wall -Werror -Wno-address-of-packed-member -Wno-unknown-warning-option -c bpf_lxc.c -o bpf_lxc.ll; llc -march=bpf -mcpu=probe -mattr=dwarfris -o /dev/null bpf_lxc.ll; \ fi In file included from <built-in>:323: <command line>:1:20: error: ISO C99 requires whitespace after the macro name [-Werror,-Wc99-extensions] #define ENABLE_IPV4-DHAVE_LPM_MAP_TYPE 1 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 640b7a6 ] Use the top-of-tree VERSION file to generate the chart versions and update the pull policy using the following rules: * Set the helm chart versions to the VERSION in the file * If the VERSION file ends with ".90": - Set the cilium tag to 'latest' - Set the pullPolicy to 'Always' * If the VERSION file does not end with ".90": - Set the cilium tag to the VERSION in the file - Set the pullPolicy to 'IfNotPresent' * Set the managed-etcd version tag to the version specified at the top of this Makefile. This must be manually bumped, it does not appear to follow the standard Cilium docker image tag process. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit da2e652 ] spanStart of waitConsumeOffQueue could be read before being written in case the buffered event was executed before the execution of waitConsumeOffQueue.Start() in the modified lines of this commit. To fix this we should execute waitConsumeOffQueue.Start() even before the event is put into the queue. Although it does not give the correct span stat, the developer or user can derive it by subtracting the waitEnqueue span to retrieve the real waitConsumeOffQueue span. Fixes: add0d65 ("add eventqueue package") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 4b61dd5 ] getState function should only be called when the endpoint mutex is being held which was not always the case. Changed the function call so that it can be accessed concurrently for the data race detected bellow: ``` ================== WARNING: DATA RACE Write at 0x00c000544f30 by goroutine 56: github.com/cilium/cilium/pkg/endpoint.(*Endpoint).SetStateLocked() /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1641 +0xe9 github.com/cilium/cilium/pkg/endpoint.(*Endpoint).LeaveLocked() /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1386 +0x333 main.(*Daemon).deleteEndpointQuiet() /go/src/github.com/cilium/cilium/daemon/endpoint.go:651 +0x3e4 main.(*Daemon).regenerateRestoredEndpoints.func2() /go/src/github.com/cilium/cilium/daemon/state.go:388 +0x49 Previous read at 0x00c000544f30 by goroutine 165: github.com/cilium/cilium/pkg/endpointmanager.releaseID() /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1549 +0xb5 github.com/cilium/cilium/pkg/endpointmanager.Remove.func1() /go/src/github.com/cilium/cilium/pkg/endpointmanager/manager.go:335 +0x88 Goroutine 56 (running) created at: main.(*Daemon).regenerateRestoredEndpoints() /go/src/github.com/cilium/cilium/daemon/state.go:382 +0x916 main.(*Daemon).initRestore() /go/src/github.com/cilium/cilium/daemon/state.go:454 +0xf3 main.runDaemon() /go/src/github.com/cilium/cilium/daemon/daemon_main.go:1441 +0xa17 main.NewDaemon() /go/src/github.com/cilium/cilium/daemon/daemon.go:894 +0x23a4 main.runDaemon() /go/src/github.com/cilium/cilium/daemon/daemon_main.go:1396 +0x314 main.glob..func1() /go/src/github.com/cilium/cilium/daemon/daemon_main.go:121 +0xbf github.com/cilium/cilium/vendor/github.com/spf13/cobra.(*Command).execute() /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:766 +0x8eb github.com/cilium/cilium/vendor/github.com/spf13/cobra.(*Command).ExecuteC() /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:850 +0x41b main.daemonMain() /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:800 +0x237 main.main() /go/src/github.com/cilium/cilium/daemon/main.go:18 +0x2f Goroutine 165 (finished) created at: github.com/cilium/cilium/pkg/endpointmanager.Remove() /go/src/github.com/cilium/cilium/pkg/endpointmanager/manager.go:324 +0x121 main.(*Daemon).deleteEndpointQuiet() /go/src/github.com/cilium/cilium/daemon/endpoint.go:614 +0x1a8 main.(*Daemon).regenerateRestoredEndpoints.func2() /go/src/github.com/cilium/cilium/daemon/state.go:388 +0x49 ================== ``` Fixes: f71d87a ("endpointmanager: signal when work is done") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 69a6fce ] Besides the ability to delete, the ipcache garbage collector also requires the ability to dump the table. Add this to the existing probe which feeds `SupportsDelete()`. level=debug msg="Detected IPCache delete operation support: true" subsys=map-ipcache level=debug msg="Detected IPCache dump operation support: true" subsys=map-ipcache Fixes: #10080 Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit 6b9559e ] Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
[ upstream commit c58d749 ] * Handling of the proxy redirection was missing entirely in the to-container section for IPv6. Add it. * The to-container section case was assuming that any proxy redirection is indicated by ipv{46}_policy() returning a non-zero proxy port. This is no longer true since commit 830adba. Fix this by using a separate return code to indicate proxy redirection and treating the proxy port as optional. The above deficits lead to proxy redirection being ineffective when the setting EnableEndpointRoutes was set. Fixes: 830adba ("bpf: Support proxy using original source address and port.") Fixes: 25a80df ("bpf: Add to-container section to bpf_lxc") Fixes: #10105 Signed-off-by: Thomas Graf <thomas@cilium.io> Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
06c53ed
to
63ac0dc
Compare
never-tell-me-the-odds |
Seems like #10168 backport broke the CI, removed it from this PR, will create separate PR with just this backport. |
Hit #10270 , retrying k8s tests. |
test-missed-k8s |
Once this PR is merged, you can update the PR labels via:
Note: dropped last commit from #10185, as
status
method was not available in 1.6 branchThis change is