Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

policy: Fix selector identity release for FQDN #18166

Merged
merged 2 commits into from
Dec 9, 2021

Conversation

joestringer
Copy link
Member

@joestringer joestringer commented Dec 8, 2021

Alexander reports in issue #18023 that establishing a connection
via an FQDN policy, then modifying that FQDN policy, would cause
subsequent traffic to the FQDN to be dropped, even if the new policy
still allowed the same traffic via a toFQDNs statement.

This was caused by overzealous release of CIDR identities while
generating a new policy. Although the policy calculation itself keeps
all SelectorCache entries alive during the policy generation phase (see
cachedSelectorPolicy.setPolicy()), after the new policy is inserted
into the PolicyCache, the distillery would clean up the old
policy. As part of that cleanup, it would call into the individual
selector to call the RemoveSelectors() function.

The previous implementation of this logic unintentionally released the
underlying identities any time a user of a selector was released, rather
than only releasing the underlying identities when the number of users
reached zero and the selector itself would be released. This meant that
rather than the SelectorCache retaining references to the underlying
identities when a policy was updated, instead the references would be
released (and all corresponding eBPF resources cleaned up) at the end of
the process. This then triggered subsequent connectivity outages.

Fix it by only releasing the identity references once the cached
selector itself is removed from the SelectorCache.

Fixes: f559cf1 ("selectorcache: Release identities on selector removal")
Reported-by: Alexander Block ablock84@gmail.com
Suggested-by: Jarno Rajahalme jarno@cilium.io

Fixes: #18023

Alexander reports in GitHub issue 18023 that establishing a connection
via an FQDN policy, then modifying that FQDN policy, would cause
subsequent traffic to the FQDN to be dropped, even if the new policy
still allowed the same traffic via a toFQDN statement.

This was caused by overzealous release of CIDR identities while
generating a new policy. Although the policy calculation itself keeps
all selectorcache entries alive during the policy generation phase (see
cachedSelectorPolicy.setPolicy() ), after the new policy is inserted
into the PolicyCache, the distillery package would clean up the old
policy. As part of that cleanup, it would call into the individual
selector to call the RemoveSelectors() function.

The previous implementation of this logic unintentionally released the
underlying identities any time a user of a selector was released, rather
than only releasing the underlying identities when the number of users
reached zero and the selector itself would be released. This meant that
rather than the selectorcache retaining references to the underlying
identities when a policy was updated, instead the references would be
released (and all corresponding BPF resources cleaned up) at the end of
the process. This then triggered subsequent connectivity outages.

Fix it by only releasing the identity references once the cached
selector itself is removed from the SelectorCache.

Fixes: f559cf1 ("selectorcache: Release identities on selector removal")
Reported-by: Alexander Block <ablock84@gmail.com>
Suggested-by: Jarno Rajahalme <jarno@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer requested a review from a team as a code owner December 8, 2021 02:54
@joestringer joestringer requested a review from a team December 8, 2021 02:54
@joestringer joestringer requested a review from a team as a code owner December 8, 2021 02:54
@joestringer joestringer added needs-backport/1.10 release-note/bug This PR fixes an issue in a previous release of Cilium. labels Dec 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.6 Dec 8, 2021
@joestringer
Copy link
Member Author

/test

Test that when FQDN policy is updated to a new policy which still
selects the same old FQDN destination, connectivity continues to work.

Validated to fail without the previous commit:

    K8sFQDNTest
    /home/joe/git/cilium/test/ginkgo-ext/scopes.go:473
      Validate that FQDN policy continues to work after being updated [It]
      /home/joe/git/cilium/test/ginkgo-ext/scopes.go:527

      Can't connect to to a valid target when it should work
      Expected command: kubectl exec -n default app2-58757b7dd5-rh7dd --
        curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5
            --max-time 20 --retry 5 http://vagrant-cache.ci.cilium.io
            -w "time-> DNS: '%{time_namelookup}(%{remote_ip})',
                Connect: '%{time_connect}',
                Transfer '%{time_starttransfer}',
                total '%{time_total}'"
      To succeed, but it failed:
      Exitcode: 28
      Err: exit status 28
      Stdout:
             time-> DNS: '0.000016()', Connect: '0.000000',Transfer '0.000000', total '5.000415'
      Stderr:
             command terminated with exit code 28

Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer
Copy link
Member Author

Forgot to include the new policy since it was all working locally, new version includes the replacement policy (same as the original one, with just one extra toFQDNs statement.)

@joestringer
Copy link
Member Author

/test

Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@nbusseneau
Copy link
Member

nbusseneau commented Dec 8, 2021

/test-gke

GKE had failed with timeouts while retrieving the Quay images, retriggering.

@nbusseneau
Copy link
Member

K8s 1.23 / 4.9 hit an issue that looks similar to #13552, though the full stacktrace is a bit different:

Stacktrace
/home/jenkins/workspace/Cilium-PR-K8s-1.23-kernel-4.9/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:465
cilium pre-flight checks failed
Expected
    <*errors.errorString | 0xc002160ca0>: {
        s: "Cilium validation failed: 4m0s timeout expired: Last polled error: controllers are failing: cilium-agent 'cilium-7cszc': controller ipcache-inject-labels is failing: Exitcode: 0 \nStdout:\n \t KVStore:                Ok   Disabled\n\t Kubernetes:             Ok   1.23+ (v1.23.0-rc.0) [linux/amd64]\n\t Kubernetes APIs:        [\"cilium/v2::CiliumClusterwideNetworkPolicy\", \"cilium/v2::CiliumNetworkPolicy\", \"cilium/v2::CiliumNode\", \"cilium/v2alpha1::CiliumEndpointSlice\", \"core/v1::Namespace\", \"core/v1::Node\", \"core/v1::Pods\", \"core/v1::Service\", \"discovery/v1::EndpointSlice\", \"networking.k8s.io/v1::NetworkPolicy\"]\n\t KubeProxyReplacement:   Disabled   \n\t Host firewall:          Disabled\n\t Cilium:                 Ok   1.11.90 (v1.11.90-96844eb)\n\t NodeMonitor:            Listening for events on 3 CPUs with 64x4096 of shared memory\n\t Cilium health daemon:   Ok   \n\t IPAM:                   IPv4: 6/254 allocated from 10.0.0.0/24, IPv6: 6/254 allocated from fd02::/120\n\t BandwidthManager:       Disabled\n\t Host Routing:           Legacy\n\t Masquerading:           IPTables [IPv4: Disabled, IPv6: Enabled]\n\t Controller Status:      39/40 healthy\n\t   Name                                          Last success   Last error   Count   Message\n\t   bpf-map-sync-cilium_lxc                       5s ago         never        0       no error                                                                                                                                                                    \n\t   cilium-health-ep                              59s ago        never        0       no error                                                                                                                                                                    \n\t   dns-garbage-collector-job                     2s ago         never        0       no error                                                                                                                                                                    \n\t   endpoint-1217-regeneration-recovery           never          never        0       no error                                                                                                                                                                    \n\t   endpoint-1248-regeneration-recovery           never          never        0       no error                                                                                                                                                                    \n\t   endpoint-1474-regeneration-recovery           never          never        0       no error                                                                                                                                                                    \n\t   endpoint-2615-regeneration-recovery           never          never        0       no error                                                                                                                                                                    \n\t   endpoint-2810-regeneration-recovery           never          never        0       no error                                                                                                                                                                    \n\t   endpoint-817-regeneration-recovery            never          never        0       no error                                                                                                                                                                    \n\t   endpoint-gc                                   4m3s ago       never        0       no error                                                                                                                                                                    \n\t   ipcache-inject-labels                         4m0s ago       9s ago       22      faile...

Gomega truncated this representation as it exceeds 'format.MaxLength'.
Consider having the object provide a custom 'GomegaStringer' representation
or adjust the parameters in Gomega's 'format' package.

Learn more here: https://onsi.github.io/gomega/#adjusting-output

to be nil
/home/jenkins/workspace/Cilium-PR-K8s-1.23-kernel-4.9/src/github.com/cilium/cilium/test/k8sT/assertionHelpers.go:157

I have never seen this one, does it ring a bell to anyone?

@nbusseneau
Copy link
Member

CI 3.0 GKE workflow hit a Hubble flow listener timeout similar to #17907.

@nbusseneau
Copy link
Member

Travis ARM build hit #17444, retriggering as this looks like a transient infra issue.

@joestringer
Copy link
Member Author

joestringer commented Dec 9, 2021

@nbusseneau I looked into that failure, taking the actual error report, unfortunately ginkgo decides to ignore the formatting primitives inside the string and print it as one giant long line that's hard to read... however if we just interpret those formatting primitives (and add a bit more tasty, tasty spacing 😋 ) then we get:

        s: "Cilium validation failed:
        4m0s timeout expired:
        Last polled error: controllers are failing:
        cilium-agent 'cilium-7cszc':
        controller ipcache-inject-labels is failing: Exitcode: 0 
Stdout:
 	 KVStore:                Ok   Disabled
	 Kubernetes:             Ok   1.23+ (v1.23.0-rc.0) [linux/amd64]
	 Kubernetes APIs:        [\"cilium/v2::CiliumClusterwideNetworkPolicy\", \"cilium/v2::CiliumNetworkPolicy\", \"cilium/v2::CiliumNode\", \"cilium/v2alpha1::CiliumEndpointSlice\", \"core/v1::Namespace\", \"core/v1::Node\", \"core/v1::Pods\", \"core/v1::Service\", \"discovery/v1::EndpointSlice\", \"networking.k8s.io/v1::NetworkPolicy\"]
	 KubeProxyReplacement:   Disabled   
	 Host firewall:          Disabled
	 Cilium:                 Ok   1.11.90 (v1.11.90-96844eb)
	 NodeMonitor:            Listening for events on 3 CPUs with 64x4096 of shared memory
	 Cilium health daemon:   Ok   
	 IPAM:                   IPv4: 6/254 allocated from 10.0.0.0/24, IPv6: 6/254 allocated from fd02::/120
	 BandwidthManager:       Disabled
	 Host Routing:           Legacy
	 Masquerading:           IPTables [IPv4: Disabled, IPv6: Enabled]
	 Controller Status:      39/40 healthy
	   Name                                          Last success   Last error   Count   Message
	   bpf-map-sync-cilium_lxc                       5s ago         never        0       no error                                                                                                                                                                    
	   cilium-health-ep                              59s ago        never        0       no error                                                                                                                                                                    
	   dns-garbage-collector-job                     2s ago         never        0       no error                                                                                                                                                                    
	   endpoint-1217-regeneration-recovery           never          never        0       no error                                                                                                                                                                    
	   endpoint-1248-regeneration-recovery           never          never        0       no error                                                                                                                                                                    
	   endpoint-1474-regeneration-recovery           never          never        0       no error                                                                                                                                                                    
	   endpoint-2615-regeneration-recovery           never          never        0       no error                                                                                                                                                                    
	   endpoint-2810-regeneration-recovery           never          never        0       no error                                                                                                                                                                    
	   endpoint-817-regeneration-recovery            never          never        0       no error                                                                                                                                                                    
	   endpoint-gc                                   4m3s ago       never        0       no error                                                                                                                                                                    
	   ipcache-inject-labels                         4m0s ago       9s ago       22      faile...

So, it's failing (in pre-flight checks, ie before the actual test) due to error output in the cilium status controller fields. This is related to the policy for kube-apiserver feature, and Chris has been iterating to improve on that feature lately and we expect that #18150 will fix this issue.

@joestringer
Copy link
Member Author

Given that the only failing jobs have been pinpointed as already existing, ie not introduced by the PR, and the new test in the PR passes, and that this is a bugfix, I'll merge this and kick off the backport.

@joestringer
Copy link
Member Author

I'll fix up the checkpatch complaint in a separate PR, that one is trivial and also affects existing tests.

@joestringer joestringer merged commit 80ff05a into cilium:master Dec 9, 2021
@joestringer joestringer deleted the submit/fix-18023 branch December 9, 2021 01:03
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.6 Dec 9, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.6 Dec 9, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.6 Dec 9, 2021
@tklauser tklauser added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Dec 16, 2021
@joestringer joestringer added this to Backport done to v1.11 in 1.11.1 Jan 5, 2022
@christarazi christarazi added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Jun 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
No open projects
1.10.6
Backport done to v1.10
1.11.1
Backport done to v1.11
Development

Successfully merging this pull request may close these issues.

Traffic gets dropped after some time when using toFQDNs
7 participants