Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator gc incluster identities only #17589

Merged
merged 2 commits into from Nov 22, 2021

Conversation

ArthurChiao
Copy link
Contributor

A tentative change as a follow-up of #16825 #16825 (comment).

To make the changes clear, I added a new commit on top of the previous PR.
If all things are done, I can squash them. @jrajahalme @aanm Thanks!

@ArthurChiao ArthurChiao requested a review from a team as a code owner October 13, 2021 02:26
@ArthurChiao ArthurChiao requested a review from a team October 13, 2021 02:26
@ArthurChiao ArthurChiao requested a review from a team as a code owner October 13, 2021 02:26
@ArthurChiao ArthurChiao requested review from a team, nathanjsweet and tklauser October 13, 2021 02:26
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 13, 2021
@ArthurChiao
Copy link
Contributor Author

level=error msg="Command execution failed" cmd="[ipset create cilium_node_set_v4 iphash family inet -exist]" error="exec: \"ipset\": executable file not found in $PATH" subsys=iptables
level=error msg="Command execution failed" cmd="[ipset add cilium_node_set_v4 10.30.0.19 -exist]" error="exec: \"ipset\": executable file not found in $PATH" subsys=iptables
level=error msg="Failed to add IP to ipset cilium_node_set_v4" error="exec: \"ipset\": executable file not found in $PATH" ipAddr=10.30.0.19 subsys=iptables
level=info msg="Envoy: Starting xDS gRPC server listening on /tmp/cilium-test-run2675004348/xds.sock" subsys=envoy-manager
level=error msg="Command execution failed" cmd="[ipset create cilium_node_set_v6 iphash family inet6 -exist]" error="exec: \"ipset\": executable file not found in $PATH" subsys=iptables
level=info msg="Adding local node to cluster" node="{travis-job-dc0ed932-82b3-4d32-8c61-8cac45737807 default [{InternalIP 10.30.0.19} {InternalIP fc00::10ca:1} {CiliumInternalIP 1.1.1.2} {CiliumInternalIP cafe::2}] 10.19.0.0/16 f00d::a13:0:0:0/96 1.1.1.192 cafe::58c2 0 local 0 map[] 6 }" subsys=nodediscovery
level=error msg="Command execution failed" cmd="[ipset add cilium_node_set_v6 fc00::10ca:1 -exist]" error="exec: \"ipset\": executable file not found in $PATH" subsys=iptables
level=error msg="Failed to add IP to ipset cilium_node_set_v6" error="exec: \"ipset\": executable file not found in $PATH" ipAddr="fc00::10ca:1" subsys=iptables
level=info msg="Initializing identity allocator" subsys=identity-cache
level=fatal msg="Unable to initialize Identity Allocator with backend kvstore" error="maximum ID must be greater than minimum ID" subsys=identity-cache

Seems Cilium agent failed with the ipset binary not found?

@tklauser
Copy link
Member

@ArthurChiao could you please rebase the PR on top of latest master? The newest runtime Docker images and VM images should contain ipset.

@aanm
Copy link
Member

aanm commented Oct 13, 2021

@ArthurChiao I think CI had a legitimate failure:

ipAddr="fc00::10ca:1" subsys=iptables
level=info msg="Initializing identity allocator" subsys=identity-cache
level=fatal msg="Unable to initialize Identity Allocator with backend kvstore" error="maximum ID must be greater than minimum ID" subsys=identity-cache
level=info msg="Starting IP identity watcher" subsys=ipcache
level=info msg="Got lease ID 49ec7c778584d805" subsys=kvstore
level=info msg="Got lock lease ID 49ec7c778584d808" subsys=kvstore
level=info msg="Initial etcd session established" config= endpoints="[http://127.0.0.1:4002]" subsys=kvstore
level=info msg="Successfully verified version of etcd endpoint" config= endpoints="[http://127.0.0.1:4002]" etcdEndpoint="http://127.0.0.1:4002" subsys=kvstore version=3.3.20
FAIL	github.com/cilium/cilium/daemon/cmd	2.058s
FAIL
make: *** [Makefile:205: integration-tests] Error 1

@ArthurChiao ArthurChiao force-pushed the operator_gc_incluster_identities_only branch from b3a332c to a0dfaf8 Compare October 13, 2021 13:13
@ArthurChiao
Copy link
Contributor Author

Just rebased, I'll look at the CI failure, thanks! @aanm @tklauser

@ArthurChiao ArthurChiao force-pushed the operator_gc_incluster_identities_only branch 2 times, most recently from 4b9ff36 to c551747 Compare October 14, 2021 07:23
@aanm
Copy link
Member

aanm commented Oct 18, 2021

/test

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch. LGTM me overall.

Would it make sense to have a struct with the max and min IDs, instead of passing them around as arguments? The initialization and checking if an id is within range could be encoded in this structure.

@@ -265,11 +266,15 @@ type Backend interface {
// After creation, IDs can be allocated with Allocate() and released with
// Release()
func NewAllocator(typ AllocatorKey, backend Backend, opts ...AllocatorOption) (*Allocator, error) {
// Calculate initial ID range with cluster ID
minID := option.Config.ClusterID<<identity.ClusterIDShift + 1
maxID := minID + (1<<identity.ClusterIDShift - 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very familiar with this code, but do ids start at 1 (not 0)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments!

Would it make sense to have a struct with the max and min IDs, instead of passing them around as arguments? The initialization and checking if an id is within range could be encoded in this structure.

I'll have a look.

Not very familiar with this code, but do ids start at 1 (not 0)?

I saw the underlying allocator has a minID of 1: https://github.com/cilium/cilium/blob/master/pkg/allocator/allocator.go#L271.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments!

Would it make sense to have a struct with the max and min IDs, instead of passing them around as arguments? The initialization and checking if an id is within range could be encoded in this structure.

I'll have a look.

To be clear, this is a suggestion, not intended to block the PR. Feel free to ignore it or leave it as a followup.

Not very familiar with this code, but do ids start at 1 (not 0)?

I saw the underlying allocator has a minID of 1: https://github.com/cilium/cilium/blob/master/pkg/allocator/allocator.go#L271.

Ah, makes sense, thanks!

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a janitor's perspective: LGTM.

@kkourt kkourt requested a review from a team October 25, 2021 13:12
Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tklauser tklauser requested a review from aanm October 29, 2021 06:14
@tklauser tklauser added area/clustermesh Relates to multi-cluster routing functionality in Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Oct 29, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 29, 2021
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
@nathanjsweet nathanjsweet removed their request for review November 4, 2021 21:28
@pchaigno pchaigno added the kind/bug This is a bug in the Cilium logic. label Nov 5, 2021
@pchaigno pchaigno requested a review from aanm November 8, 2021 10:35
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
@ArthurChiao ArthurChiao force-pushed the operator_gc_incluster_identities_only branch from c551747 to 123654b Compare November 9, 2021 02:02
Fix: cilium#16805

Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
1. Calculate identity range after well-known identities are initialized
2. Calculate identity range more accurately
3. Include the invalid min/max identity ID in error log

Ref: 7048088

Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
@ArthurChiao ArthurChiao force-pushed the operator_gc_incluster_identities_only branch from 123654b to e3bc9c6 Compare November 9, 2021 02:05
@pchaigno pchaigno requested a review from aanm November 9, 2021 15:04
@pchaigno
Copy link
Member

pchaigno commented Nov 9, 2021

/test

Job 'Cilium-PR-Runtime-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

RuntimePrivilegedUnitTests Run Tests

Failure Output

FAIL: Failed to run privileged unit tests

If it is a flake, comment /mlh new-flake Cilium-PR-Runtime-net-next so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.20-kernel-5.4' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDatapathConfig Host firewall With VXLAN

Failure Output

FAIL: Failed to reach 10.0.0.32:80 from testclient-host-7qjzm

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.20-kernel-5.4 so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.21-kernel-4.19' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sDatapathConfig Host firewall With VXLAN

Failure Output

FAIL: Managed to reach 192.168.36.11:69 from testserver-host-hgmh9

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.21-kernel-4.19 so I can create a new GitHub issue to track it.

@aanm
Copy link
Member

aanm commented Nov 11, 2021

/test-1.21-4.19

@aanm
Copy link
Member

aanm commented Nov 11, 2021

/test-1.20-5.4

@aanm
Copy link
Member

aanm commented Nov 11, 2021

/test-runtime

Job 'Cilium-PR-K8s-1.21-kernel-4.19' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sServicesTest Checks service across nodes Tests NodePort BPF Tests with vxlan Tests with secondary NodePort device

Failure Output

FAIL: Request from k8s1 to service http://192.168.37.12:30915 failed

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.21-kernel-4.19 so I can create a new GitHub issue to track it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants