Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: only GC identity keys of its own cluster #16825

Merged
merged 1 commit into from Oct 2, 2021

Conversation

ArthurChiao
Copy link
Contributor

Ref: #16805

Signed-off-by: ArthurChiao arthurchiao@hotmail.com

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 8, 2021
@pchaigno pchaigno added area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jul 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 8, 2021
@ArthurChiao ArthurChiao force-pushed the gc_incluster_identities_only branch 3 times, most recently from 3e03023 to 93a4162 Compare July 15, 2021 08:36
@ArthurChiao ArthurChiao marked this pull request as ready for review July 15, 2021 14:40
@ArthurChiao ArthurChiao requested a review from a team July 15, 2021 14:40
@ArthurChiao ArthurChiao requested review from a team as code owners July 15, 2021 14:40
@pchaigno pchaigno added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Jul 27, 2021
pkg/kvstore/allocator/allocator_test.go Outdated Show resolved Hide resolved
pkg/option/config.go Outdated Show resolved Hide resolved
@ArthurChiao ArthurChiao requested a review from twpayne July 28, 2021 07:45
pkg/option/config.go Outdated Show resolved Hide resolved
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! A few minor comments

pkg/allocator/allocator.go Outdated Show resolved Hide resolved
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
pkg/kvstore/allocator/allocator.go Outdated Show resolved Hide resolved
@maintainer-s-little-helper

This comment has been minimized.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Aug 17, 2021
@maintainer-s-little-helper

This comment has been minimized.

1 similar comment
@maintainer-s-little-helper

This comment has been minimized.

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Aug 17, 2021
@pchaigno pchaigno removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Aug 17, 2021
@tklauser
Copy link
Member

test-me-please

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@joestringer
Copy link
Member

I note that all of the Jenkins-based CI runs were successful (except test-gke), while the GHA-based CI runs were not successful. @ArthurChiao would you mind rebasing once more so we can evaluate whether the GHA-based actions were just flaky at the time or whether they were potentially revealing a real issue with the PR?

@pchaigno
Copy link
Member

Rebasing should fix the current AKS error. Not sure for the rest. We do still have lots of flakes in GitHub workflows.

Fix: cilium#16805

Signed-off-by: ArthurChiao <arthurchiao@hotmail.com>
@aanm
Copy link
Member

aanm commented Sep 15, 2021

test-me-please

Job 'Cilium-PR-K8s-GKE' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sKafkaPolicyTest Kafka Policy Tests KafkaPolicies

Failure Output

FAIL: Kafka Pods are not ready after timeout

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-GKE so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sServicesTest Checks service across nodes Tests NodePort BPF Tests with direct routing Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://192.168.36.12:30669/hello" from outside cluster (1/10)

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.16-net-next so I can create a new GitHub issue to track it.

@aanm aanm merged commit e646d3e into cilium:master Oct 2, 2021
@jrajahalme
Copy link
Member

jrajahalme commented Oct 6, 2021

@aanm Unfortunately this breaks multi-cluster with a fatal warning in the logs:

level=fatal msg="Unable to initialize Identity Allocator with backend crd" error="maximum ID must be greater than minimum ID" subsys=identity-cache

This happens whenever --cluster-id is set to non-default value (I tested with --cluster-id=1).

On the first order this is due to the new WithMin() and WithMax() functions allowing max to be set to less than the min and min to be set greater than min.

But more fundamentally, NewAllocator() should not limit the range of the IDs by itself, as its callers are passing in options to set the min, max and mask as they see fit. It appears that the fix in this PR should have been made one level up in the call stack.

As this is now this PR needs to be reverted from master.

@jrajahalme jrajahalme mentioned this pull request Oct 6, 2021
9 tasks
nbusseneau added a commit to nbusseneau/cilium that referenced this pull request Oct 6, 2021
This reverts commit e646d3e.

Original PR: cilium#16825
Rationale: breaks multicluster functionality at install time with

```
level=fatal msg="Unable to initialize Identity Allocator with backend crd" error="maximum ID must be greater than minimum ID" subsys=identity-cache
```

when `--cluster-id` is set to a non-default value (e.g. `1`, as in our
CI).

Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
joestringer pushed a commit that referenced this pull request Oct 6, 2021
This reverts commit e646d3e.

Original PR: #16825
Rationale: breaks multicluster functionality at install time with

```
level=fatal msg="Unable to initialize Identity Allocator with backend crd" error="maximum ID must be greater than minimum ID" subsys=identity-cache
```

when `--cluster-id` is set to a non-default value (e.g. `1`, as in our
CI).

Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
@ArthurChiao
Copy link
Contributor Author

Thanks, I'll look at this tomorrow @jrajahalme

@ArthurChiao ArthurChiao deleted the gc_incluster_identities_only branch October 12, 2021 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants