Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaintainIPPool refactor and support for aborting release handshake #18330

Merged
merged 3 commits into from
Jan 6, 2022

Conversation

hemanthmalla
Copy link
Member

@hemanthmalla hemanthmalla commented Dec 23, 2021

Imagine a scenario where a node has 2 unused IPs and pre-allocate set to one. Let's say one of the IPs is in the middle of a handshake and a new pod is scheduled on the node. The other unused IP would be allocated to the pod. Now, when the operator re-evaluates, the node is no longer considered to be in excess. Without this commit, the operator does not act further on IPs in this state. This results in a scenario where no new IPs are allocated to the node and agent cannot allocate the unused IPs because they're in the middle of a handshake.

Also refactors IP release and allocate functionality out of maintainIPPool()

Fix for a bug where unused IPs on the node cannot be allocated when IP release handshake is enabled. Adds support for aborting IP release, if the node doesn't have excess anymore.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 23, 2021
@hemanthmalla hemanthmalla marked this pull request as ready for review December 23, 2021 12:21
@hemanthmalla hemanthmalla requested review from a team and twpayne December 23, 2021 12:21
@gandro gandro added area/eni Impacts ENI based IPAM. needs-backport/1.11 release-note/bug This PR fixes an issue in a previous release of Cilium. labels Dec 23, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 23, 2021
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thank you.

Only some minor nits, nothing blocking.

pkg/ipam/node.go Outdated Show resolved Hide resolved
pkg/ipam/node.go Outdated Show resolved Hide resolved
pkg/ipam/node.go Outdated Show resolved Hide resolved
pkg/ipam/node.go Outdated Show resolved Hide resolved
@hemanthmalla
Copy link
Member Author

@gandro Thanks for the review 🙇
Incorporated all the feedback, updated the commit with changes.

@hemanthmalla
Copy link
Member Author

/ci-eks

@hemanthmalla
Copy link
Member Author

/test-runtime

@gandro
Copy link
Member

gandro commented Dec 23, 2021

/test-runtime

… node

Imagine a scenario where a node has 2 unused IPs and pre-allocate set to
1. Let's say one of the IPs is in the middle of a handshake and a new
pod is scheduled on the node. The other unused IP would be allocated to
the pod. Now, when the operator re-evaluates, the node is no longer
considered to be in excess. Without this commit, the operator does not
act further on IPs in this state. This results in a scenario where no
new IPs are allocated to the node and agent cannot allocate the unused IPs
because they're in the middle of a handshake.

Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
@twpayne twpayne added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 4, 2022
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. The changes LGTM, could you update the 2nd commit msg with the motivation behind the refactor?

@christarazi christarazi removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 5, 2022
@joestringer
Copy link
Member

Given the current title of the PR plus the labels, if we merge & backport & release this as part of v1.11.1, then the PR will show up in the release notes under "Bugfixes" as MaintainIPPool refactor and support for aborting release handshake. This doesn't sound like a bugfix. Could you improve the title or add a ```release-note::...``` section into the PR description that more succinctly describes the bugfix? Maybe something like "Fix bug where Cilium cannot allocate unused IPs from the ENI IPAM pool"?

…ol()

With the addition of IP release handhake, maintainIPPool() became too
long and not very readable. So, moving release and allocate logic into
their own functions.

Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
@hemanthmalla
Copy link
Member Author

@christarazi @joestringer Updated the commit message and added a release note.

@joestringer
Copy link
Member

I see that this was previously marked ready-to-merge and you've only updated the commit message, so this should be good to merge now. Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/eni Impacts ENI based IPAM. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants