Skip to content

Conversation

@rbtr
Copy link
Collaborator

@rbtr rbtr commented Sep 28, 2022

Reason for Change:

This starts the design train for the Subnet Scarcity feature, a solution to the issue raised in #1605.

Namely: in SWIFT, due to the batch-wise scaling of the IP Pool, CNS reserves an overhead of IPs from the Subnet on every Node. This can lead to artificial Subnet IP Exhaustion where there are insufficient unreserved IPs left in the Subnet for Nodes to join the cluster or for Pods to schedule, even when the real total Pod IP usage is less than the total Subnet IP Capacity.

Issue Fixed:

Requirements:

Notes:

@rbtr rbtr added enhancement cns Related to CNS. docs Documentation only labels Sep 28, 2022
@rbtr rbtr self-assigned this Sep 28, 2022
@rbtr rbtr force-pushed the proposal/subnet-scarcity branch from 851e6e1 to b21e83f Compare September 28, 2022 22:34
Comment on lines +36 to +37
'400':
description: bad input parameter
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsagasthya can you fill in the missing response types and codes of this API as it has been written?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response codes are 400 for invalid networkId or subnetName, 500 in cases of error in retrieving the cache from controller. Success is 200.

When a Pod is created, the CNI will call with a request to assign an IP. If CNS is out of IPs and cannot honor that request, the CNI will return an error to the CRI, which will follow up by tearing down that Pod sandbox and starting over. Because of this stateless retrying, CNS can only reliable understand that it needs _at least one more_ IP, because it is impossible to tell if subsequent requests are retries for the same Pod, or many different Pods. If _many_ Pods have been scheduled, CNS will still only request a single additional batch of IPs, and assign those IPs one at a time until it runs out, then request a single additional batch of IPs...

A more predictive method of IP Pool scaling will be added to CNS: CNS will watch Pods for its Node, and will request/release IPs immediately based on the number of Pods scheduled. The Batching behavior will be unchanged, and CNS will continue to request IPs in Batches $B$ based on the local IP usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add in handling of the race condition that Ramiro brought up on Rahul's PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted, I have covered this in my local working draft and will include it in the next PR addition

nairashu
nairashu previously approved these changes Sep 29, 2022
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: GitHub <noreply@github.com>
@rbtr rbtr merged commit 50771ed into Azure:master Sep 29, 2022
@rbtr rbtr deleted the proposal/subnet-scarcity branch September 29, 2022 22:48
rjdenney pushed a commit to rjdenney/azure-container-networking that referenced this pull request Jan 19, 2023
* stub docs/design for design proposals

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* feature proposal: subnet scarcity phase 1

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* feature proposal: subnet scarcity phase 2

Signed-off-by: GitHub <noreply@github.com>

* feature proposal: subnet scarcity phase 3

Signed-off-by: GitHub <noreply@github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: GitHub <noreply@github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cns Related to CNS. docs Documentation only enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants