Skip to content

Conversation

@huntergregory
Copy link
Contributor

@huntergregory huntergregory commented Feb 16, 2023

Fix #1729. Improves on Pod markedForDelete idea in original commits of #1503 (see 398c99e).

Overview

NPM may incorrectly assign a Pod to an Endpoint. NPM needs to be able to reconcile this.

Scenario

Pod x/a, x/b, and NPM all restart at same time. Pod x/a had IP 10.224.0.1 before restart. Pod x/b has 10.224.0.1 after restart.

NPM can receive the following sequence of events:

  1. Pod x/a create --> Pod x/a delete --> Pod x/b create
  2. Pod x/a create --> Pod x/b create --> Pod x/a delete
  3. Pod x/b create --> Pod x/a create --> Pod x/a delete (considered impossible)

Changes

When receiving Pod B event, DP must make sure there are no Pod A ACLs on the Endpoint. Also, DP must not process Pod A events for the same IP afterwards.

Minor Changes

Remove complex, unnecessary stalePodKey logic. Stale Pod Key was designed to handle this (impossible) situation: a Pod's Update event comes in after its Delete event. It is safe to assume that the Pod Controller won't send an Update after a Delete event for the same Pod/IP pair.

@huntergregory huntergregory requested a review from a team as a code owner February 16, 2023 02:08
@huntergregory huntergregory requested review from ck319, matmerr and vakalapa and removed request for a team February 16, 2023 02:08
@huntergregory huntergregory force-pushed the hgregory/npm-pod-assignment branch from 44d97f1 to 468f1d0 Compare February 18, 2023 00:02
@huntergregory huntergregory force-pushed the hgregory/npm-pod-assignment branch from ccef902 to aade8fb Compare March 2, 2023 23:17
@huntergregory huntergregory force-pushed the hgregory/npm-pod-assignment branch from 6166bd1 to d20fb6c Compare March 3, 2023 18:00
@huntergregory huntergregory force-pushed the hgregory/npm-pod-assignment branch 2 times, most recently from 123fae7 to 3927fbf Compare March 3, 2023 19:16
@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@vakalapa vakalapa merged commit 60fcc5c into master Mar 8, 2023
@vakalapa vakalapa deleted the hgregory/npm-pod-assignment branch March 8, 2023 18:45
rjdenney pushed a commit that referenced this pull request Mar 13, 2023
* reset acls for pods marked for delete

* simplify: remove stalePodKey

* add log for associating pod with endpoint

* reorganize win dp uts

* UTs for issue 1729 (fail because podMarkedForDelete is not used)

* update controller mock to use deleted pod metadata

* rename podMarkedForDelete to deletedPod

* all UTs (noticing edge in ipsetmanager now)

* updatePod() works for all sequences

* work with updatePodCache and existing controlplane

* remove pod delete logic

* remove change incorrect comment

* fix lint

* fix gofumpt

* remove changes to dp.go logs/comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug npm Related to NPM. windows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Edge Case for memory-starved Windows Server 2022 Nodes with NPM

3 participants