New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a fast path for adding new node in node_authorizer #87688
Conversation
While reviewing this, I found an existing bug in the index maintenance code. Opened #87693 to resolve that first (required since this PR makes the index authoritative) |
go ahead and squash, then lgtm pending #87693 |
Squashed, thanks! |
opened WIP #87696 that builds on this and switches to refcount and adds a fastpath for edge removal.
|
/retest |
@@ -302,8 +302,11 @@ func (r *NodeAuthorizer) hasPathFrom(nodeName string, startingType vertexType, s | |||
} | |||
|
|||
// Fast check to see if we know of a destination edge | |||
if r.graph.destinationEdgeIndex[startingVertex.ID()].has(nodeVertex.ID()) { | |||
return true, nil | |||
if index := r.graph.destinationEdgeIndex[startingVertex.ID()]; index != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only hold items that had a destination edge and were added to the index. AddVolumeAttachment, SetNodeConfigMap, and AddPod all have places where they add destination edges and do not update the index.
AddPV explicitly doesn't have the ability to add a destination edge:
// since we don't know the other end of the pvc -> pod -> node chain (or it may not even exist yet), we can't decorate these edges with kubernetes node info
Unless all added edges are indexed, we can't make the presence of this index short-circuit a DFS fallback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that unit tests are passing with this change as well means that not all relationships are well-exercised in the presence of an index. For example, create enough node{1,2,3,...}<-pod->configmap1 relationships that configmap1 has an index, then create a nodeN<->configmap1 relationship with SetNodeConfigMap
and verify authorizer.hasPathFrom
finds a path from the node to the configmap:
diff --git a/plugin/pkg/auth/authorizer/node/graph_test.go b/plugin/pkg/auth/authorizer/node/graph_test.go
index a87d6c0a7b8..a3a16064196 100644
--- a/plugin/pkg/auth/authorizer/node/graph_test.go
+++ b/plugin/pkg/auth/authorizer/node/graph_test.go
@@ -348,20 +348,28 @@ func TestIndex(t *testing.T) {
g.SetNodeConfigMap("node1", "cm1", "ns")
g.SetNodeConfigMap("node2", "cm1", "ns")
g.SetNodeConfigMap("node3", "cm1", "ns")
+ g.SetNodeConfigMap("node4", "cm1", "ns")
+ if ok, err := a.hasPathFrom("node1", configMapVertexType, "ns", "cm1"); err != nil || !ok {
+ t.Errorf("expected path from %s to cm1, got %v, %v", "node1", ok, err)
+ }
+ if ok, err := a.hasPathFrom("node4", configMapVertexType, "ns", "cm1"); err != nil || !ok {
+ t.Errorf("expected path from %s to cm1, got %v, %v", "node4", ok, err)
+ }
expectGraph(map[string][]string{
"node:node1": {},
"node:node2": {},
"node:node3": {},
+ "node:node4": {},
"pod:ns/pod2": {"node:node2"},
"pod:ns/pod3": {"node:node3"},
"pod:ns/pod4": {"node:node1"},
- "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+ "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm2": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm3": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
})
expectIndex(map[string][]string{
- "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3"},
+ "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3", "node:node4"},
"configmap:ns/cm2": {"node:node1", "node:node2", "node:node3"},
"configmap:ns/cm3": {"node:node1", "node:node2", "node:node3"},
"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},
@@ -373,16 +381,17 @@ func TestIndex(t *testing.T) {
"node:node1": {},
"node:node2": {},
"node:node3": {},
+ "node:node4": {},
"pod:ns/pod2": {"node:node2"},
"pod:ns/pod3": {"node:node3"},
"pod:ns/pod4": {"node:node1"},
- "configmap:ns/cm1": {"node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+ "configmap:ns/cm1": {"node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm2": {"node:node1", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm3": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
})
expectIndex(map[string][]string{
- "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3"},
+ "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3", "node:node4"},
"configmap:ns/cm2": {"node:node1", "node:node2", "node:node3"},
"configmap:ns/cm3": {"node:node1", "node:node2", "node:node3"},
"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},
@@ -394,16 +403,17 @@ func TestIndex(t *testing.T) {
"node:node1": {},
"node:node2": {},
"node:node3": {},
+ "node:node4": {},
"pod:ns/pod2": {"node:node2"},
"pod:ns/pod3": {"node:node3"},
"pod:ns/pod4": {"node:node1"},
- "configmap:ns/cm1": {"node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+ "configmap:ns/cm1": {"node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm2": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"configmap:ns/cm3": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
})
expectIndex(map[string][]string{
- "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3"},
+ "configmap:ns/cm1": {"node:node1", "node:node2", "node:node3", "node:node4"},
"configmap:ns/cm2": {"node:node1", "node:node2", "node:node3"},
"configmap:ns/cm3": {"node:node1", "node:node2", "node:node3"},
"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},
that test fails on this PR with:
--- FAIL: TestIndex (0.00s)
graph_test.go:356: expected path from node4 to cm1, got false, node "node4" cannot get configmap ns/cm1, no relationship to this object was found in the node authorizer graph index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, thanks for catching this!
Could you include this test case in your PR?
/hold the destination index is not authoritative, since not all edges know their destination node. see #87688 (comment) |
the fastpath index add is useful and should be kept (and the corresponding refcount and fastpath remove in #87696) |
Thanks for finding the issue with destination index. I reverted this part. |
PTAL |
This seems to improve WriteIndexMaintenance benchmark: Before: BenchmarkWriteIndexMaintenance-12 1034 1157922 ns/op 1906 B/op 41 allocs/op After: BenchmarkWriteIndexMaintenance-12 4891 239821 ns/op 1572 B/op 37 allocs/op
/retest |
/hold cancel |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, mborsz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/priority important-longterm |
/retest |
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
/retest Review the full test history for this PR. Silence the bot with an |
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
/retest Review the full test history for this PR. Silence the bot with an |
/retest |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This is a performance improvement for scheduling new pod with secrets, configmaps and other resources.
Without this patch, each pod using secret X will trigger index recomputation of size O(pods using that secret), so in clusters with highly replicated secrets this can be a huge cost.
With this patch, each such pod will be handled in O(1) time.
We saw that this was impacting scalability tests when we created pods with secrets with high rate (100 pods/s), creating a O(minute) delay between the time when pod was scheduled on some node and time when node was able to fetch associated secret.
Similar improvement can be implemented for removing secrets, but this will require introducing reference counting, so maybe in future.
This seems to improve WriteIndexMaintenance benchmark:
This also changes the way the index is used in node_authorizer: before this patch, even if destinationEdgeIndex is present for given node, but there is no path between two nodes, DFS will be triggered. This can be simplified, as if destinationEdgeIndex is present, it must be complete so we can easily fail if destinationEdgeIndex is present, but destination node is not there.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/assign @wojtek-t
/assign @liggitt