Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fast path for adding new node in node_authorizer #87688

Merged
merged 1 commit into from Feb 7, 2020

Conversation

mborsz
Copy link
Member

@mborsz mborsz commented Jan 30, 2020

What type of PR is this?
/kind feature

What this PR does / why we need it:
This is a performance improvement for scheduling new pod with secrets, configmaps and other resources.
Without this patch, each pod using secret X will trigger index recomputation of size O(pods using that secret), so in clusters with highly replicated secrets this can be a huge cost.
With this patch, each such pod will be handled in O(1) time.

We saw that this was impacting scalability tests when we created pods with secrets with high rate (100 pods/s), creating a O(minute) delay between the time when pod was scheduled on some node and time when node was able to fetch associated secret.

Similar improvement can be implemented for removing secrets, but this will require introducing reference counting, so maybe in future.

This seems to improve WriteIndexMaintenance benchmark:

Before:
BenchmarkWriteIndexMaintenance-12    	    1034	   1157922 ns/op	    1906 B/op	      41 allocs/op
After:
BenchmarkWriteIndexMaintenance-12    	    4891	    239821 ns/op	    1572 B/op	      37 allocs/op

This also changes the way the index is used in node_authorizer: before this patch, even if destinationEdgeIndex is present for given node, but there is no path between two nodes, DFS will be triggered. This can be simplified, as if destinationEdgeIndex is present, it must be complete so we can easily fail if destinationEdgeIndex is present, but destination node is not there.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/assign @wojtek-t
/assign @liggitt

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Jan 30, 2020
@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 30, 2020
@liggitt
Copy link
Member

liggitt commented Jan 30, 2020

While reviewing this, I found an existing bug in the index maintenance code. Opened #87693 to resolve that first (required since this PR makes the index authoritative)

@liggitt
Copy link
Member

liggitt commented Jan 30, 2020

go ahead and squash, then lgtm pending #87693

@mborsz
Copy link
Member Author

mborsz commented Jan 30, 2020

Squashed, thanks!

@liggitt
Copy link
Member

liggitt commented Jan 30, 2020

opened WIP #87696 that builds on this and switches to refcount and adds a fastpath for edge removal.

benchmark                             old ns/op     new ns/op     delta
BenchmarkWriteIndexMaintenance-12     1935450       8629          -99.55%

benchmark                             old allocs     new allocs     delta
BenchmarkWriteIndexMaintenance-12     44             37             -15.91%

benchmark                             old bytes     new bytes     delta
BenchmarkWriteIndexMaintenance-12     2141          1521          -28.96%

@liggitt
Copy link
Member

liggitt commented Jan 31, 2020

/retest
/test pull-kubernetes-bazel-test

@@ -302,8 +302,11 @@ func (r *NodeAuthorizer) hasPathFrom(nodeName string, startingType vertexType, s
}

// Fast check to see if we know of a destination edge
if r.graph.destinationEdgeIndex[startingVertex.ID()].has(nodeVertex.ID()) {
return true, nil
if index := r.graph.destinationEdgeIndex[startingVertex.ID()]; index != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only hold items that had a destination edge and were added to the index. AddVolumeAttachment, SetNodeConfigMap, and AddPod all have places where they add destination edges and do not update the index.

AddPV explicitly doesn't have the ability to add a destination edge:

		// since we don't know the other end of the pvc -> pod -> node chain (or it may not even exist yet), we can't decorate these edges with kubernetes node info

Unless all added edges are indexed, we can't make the presence of this index short-circuit a DFS fallback

Copy link
Member

@liggitt liggitt Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that unit tests are passing with this change as well means that not all relationships are well-exercised in the presence of an index. For example, create enough node{1,2,3,...}<-pod->configmap1 relationships that configmap1 has an index, then create a nodeN<->configmap1 relationship with SetNodeConfigMap and verify authorizer.hasPathFrom finds a path from the node to the configmap:

diff --git a/plugin/pkg/auth/authorizer/node/graph_test.go b/plugin/pkg/auth/authorizer/node/graph_test.go
index a87d6c0a7b8..a3a16064196 100644
--- a/plugin/pkg/auth/authorizer/node/graph_test.go
+++ b/plugin/pkg/auth/authorizer/node/graph_test.go
@@ -348,20 +348,28 @@ func TestIndex(t *testing.T) {
 	g.SetNodeConfigMap("node1", "cm1", "ns")
 	g.SetNodeConfigMap("node2", "cm1", "ns")
 	g.SetNodeConfigMap("node3", "cm1", "ns")
+	g.SetNodeConfigMap("node4", "cm1", "ns")
+	if ok, err := a.hasPathFrom("node1", configMapVertexType, "ns", "cm1"); err != nil || !ok {
+		t.Errorf("expected path from %s to cm1, got %v, %v", "node1", ok, err)
+	}
+	if ok, err := a.hasPathFrom("node4", configMapVertexType, "ns", "cm1"); err != nil || !ok {
+		t.Errorf("expected path from %s to cm1, got %v, %v", "node4", ok, err)
+	}
 	expectGraph(map[string][]string{
 		"node:node1":            {},
 		"node:node2":            {},
 		"node:node3":            {},
+		"node:node4":            {},
 		"pod:ns/pod2":           {"node:node2"},
 		"pod:ns/pod3":           {"node:node3"},
 		"pod:ns/pod4":           {"node:node1"},
-		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm2":      {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm3":      {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 	})
 	expectIndex(map[string][]string{
-		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3"},
+		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3", "node:node4"},
 		"configmap:ns/cm2":      {"node:node1", "node:node2", "node:node3"},
 		"configmap:ns/cm3":      {"node:node1", "node:node2", "node:node3"},
 		"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},
@@ -373,16 +381,17 @@ func TestIndex(t *testing.T) {
 		"node:node1":            {},
 		"node:node2":            {},
 		"node:node3":            {},
+		"node:node4":            {},
 		"pod:ns/pod2":           {"node:node2"},
 		"pod:ns/pod3":           {"node:node3"},
 		"pod:ns/pod4":           {"node:node1"},
-		"configmap:ns/cm1":      {"node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+		"configmap:ns/cm1":      {"node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm2":      {"node:node1", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm3":      {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 	})
 	expectIndex(map[string][]string{
-		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3"},
+		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3", "node:node4"},
 		"configmap:ns/cm2":      {"node:node1", "node:node2", "node:node3"},
 		"configmap:ns/cm3":      {"node:node1", "node:node2", "node:node3"},
 		"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},
@@ -394,16 +403,17 @@ func TestIndex(t *testing.T) {
 		"node:node1":            {},
 		"node:node2":            {},
 		"node:node3":            {},
+		"node:node4":            {},
 		"pod:ns/pod2":           {"node:node2"},
 		"pod:ns/pod3":           {"node:node3"},
 		"pod:ns/pod4":           {"node:node1"},
-		"configmap:ns/cm1":      {"node:node2", "node:node3", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
+		"configmap:ns/cm1":      {"node:node2", "node:node3", "node:node4", "pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm2":      {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"configmap:ns/cm3":      {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 		"serviceAccount:ns/sa1": {"pod:ns/pod2", "pod:ns/pod3", "pod:ns/pod4"},
 	})
 	expectIndex(map[string][]string{
-		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3"},
+		"configmap:ns/cm1":      {"node:node1", "node:node2", "node:node3", "node:node4"},
 		"configmap:ns/cm2":      {"node:node1", "node:node2", "node:node3"},
 		"configmap:ns/cm3":      {"node:node1", "node:node2", "node:node3"},
 		"serviceAccount:ns/sa1": {"node:node1", "node:node2", "node:node3"},

that test fails on this PR with:

--- FAIL: TestIndex (0.00s)
    graph_test.go:356: expected path from node4 to cm1, got false, node "node4" cannot get configmap ns/cm1, no relationship to this object was found in the node authorizer graph index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for catching this!

Could you include this test case in your PR?

@liggitt
Copy link
Member

liggitt commented Jan 31, 2020

/hold

the destination index is not authoritative, since not all edges know their destination node. see #87688 (comment)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2020
@liggitt
Copy link
Member

liggitt commented Jan 31, 2020

the fastpath index add is useful and should be kept (and the corresponding refcount and fastpath remove in #87696)

@mborsz
Copy link
Member Author

mborsz commented Jan 31, 2020

Thanks for finding the issue with destination index. I reverted this part.

@mborsz
Copy link
Member Author

mborsz commented Feb 4, 2020

PTAL

This seems to improve WriteIndexMaintenance benchmark:

Before:
BenchmarkWriteIndexMaintenance-12    	    1034	   1157922 ns/op	    1906 B/op	      41 allocs/op
After:
BenchmarkWriteIndexMaintenance-12    	    4891	    239821 ns/op	    1572 B/op	      37 allocs/op
@mborsz
Copy link
Member Author

mborsz commented Feb 4, 2020

/retest

@liggitt
Copy link
Member

liggitt commented Feb 4, 2020

/hold cancel
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 4, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, mborsz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Feb 4, 2020

/retest

@liggitt
Copy link
Member

liggitt commented Feb 4, 2020

/priority important-longterm
/retest

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 4, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Feb 4, 2020

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@wojtek-t
Copy link
Member

wojtek-t commented Feb 5, 2020

/retest

4 similar comments
@mborsz
Copy link
Member Author

mborsz commented Feb 5, 2020

/retest

@mborsz
Copy link
Member Author

mborsz commented Feb 5, 2020

/retest

@mborsz
Copy link
Member Author

mborsz commented Feb 5, 2020

/retest

@mborsz
Copy link
Member Author

mborsz commented Feb 5, 2020

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2020

/retest

@k8s-ci-robot k8s-ci-robot merged commit 91738cb into kubernetes:master Feb 7, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants