Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify taint manager workqueue keys #65350

Merged
merged 1 commit into from
Oct 18, 2018

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented Jun 22, 2018

This greatly reduces the memory size of the workqueue keys, and allows the queue to dedupe rapid events from the same objects, like other controllers do

Builds on #65339

Fixes #65325

/sig scheduling

improves memory use and performance when processing large numbers of pods containing tolerations

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 22, 2018
@liggitt liggitt force-pushed the simplify-taint-manager-key branch from 029d893 to cc8e67e Compare June 22, 2018 04:27
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 22, 2018
@liggitt
Copy link
Member Author

liggitt commented Jun 22, 2018

/assign @wojtek-t
@kubernetes/sig-scheduling-pr-reviews

@liggitt
Copy link
Member Author

liggitt commented Jun 22, 2018

/retest

@wojtek-t
Copy link
Member

/lgtm
/approve

Thanks!

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 22, 2018
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@liggitt
Copy link
Member Author

liggitt commented Jun 22, 2018

/retest

@dims
Copy link
Member

dims commented Jun 22, 2018

@liggitt the change in pkg/controller/nodelifecycle/scheduler/taint_manager.go is in this PR as well as #65339 i am guessing we don't need the 65339 then?

@dims
Copy link
Member

dims commented Jun 22, 2018

/test pull-kubernetes-e2e-kops-aws

@liggitt
Copy link
Member Author

liggitt commented Jun 22, 2018

i am guessing we don't need the 65339 then?

I am planning to cherrypick that one back to 1.8

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@liggitt
Copy link
Member Author

liggitt commented Jun 22, 2018

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@liggitt liggitt force-pushed the simplify-taint-manager-key branch from cc8e67e to d2d3149 Compare June 23, 2018 02:35
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 23, 2018
@wgliang
Copy link
Contributor

wgliang commented Jun 23, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 23, 2018
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@liggitt
Copy link
Member Author

liggitt commented Oct 16, 2018

the failures look related, which is concerning, because it seems to mean something is depending on level-driven processing of the nodes/pod updates. will keep digging as I have time

@spiffxp
Copy link
Member

spiffxp commented Oct 17, 2018

/milestone v1.13
I'm adding this to the v1.13 milestone since it relates to kubernetes/enhancements#166 which is currently planned for v1.13

@k8s-ci-robot k8s-ci-robot added this to the v1.13 milestone Oct 17, 2018
@AishSundar
Copy link
Contributor

/retest

@wojtek-t
Copy link
Member

the failures look related, which is concerning, because it seems to mean something is depending on level-driven processing of the nodes/pod updates. will keep digging as I have time

Yes - those seem to be real failures not flakes.
What is even more terrifying is that for e.g. gce-100-performance is (in terms of taints) preatty much relying only on making nodes ready/schedulable.

So something is really terrifying in this PR.

@AishSundar
Copy link
Contributor

AishSundar commented Oct 17, 2018

@wojtek-t @liggitt based on the comment above I am assuming this issue is serious enough to block Taint based eviction going to Beta in 1.13. @Huang-Wei 's comment indicated this PR to be a "nice to have" performance enhancement.

Considering the scalability regressions we saw with taint nodes in 1.12 I am concerned with the signals here already. Please let us know of how critical is this fix for the feature to go to Beta in 1.13. Thanks

@liggitt liggitt changed the title WIP - Simplify taint manager workqueue keys Simplify taint manager workqueue keys Oct 17, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 17, 2018
@liggitt
Copy link
Member Author

liggitt commented Oct 17, 2018

/retest

@liggitt
Copy link
Member Author

liggitt commented Oct 17, 2018

/test pull-kubernetes-e2e-gke

@liggitt
Copy link
Member Author

liggitt commented Oct 17, 2018

@AishSundar - the issues with this PR have been resolved, and tests are green. this should help with taint/toleration scaleability issues

@liggitt
Copy link
Member Author

liggitt commented Oct 17, 2018

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 17, 2018
@Huang-Wei
Copy link
Member

Thanks @liggitt !!

@AishSundar
Copy link
Contributor

Thanks @liggitt and @Huang-Wei

@wojtek-t
Copy link
Member

great.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 17, 2018
@Huang-Wei
Copy link
Member

/test pull-kubernetes-e2e-kops-aws

@liggitt
Copy link
Member Author

liggitt commented Oct 17, 2018

/retest

1 similar comment
@Huang-Wei
Copy link
Member

/retest

@k8s-ci-robot k8s-ci-robot merged commit 6f4b768 into kubernetes:master Oct 18, 2018
@@ -211,8 +198,8 @@ func (tc *NoExecuteTaintManager) Run(stopCh <-chan struct{}) {
glog.V(0).Infof("Starting NoExecuteTaintManager")

for i := 0; i < UpdateWorkerSize; i++ {
tc.nodeUpdateChannels = append(tc.nodeUpdateChannels, make(chan *nodeUpdateItem, NodeUpdateChannelSize))
tc.podUpdateChannels = append(tc.podUpdateChannels, make(chan *podUpdateItem, podUpdateChannelSize))
tc.nodeUpdateChannels = append(tc.nodeUpdateChannels, make(chan nodeUpdateItem, NodeUpdateChannelSize))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need channel group? The performance of TaintNodeByCondition is acceptable by one channel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "now".
I think that once we will be able to bump default qps limits in large clusters, it may appear to be too slow. But yeah - it's mostly guessing...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TaintNodeByCondition is distributing a single update queue among multiple workers.

This has two update queues to coordinate, so the exact solution will look different. I think we can do something simpler than this two-layer async, but we need to do it in a way that still lets us fan out workers and give node update handling priority when handling pod update events.

@liggitt liggitt deleted the simplify-taint-manager-key branch October 19, 2018 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants