Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take kube/system reserved cores into account #5

Merged
merged 1 commit into from
Jul 26, 2017

Conversation

flyingcougar
Copy link

Take kube/system reserved cores into account.

@flyingcougar
Copy link
Author

@ConnorDoyle - what do you think about that ?

@ConnorDoyle
Copy link
Owner

Nice work, thanks! Do you think it's better to have a different flag or piggyback on allocatable? This looks fine to me but we'd have to update the config section in the proposal.

Could you update this to acquire CPUs from low-numbered physical cores topologically? E.g., 0,3,1,5,...

We can get this in now though if you want to do that in a separate PR.

@ConnorDoyle
Copy link
Owner

Cc @sjenning @balajismaniam

@flyingcougar
Copy link
Author

Will update this PR - i like piggybacking on allocatable :) IMHO another cpu manager flag (only valid for one policy) may introduce some confusion

@ConnorDoyle
Copy link
Owner

OK, I can update the config section in the proposal (will make a PR this time). I was convinced on Friday that a separate flag is better but on second look I am less sure. A solid argument for piggybacking is that the best-practice would involve setting those flags properly anyway!

@sjenning
Copy link

@flyingcougar @ConnorDoyle +1 on this

@@ -88,6 +94,20 @@ func NewManager(policyName PolicyName, cr internalapi.RuntimeService, kletGetter
}, nil
}

func getReserverdCores(nodeStatus v1.NodeStatus) int64 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a test? Know we're already in test-debt but this seems like an easy add.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will do

}
s.SetDefaultCPUSet(shared)
// Figure out which cores shall not be used in shared pool
reserved, _ := takeByTopology(p.topology, fullCpuset, p.topology.NumReservedCores)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we leave a note here saying that this depends on the order in which cores are taken in that function (low-numbered cores first?) We have to make this explicit in the contract so operators can reliably configure their systems around the CPU manager.

Glad this could be reused by the way, 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

reserved.Sub(nodeStatus.Allocatable[v1.ResourceCPU])
glog.Infof("[cpumanager] %v cores reserved for kube/system and shared pool", reserved.Value())
if reserved.IsZero() {
glog.Infof("[cpumanager] defaulting to 1 core")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to need to change this at some point as just unilaterally reserving one core messes with the scheduler i.e. on a 4 core node, the schedule sees nothing wrong with scheduling 4 cores of pinned pods to the node; however, the cpumanager can only accommodate 3.

I'm working on new content for the proposal to cover the setting of a CPUPressure=true condition on the node to prevent placement of CPUless pods (Bu pods without CPU requests or BE pods) and eviction of existing CPUless pods. This lets us deplete the shared pool and the scheduler accounting Just Works.

This is just an "keep in mind", not an objection to this change. Thanks!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great Seth, there's a little blurb in the proposal about adding taints in the static manager scenarios (shared pool becomes empty, shared pool becomes nonempty) that could use some expansion.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we'd be better off dropping this conditional for now... if the operator really wants zero CPUs to be reserved, then there's nothing really "wrong" with it IMO. I think reserving CPU 0 is more of an artifact of the stopgap we had in place before this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok with dropping it. Avoids us needing to remember to remove it later.

Copy link
Owner

@ConnorDoyle ConnorDoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flyingcougar once comments are fixed please squash

@flyingcougar
Copy link
Author

flyingcougar commented Jul 25, 2017

@ConnorDoyle - done
Also adjusted the calculation to match ceiling(node.status.allocatable.cpu)

@flyingcougar flyingcougar force-pushed the cpu-res branch 4 times, most recently from cbdd56a to cb76746 Compare July 25, 2017 11:01
Copy link
Owner

@ConnorDoyle ConnorDoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@ConnorDoyle ConnorDoyle merged this pull request into ConnorDoyle:cpu-manager Jul 26, 2017
@ConnorDoyle ConnorDoyle deleted the cpu-res branch July 26, 2017 23:30
ConnorDoyle pushed a commit that referenced this pull request Nov 4, 2017
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove redundant call to StartLogging in service_controller. Fixes #5…

…4339



**What this PR does / why we need it**:
Removes redundant call to StartLogging introduced in 96b48d4#diff-1f7f903e25ab8bcbc514bb1e532e997e

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
kubernetes#54339 

**Special notes for your reviewer**:

**Release note**:

```release-note
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants