Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release CPUs reminders #475

Merged
merged 1 commit into from Jun 23, 2022
Merged

Release CPUs reminders #475

merged 1 commit into from Jun 23, 2022

Conversation

liranr23
Copy link
Member

When we allocate CPUs we are basing upon the list we return back. This
is correct for the scheduling phase. But now we run the same logic on
canSchedule. There, when having a group of VMs, we are basing on the
cpuTopology list, not the list we selected to allocate. The overtaken
CPU wasn't released from the pinning itself and that harmed the
canSchedule logic for VM groups.

Also, when grouping the VMs, when the group is having an affinity set,
it is reordered. Therefore, it is required to sort them there as well.

Change-Id: Idcae90fed953c4798f2c4bf8893bff0a1b7086ea
Signed-off-by: Liran Rotenberg lrotenbe@redhat.com

@liranr23
Copy link
Member Author

/ost

@ahadas
Copy link
Member

ahadas commented Jun 20, 2022

the code changes lgtm but this description.. it should clarify the reason/context for the changes but it's not easy to parse it - see comments below

When we allocate CPUs we are basing upon the list we return back.

by 'we are basing upon' you mean 'we rely on'?

There, when having a group of VMs, we are basing on the cpuTopology list, not the list we selected to allocate.

do you mean "we use the cpuTopology list"?

The overtaken CPU wasn't released from the pinning itself and that harmed the canSchedule logic for VM groups.

"The overtaken CPU remained pinned"?

Also, when grouping the VMs, when the group is having an affinity set, it is reordered. Therefore, it is required to sort them there as well.

"when grouping the VMs and having an affinity set"? what does "there" refer to?

When we allocate CPUs we rely on the list we return back. This is
correct for the scheduling phase. But now we run the same logic on
canSchedule. There, when having a group of VMs, we use the cpuTopology
list, not the list we selected to allocate. The overtaken CPU remained
pinned and that harmed the canSchedule logic for VM groups.

When grouping the VMs, when the group is having an affinity set, it is
reordered. Therefore, it is required to sort them.

Change-Id: Idcae90fed953c4798f2c4bf8893bff0a1b7086ea
Signed-off-by: Liran Rotenberg <lrotenbe@redhat.com>
@liranr23
Copy link
Member Author

the code changes lgtm but this description.. it should clarify the reason/context for the changes but it's not easy to parse it - see comments below

When we allocate CPUs we are basing upon the list we return back.

by 'we are basing upon' you mean 'we rely on'?

There, when having a group of VMs, we are basing on the cpuTopology list, not the list we selected to allocate.

do you mean "we use the cpuTopology list"?

The overtaken CPU wasn't released from the pinning itself and that harmed the canSchedule logic for VM groups.

"The overtaken CPU remained pinned"?

Also, when grouping the VMs, when the group is having an affinity set, it is reordered. Therefore, it is required to sort them there as well.

"when grouping the VMs and having an affinity set"? what does "there" refer to?

thanks, commit changed.
"there" referred to the start of groupVms, were we sort the vms list. but that will be alright only when there is no affinity.

@ljelinkova
Copy link
Collaborator

I still do not understand what is this PR about - do you have an example of incorrect behavior?

@liranr23
Copy link
Member Author

liranr23 commented Jun 20, 2022

I still do not understand what is this PR about - do you have an example of incorrect behavior?

The one in the bug https://bugzilla.redhat.com/show_bug.cgi?id=2079351 .
1st problem is when we thought we do sort the VM list in groupVms (under SchedulingManager) and we do. Just not in the case there is affinity group, if we do have affinity group the sort need to be done at the end of the function.

2nd problem, now when we enter CpuPinningPolicyUnit, we use the same cpuTopology for the group of VMs. When allocating (actually, this may hit CPUPolicyUnit as well now), the logic is to allocate and "drop" the reminder CPU(s) if there are any, the returned list from the allocation function (cpusToAllocate) was the place we looked into.
But now in canSchedule, using the same cpuTopology, we don't only look on cpusToAllocate, we then keep passing on cpuTopology, which didn't unPin the reminders.

Example from the bug having host with 4:4:1, VM1 with 4:3:1 and VM2 with 2:1:1.
In the allocation function we will actually pin 15 CPUs (only the one with VDSM is left out), while we only take 12 in cpusToAllocate (there are 3 sockets having 1 CPU as a reminder). Now we will also call unPin to this VdsCpuUnit, and it can be used for the next VM (in our case VM2).
Makes more sense?

@ljelinkova
Copy link
Collaborator

Yes, it is starting to make sense to me. Do I understand it correctly that at first we pin all CPUs in cpusToBeAllocated and then remove and unpin some of them. Why is that? What does coresReminder mean?

@liranr23
Copy link
Member Author

Yes, it is starting to make sense to me. Do I understand it correctly that at first we pin all CPUs in cpusToBeAllocated and then remove and unpin some of them. Why is that? What does coresReminder mean?

When selecting CPUs to allocate we pass on cpuTopology and pin them, cpusToBeAllocated are the selected ones we really take. First we select and pin, then we remove any reminder left from the cpusToBeAllocated. Now we also need to unPin those CPUs.

Why it's like that is a good question which I don't remember (and it was discussed) when implemented. To me it's more greedy approach, when we see that core doesn't have enough threads within to be used, we just release it.

Copy link
Collaborator

@ljelinkova ljelinkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not completely understand how cpusToBeAllocated is created but since that part has already been tested and works, the changes in this PR make sense.

@liranr23
Copy link
Member Author

liranr23 commented Jun 23, 2022

OST passed last time, the only code change is the commit message.

@ahadas ahadas merged commit 9347171 into oVirt:master Jun 23, 2022
@liranr23 liranr23 deleted the release_allocation branch June 23, 2022 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants