-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_central: Break dispatch_to_cpu loop when running out of buffer slots #26
Conversation
56784ab
to
107779f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me. A couple minor suggestions.
@@ -142,6 +142,13 @@ static bool dispatch_to_cpu(s32 cpu) | |||
s32 pid; | |||
|
|||
bpf_repeat(BPF_MAX_LOOPS) { | |||
/* We might run out of dispatch buffer slots if we continue dispatching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use fully winged comment style to be consistent with other comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will do.
* to the fallback DSQ, without dispatching to the local DSQ of the | ||
* target CPU. In such a case, break the loop. | ||
*/ | ||
if (!scx_bpf_dispatch_nr_slots()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we know that there are slots on entry to the function, would it be better to check nr_slots in the FALLBACK_DSQ block right before continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but I was thinking of a case where say you have 2 slots remaining, you pull out 2 tasks from the central_q which go to the fallback DSQ, and now you pull a 3rd which can go to the target CPU's local DSQ, but it will error out. So while it is true we have slots on entry it may not be true for the SCX_DSQ_LOCAL_ON case for later iterations.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what you mean, please ignore the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... so, we have either:
while (true) {
if (no slot)
break;
find task;
if (cpu doesn't match) {
dispatch(FALLBACK);
continue;
}
dispatch(target_cpu);
break;
}
or
while (true) {
find task;
if (cpu doesn't match) {
dispatch(FALLBACK);
if (more slots)
continue;
else
break;
}
dispatch(target_cpu);
break;
}
If we know that there are slots on entry, the behavior should be identical between the two, right? The only difference is that the former would do an extra check which will always indicate more slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're right, I was not parsing english properly. I've addressed both comments.
107779f
to
a6566a7
Compare
I have addressed both comments. I was a bit confused about the second one, please ignore that. |
For the case where many tasks being popped from the central queue cannot be dispatched to the local DSQ of the target CPU, we will keep bouncing them to the fallback DSQ and continue the dispatch_to_cpu loop until we find one which can be dispatch to the local DSQ of the target CPU. In a contrived case, it might be so that all tasks pin themselves to CPUs != target CPU, and due to their affinity cannot be dispatched to that CPU's local DSQ. If all of them are filling up the central queue, then we will keep looping in the dispatch_to_cpu loop and eventually run out of slots for the dispatch buffer. The nr_mismatched counter will quickly rise and sched-ext will notice the error and unload the BPF scheduler. To remedy this, ensure that we break the dispatch_to_cpu loop when we can no longer perform a dispatch operation. The outer loop in central_dispatch for the central CPU should ensure the loop breaks when we run out of these slots and schedule a self-IPI to the central core, and allow sched-ext to consume the dispatch buffer before restarting the dispatch loop again. A basic way to reproduce this scenario is to do: taskset -c 0 perf bench sched messaging The error in the kernel will be: sched_ext: BPF scheduler "central" errored, disabling sched_ext: runtime error (dispatch buffer overflow) bpf_prog_6a473147db3cec67_dispatch_to_cpu+0xc2/0x19a bpf_prog_c9e51ba75372a829_central_dispatch+0x103/0x1a5 Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
a6566a7
to
c4c994c
Compare
For the case where many tasks being popped from the central queue cannot be dispatched to the local DSQ of the target CPU, we will keep bouncing them to the fallback DSQ and continue the dispatch_to_cpu loop until we find one which can be dispatch to the local DSQ of the target CPU.
In a contrived case, it might be so that all tasks pin themselves to CPUs != target CPU, and due to their affinity cannot be dispatched to that CPU's local DSQ. If all of them are filling up the central queue, then we will keep looping in the dispatch_to_cpu loop and eventually run out of slots for the dispatch buffer. The nr_mismatched counter will quickly rise and the sched_ext core will notice the error and unload the BPF scheduler.
To remedy this, ensure that we break the dispatch_to_cpu loop when we can no longer perform a dispatch operation. The outer loop in central_dispatch for the central CPU should ensure the loop breaks when we run out of these slots and schedule a self-IPI to the local core, and allow the sched-ext core to consume the dispatch buffer before restarting the dispatch loop again.
A basic way to reproduce this scenario is to do:
taskset -c 0 perf bench sched messaging
The error in the kernel will be:
sched_ext: BPF scheduler "central" errored, disabling
sched_ext: runtime error (dispatch buffer overflow)
bpf_prog_6a473147db3cec67_dispatch_to_cpu+0xc2/0x19a
bpf_prog_c9e51ba75372a829_central_dispatch+0x103/0x1a5