-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_rustland: per-task cpumask generation counter #133
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left a minor comment. I suppose the next step is making the userspace cpu selection code to consider the allowed cpumask?
return; | ||
|
||
/* Read current cpumask generation counter */ | ||
curr_cpumask_cnt = __sync_fetch_and_add(&tctx->cpumask_cnt, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think __sync op is necessary here, curr_cpumask_cnt = tctx->cpumask
should be sufficient. All the involved operations are already fully synchronized.
tctx = lookup_task_ctx(p); | ||
if (!tctx) | ||
return; | ||
task->cpumask_cnt = __sync_fetch_and_add(&tctx->cpumask_cnt, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
tctx = lookup_task_ctx(p); | ||
if (!tctx) | ||
return; | ||
__sync_fetch_and_add(&tctx->cpumask_cnt, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this can just be tctx->cpumask_cnt++
.
Introduce a per-task generation counter to check the validity of the cpumask at dispatch time. The logic is the following: - the cpumask generation number is incremented every time a task calls .set_cpumask() - when a task is enqueued the current generation number is stored in the queued_task_ctx and relayed to the user-space scheduler - the user-space scheduler can decide to dispatch the task on the CPU determined by the BPF layer in .select_cpu(), redirect the task to any other specific CPU, or redirect to the first CPU available (using NO_CPU) - task is then dispatched back to the BPF code along with its cpumask generation counter - at dispatch time the BPF code checks if the generation number is the same and it discards the dispatch attempt if the cpumask is not valid anymore (the task will be automatically re-enqueued by the sched-ext core code, potentially selecting another CPU / cpumask) - if the cpumask is valid, but the CPU selected by the user-space scheduler is invalid (according to the cpumask), the task will be transparently bounced by the BPF code to the shared DSQ (in this way the user-space code can be completely abstracted and dispatches that target invalid CPUs can be automatically fixed by the BPF layer) This solution can prevent stalls due to dispatches targeting invalid CPUs and it can also avoid redundant dispatch events, making the code more efficient and the cpumask interlocking more reliable. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
7d5efa6
to
61d1ed3
Compare
Updated (you're right, we don't need _sync_fetch*(), the cpumask counter is per task and it's all synchronized already). And yes, the next step will be to expose the cpumask to the user-space so that the scheduler can make better decisions (and at that point we could even make the auto-bounce to the shared DSQ logic optional with a flag or similar). Thanks for the review! |
Introduce a per-task generation counter to check the validity of the cpumask at dispatch time.
The logic is the following:
the cpumask generation number is incremented every time a task calls .set_cpumask()
when a task is enqueued the current generation number is stored in the queued_task_ctx and relayed to the user-space scheduler
the user-space scheduler can decide to dispatch the task on the CPU determined by the BPF layer in .select_cpu(), redirect the task to any other specific CPU, or redirect to the first CPU available (using NO_CPU)
task is then dispatched back to the BPF code along with its cpumask generation counter
at dispatch time the BPF code checks if the generation number is the same and it discards the dispatch attempt if the cpumask is not valid anymore (the task will be automatically re-enqueued by the sched-ext core code, potentially selecting another CPU / cpumask)
if the cpumask is valid, but the CPU selected by the user-space scheduler is invalid (according to the cpumask), the task will be transparently bounced by the BPF code to the shared DSQ (in this way the user-space code can be completely abstracted and dispatches that target invalid CPUs can be automatically fixed by the BPF layer)
This solution can prevent stalls due to dispatches targeting invalid CPUs and it can also avoid redundant dispatch events, making the code more efficient and the cpumask interlocking more reliable.