scx_rustland: enhancements #72

arighi · 2024-01-06T00:54:27Z

Some fixes and enhancements to make scx_rustland more usable on a desktop / interactive system.

Byte-Lab

Overall looks good to me. Left one optional nit, and one suggestion for how we can make _get_task_cpu_slow() a lot faster.

Byte-Lab · 2024-01-06T01:51:54Z

scheds/rust/scx_rustland/src/bpf/main.bpf.c

+	int i;
+
+	for (i = 0; i < num_possible_cpus; i++) {
+		cpu = (cpu + i) % num_possible_cpus;
+		if (bpf_cpumask_test_cpu(cpu, p->cpus_ptr))
+			return cpu;
+	}
+
+	return -1;


Something else you could do is call bpf_cpumask_any_distribute(p->cpus_ptr) to choose a random CPU from p->cpus_ptr? Note that it returns a cpu >= num_cpus if none of the bits are set, so you'd have to do something like:

cpu = bpf_cpumask_any_distribute(p->cpus_ptr); return cpu < num_possible_cpus ? cpu : -1;

Nice suggestion! Thanks! I'll apply it and run some tests. I think we should still maintain the logic to try the prev_cpu before picking a random one using bpf_cpumask_any_distribute(), that should reduce some unnecessary migrations, but I'll test/measure both and check if one shows some clear benefits vs the other.

Yep, agreed. I was thinking that bpf_cpumask_any_distribute() could replace that for loop. But trying to choose prev_cpu first seems like a good idea.

Byte-Lab · 2024-01-06T01:55:32Z

scheds/rust/scx_rustland/src/bpf/main.bpf.c

@@ -375,6 +411,14 @@ static bool is_task_cpu_available(struct task_struct *p, u64 enq_flags)
 	if (is_kthread(p) && p->nr_cpus_allowed == 1)
 		return true;

+	/*
+	 * Do not bypass the user-space scheduler if there are pending
+	 * activity, otherwise, we may risk to disrupt the scheduler's


optional suggestion:

s/we may risk to disrupt/we may risk disrupting

arighi · 2024-01-06T09:01:31Z

Moreover, I'm not yet fully convinced about the last 2 commits (change the default timeout_ms and slice_ns).

While they can definitely provide some benefits for interactive workloads I think there's also value to keep them as they are, because that can help to better measure future improvements of the scheduler.

Maybe I can add a command line option that would set them if a certain command line option is passed...

Change TaskTree.push() to accept directly a Task object, rather than each individual attribute. Moreover, Task attributes don't need to be public, since both TaskTree and Task are only used locally. This makes the code more elegant and more readable. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Assign min_vruntime to the task before the weighted time slice is evaluated, then add the time slice. In this way we still ensure that the task's vruntime is in the range (min_vruntime + 1, min_vruntime + max_slice_ns], but we don't nullify the effect of the evaluated time slice if the starting vruntime of the task is too small. Also change update_enqueued() to return the evaluated weighted time slice (that can be used in the future). Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Instead of just trying the target CPU and the previously used CPU, we could cycle among all the available CPUs (if both those CPUs cannot be used), before using the global DSQ. This allows to not de-prioritize too much tasks that can't be scheduled on the CPU selected by the scheduler (or their previously used CPU), and we can still dispatch them using SCX_DSQ_LOCAL_ON, like any other task. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

When we fail to push a task to the queued BPF map we fallback to direct dispatch, but we can't use SCX_DSQ_LOCAL_ON. So, make sure to use SCX_DSQ_GLOBAL in this case to prevent scheduler crashes. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

We allow tasks to bypass the user-space scheduler and be dispatched directly using a shortcut in the enqueue path, if their running CPU is immediately available or if the task is per-CPU kthread. However, the shortcut is disabled if the user-space scheduler has some pending activities to do (to avoid disrupting too much its decision). In this case the shortcut is disabled also for per-CPU kthreads and that may cause priority-inversion problems in the system, triggering some stall of some per-CPU kthreads (such as rcuog/N) and short system lockups, if the system is overloaded. Prevent this by always enabing the dispatch shortcut for per-CPU kthreads. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

arighi · 2024-01-06T10:37:47Z

Alright, I dropped some commits and updated this PR only with the "safe" fixes.

For the more experimental ones I'd like to do more testing (even if the other changes make the scheduler more responsive, they also make it more unpredictable sometimes, so I would like to investigate more on that).

Always report task comm, nr_queued and nr_scheduled in the log messages. Moreover, report also task name (comm) and cpu when possible. All these extra information can be really helpful to trace and debug scheduling issues. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Cache slice_ns into the main scheduler class to avoid accessing it via self.bpf.skel.rodata().slice_ns every single time. This also makes the scheduler code more clear and more abstracted from the BPF details. Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Byte-Lab reviewed Jan 6, 2024

View reviewed changes

Andrea Righi added 5 commits January 6, 2024 11:06

arighi force-pushed the scx-rustland-enhancements branch from bd5024c to 295873a Compare January 6, 2024 10:35

Andrea Righi added 2 commits January 7, 2024 16:14

htejun approved these changes Jan 7, 2024

View reviewed changes

htejun merged commit 63fe690 into main Jan 7, 2024
2 checks passed

htejun deleted the scx-rustland-enhancements branch January 10, 2024 19:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scx_rustland: enhancements #72

scx_rustland: enhancements #72

arighi commented Jan 6, 2024 •

edited

Loading

Byte-Lab left a comment •

edited

Loading

Byte-Lab Jan 6, 2024

arighi Jan 6, 2024

Byte-Lab Jan 6, 2024

Byte-Lab Jan 6, 2024

arighi commented Jan 6, 2024

arighi commented Jan 6, 2024

scx_rustland: enhancements #72

scx_rustland: enhancements #72

Conversation

arighi commented Jan 6, 2024 • edited Loading

Byte-Lab left a comment • edited Loading

Choose a reason for hiding this comment

Byte-Lab Jan 6, 2024

Choose a reason for hiding this comment

arighi Jan 6, 2024

Choose a reason for hiding this comment

Byte-Lab Jan 6, 2024

Choose a reason for hiding this comment

Byte-Lab Jan 6, 2024

Choose a reason for hiding this comment

arighi commented Jan 6, 2024

arighi commented Jan 6, 2024

arighi commented Jan 6, 2024 •

edited

Loading

Byte-Lab left a comment •

edited

Loading