forked from sched-ext/scx
-
Notifications
You must be signed in to change notification settings - Fork 0
lib: embedded ATQ #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Add a utility script, cgpath.sh, that takes a cgroup ID as a command-line argument and returns the full path of the cgroup. The cgroup ID is the inode number of the cgroup. This is for easy debugging of the cpu.max support. Signed-off-by: Changwoo Min <changwoo@igalia.com>
Define key data structures (scx_cgroup_ctx, scx_cgroup_llc_ctx) to support CPU bandwidth control (cpu.max) in cgroup v2. In addition, add an API skeleton for BPF schedulers. Signed-off-by: Changwoo Min <changwoo@igalia.com>
To avoid an --E2BIG error, tweak the code to reduce the BPF program size. Signed-off-by: Changwoo Min <changwoo@igalia.com>
Interate the CPU bandwidth control with the scx_lavd scheduler. The library is initialized (scx_cgroup_bw_lib_init) when the scheduler is initialized. Also, ops.cgroup_init(), ops.cgroup_exit(), and ops.cgroup_move() are implemented; scx_cgroup_bw_reenqueue() is called at ops.dispatch(). A new option, `--enable-cpu-bw` is added to enable the feature. Finally, replace __nr_cpu_ids to nr_cpu_ids defined in the scx library. Signed-off-by: Changwoo Min <changwoo@igalia.com>
scx_cgroup_bw_lib_init() first initializes the config and replelish timer. Signed-off-by: Changwoo Min <changwoo@igalia.com>
When a cgroup is initialized, the cgroup's context and its LLC contexts are initialized. Also, its parent now becomes non-leaf. If its parent is not threaded, it cannot have tasks, so we delete its LLC contexts. Signed-off-by: Changwoo Min <changwoo@igalia.com>
When a cgroup's bandwidth is updated, we should update the nquota_lb of all its descendants too. Signed-off-by: Changwoo Min <changwoo@igalia.com>
Destroy cgroup context, and its LLC contexts, and drain & free BTQs associated with the LLC contexts. Signed-off-by: Changwoo Min <changwoo@igalia.com>
We reserve the budget in a bottom-up manner: first at the LLC level, then at the cgroup level, and finally at the subroot cgroup level. Before asking the budget to the subroot cgroup level, update this cgroup's runtime_total_sloppy to avoid spending the budget over its upper bound (nquota_ub). We traverse the cgroup hierarchy in a post-order manner (left, right, then root) and check the cgroup's level to efficiently sum the runtime_total of all its descendants. Signed-off-by: Changwoo Min <changwoo@igalia.com>
After executing a task, we update runtime_toal and compensate for budget_remaining_before by comparing the planned vs. actual time usage. Signed-off-by: Changwoo Min <changwoo@igalia.com>
When a cgroup is throttled (i.e., scx_cgroup_bw_reserve() returns -EAGAIN), a task that is in the ops.enqueue() path should be put aside to the BTQ of its associated LLC context. When the cgroup becomes unthrottled again, the registered enqueue_cb() will be called to re-enqueue the task for execution. Signed-off-by: Changwoo Min <changwoo@igalia.com>
We replenish the cgroup's budget every 100ms (nperiod) interval. Only subroot cgroups (their levels == 1) distribute the budget to their descendants; we only replenish the budget for the subroot cgroup with a limited quota. The underused budget can be accumulated by the burst specified. On the other hand, the overused budget will be charged over the intervals. The replenish timer is split into two parts: the top half and the bottom half. The top half -- the actual BPF timer function (replenish_timerfn) -- runs the essential, critical part, such as refilling the time budget. On the other hand, the bottom half -- scx_cgroup_bw_reenqueue() -- runs on a BPF scheduler's ops.dispatch() and requeues the backlogged tasks to proper DSQs. Signed-off-by: Changwoo Min <changwoo@igalia.com>
Under CPU bandwidth control using cpu.max, we also need to report how much time was actually consumed compared to the reserved time. Signed-off-by: Changwoo Min <changwoo@igalia.com>
We updated ops.enqueue(), enqueue callback, and ops.dispatch() paths. Under CPU bandwidth control using cpu.max, we should first reserve time for execution. If we succeed in reserving the time, we will go ahead. Otherwise, we should set the task aside for later execution. When triggered by lavd_enqueue_cb(), we should still enqueue the task even if the time reservation fails (-EAGAIN). Note that we do not throttle the scheduler process itself to guarantee forward progress. Signed-off-by: Changwoo Min <changwoo@igalia.com>
This reverts commit ae41fd9.
Author
|
Closing since the merge was accidentally pointed to main. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add the necessary changes to the rbtree and ATQ APIs to enable no-allocation ATQ operations.
NOTE: There is currently a verification failure due to the stack depth being exceeded. This is incidental to the patch and just requires some more debugging that I will be doing over the weekend.