SQL syntax for scheduler nodes #63249

serxa · 2024-05-01T18:13:07Z

Scheduling hierarchy and workload classifiers although being good building blocks are too complicated for definition and require config.xml changes. SQL alternative for the definition of these entities is required.

Describe the solution you'd like
We need an SQL syntax for defining scheduling nodes that is conceptually easier to grasp. Here are the features I'd like to see:

Multiple types of nodes should be avoided. Instead, it is better to have a unified node type that will be expanded into a tree section.
This unified node should have weights and priorities. Leaf nodes will be assigned fifo type. Not leaf nodes will be expanded into a parent -> prio -> fair -> children subtree of policies. prio node can be omitted in case all children have the same priority. fair node can be omitted in case all children have different priorities. This way children having the same priority will compete using their weight ratio.
A unified node could have a few constraints: max_requests, max_cost, max_speed and max_burst. Every constraint is integrated into the tree between node and its parent. For the leaf node: parent -> constraint -> fifo. And for inner nodes: parent -> constrant -> prio -> fair -> children. Multiple constraints are chained one after another. (We can also allow max_speed_2 and max_burst_2 to be able to define two leaky buckets).
Different trees for the different resources (<clickhouse><resources><resource_name>) should be avoided for simplicity reasons (remember cgroups v1 vs cgroups v2 in Linux). So the same unified tree will be expanded into an independent scheduler trees per resource. All trees will be identical.
Workload classifiers should also be integrated into the same unified tree to avoid unnecessary entities for simplicity. A leaf node name must be unique across the whole unified tree and will be used as a workload name (<clickhouse><workload_classifiers><workload_name>). This is much simpler, but it will not allow sending different workloads to the same fifo queue. This is fine, an intermediate node can be easily created. It won't be identical but it is good enough.
Given p.4 and p.5 the list of resources is not required any longer, and the scheduler tree for any resource could be constructed in runtime on the first request of the resource.
Sometimes it is important to have different settings for different resources. Most importantly weights. For example, it is needed for specifying different shares of IO and CPU for a workload (e.g. 90% CPU and 50% of IO). To achieve it a unified node is allowed to have both cpu_weight and io_weight set to different values alongside with plain weight to override its value for a specific resource: cpu and io.
The same logic applies to constraints: max_requests, max_cost, max_speed and max_burst could be overridden. For example io_max_requests will override max_requests for resources named io.

Permissions
It should be possible to allow users to modify only a subtree. But the first implementation will be simpler: allowing or denying all modifications.

SQL Syntax (TBD)
CREATE SCHEDULER NODE
ALTER SCHEDULER NODE
DROP SCHEDULER NODE
SHOW CREATE SCHEDULER NODE
SHOW SCHEDULER NODES

The text was updated successfully, but these errors were encountered:

serxa added the feature label May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL syntax for scheduler nodes #63249

SQL syntax for scheduler nodes #63249

serxa commented May 1, 2024 •

edited

SQL syntax for scheduler nodes #63249

SQL syntax for scheduler nodes #63249

Comments

serxa commented May 1, 2024 • edited

serxa commented May 1, 2024 •

edited