Skip to content

Fix SCHED_FLAG_KEEP_PARAMS use for sched_runtime#148

Closed
petretudor-arm wants to merge 1 commit into
scheduler-tools:masterfrom
petretudor-arm:sched-other-keep-runtime
Closed

Fix SCHED_FLAG_KEEP_PARAMS use for sched_runtime#148
petretudor-arm wants to merge 1 commit into
scheduler-tools:masterfrom
petretudor-arm:sched-other-keep-runtime

Conversation

@petretudor-arm
Copy link
Copy Markdown
Contributor

SCHED_FLAG_KEEP_PARAMS is too coarse-grained for preserving only the sched_runtime value set by system managers since it will also prevent other parameters from being changed. The result is that the SCHED_OTHER nice value can now only be changed when sched_runtime != 0.

Instead, when sched_runtime is not requested by the user, set it to the value read by sched_getattr.

As @deggeman pointed out on the commit, this is a very narrow fix. AFAIK systemd can also set sched_policy, sched_nice, sched_priority, the RESET_ON_FORK field in sched_flags and uclamp values with cgroup v2. And this is not accounting for other things systemd can control like affinity, NUMA etc.

Maybe we should revisit the strategy for how we decide default values for scheduling parameters? Should everything behave like "if not set by the user, set it to an inherited value"?

SCHED_FLAG_KEEP_PARAMS is too coarse-grained for preserving only the
sched_runtime value set by system managers since it will also prevent
other parameters from being changed. The result is that the SCHED_OTHER
nice value can now only be changed when sched_runtime != 0.

Instead, when sched_runtime is not requested by the user, set it to the
value read by sched_getattr.

Signed-off-by: Petre Tudor <petre-ionut.tudor@arm.com>
Comment thread src/rt-app.c
perror("sched_getattr: failed to get SCHED_OTHER attributes");
exit(EXIT_FAILURE);
}
sa_params.sched_runtime = current_sa_params.sched_runtime;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only update sched_runtime but what happen for sched_nice, sched_deadline ...

should we not always call sched_getattr() and then only apply the change

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess this is the way to do this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree, but won't it break a few scripts? At this point there probably is an expectation that when the user does not set a scheduling-related value, it gets a known default from rt-app.

Examples:

  • user sets SCHED_OTHER expecting a default nice value of 0, but they get some other default.
  • user sets SCHED_FIFO as the policy, but not priority, expecting to get SCHED_FIFO with priority 10 (DEFAULT_THREAD_PRIORITY), but what they get is the system default priority from kernel or systemd.
  • user sets SCHED_DEADLINE and a valid deadline triple, but the system sets a priority != 0 and sched_setattr() fails. Now they have to also set the priority to 0 on every deadline task

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean something like below.
`--- a/src/rt-app.c
+++ b/src/rt-app.c
@@ -969,6 +969,16 @@ static void __set_thread_sched_other_attrs(thread_data_t *data,
sched_data->runtime);

    tid = gettid();
  •   /* sched_runtime was not requested - keep the current value */
    
  •   ret = sched_getattr(tid, &sa_params,
    
  •                       sizeof(struct sched_attr), flags);
    
  •   if (ret) {
    
  •           log_critical("[%d] sched_getattr returned %d",
    
  •                        data->ind, ret);
    
  •           errno = ret;
    
  •           perror("sched_getattr: failed to get SCHED_OTHER attributes");
    
  •           exit(EXIT_FAILURE);
    
  •   }
      sa_params.size = sizeof(struct sched_attr);
      sa_params.sched_policy = sched_data->policy;
      sa_params.sched_priority = __sched_priority(data, sched_data);
    

@@ -982,8 +992,6 @@ static void __set_thread_sched_other_attrs(thread_data_t *data,

    if(sched_data->runtime)
            sa_params.sched_runtime = sched_data->runtime;
  •   else
    
  •           sa_params.sched_flags = SCHED_FLAG_KEEP_PARAMS;
    
      ret = sched_setattr(tid, &sa_params, flags);
      if (ret) {
    

`

Beside this, we probably need to think about a way to describe that we don't want to make any change instead of applying default values. I think that adding the below and setting "default_policy": "SCHED_SAME" in the "global" node should be enough (not compiled, not tested)

diff --git a/src/rt-app_utils.c b/src/rt-app_utils.c
index 209d99176b3b..81d6ab1f228c 100644
--- a/src/rt-app_utils.c
+++ b/src/rt-app_utils.c
@@ -186,6 +186,8 @@ string_to_policy(const char *policy_name, policy_t *policy)
*policy = fifo;
else if (strcmp(policy_name, "SCHED_DEADLINE") == 0)
*policy = deadline;

  •   else if (strcmp(policy_name, "SCHED_SAME") == 0)
    
  •           *policy =  same;
      else
              return 1;
      return 0;
    

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this behaviour should affect the sched_attr we set regardless of which scheduling class the policy belongs to. So for v2 of this PR I propose:

  • We introduce a new global parameter, called "defaults" which controls the source of default values and can take two values: "system" and "rt-app". By default it would be set to "rt-app" to preserve existing behaviour.
  • Change sched_setattr (or add a new function) to check for this flag and in the "system" case do what you showed in your comment above, only extended to all individual sched_attr fields rather than just priority/nice and runtime. In horrible pseudocode:
fn sched_setattr_2(..., sched_data_t *sched_data, ...)
{
	if defaults == "system" {
		sa_params = sched_getattr();
		for each $field in struct sched_data_t {
			if sched_data.$field is set {
				// Need to find values which describe what it means to be "unset".
				sa_params.$field = sched_data.$field;
			}
		}
	} else {
		// current behaviour: overwrite with union of rt-app defaults and user values
		sa_params = sched_data;
	}

return sched_setattr(..., sa_params, ...);
}

This shouldn't interfere with policy_t::same as it takes effect before set_thread_param starts applying scheduling parameters.

Does this make sense?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be ok, i suppose

@deggeman
Copy link
Copy Markdown
Contributor

deggeman commented Apr 24, 2026

Do you now want to fix more than this nice or runtime overwrite in case sched_data doesn't have the appropriate value? I thought a fix like this will cure the issue just fine?

0001-rt-app.c-Allow-to-set-only-nice-or-runtime-value-for.patch

Hope you can see the patch? The source code directly in the comments is rendered awfully.

@petretudor-arm
Copy link
Copy Markdown
Contributor Author

petretudor-arm commented Apr 24, 2026

That's what I was proposing. IMO this behaviour should not be specific to fair policies, but extend to RT, deadline and idle as well. This plus a flag to make it more explicit means the user always knows what to expect from these values and we shouldn't break anything.

With your fix, if the user explicitly sets priority to 0 they get the system value for nice. Same with the runtime value. Also, since rt-app's default values for nice and runtime are also both 0, it effectively eliminates rt-app defaults from the equation, so now you can only set nice and runtime to 0 if those are the system defaults. I would say this is not what users expect and it's also inconsistent with how the other set_thread_ functions behave.

@petretudor-arm
Copy link
Copy Markdown
Contributor Author

I should mention my patch also has the same problems, but only affecting the runtime value. Hence why I proposed the approach above.

@deggeman
Copy link
Copy Markdown
Contributor

Right, 'priority == 0' doesn't work as unspecified. So then lets use THREAD_PRIORITY_UNCHANGED and enable it to be routed into __set_thread_sched_other_attrs(). I guess runtime == 0 is fine since it should only be allowed to be set between 0.5 and 100ms. Can you share you patch as a file here or as a patch I can fetch for testing? I attach my new version here (only Fair tasks):

IMHO, you can't set nice or slice via sched_setattr for SCHED_IDLE:

__sched_setscheduler()

   if (unlikely(policy == p->policy)) {
            if (fair_policy(policy) &&        
                (attr->sched_nice != task_nice(p) ||
                 (attr->sched_runtime != p->se.slice)))
                    goto change;

fair_policy() only returns true for SCHED_NORMAL (SCHED_EXT) & SCHED_BATCH.

0001-rt-app.c-Allow-to-set-only-nice-or-runtime-value-for.patch

@petretudor-arm
Copy link
Copy Markdown
Contributor Author

If I understand correctly, with your patch the user should set the prio to THREAD_PRIORITY_UNCHANGED to get the system/inherited value, otherwise they get the rt-app default.

if (sched_data->prio)
		sa_params.sched_nice = sched_data->prio;
else
		sa_params.sched_nice = _sa_params.sched_nice;

If __set_thread_sched_other_attrs() gets a sched_data->prio of THREAD_PRIORITY_UNCHANGED then won't this set the nice value to INT_MAX for SCHED_{NORMAL,BATCH,IDLE}? So I guess this would need changing to something like

if (sched_data->prio == THREAD_PRIORITY_UNCHANGED)
                sa_params.sched_nice = _sa_params.sched_nice;
else
                /* Already validated; set by rt-app to 0 if not present in the config. */
               sa_params.sched_nice = sched_data->prio;

Can you share you patch as a file here or as a patch I can fetch for testing?

I was refering to the v1 patch I submitted with this PR, sorry for the confusion.

@deggeman
Copy link
Copy Markdown
Contributor

Can't you see the patch I attached to my last comment?

parse_sched_data() sets 'prior_def = THREAD_PRIORITY_UNCHANGED' in case 'policy == same'. My patch just makes sure we route this into__set_thread_sched_other_attrs().

@vingu-linaro
Copy link
Copy Markdown
Member

Can't you see the patch I attached to my last comment?

parse_sched_data() sets 'prior_def = THREAD_PRIORITY_UNCHANGED' in case 'policy == same'. My patch just makes sure we route this into__set_thread_sched_other_attrs().

Would be easier if you sent your proposal either in this PR or create a new one so we can comment directly.

@deggeman
Copy link
Copy Markdown
Contributor

Yeah, you're right, let me do this.

@deggeman
Copy link
Copy Markdown
Contributor

#149

@vingu-linaro
Copy link
Copy Markdown
Member

This #148 and #149 are fixing the same problem.
Am I correct to consider that #149 includes also these changes ?

@petretudor-arm
Copy link
Copy Markdown
Contributor Author

This #148 and #149 are fixing the same problem. Am I correct to consider that #149 includes also these changes ?

Yes. I'll go ahead and close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants