Skip to content

Commit

Permalink
sched/fair: Record the average duration of a task [v7]
Browse files Browse the repository at this point in the history
Record the average duration of a task, as there is a requirement
to leverage this information for better task placement.

At first thought the (p->se.sum_exec_runtime / p->nvcsw)
can be used to measure the task duration. However, the
history long past was factored too heavily in such a formula.
Ideally, the old activity should decay and not affect
the current status too much.

Although something based on PELT can be used, se.util_avg might
not be appropriate to describe the task duration:
Task p1 and task p2 are doing frequent ping-pong scheduling on
one CPU, both p1 and p2 have a short duration, but the util_avg
can be up to 50%, which is inconsistent with task duration.

It was found that there was once a similar feature to track the
duration of a task:
commit ad4b78b ("sched: Add new wakeup preemption mode: WAKEUP_RUNNING")
Unfortunately, it was reverted because it was an experiment. Pick the
patch up again, by recording the average duration when a task voluntarily
switches out.

For example, suppose on CPU1, task p1 and p2 run alternatively:

 --------------------> time

 | p1 runs 1ms | p2 preempt p1 | p1 switch in, runs 0.5ms and blocks |
               ^               ^                                     ^
 |_____________|               |_____________________________________|
                                                                     ^
                                                                     |
                                                                  p1 dequeued

p1's duration in one section is (1 + 0.5)ms. Because if p2 does not
preempt p1, p1 can run 1.5ms. This reflects the nature of a task:
how long it wishes to run at most.

Suggested-by: Tim Chen <tim.c.chen@intel.com>
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Alexandre Frade <kernel@xanmod.org>
  • Loading branch information
yu-chen-surf authored and xanmod committed Apr 26, 2023
1 parent 5c47604 commit 4d4586a
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 0 deletions.
3 changes: 3 additions & 0 deletions include/linux/sched.h
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,9 @@ struct sched_entity {
u64 prev_sum_exec_runtime;

u64 nr_migrations;
u64 prev_sleep_sum_runtime;
/* average duration of a task */
u64 dur_avg;

#ifdef CONFIG_FAIR_GROUP_SCHED
int depth;
Expand Down
2 changes: 2 additions & 0 deletions kernel/sched/core.c
Original file line number Diff line number Diff line change
Expand Up @@ -4442,6 +4442,8 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
p->se.prev_sum_exec_runtime = 0;
p->se.nr_migrations = 0;
p->se.vruntime = 0;
p->se.dur_avg = 0;
p->se.prev_sleep_sum_runtime = 0;
INIT_LIST_HEAD(&p->se.group_node);

#ifdef CONFIG_FAIR_GROUP_SCHED
Expand Down
1 change: 1 addition & 0 deletions kernel/sched/debug.c
Original file line number Diff line number Diff line change
Expand Up @@ -1024,6 +1024,7 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
__PS("nr_involuntary_switches", p->nivcsw);

P(se.load.weight);
P(se.dur_avg);
#ifdef CONFIG_SMP
P(se.avg.load_sum);
P(se.avg.runnable_sum);
Expand Down
13 changes: 13 additions & 0 deletions kernel/sched/fair.c
Original file line number Diff line number Diff line change
Expand Up @@ -6318,6 +6318,18 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)

static void set_next_buddy(struct sched_entity *se);

static inline void dur_avg_update(struct task_struct *p, bool task_sleep)
{
u64 dur;

if (!task_sleep)
return;

dur = p->se.sum_exec_runtime - p->se.prev_sleep_sum_runtime;
p->se.prev_sleep_sum_runtime = p->se.sum_exec_runtime;
update_avg(&p->se.dur_avg, dur);
}

/*
* The dequeue_task method is called before nr_running is
* decreased. We remove the task from the rbtree and
Expand Down Expand Up @@ -6390,6 +6402,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)

dequeue_throttle:
util_est_update(&rq->cfs, p, task_sleep);
dur_avg_update(p, task_sleep);
hrtick_update(rq);
}

Expand Down

0 comments on commit 4d4586a

Please sign in to comment.