Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rusty: Rework deadline as a signed sum #309

Merged
merged 2 commits into from
Jul 25, 2024
Merged

rusty: Rework deadline as a signed sum #309

merged 2 commits into from
Jul 25, 2024

Commits on Jul 25, 2024

  1. rusty: Rework deadline as a signed sum

    Currently, a task's deadline is computed as its vtime + a scaled function of
    its average runtime (with its deadline being scaled down if it's more
    interactive). This makes sense intuitively, as we do want an interactive task
    to have an earlier deadline, but it also has some flaws.
    
    For one thing, we're currently ignoring duty cycle when determining a task's
    deadline. This has a few implications. Firstly, because we reward tasks with
    higher waker and blocked frequencies due to considering them to be part of a
    work chain, we implicitly penalize tasks that rarely ever use the CPU because
    those frequencies are low. While those tasks are likely not part of a work
    chain, they also should get an interactivity boost just by pure virtue of not
    using the CPU very often. This should in theory be addressed by vruntime, but
    because we cap the amount of vtime that a task can accumulate to one slice, it
    may not be adequately reflected after a task runs for the first time.
    
    Another problem is that we're minimizing a task's deadline if it's interactive,
    but we're also not really penalizing a task that's a super CPU hog by
    increasing its deadline. We sort of do a bit by applying a higher niceness
    which gives it a higher deadline for a lower weight, but its somewhat minimal
    considering that we're using niceness, and that the best an interactive task
    can do is minimize its deadline to near zero relative to its vtime.
    
    What we really want to do is "negatively" scale an interactive task's deadline
    with the same magnitude as we "positively" scale a CPU-hogging task's deadline.
    To do this, we make two major changes to how we compute deadline:
    
    1. Instead of using niceness, we now instead use our own straightforward
       scaling factor. This was chosen arbitrarily to be a scaling by 1000, but we
       can and should improve this in the future.
    
    2. We now create a _signed_ linear latency priority factor as a sum of the
       three following inputs:
       - Work-chain factor (log_2 of product of blocked freq and waker freq)
       - Inverse duty cycle factor (log_2 of the inverse of a task's duty cycle --
         higher duty cycle means lower factor)
       - Average runtime factor (Higher avg runtime means higher average runtime
         factor)
    
    We then compute the latency priority as:
    
    	lat_prio := Average runtime factor - (work-chain factor + duty cycle factor)
    
    This gives us a signed value that can be negative. With this, we can compute a
    non-negative weight value by calculating a weight from the absolute value of
    lat_prio, and use this to scale slice_ns. If lat_prio is negative we calculate
    a task's deadline as its vtime MINUS its scaled slice_ns, and if it's positive,
    it's the task's vtime PLUS scaled slice_ns.
    
    This ends up working well because you get a higher weight both for highly
    interactive tasks, and highly CPU-hogging / non-interactive tasks, which lets
    you scale a task's deadline "more negatively" for interactive tasks, and "more
    positively" for the CPU hogs.
    
    With this change, we get a significant improvement in FPS. On a 7950X, if I run
    the following workload:
    
    	$ stress-ng -c $((8 * $(nproc)))
    
    1. I get 60 FPS when playing Stellaris (while time is progressing at max
       speed), whereas EEVDF gets 6-7 FPS.
    
    2. I get ~15-40 FPS while playing Civ6, whereas EEVDF seems to get < 1 FPS. The
       Civ6 benchmark doesn't even start after over 4 minutes in the initial frame
       with EEVDF, but gets us 13s / turn with rusty.
    
    3. It seems that EEVDF has improved with Terraria in v6.9. It was able to
       maintain ~30-55 FPS, as opposed to the ~5-10FPS we've seen in the past.
       rusty is still able to maintain a solid 60-62FPS consistently with no
       problem, however.
    Byte-Lab committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    933ea9b View commit details
    Browse the repository at this point in the history
  2. rusty: Transfer latency priority between CPU-intensive and interactiv…

    …e tasks
    
    In some scenarios, a CPU-intensive task may be on the critical path for
    interactive workloads. For example, you may have a game with CPU-intensive
    tasks that are crunching the logic for the game, and that's required for the
    game to proceed without being choppy.
    
    To support such workflows, this change adds logic to allow a non-interactive
    task to inherit the lower (i.e. stronger) latency priority of another task if
    it wakes or is woken by that task.
    
    Signed-off-by: David Vernet <void@manifault.com>
    Byte-Lab committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    c1ad602 View commit details
    Browse the repository at this point in the history