Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: prevent meaningless scheduling iterations of simpleschedule with penalty. #164

Merged
merged 1 commit into from
Oct 1, 2022

Conversation

jheo4
Copy link
Contributor

@jheo4 jheo4 commented Sep 15, 2022

Description

TL;DR - Too simple simplescheduler

The current simplescheduler schedules the kernel as fast as the CPU runs. When the stream from a source kernel is generated at very high frequency, the current simplescheduler would be fine. However, when the upstream kernel runs at a lower frequency, scheduling kernels dominates the CPU core usage while the scheduled kernel's run() is not executed.

The demonstration of this bug is done by the two cases of poc.cpp. The demonstration codes of modified poc.cpp are here.

  • Case1: I made kernel A sleep 1 second and measured the energy consumption of CPUs. The result is below.
 Performance counter stats for 'system wide':
            346.47 Joules power/energy-cores/                                         
      10.008960046 seconds time elapsed
  • Case2: I made kernel C sleep 1 second and measured the energy consumption of CPUs. The result is below.
 Performance counter stats for 'system wide':
            132.33 Joules power/energy-cores/                                         
      10.008950988 seconds time elapsed

While these two cases did almost the same things, the CPU usages were very different and it caused the different energy consumptions also.

To prevent this inefficiency, I checked whether the scheduled kernel run() is really executed, and if not, penalized the next scheduling with 5 ms of sleeping. It is quite hacky and simple, but I think it would be fine with the "simple" scheduler.
For further schedulers, a more fancy backoff mechanism can be applied such as exponential backoff which was adopted by the network community in the past.

With my small modification, I re-tested the above cases, and th results are followings:

  • Case1: making kernel A sleep 1 second, but this result is with my commit.
 Performance counter stats for 'system wide':
            138.02 Joules power/energy-cores/                                         
      10.012098173 seconds time elapsed
  • Case2: making kernel C sleep 1 second, but this result is with my commit.
 Performance counter stats for 'system wide':
            138.48 Joules power/energy-cores/   
      10.013501176 seconds time elapsed

Fixes #163

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Runs locally on Windows
  • Runs locally on Linux
  • Runs locally on OS X

Details

I tested my commit on Ubuntu 20.04 and 22.04.

Please list test cases created to ensure your feature or addition
works.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@jonathan-beard
Copy link
Member

Looks like there's an issue with the Windows test case for 'reduction_test', will diagnose. The Mac OS version fails on known issues so I'm not gonna worry about that one. Once the Win test case passes I'll merge.

@jonathan-beard
Copy link
Member

ok, I need to fix the test case script....it's not a code issue. Will merge.

@jonathan-beard jonathan-beard merged commit 41cefc9 into RaftLib:master Oct 1, 2022
@jheo4 jheo4 deleted the fix/simple_schedule branch October 5, 2022 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CPU Consumption in a dormant system
3 participants