Initial energy model RFC by bjackman · Pull Request #246 · ARM-software/lisa

bjackman · 2016-11-30T18:45:20Z

This class provides a model of systems with:

CPU capacity at different frequencies.
Power usage at different frequency.
Power usage in different idle states.

The model is aware of topologically shared resources (clusters),
topological dependencies for idle states (power domains) and frequency
domains.

The intended use case for this model is for testing energy-aware
scheduling in a platform-agnostic way.

This is an RFC of the EnergyModel class. This doesn't include the complex stuff, just the expression of the data. A later version will introduce code to estimate energy usage for a workload both by imagining ideal kernel behaviour and examining Trappy traces. This will be used to compare traces against "ideal" behaviour for automated tests of Energy Aware Scheduling.

The first commit (energy_model: Add EnergyModel class) contains the interesting bit, the rest of the commits demonstrate some intended uses.

This leaves lots of room for refactoring around LISA, for example the TestEnv::platform attribute becomes redundant. However I'm leaving that until the design is finalised here as the way that refactoring is done will probably need quite a lot of discussion.

This class provides a model of systems with: - CPU capacity at different frequencies. - Power usage at different frequency. - Power usage in different idle states. The model is aware of topologically shared resources (clusters), topological dependencies for idle states (power domains) and frequency domains. The intended use case for this model is for testing energy-aware scheduling in a platform-agnostic way.

The migrators probably won't start on big CPUs.

derkling

Despite missing some support material (https://goo.gl/2wiF5b) I've tried to go through all the patches.
Overall really appreciable effort, from the data structures side I guess we are almost there.

I would just suggest to add a little bit of documentation and perhaps few dummy wrapping functions to make code more similar in the EM initialization side.

Some doubts about the way we represent Pixel like systems and how the EM methods affects workloads generation (e.g. number of tasks based on number of little CPUs)

Regarding tests, well done... perhaps we should try better to consolidate hard-coded values... but I'm not sure if the effort is worth. At the end the test is written once... thus perhaps having the same hard-coded value in multiple places is not a big issue. A midway solution can be that to use global constants.

derkling · 2016-12-07T17:27:21Z

libs/utils/energy_model.py

+# limitations under the License.
+#
+
+import logging


That's not used by this patch, can we remove it?

derkling · 2016-12-07T17:28:08Z

libs/utils/energy_model.py

+#
+
+import logging
+from collections import namedtuple, OrderedDict


OrderedDict also not used.

derkling · 2016-12-07T17:32:17Z

libs/utils/energy_model.py

+    # TODO check that this is the highest cap available
+    capacity_scale = 1024
+
+    def __init__(self, levels=None):


What is levels? We should add at least a minimal documentation to describe parameters.

derkling · 2016-12-07T17:34:07Z

libs/utils/energy_model.py

+    in various configurations.
+
+    The topology is stored in "levels", currently hard-coded to be "cpu" and
+    "cluster". Each level is a list of EnergyModelNode objects. An EnergyModel


Do we want to use the same naming used by TRAPpy? I mean: CPUs and Clusters, while in kernel space usually we talk about CORE and DIE...

Yeah, that's a good point. I think probably they should have no name at all and just be numerically indexed. We can assume that level 0 is the logical CPU. I'll play around with that and see if it works.

I'm all in for the usage of a logical index... maybe with the possibility to pass in somehow a map which defined labels to be used for logging and/or reporting...

derkling · 2016-12-07T17:36:14Z

libs/utils/env.py


 import devlib

+import platforms.juno_energy


What about:

from platform.juno_energy import juno_energy

which simplifies also the following assignment?

derkling · 2016-12-07T19:00:35Z

tests/eas/acceptance.py

    }

+    # Set to true to run a test only on heterogeneous systems
+    skip_on_smp = False


Since for the time being, all the tests we have are only for !smp systems, and actually most of the tests we will develop will be for !smp systems, should not be easier to set this default to True?

derkling · 2016-12-07T19:02:56Z

tests/eas/acceptance.py


-        sched_assert = self.get_multi_assert(experiment)
+        sched_assert = SchedMultiAssert(
+            experiment.out_dir, self.te.topology, tasks)


This seems to belong to a different patch, isn't it?

Don't think this should be here at all.

derkling · 2016-12-07T19:07:51Z

libs/utils/platforms/pixel_energy.py

+])
+
+gold_cpu_active_states = OrderedDict([
+    (307200,     ActiveState(capacity=149, power=93)),


Here it is... we have same lower OPP capacities but different max OPP capacities for SILVER and GOLD.
Isn't the EnergyModel::littlest_cpus() returning all the CPUs while EnergyModel::biggest_cpus() returns only the GOLD ones? Does this affects workloads generated by some tests?

derkling · 2016-12-07T19:09:13Z

tests/eas/acceptance.py

    @experiment_test
    def test_big_cpus_fully_loaded(self, experiment, tasks):
        """Offload Migration and Idle Pull: Big cpus are fully loaded as long as there are tasks left to run in the system"""
-        num_big_cpus = len(self.target.bl.bigs)


Maybe this patch can be squashed with these previous two:

Remove bl reference in first CPU test

Remove big.LITTLE assumption from SmallTaskPacking

Yeah, sounds good to just have one patch that updates the acceptance tests to remove big.LITTLE assumptions.

derkling · 2016-12-07T19:13:10Z

libs/utils/energy_model.py

+        if self.parent:
+            self.parent.add_cpus(self.cpus)
+
+    def __repr__(self):


A similar method would be appreciated both for EnergyModelNode and EnergyModel classes

EnergyModelNode inherits a __repr__ fom namedtuple. Good point on EnergyModel - I do have one in a subsequent patch but I'll need to think about how it should look, it's a little bit awkward because it has quite a lot of data so the same style as the namedtuple __repr__ might be totally unreadable.

bjackman · 2016-12-08T14:32:21Z

OK thanks a lot for the review, I'm going to do some more work on this, especially on the testing, and do a v2 PR.

bjackman · 2017-01-03T16:43:09Z

superseded by #263

Brendan Jackman added 10 commits November 30, 2016 18:19

libs/utils/platforms: Add __init__.py

d82c7a7

platforms: Add energy model for Juno r0

9c8d162

platforms: Add energy model for HiKey

9773278

platforms: add energy model for Pixel

3c1623c

tests/eas/acceptance: Remove bl reference in first CPU test

202d589

tests/eas/acceptance: Remove big.LITTLE assumption from SmallTaskPacking

dd505f9

tests/eas/acceptance: Add skip_on_smp flag

b92d95c

tests/eas/acceptance: Only check early tasks in Offoad+IdlePull

2c7eb62

The migrators probably won't start on big CPUs.

tests/eas/acceptance: Remove big.LITTLE assumption for Offload+IdlePull

a833aa1

bjackman assigned derkling Dec 2, 2016

derkling suggested changes Dec 7, 2016

View reviewed changes

derkling added this to the 17.01 milestone Dec 13, 2016

bjackman mentioned this pull request Dec 14, 2016

Add to support tri-clusters #255

Closed

bjackman closed this Jan 3, 2017

Conversation

bjackman commented Nov 30, 2016

Uh oh!

derkling left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bjackman Dec 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bjackman commented Dec 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjackman commented Jan 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bjackman Dec 8, 2016 •

edited

Loading

bjackman commented Dec 8, 2016 •

edited

Loading