# Modeling and simulation of Power Consumption on Heterogenous CPU Cores under varying workloads and operating conditions

Atharv Arun Desai

Department of CSA

Indian Institute of Science (IISc)

Bangalore, India

atharvarun@iisc.ac.in

Boul Chandra Garai

Department of CSA

Indian Institute of Science (IISc)

Bangalore, India

chandraboul@iisc.ac.in

Himanshu Srivastava

Department of CSA

Indian Institute of Science (IISc)

Bangalore, India
himanshusriv@iisc.ac.in

Vaisakh P S
Department of CSA
Indian Institute of Science (IISc)
Bangalore, India
vaisakhp@iisc.ac.in

Abstract—This document serves as phase-1 report for E0-240 - Modeling and Simulation course project delivery. The main objective of this project is to apply concepts learned in E0-240 course in to Modeling and simulation of a real-world system, which in this case is Multi-core, Heterogenous CPU. This project, will focus on developing a Power Consumption Model for simulated Full-System [1] under varying workloads. This model will be developed taking in consideration various operating conditions of the CPU such as Dynamic Frequency Scaling, Heterogenous Cores [2]

 ${\it Index\ Terms}{--} {\bf Modeling,\ simulation,\ heterogenous\ CPU\ cores,} \\ {\bf power\ consumption}$ 

## I. BACKGROUND

Power consumption is one of the key performance indices of any embedded or mobile device which operate of power budget, as this directly impacts on user experience and usability of any such devices. Hence, the need for accurate power models in simulation environment has increased as well, to enable designer and manufacturers to measure the impact of any new functionality or optimization that is being prototyped. Insights from such models, will allow all key stakeholders in an embedded product development arena for evaluation without waiting hardware fabrication and rollout, there by saving resources and investment.

One of the main motivations for this project is the top-down power modeling approach [3] that utilized Performance Monitoring Counters(PMCs) in an actual hardware along with overall power consumption data to develop an empirical power model in Gem5 simulator [4]. The average error achieved by this approach is claimed to be less than 6%. We will further explore in to additional enhancement over this said approach by factoring in additional CPU performance metrics.

### II. METHODOLOGY

As mentioned in previous work [3], a similar method is followed in developing a Top-down Power model that will be integrated in to Gem5. ODROID-XU4 [5] Big-Little development board is chosen as first target for experimentation, data gathering and validation efforts. An overview of this hardware is show in Figure. 1. ODROID-XU4 offers minimalistic development platform with a Samsung Exynos 5422 Octa ARM Cortex<sup>TM</sup>-A15 Quad 2GHz and Cortex<sup>TM</sup>-A7 Quad 1.3GHz CPUs, with a 2GB LPDDR3 RAM operating at 933MHz stacked along with CPU package. Both Cortex-A15 and Cortex-A7 cores has 32KB Instruction and Data caches each. For L2 cache, the Cortex-A15 and Cortex-A7 cores makes use of 2MB and 512 KB respectively.



Fig. 1. ODroid XU4 board overview

## A. Modeling and Development Strategy

Simulation of the ODROID-XU4/Exynos5422 will be integrated in to Gem5, that would closely resemble its CPU operating parameters. A SmartPower3 [6], power monitor unit will be used along ODROID-XU4 as represented in Figure. 2, to measure overall power consumption on the hardware, while most of the peripheral modules on it will be kept to reduce any variation or impact on the measured data. In addition, the perf



Fig. 2. Experiment setup for power data gathering from ODROID-XU4 [5] hardware

[7] will be used to gather PMC data-points. A summary of data-points being gathered for this modeling exercise is listed in Table I.

TABLE I
POWER AND PERFORMANCE FEATURE GATHERED FROM MENTIONED
EXPERIMENT SETUP

| Feature/Statistics   | Feature details |                                                                                       |  |
|----------------------|-----------------|---------------------------------------------------------------------------------------|--|
| Type                 | Source          | <b>Details</b>                                                                        |  |
| CPU Clock Cycles     | perf[x]         | CPU cycles, us cycles, instruc-<br>tions, CPU frequency, CPU idle<br>state statistics |  |
| Instruction Branches | perf[x]         | Branch instruction and speculative operation statistics                               |  |
| Caches               | perf[x]         | Data/Instruction cache references,<br>misses at L1, Last-Level-Cache<br>levels        |  |
| Board Level Power    | SmartPower3 [6] | Current, Power drawn from power supply.                                               |  |
| Misc. Performance    | perf[x]         | CPU Migrations, Context switches, Virtual memory                                      |  |

A set of preliminary workloads that would induce resource load for CPU and memory will be executed on the ODROID-XU4 device, while the power consumption and PMC data are simultaneously recorded. Few of the workloads that are being considered as listed in Table II. As of now, a total of 5 workloads have been employed. Furthermore, integration of SPEC2017 will allow inclusion of up to 43 feasible benchmarks to improve quantity of data.

TABLE II
LIST OF WORKLOADS BEING USED FOR DATA GATHERING AND VALIDATION

| Workload         | Workload details and status of integration |          |  |
|------------------|--------------------------------------------|----------|--|
| Type             | Workloads                                  | Status   |  |
| Stress Test      | stress command [8]                         | <b>√</b> |  |
| Video Encoding   | ffmpeg encode [8]                          | <b>√</b> |  |
| File Compression | gzip, bzip2, xz on complex datasets [9]    | <b>√</b> |  |
| Benchmark Suite  | SPEC2017 CPU Benchmarks [10]               | Planned  |  |

# III. PHASE-2 PROGRESS

test

## IV. PHASE-2 OBSERVATIONS AND RESULTS

test

#### V. DISCUSSION ON PHASE-2 OUTCOMES

# VI. NEXT STEP

- A. Modeling and Empirical Model generation
- B. Power Model Integration to Gem5

#### REFERENCES

- A. Akram and L. Sawalha, "A survey of computer architecture simulation techniques and tools," *IEEE Access*, vol. 7, pp. 78120–78145, 2019.
- [2] A. Inc., ""big. little technology: The future of mobile", white paper," ONLINE, 2013. [Online]. Available: https://www.arm.com/
- [3] B. K. Reddy, M. J. Walker, D. Balsamo, S. Diestelhorst, B. M. Al-Hashimi, and G. V. Merrett, "Empirical cpu power modelling and estimation in the gem5 simulator," 2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp. 1–8, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:10100084
- [4] N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, "The gem5 simulator," SIGARCH Comput. Archit. News, vol. 39, no. 2, p. 1–7, aug 2011. [Online]. Available: https://doi.org/10.1145/2024716.2024718
- [5] "Odroid-xu4: Big.little development board," Apr 2023. [Online]. Available: https://wiki.odroid.com/odroid-xu4/odroid-xu4
- [6] "Smartpower3: Power monitor," May 2023. [Online]. Available: https://wiki.odroid.com/accessory/power\_supply\_battery/smartpower3
- [7] "Chapter 8 perf," in Power and Performance, J. Kukunas, Ed. Boston: Morgan Kaufmann, 2015, pp. 137–165. [Online]. Available: https://www.sciencedirect.com/science/article/pii/B9780128007266000082
- [8] A. L. Wiki, "Stress testing," ONLINE, Aug 2023. [Online]. Available: https://wiki.archlinux.org/title/Stress\_testing
- [9] P. PeaZip project TOS, "Compression benchmark: 7-zip, peazip, winrar, winzip comparison," ONLINE, Aug 2023. [Online]. Available: https://peazip.github.io/peazip-compression-benchmark.html
- 10] J. Bucek, K.-D. Lange, and J. v. Kistowski, "Spec cpu2017: Next-generation compute benchmark," in *Companion of the 2018 ACM/SPEC International Conference on Performance Engineering*, ser. ICPE '18. New York, NY, USA: Association for Computing Machinery, 2018, p. 41–42. [Online]. Available: https://doi.org/10.1145/3185768.3185771