ELMO is our first go at a statistical leakage simulator for the ARM M0 family: https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/mccann
This project originated from the research of David McCann (on modelling energy efficiency), originally funded by NCSC and then by EPSRC (LADA) under the supervision of Elisabeth Oswald. The modelling technique was further refined by Elisabeth Oswald in the context of the SEAL (ERC Cog) proposal. Eventually the research team was expanded by Carolyn Whitnall and Si Gao, both funded via REASSURE (H2020 funding, www.reassure.eu).
What is ELMO?
ELMO consists of two components: an emulator (our custom version of the Thumbulator) for the ARM M0 architecture and a set of leakage models. The emulator takes in Thumb assembly and 'emulates' the workings of an M0 core (in particular the 3 stage pipeline). The power models utilise the instruction flow generated by the emulator to produce cycle or instruction accurate predictions of the power consumption (noise free). ELMO in its' base version emulates the power consumption of many instructions but limited to the contribution of the CPU core only. There is a simple 'memory extension', which gives a very basic model of the power consumption of the memory bus.
In contrast to other power simulators, ELMO is not built based on fixed assumptions about the power model of the processor. Instead, we used a statistical technique called model building where we test 'promising' variables w.r.t. their contribution to the power consumptions, and include them if our statistical analysis concludes that they are indeed significant. All models take the 'processor context' into account, i.e. any model is based on a triplet of instructions consisting of a target instruction, the instruction before the target, and the instruction after the target instruction.
Technically speaking, ELMO is a standalone tool written in C. It should be possible to compile it on various platforms, as long as all dependent system libraries are correctly included. However, we have only tested it on Ubuntu and Mac: support for Windows is not provided. There is a pre-complied binary program in the repository, which could possibly work for many Linux-based systems. However, it is recommended to re-compile ELMO source code and generate a new binary program for your system. To compile ELMO, you need the GCC compiler collection (tested version 7.3.0 on Ubuntu) and the command/utility 'make' :
- Ubuntu: It would be easier to install GCC and make at the same time with "sudo apt install build-essential"
- Mac: Download and install “Command Line Tools for Xcode”, which is available on Apple’s developer page. After installation is completed, run “gcc -v” in a terminal to see whether you can see the correct version information.
Running your own code
To run any user-defined test case, users need to compile their implementations to binary programs (a ".bin" file). Technically, this is NOT a step in ELMO, as ELMO starts from analyzing the binary representation. ELMO is not linked to any ARM specific tools, so users should be fine to utilise whatever they want for this purpose.
A minimal working platform for compiling your code into an ARM Thumb binary would be to use the GNU ARM Embedded Toolchain (tested version: arm-none-eabi-gcc version 7.3.1 20180622, it can be downloaded from https://developer.arm.com/open-source/gnu-toolchain/gnu-rm). According to ARM's install guide:
- Ubuntu: Unpack the tarball to the install directory: "$ cd $install_dir && tar xjf gcc-arm-none-eabi--yyyymmdd-linux.tar.bz2"; then invoke ARM toolchain as "$ export PATH=$PATH:$install_dir/gcc-arm-none-eabi-/bin; $ arm-none-eabi-gcc" (an older version is available through apt-get)
- Mac: Unpack the tarball to the install directory, like this "$ cd $install_dir && tar xjf gcc-arm-none-eabi--yyyymmdd-mac.tar.bz2"; then invoke ARM toolchain as "$ export PATH=$PATH:$install_dir/gcc-arm-none-eabi-/bin; $ arm-none-eabi-gcc"
For most users, we believe it is sufficient to follow the exact workflow provided in the manual. That means users should replace the target code section with their code and revise the Makefile so that the ARM toolchain can successfully produce the binary code. Note that only the critical part of the program should be examined: the rest non-security-related codes should be left out, as running ELMO with them not only adds unnecessary workload but also increase the risk of the encounter of unsupported bugs.
The following extensions have been develope within the H2020 project REASSURE:
Models from another M0 core (NXP LPC1114)
We have rebuilt ELMO's power model with another M0 core and compared the similarities and differences between the NXP one and the ST one. Presentation on ARM Research Summit 2018. The resultant model coefficient file can be found in the repository. ELMO will produce traces using this model, if the users renaming this file as "coeffs.txt" or changing the COEFFSFILE macro in elmodefines.h to "coeffs_LPC.txt"
Sound leakage detection
Whilst ELMO is primarily a leakage emulator, a natural extension is to integrate leakage detection functionality so users can more easily identify leaking instructions. The 'default' methodology for this is 'TVLA' https://www.rambus.com/wp-content/uploads/2015/08/TVLA-DTR-with-AES.pdf. Unfortunately, the TVLA description fails to recognise a number of statistical challenges https://eprint.iacr.org/2019/703.pdf, and current adaptions of it in standards lead to statistically unsound techniques: https://eprint.iacr.org/2019/1013.pdf.
To facilitate a somewhat more sound use of leakage detection, the in-built feature is capable of determining the necessary sample size (based on a defined significance level, power, and effect size). If this leads to unpracticable workloads then users can still override this.
We have also added a small extension which can change the number of traces without re-compling the ARM binary (.bin file). To achieve this, one needs to write
then use N as number of traces. This API will read from ELMO's command line, the value after
will be written to N
In depth: power models
ELMO comes with several sets of power models which were captured from different implementations of M0 cores. The philosophy of model building was to test promising variables and to include them if they turn out statistically significant. A precise discussion and description of this process can be found in the associated USENIX paper, but in a nutshell these are the factors/leakage sources that feature in the models:
- The weighted Hamming weight/distance on the data bus for operand 1 /operand 2 in the ALU (3-instruction time window)
- Second-order bit-interaction between bits/bitflips within the same operand (only for shift instructions and multiplications)
The following terms looked promising but turned out not to be significant:
- The bitflip of the target register (although this is by chance, duplicated by the bitflip of a certain operand bus)
- Bit-interaction of more than two bits (not significant enough in the dataset) Note that the latter is actually core-dependent: sometimes even the same manufacturer may change micro-architectures in their product lines.
In depth: model extension
The model building left out effects related to the memory subsystem. This is a clear limitation and we discuss this partially in our paper on assembly-based masked AES implementation.. Independently, th ROSITA paper also recognises similar issues on another M0 core.
Finding ways of automatically reverse engineering a specific bus architecture with the aim of characterising it is an ongoing effort for us. In the meantime we have settle for a conservative work around based on the assumptions that there is one bus/buffer for read bus and one for write, and that none of these busses will be cleared until the next memory access. Based on these assumptions:
- Two 32-bit variables have been added to ELMO's data flow data_flow (elmodefine.h), representing the current read/write bus
- In elmo.c, all load/store instructions will update the corresponding variables in the data flow. For any other instructions, these two variables are copied from the previous instruction.
- A hamming distance term is added in elmopowermodel(powermodel.h). This term will be zero for all the non-load/store instructions. The coefficient here is simply borrowed from one of the previous HD coefficients: readers can change it to to any non-zero value they want, as long as they won't be using the resultant traces "quantitatively".
- Eventually, a new macro called "MEMORY_EXTENSION" is added in elmodefine.h. When defined, ELMO will consider the contribution from the memory system.
March 26th 2020, Si Gao
- Marco for changing the number of traces in ELMO command-line arguments renaming: AUTOTVLA---> NTRACE
- ELMO command line argument renaming: '-autotvla' ---> 'Ntrace'
- Renaming the corresponding ELMO library function: 'LoadNForTVLA()' --->'LoadN()'
- Now ELMO will always reads the effect size from the model (last line of "coeffs.txt") and outputs the power of the test
- Adding examples used in the REASSURE whitepaper