Skip to content

A repository with the code of the crypto library from the Ches 2023: "SCA-secure ECC in software – mission impossible?"

License

Notifications You must be signed in to change notification settings

sca-secure-library-sca25519/sca25519

Repository files navigation

The sca25519 library ("SCA-secure ECC in software – mission impossible?")

This repository accompanies the paper SCA-secure ECC in software – mission impossible? available at

Authors:

  • Lejla Batina lejla@cs.ru.nl
  • Łukasz Chmielewski chmiel@fi.muni.cz and lukaszc@cs.ru.nl
  • Björn Haase Bjoern.M.Haase@web.de
  • Niels Samwel niels.samwel@ru.nl
  • Peter Schwabe peter@cryptojedi.org

This repository contains three implementations of X25519 in C and assembly for the Cortex-M4 with countermeasures against side-channel and fault injection attacks. The first implementation is unprotected, the second implementation contains countermeasure required for the case of ephemeral scalar multiplication, and the third implementation contains the most countermeasures for the case of static scalar multiplication. The three implementations are located in similarly named subdirectories. The repository includes a common directory that contains code common to the three implementations and a hostside directory that contains python code to communicate with the board.

The Cortex-M4 implementations are based on this STM32: getting started repository.

Installation

This code assumes you have the arm-none-eabi toolchain installed and accessible. Confusingly, the tools available in the (discontinued) embedian project have identical names - be careful to select the correct toolchain (or consider re-installing if you experience unexpected behaviour). On most Linux systems, the correct toolchain gets installed when you install the arm-none-eabi-gcc (or gcc-arm-none-eabi) package. Besides a compiler and assembler, you may also want to install arm-none-eabi-gdb. On Linux Mint, be sure to explicitly install libnewlib-arm-none-eabi as well (to fix an error relating to stdint.h).

This project relies on the libopencm3 firmware. This is included as a submodule and we also included it directly to the folder libopencm3 in the main directory. When using git from the command line, you might need to execute "git submodule init" and "git submodule update" in the root directory first. Compile it (e.g. by calling make lib within the STM32F407-unprotected directory) before attempting to compile any of the other targets. On some systems where there is no symlink from the python3 binary to a python executable available, you might need to replace the line #!/usr/bin/env python in the files gendoxylayout.py and genlink.py with #!/usr/bin/env python3 instead (subdirectory libopencm3/scripts). If you observe problems with building libopencm3 (e.g. reports regarding "unterminated quotes") it might help to fix line #27 in the file libopencm3/Makefile by replacing the assignment
SRCLIBDIR:= $(subst $(space),\$(space),$(realpath lib))
with
SRCLIBDIR:= $(subst $(space),\$(space),$(realpath ./))/lib
or if your directory path does not contain spaces with
SRCLIBDIR:= $(subst $(space),/$(space),$(realpath lib)). \

The binary can be compiled by calling make in each respective subdirectory (unprotected, ephemeral, static). The binary can then be flashed onto the boards using stlink, as follows: st-flash write main.bin 0x8000000. Depending on your operating system, stlink may be available in your package manager -- otherwise refer to their Github page for instructions on how to compile it from source (in that case, be careful to use libusb-1.0.0-dev, not libusb-0.1).

The host-side Python 3 code requires the pyserial module. Your package repository might offer python-serial or python-pyserial directly (as of writing, this is the case for Ubuntu, Debian and Arch). Alternatively, this can be easily installed from PyPA by calling pip install pyserial (or pip3, depending on your system). If you do not have pip installed yet, you can typically find it as python3-pip using your package manager. Use the host_unidirectional.py script to receive data from the board.

Requirements Summary

  • arm-none-eabi-gcc with version 9.2.1 20191025 (with -O2 optimization flag); all related work for performance evaluation we also compiled using this compiler and the -02 flag.
  • python3 with pyserial
  • For Cortex-M4F:
    • Board stm32f4discovery (we use floating-point registers)
    • stlink

libopencm3

For our evaluation we have used libopencm3 (https://github.com/libopencm3/libopencm3/) with the commit id: 7daa6f15bf8db77b3225df01e427777b202b4e4e (from February 5th, 17:22:55, 2019). We have also included it in the libopencm3 directory in our repository.

Hooking up an STM32 discovery board

Connect the board to your machine using the mini-USB port. This provides it with power, and allows you to flash binaries onto the board. It should show up in lsusb as STMicroelectronics ST-LINK/V2.

If you are using a UART-USB connector that has a PL2303 chip on board (which appears to be the most common), the driver should be loaded in your kernel by default. If it is not, it is typically called pl2303. On macOS, you will still need to install it (and reboot). When you plug in the device, it should show up as Prolific Technology, Inc. PL2303 Serial Port when you type lsusb.

Using dupont / jumper cables, connect the TX/TXD pin of the USB connector to the PA3 pin on the board, and connect RX/RXD to PA2. Depending on your setup, you may also want to connect the GND pins.

Troubleshooting

At some point the boards might behave differently than one would expect, to a point where simply power-cycling the board does not help. In these cases, it is useful to be aware of a few trouble-shooting steps.

Problems related to the tools

If you're using Ubuntu, a common issue when using stlink is an error saying you are missing libstlink-shared.so.1. In this case, try running ldconfig.

If you are running into permission errors when trying to access the serial devices as a non-root user, you could consider adding your current user to the dialout (Debian, Ubuntu) or uucp (Arch) group, using something along the lines of sudo usermod -a -G [group] [username].

If you are getting Python errors when running the host-side scripts, make sure you are using Python 3.

Some issues can be caused by symbolic link not working correctly. We recommend cloning the git repository and not downloading the zip file from the web github interface.

Problems related to the board

First, check if all the cables are attached properly. For the boards supported in this repository, connect TX to PA3, RX to PA2 and GND to GND. Power is typically supplied using the mini-USB connector that is also used to flash code onto the board.

If the code in this repository does not appear to work correctly after flashing it on to the board, try pressing the RST button (optionally followed by re-flashing).

If you cannot flash new code onto the board, but are instead confronted with WARN src/stlink-common.c: unknown chip id!, try shorting the BOOT0 and VDD pins and pressing RST. This selects the DFU bootloader. After that, optionally use st-flash erase before re-flashing the board.

If you cannot flash the code onto the board, and instead get Error: Data length doesn't have a 32 bit alignment: +2 byte., make sure you are using a version of stlink for which this issue has been resolved. This affected L0 and L1 boards.

Structure of this repository

  • STM32F407-ephemeral contains our implementation for the ephemeral X25519; this implementation contains some side-channel protections.
  • STM32F407-static contains our implementation for the static X25519; this implementation contains multiple side-channel protections.
  • STM32F407-unprotected contains our implementation for the unprotected X25519; this implementation contains no side-channel protections besides being constant-time.
  • common contains common files for the implementations. Currently these files are copied to the implementations but symbolic links can be used instead.
  • hostside contains simple python code to communicate with the device. We use it to read the test results (for the performance evaluation) from the board.

Relevant Flags

The following flags are relevant and useful for performance evaluation:

  • ITOH_COUNTERMEASURE and ITOH_COUNTERMEASURE64 (in file scalarmult_25519.c) specify whether the address randomization is turned on in the static multiplication; both are turned on by default;
  • UPDATABLE_STATIC_SCALAR (in file scalarmult_25519.c) specifies whether the static scalar is being updated per each scalar multiplication call and SCALAR_RANDOMIZATION (also in scalarmult_25519.c) specifies whether the scalar is updated just before the scalar multiplication (turning these countermeasures of is important for template attacks); both flags are turned on by default;
  • COUNT_CYCLES_EXTRA_SM (in file crypto_scalarmult.h) is only useful when measuring clock cycles that the extra 64-bit scalar multiplication takes in the static implementation; otherwise, the flag should be turned off because it affects the efficiency of the scalar multiplication; it is turned off by default;
  • WITH_PERFORMANCE_BENCHMARKING defines whether the code for measuring clock cycles is present in the library; by default it is enable in the Makefile: -DWITH_PERFORMANCE_BENCHMARKING;
  • MULTIPLICATIVE_CSWAP (in file crypto_scalarmult.h) is a flag that can be used for two different implementations of the cswaperr procedure (for both, the ephemeral and static implementations). All the results in the paper are presented when MULTIPLICATIVE_CSWAP is enabled, since this implementation occurred to be safer and more efficient.

Code Formatting

All the code base was cleaned using the following commands:

  • clang-format --style=Google -i `find ./STM32F407-* common | grep "\.h"`
  • clang-format --style=Google -i `find ./STM32F407-* common | grep "\.c"`

Execution time

The execution times of the tests for the unprotected and ephemeral implementations are relatively fast (less than a minute). However, the test for the static implementation takes much more time, approximately, 16-17 minutes. The amount of cycles might slightly differ from the results presented in the paper due to the fact the execution times are randomized (due to the protection of the inversion, for example). Furthermore, the version of the compiler might slightly influence this amount too.

Traces

As mentioned in the paper we run our side-channel experiments using a modified Cortex-M4 (several capacitors are removed) on an STM32F407IGT6 board clocked at 168MHz. The resulting comparison of traces corresponding to all three implementations is present below: alt text alt text alt text

The scalar multiplications are marked in red.

To help reproduce the side-channel evaluation, for each of the implementations we upload 100 traces under the following link: https://www.dropbox.com/scl/fo/3zjd5m4oc77c5qgob7kxf/h?dl=0&rlkey=59774fmckgv5k8z87dbk7gpqk.

The first byte of the attached data to each trace indicates the TVLA set (0 - fixed, 1 - random input and fixed key, and 2 fixed input and random key). The rest of the data corresponds to the input point and the scalar.

Format

Each trace set is in the TRS format that is described under the following links:

The python trsfile package can be used to read the traces and it can be installed using pip: https://pypi.org/project/trsfile/.

License

All our code is covered by CC0. libopencm3 is licensed under GPL version 3, see https://github.com/libopencm3/libopencm3.

About

A repository with the code of the crypto library from the Ches 2023: "SCA-secure ECC in software – mission impossible?"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages