Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40 tests fail on ARM32 (armv7hf): AssertionError: Got unexpected return code -4 #1483

Closed
vt-alt opened this issue May 26, 2023 · 12 comments · Fixed by #1485
Closed

40 tests fail on ARM32 (armv7hf): AssertionError: Got unexpected return code -4 #1483

vt-alt opened this issue May 26, 2023 · 12 comments · Fixed by #1485

Comments

@vt-alt
Copy link
Contributor

vt-alt commented May 26, 2023

Describe the bug
I try to build 0.8.0-rc1 for ALT Linux and found that 40 tests failed on armv7hf arch.

[00:00:48] + timeout 222 /usr/bin/ninja -j32 -C build run_tests
[00:00:48] ninja: Entering directory `build'
[00:00:49] [0/1] cd /usr/src/RPM/BUILD/liboqs-0.8.0.rc1 && /usr/bin/cmake -E env OQS_BUILD_DIR=/usr/src/RPM/BUILD/liboqs-0.8.0.rc1/build python3 -m pytest --verbose --numprocesses=auto --ignore=scripts/copy_from_upstream/repos
[00:00:55] ============================= test session starts ==============================
[00:00:55] platform linux -- Python 3.10.8, pytest-7.3.1, pluggy-1.0.0 -- /usr/bin/python3
[00:00:55] cachedir: .pytest_cache
[00:00:55] rootdir: /usr/src/RPM/BUILD/liboqs-0.8.0.rc1
[00:00:55] plugins: xdist-3.3.1
[00:00:55] created: 32/32 workers
[00:00:55] 32 workers [402 items]
...
[00:01:30] =================================== FAILURES ===================================
[00:01:30] ______________________ test_kem[Classic-McEliece-348864] _______________________
[00:01:30] [gw26] linux -- Python 3.10.8 /usr/bin/python3
[00:01:30] 
[00:01:30] kem_name = 'Classic-McEliece-348864'
[00:01:30] 
[00:01:30]     @helpers.filtered_test
[00:01:30]     @pytest.mark.parametrize('kem_name', helpers.available_kems_by_name())
[00:01:30]     def test_kem(kem_name):
[00:01:30]         kats = helpers.get_kats("kem")
[00:01:30]         if not(helpers.is_kem_enabled_by_name(kem_name)): pytest.skip('Not enabled')
[00:01:30] >       helpers.run_subprocess( [helpers.path_to_executable('speed_kem'), kem_name, "-f"] )
[00:01:30] 
[00:01:30] tests/test_speed.py:14: 
[00:01:30] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[00:01:30] 
[00:01:30] command = ['/usr/src/RPM/BUILD/liboqs-0.8.0.rc1/build/tests/speed_kem', 'Classic-McEliece-348864', '-f']
[00:01:30] working_dir = '.'
[00:01:30] env = {'G_BROKEN_FILENAMES': '1', 'HISTFILESIZE': '9999', 'HISTSIZE': '999', 'HOME': '/usr/src', ...}
[00:01:30] expected_returncode = 0, input = None, ignore_returncode = False
[00:01:30] 
[00:01:30]     def run_subprocess(command, working_dir='.', env=None, expected_returncode=0, input=None, ignore_returncode=False):
[00:01:30]         """
[00:01:30]         Helper function to run a shell command and report success/failure
[00:01:30]         depending on the exit status of the shell command.
[00:01:30]         """
[00:01:30]         env_ = os.environ.copy()
[00:01:30]         if env is not None:
[00:01:30]             env_.update(env)
[00:01:30]         env = env_
[00:01:30]     
[00:01:30]         # Note we need to capture stdout/stderr from the subprocess,
[00:01:30]         # then print it, which pytest will then capture and
[00:01:30]         # buffer appropriately
[00:01:30]         print(working_dir + " > " + " ".join(command))
[00:01:30]     
[00:01:30]         result = subprocess.run(
[00:01:30]                 command,
[00:01:30]                 input=input,
[00:01:30]                 stdout=subprocess.PIPE,
[00:01:30]                 stderr=subprocess.STDOUT,
[00:01:30]                 cwd=working_dir,
[00:01:30]                 env=env,
[00:01:30]             )
[00:01:30]     
[00:01:30]         if not(ignore_returncode) and (result.returncode != expected_returncode):
[00:01:30]             print(result.stdout.decode('utf-8'))
[00:01:30] >           assert False, "Got unexpected return code {}".format(result.returncode)
[00:01:30] E           AssertionError: Got unexpected return code -4
[00:01:30] 
[00:01:30] tests/helpers.py:42: AssertionError
[00:01:30] ----------------------------- Captured stdout call -----------------------------
[00:01:30] . > /usr/src/RPM/BUILD/liboqs-0.8.0.rc1/build/tests/speed_kem Classic-McEliece-348864 -f
[00:01:30] 
...
...
[00:01:30] =========================== short test summary info ============================
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-348864] - AssertionErro...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-348864f] - AssertionErr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-460896f] - AssertionErr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-6960119] - AssertionErr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-460896] - AssertionErro...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-6688128f] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-6960119f] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-8192128] - AssertionErr...
[00:01:30] FAILED tests/test_speed.py::test_kem[HQC-192] - AssertionError: Got unexpecte...
[00:01:30] FAILED tests/test_speed.py::test_kem[HQC-128] - AssertionError: Got unexpecte...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-8192128f] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Kyber512] - AssertionError: Got unexpect...
[00:01:30] FAILED tests/test_speed.py::test_kem[HQC-256] - AssertionError: Got unexpecte...
[00:01:30] FAILED tests/test_speed.py::test_kem[Kyber768] - AssertionError: Got unexpect...
[00:01:30] FAILED tests/test_speed.py::test_kem[sntrup761] - AssertionError: Got unexpec...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-640-AES] - AssertionError: Got ...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-640-SHAKE] - AssertionError: Go...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-976-AES] - AssertionError: Got ...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-1344-AES] - AssertionError: Got...
[00:01:30] FAILED tests/test_speed.py::test_sig[Dilithium3] - AssertionError: Got unexpe...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-1344-SHAKE] - AssertionError: G...
[00:01:30] FAILED tests/test_speed.py::test_kem[FrodoKEM-976-SHAKE] - AssertionError: Go...
[00:01:30] FAILED tests/test_speed.py::test_sig[Dilithium5] - AssertionError: Got unexpe...
[00:01:30] FAILED tests/test_speed.py::test_sig[Falcon-512] - AssertionError: Got unexpe...
[00:01:30] FAILED tests/test_speed.py::test_sig[Dilithium2] - AssertionError: Got unexpe...
[00:01:30] FAILED tests/test_speed.py::test_sig[Falcon-1024] - AssertionError: Got unexp...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-128s-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-128f-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-192f-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-192s-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-256f-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHA2-256s-simple] - AssertionEr...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-128f-simple] - AssertionE...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-128s-simple] - AssertionE...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-192f-simple] - AssertionE...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-256f-simple] - AssertionE...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-256s-simple] - AssertionE...
[00:01:30] FAILED tests/test_speed.py::test_kem[Classic-McEliece-6688128] - AssertionErr...
[00:01:30] FAILED tests/test_speed.py::test_kem[Kyber1024] - AssertionError: Got unexpec...
[00:01:30] FAILED tests/test_speed.py::test_sig[SPHINCS+-SHAKE-192s-simple] - AssertionE...
[00:01:30] ================= 40 failed, 218 passed, 144 skipped in 40.35s =================

Environment (please complete the following information):

  • OS: ALT Linux
  • OpenSSL version 1.1.1t
  • Compiler version used gcc 12.2.1
  • Build variables used -DBUILD_SHARED_LIBS=ON -DOQS_DIST_BUILD=ON -DOQS_OPT_TARGET=generic
  • liboqs version 0.8.0.rc1

Full build log is temporary available there https://git.altlinux.org/tasks/321944/build/100/armh/log

@vt-alt
Copy link
Contributor Author

vt-alt commented May 26, 2023

My (not expert in arm) debugging attempts:

builder@armh:~/RPM/BUILD/liboqs-0.8.0.rc1$ gdb --args  /usr/src/RPM/BUILD/liboqs-0.8.0.rc1/build/tests/speed_kem Kyber768 -f
GNU gdb (GDB) 11.2-alt1 (ALT Sisyphus)
(gdb) r
Configuration info
==================
Target platform:  armv8l-Linux-6.1.27-std-def-alt1
Compiler:         gcc (12.2.1)
Compile options:  [-Wa,--noexecstack;-Wstrict-overflow;-ggdb3;-fno-ipa-modref;-fno-ipa-pure-const;-Wbad-function-cast]
OQS version:      0.8.0-rc1
Git commit:       unknown
OpenSSL enabled:  Yes (OpenSSL 1.1.1t  7 Feb 2023)
AES:              OpenSSL
SHA-2:            OpenSSL
SHA-3:            C
OQS build flags:  BUILD_SHARED_LIBS OQS_DIST_BUILD OQS_OPT_TARGET=generic CMAKE_BUILD_TYPE=RelWithDebInfo
CPU exts active:
Speed test
==========
Started at 2023-05-26 15:39:49
Operation                            | Iterations | Total time (s) | Time (us): mean | pop. stdev | CPU cycles: mean          | pop. stdev
------------------------------------ | ----------:| --------------:| ---------------:| ----------:| -------------------------:| ----------:
Kyber768                             |            |                |                 |            |                           |

Program received signal SIGILL, Illegal instruction.
kem_speed_wrapper (method_name=<optimized out>, duration=<optimized out>, printInfo=<optimized out>, printInfo@entry=false, doFullCycle=<optimized out>, doFullCycle@entry=true) at /usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/speed_kem.c:68
68                      TIME_OPERATION_SECONDS(fullcycletest(kem, public_key, secret_key, ciphertext, shared_secret_e, shared_secret_d), "fullcycletest", duration)
(gdb) bt
#0  kem_speed_wrapper (method_name=<optimized out>, duration=<optimized out>, printInfo=<optimized out>, printInfo@entry=false, doFullCycle=<optimized out>,
    doFullCycle@entry=true) at /usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/speed_kem.c:68
#1  0x00400da6 in main (argc=<optimized out>, argv=0xbefff564) at /usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/speed_kem.c:183
(gdb) i r
r0             0x0                 0
r1             0x2dc6c0            3000000
r2             0x17                23
r3             0xf4240             1000000
r4             0x0                 0
r5             0x1                 1
r6             0x0                 0
r7             0x3                 3
r8             0x412f5c            4271964
r9             0x416160            4284768
r10            0x4157f8            4282360
r11            0x415350            4281168
r12            0xb6ff3020          3070177312
sp             0xbefff1f8          0xbefff1f8
lr             0x4013a7            4199335
pc             0x4013c2            0x4013c2 <kem_speed_wrapper+226>
cpsr           0x200f0030          537854000
fpscr          0x0                 0
(gdb) disas/s kem_speed_wrapper
Dump of assembler code for function kem_speed_wrapper:
...
/usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/speed_kem.c:
64                      TIME_OPERATION_SECONDS(OQS_KEM_keypair(kem, public_key, secret_key), "keygen", duration)
   0x004013ba <+218>:   str     r0, [sp, #60]   ; 0x3c

63              if (!doFullCycle) {
   0x004013bc <+220>:   cmp     r5, #0
   0x004013be <+222>:   beq.w   0x4015cc <kem_speed_wrapper+748>

/usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/ds_benchmark.h:
175             __asm__ volatile("mcr p15, 0, %0, c9, c12, 0\t\n" ::"r"(value));
=> 0x004013c2 <+226>:   mcr     15, 0, r2, cr9, cr12, {0}
...
(gdb) l *(kem_speed_wrapper+226)
0x4013c2 is in kem_speed_wrapper (/usr/src/RPM/BUILD/liboqs-0.8.0.rc1/tests/ds_benchmark.h:175).
170             }
171
172             value |= 16;
173
174             /* Program the performance-counter control-register */
175             __asm__ volatile("mcr p15, 0, %0, c9, c12, 0\t\n" ::"r"(value));
176
177             /* Enable all counters */
178             __asm__ volatile("mcr p15, 0, %0, c9, c12, 1\t\n" ::"r"(0x8000000f));
179

@dstebila
Copy link
Member

Can you try adding -DOQS_USE_RASPBERRY_PI to your CMake command? We have some code in ds_benchmark.h that is allowed on ARM but guarded against use on Raspberry Pi since we know it doesn't work there, I'm wondering if the problem extends further to ARM32.

@vt-alt
Copy link
Contributor Author

vt-alt commented May 29, 2023

I added -DOQS_USE_RASPBERRY_PI to CFLAGS[1] and after this all tests are passed.

====================== 258 passed, 144 skipped in 37.18s =======================

Thanks!

[1] Adding to cmake command does not work:

CMake Error: Parse error in command line argument: OQS_USE_RASPBERRY_PI

@dstebila
Copy link
Member

Okay, so that's the code that was causing the problem. Do you know how to do cycle counting on ARM32? If not, then ideally we would update the CMake configuration to automatically detect ARM32 and apply the flag you did.

@vt-alt
Copy link
Contributor Author

vt-alt commented May 29, 2023

I'm not an expert in ARM32, so I don't have personal knowledge, but I have heard that djb does libcpucycles, which appears to support some ARM32.: https://cpucycles.cr.yp.to/libcpucycles-20230115/cpucycles/arm32-cortex.c.html (not tested by me.)

@vt-alt
Copy link
Contributor Author

vt-alt commented May 29, 2023

not tested by me.

I tested and this (call to ticks and enable) produces Illegal instruction either. (And yes enable seems the same as your code.)

ps. I fear there cannot be single solution for ARM32 since they are very diverse.

@dstebila
Copy link
Member

not tested by me.

I tested and this (call to ticks and enable) produces Illegal instruction either. (And yes enable seems the same as your code.)

ps. I fear there cannot be single solution for ARM32 since they are very diverse.

So probably we'll need to revert to using time rather than cycles. Do you happen to know a way to detect ARM32 in either CMake or C preprocessor macros?

@vt-alt
Copy link
Contributor Author

vt-alt commented May 29, 2023

By this https://developer.arm.com/documentation/dui0774/e/Other-Compiler-specific-Features/Predefined-macros I think __arm__ is for ARM32, and __aarch64__ for ARM64. I tested this.

JFYI:

builder@armh:/.in$ gcc -dM -E - </dev/null |grep -i -e aarch -e arm
#define __ARM_SIZEOF_WCHAR_T 4
#define __ARM_FEATURE_SAT 1
#define __ARM_ARCH_ISA_ARM 1
#define __ARMEL__ 1
#define __ARM_FP 12
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_PCS_VFP 1
#define __ARM_FEATURE_LDREX 15
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FEATURE_QBIT 1
#define __ARM_ARCH_PROFILE 65
#define __ARM_32BIT_STATE 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_ARCH_ISA_THUMB 2
#define __ARM_ARCH 7
#define __arm__ 1
#define __ARM_ARCH_7A__ 1
#define __ARM_FEATURE_SIMD32 1
#define __ARM_FEATURE_COPROC 15
#define __ARM_FEATURE_DSP 1
#define __ARM_EABI__ 1
builder@aarch64:/.in$ gcc -dM -E - </dev/null |grep -i -e aarch -e arm
#define __ARM_SIZEOF_WCHAR_T 4
#define __ARM_FEATURE_IDIV 1
#define __ARM_FP 14
#define __AARCH64_CMODEL_SMALL__ 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_ALIGN_MAX_PWR 28
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_FP16_ARGS 1
#define __ARM_FEATURE_CLZ 1
#define __aarch64__ 1
#define __AARCH64EL__ 1
#define __ARM_64BIT_STATE 1
#define __ARM_FEATURE_FMA 1
#define __ARM_ARCH_PROFILE 65
#define __ARM_PCS_AAPCS64 1
#define __ARM_ARCH 8
#define __ARM_ARCH_8A 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_NEON 1
#define __ARM_ALIGN_MAX_STACK_PWR 16
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_ARCH_ISA_A64 1

@vt-alt
Copy link
Contributor Author

vt-alt commented May 29, 2023

About cmake, my guess is if CMAKE_SYSTEM_PROCESSOR contains substring "arm" then it some kind of ARM32 (for example even if it's "armv8l"), if matches "aarch64" then it's ARM64.

@dstebila
Copy link
Member

I'm not sure how to characterize which platforms it is acceptable on and which aren't, so the only strategy I have would be to turn it off on all ARM32 platforms.

@dstebila
Copy link
Member

I've created #1485 as an attempt to fix it.

But I wonder if it's actually a different problem -- it's not that the instructions aren't supported, but that they aren't available to user-space programs by default. It's my understanding this is a problem on ARMv8 (https://github.com/mupq/pqax#enable-access-to-performance-counters) but it might also be a problem on ARMv7 (https://neocontra.blogspot.com/2013/05/user-mode-performance-counters-for.html). The workaround on ARMv8 is to install a kernel module (see my first link) to turn on the Performance Monitor Unit (PMU) which then makes the required instructions available to user-space programs. In our benchmarking code we have a configure-time option that lets you decide whether to use those or not (but still depends on you having the kernel module enabled.)

@vt-alt
Copy link
Contributor Author

vt-alt commented May 31, 2023

I think our infrastructure admins will not agree to install some 3rd party out-of-tree kernel module on build farm (not even all in-tree modules, due to security reasons) just to run some tests.

Also (I tried to run but) tests do not work in aarch32 KVM virtual machine.

So the best approach sounds like to have configure-time option. (And by default to disable this on all ARM32.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants