This is a suite of simple Arm assembly kernels for testing the performance and functionality of Arm CPUs.
Requirements: A compiler that understands Armv9 with i8mm extensions (e.g. GCC12+).
Quick Start:
$ make -j
$ ./arithmetic/fp64_neon_fmla.x
This code is provided under the BSD-3 License.
Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.