Open
Description
I see very poor results on a modern ARM server. Some openlibm implementations are up to 48.68x slower than their libm counterparts. This was a checkout of the main branch at 12f5ffc This is also related to #234, but the performance difference seems to be even more dramatic.
For interest and potential usefulness to #203, I also compared it against an optimized build of musl 1.2.4:
bench-syslibm | bench-openlibm | bench-musl
pow : 78.6387 MPS | pow : 17.3955 MPS | pow : 57.9493 MPS
hypot : 232.7852 MPS | hypot : 4.7823 MPS | hypot : 139.4793 MPS
exp : 317.8124 MPS | exp : 119.9932 MPS | exp : 215.2262 MPS
log : 228.3188 MPS | log : 97.0294 MPS | log : 181.7701 MPS
log10 : 118.6787 MPS | log10 : 73.0237 MPS | log10 : 76.8402 MPS
sin : 133.0101 MPS | sin : 135.6112 MPS | sin : 165.5926 MPS
cos : 144.4003 MPS | cos : 127.8527 MPS | cos : 150.5435 MPS
tan : 105.8875 MPS | tan : 68.8512 MPS | tan : 78.9428 MPS
asin : 178.2302 MPS | asin : 9.6621 MPS | asin : 88.3722 MPS
acos : 154.1304 MPS | acos : 9.9192 MPS | acos : 98.5818 MPS
atan : 190.8853 MPS | atan : 91.6229 MPS | atan : 97.0451 MPS
atan2 : 56.6821 MPS | atan2 : 42.4876 MPS | atan2 : 47.6644 MPS
GNU libc version: 2.35
GNU libc release: stable
The openlibm compilation line looks like:
cc -fno-gnu89-inline -fno-builtin -O3 -fPIC -std=c99 -Wall -I/home/user/openlibm -I/home/user/openlibm/include -I/home/user/openlibm/aarch64 -I/home/user/openlibm/src -DASSEMBLER -D__BSD_VISIBLE -Wno-implicit-function-declaration -I/home/user/openlibm/ld128 -c src/e_j0.c -o src/e_j0.c.o
I have tried compiling openlibm with just bare make
, and also specifying the architecture directly with make ARCH=aarch64
to identical results.
Is there something we can do about this?