Skip to content

clang++ miscompiles std::pow with mixed types when using -O3 -fno-math-errno -fveclib=libmvec #164642

@davidbatetc

Description

@davidbatetc

I am using the Linux x64_64 release of LLVM 21.1.4.

Reduced test case (also in godbolt):
test.cpp

#include <cmath>
#include <iostream>

using T = double;
using U = float;

void __attribute__((noinline)) computePow(T *dst, T *base, U *exponent, int n)
{
	for (int i = 0; i < n; ++i) {
		dst[i] = static_cast<T>(std::pow(base[i], exponent[i]));
	}
}

int main()
{
	constexpr int N = 4;

	T x[N] = {2, 4, 6, 8};
	U y[N] = {7, 5, 3, 1};

	T z[N];
	computePow(z, x, y, N);

	for (int i = 0; i < N; ++i) {
		std::cout << "pow(" << x[i] << ", " << y[i] << ") = " << z[i]
				  << std::endl;
	}
}
$ clang++ test.cpp -o test -O3 -fveclib=libmvec -fno-math-errno
./test
pow(2, 7) = 64
pow(4, 5) = 65536
pow(6, 3) = 0
pow(8, 1) = 0

However, the result should be

pow(2, 7) = 128
pow(4, 5) = 1024
pow(6, 3) = 216
pow(8, 1) = 8

The same wrong result happens for other combinations of types for T and U, where T != U, for example T = double and U = int. The result is instead correct for T = U = double and T = U = float.


From what I have investigated, it seems like the issue is that the compiled program uses the libmvec function _ZGVdN4vv_pow for computing 4 powers at once, but the values are not placed properly in the registers.

In the assembly generated (see godbolt), the registers xmm0 and xmm1 are used for the base and the registers xmm2 and xmm3 for the exponent. I think that ymm0 should instead be used for the base and ymm1 for the exponent.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions