I ran into an issue with xsimd::erfc. For batches that have the same value in all their elements, the results look correct, however, for batches that have different values in their elements, the erfc function may be calculated inaccurately for some elements. The error can be large, above 100 ULP.
Repro
Version: xsimd/11.1.0 (conan)
ISA: SSE2, x86-64, AMD Ryzen
xsimd::batch<float, xsimd::sse2> v_ = { 0.5f, 0.6f, 0.7f, 0.8f };
xsimd::batch<float, xsimd::sse2> c1 = { 0.5f, 0.5f, 0.5f, 0.5f };
xsimd::batch<float, xsimd::sse2> c2 = { 0.6f, 0.6f, 0.6f, 0.6f };
xsimd::batch<float, xsimd::sse2> c3 = { 0.7f, 0.7f, 0.7f, 0.7f };
xsimd::batch<float, xsimd::sse2> c4 = { 0.8f, 0.8f, 0.8f, 0.8f };
std::cout << xsimd::erfc(v_) << std::endl;
std::cout << xsimd::erfc(c1) << std::endl;
std::cout << xsimd::erfc(c2) << std::endl;
std::cout << xsimd::erfc(c3) << std::endl;
std::cout << xsimd::erfc(c4) << std::endl;
Observed vs. expected behaviour:
The code above produces the following output:
(0.4795, 0.396144, 0.3222, 0.257916)
(0.4795, 0.4795, 0.4795, 0.4795)
(0.396144, 0.396144, 0.396144, 0.396144)
(0.322199, 0.322199, 0.322199, 0.322199)
(0.257899, 0.257899, 0.257899, 0.257899)
The last element of erfc(v_) should be equal to all elements of erfc(c4), however 0.257916 != 0.257899. The accurate value of Erfc[0.8] equals 0.2578990352923395... according to Wolfram Mathematica, so the heterogeneous batch is not calculated correctly.
I ran into an issue with
xsimd::erfc. For batches that have the same value in all their elements, the results look correct, however, for batches that have different values in their elements, theerfcfunction may be calculated inaccurately for some elements. The error can be large, above 100 ULP.Repro
Version: xsimd/11.1.0 (conan)
ISA: SSE2, x86-64, AMD Ryzen
Observed vs. expected behaviour:
The code above produces the following output:
The last element of
erfc(v_)should be equal to all elements oferfc(c4), however0.257916 != 0.257899. The accurate value ofErfc[0.8]equals0.2578990352923395...according to Wolfram Mathematica, so the heterogeneous batch is not calculated correctly.