Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cblas_zdotc_sub returns incorrect results on ppc64le #2837

Closed
susilehtola opened this issue Sep 15, 2020 · 6 comments
Closed

cblas_zdotc_sub returns incorrect results on ppc64le #2837

susilehtola opened this issue Sep 15, 2020 · 6 comments
Milestone

Comments

@susilehtola
Copy link
Contributor

Bug reported on Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1878449

cblas_zdotc_sub returns incorrect results on ppc64le on Fedora 32 and above (all with GCC 10.2.1). The correct result, however, is obtained on Fedora 31 (GCC 9.3.1).

The test is the following: since zdotc_sub(x,y) = conjugate(x) . y, one should have zdotc_sub(y,x) = conjugate(zdotc_sub(x,y)) but this property is violated by the numerics.

Test program from original report below

#include <stdio.h>
#include <cblas.h>

int
main(void)
{
	const double x[16] = {
		-0.4759506499688366,	+0.4656291053809647,
		-0.6826320556910688,	-0.7694515466337011,
		-0.4437469610112805,	-0.2274498627312811,
		-0.0813662255708667,	+0.25700235907942326,
		-0.35799891895966574,	-0.7498841473288012,
		+0.036785641195074215,	+0.9670972102872823,
		-0.4761141488697107,	-0.11355026270974378,
		+0.9521705697548672,	+0.5791166840478064,
	};
	const double y[16] = {
		+0.5882371515831732,	-7.0347725597508237e-04,
		-0.27747685516513854,	-6.4286264567294582e-01,
		-0.167792120858004,	-1.7371739715068224e-01,
		+0.1685162562206428,	-6.0160952958980318e-01,
		+0.5203435476117149,	+6.3398807174073424e-02,
		-0.6243831891471687,	+6.6474140613596511e-01,
		-0.42366570437592865,	-6.2949810985591026e-01,
		+0.34043771632046815,	+9.1471844734252139e-01,
	};
	double ret[2];
	cblas_zdotc_sub(8, x, 1, y, 1, ret);
	printf("Result: %.15f%+.15fj\n", ret[0], ret[1]);
	cblas_zdotc_sub(8, y, 1, x, 1, ret);
	printf("Result: %.15f%+.15fj\n", ret[0], ret[1]);
	return 0;
}
@martin-frbg
Copy link
Collaborator

martin-frbg commented Sep 15, 2020

Seems to be an optimizer bug that comes and goes... I think I saw this with gcc 10.1 but the patch (pragma GCC optimize "O1") seemed to be unnecessary with gcc 10.2 by the time my openpower account was reactivated. I'll take another look.

@martin-frbg
Copy link
Collaborator

Not reproducible with current develop and gcc 8.4.0, gcc 9.3.0 or gcc 10.2.0 on Debian11 ppc64le. (Also not the problem it first reminded me of)

@martin-frbg
Copy link
Collaborator

Not reproduced with develop, but reproduced with 0.3.10 on Fedora32 - though the only relevant change would seem to be my replacing the poorly portable __real, __imag initializations with OPENBLAS_MAKE_COMPLEX_FLOAT

@martin-frbg
Copy link
Collaborator

Definitely the switch to OPENBLAS_MAKE_COMPLEX_FLOAT in the PPC zdot.c that fixed this

@susilehtola
Copy link
Contributor Author

You mean 661c6bf and 21072e5?

@martin-frbg
Copy link
Collaborator

Yes (where the exclusion of the microkernel has nothing to do with this particular problem)

@martin-frbg martin-frbg added this to the 0.3.11 milestone Sep 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants