Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest Galera 4 (26.4.6) and Galera 3 (25.3.31) fail to build on arm64: CRC32 illegal instruction #582

Closed
ottok opened this issue Nov 1, 2020 · 7 comments
Assignees
Labels
Milestone

Comments

@ottok
Copy link
Contributor

ottok commented Nov 1, 2020

While preparing Galera 4 and Galera 3 for upload to Debian, I noticed they failed to build on arm64:

image

image

From https://launchpadlibrarian.net/504795654/buildlog_ubuntu-groovy-arm64.galera-4_26.4.6-1~ubuntu20.10.1~1604258955.5df1ff6+master_BUILDING.txt.gz

Running suite(s): CRC32C implementations
75%: Checks: 4, Failures: 0, Errors: 1
galerautils/tests/gu_crc32c_test.c:66:E:gu_crc32c_hw_arm64:test_gu_crc32c_arm64:0: (after this point) Received signal 4 (Illegal instruction)

From https://launchpadlibrarian.net/504795631/buildlog_ubuntu-groovy-arm64.galera-3_25.3.31-1~ubuntu20.10.1~1604258374.f284fcf+master_BUILDING.txt.gz

Running suite(s): CRC32C implementations
gcc -o galerautils/tests/gu_to_test.o -c -std=c99 -fno-strict-aliasing -pipe -g -O2 -fdebug-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -g -O3 -DNDEBUG -pthread -fPIC -Wall -Wextra -Wno-unused-parameter -Wdate-time -D_FORTIFY_SOURCE=2 -D_XOPEN_SOURCE=600 -DHAVE_COMMON_H -DGALERA_USE_GU_NETWORK -DHAVE_BYTESWAP_H -DHAVE_ENDIAN_H -DHAVE_EXECINFO_H -DHAVE_TR1_ARRAY -DHAVE_BOOST_SHARED_PTR_HPP -DHAVE_TR1_UNORDERED_MAP -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG=1 -DHAVE_ASIO_HPP -DOPENSSL_HAS_SET_ECDH_AUTO -DGALERA_ONLY_ALIGNED -Iasio -Iwsrep/src -I. -Igalerautils/src -Icommon galerautils/tests/gu_to_test.c
75%: Checks: 4, Failures: 0, Errors: 1
galerautils/tests/gu_crc32c_test.c:66:E:gu_crc32c_hw_arm64:test_gu_crc32c_arm64:0: (after this point) Received signal 4 (Illegal instruction)

If you have a quick fix for this, I can hold back the upload to Debian and try to include your fix.

@ottok
Copy link
Contributor Author

ottok commented Nov 2, 2020

I went ahead and uploaded Galera 3 to Debian unstable, and it failed as expected. See https://buildd.debian.org/status/package.php?p=galera-3 -> https://buildd.debian.org/status/fetch.php?pkg=galera-3&arch=arm64&ver=25.3.31-1&stamp=1604344898&raw=0

Please advice.

@temeo
Copy link
Contributor

temeo commented Nov 3, 2020

This looks like a bad CPU capabilities test and should be relatively easy to work around/fix. We will take a look at this asap.

@ottok
Copy link
Contributor Author

ottok commented Nov 3, 2020 via email

@ayurchen
Copy link
Member

ayurchen commented Nov 4, 2020

Otto,

I tried this with Sid 20201104 public image on AWS on all three types of arm64 instances and failed to reproduce.

The error itself most likely means that the test for hardware CRC32 support returns true but attempt to use the corresponding function fails. It looks like a problem with the environment to me. Perhaps the build user or the container does not have sufficient privileges, or hardware CRC32 support is erroneously returned by the kernel.

Could you elaborate on any peculiarities of the environment that may come to mind?
Maybe you could get a stacktrace?

@ottok
Copy link
Contributor Author

ottok commented Nov 4, 2020

As reported above, it is failing both on Launchpad.net builds and on buildd. I am sure they have different hardware. The logs at both Launchpad and Debian buildd are fully public and contain all details of the build environment.

I (or anybody, it's public and free) and upload new builds to Launchpad.net. Is there some debug code you would like me to run at the end of the build to check for hardware stuff?

@ayurchen ayurchen self-assigned this Nov 6, 2020
@ayurchen ayurchen added the bug label Nov 6, 2020
@ayurchen ayurchen added this to the 3.32, 4.7 milestone Nov 6, 2020
@ayurchen
Copy link
Member

ayurchen commented Nov 6, 2020

@ottok this patch should fix the issue: https://gist.github.com/ayurchen/77251795940805f21d4dd53fa04f087e

@ottok
Copy link
Contributor Author

ottok commented Nov 7, 2020

I applied a modified version of the patch (so that it would apply cleanly) at https://salsa.debian.org/mariadb-team/galera-3/-/commit/cdc1cbba7420246bac4c9455d4492496d3ce5640 for Galera 3.

Launchpad builds passed now:
image

Same goes for Galera 4:
https://salsa.debian.org/mariadb-team/galera-4/-/commit/b20136f3d1f066946d20cddc54355cab8966a19c
image

So the issue seems indeed to be fixed.

Finally also official Galera 3 build in Debian confirmed this:

image

https://buildd.debian.org/status/package.php?p=galera-4

Thanks!

@ottok ottok closed this as completed Nov 7, 2020
raspbian-autopush pushed a commit to raspbian-packages/galera-3 that referenced this issue Nov 12, 2020
…ept to run the hardware unit test when not supported.

Refs: codership/galera#582

Gbp-Pq: Name arm-crc32-fix.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants