Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weekly build triage for the week starting 2023/04/15 #3330

Closed
adamfarley opened this issue Apr 17, 2023 · 8 comments
Closed

Weekly build triage for the week starting 2023/04/15 #3330

adamfarley opened this issue Apr 17, 2023 · 8 comments
Assignees

Comments

@adamfarley
Copy link
Contributor

This week's build triage summary:

This is the week (we assume/hope) of the April-2023 Temurin release. As such, no new builds were run over the weekend.

To get a feel for the build status prior to the release, I'll be focusing on the Temurin build jobs whose last execution failed. Link.

Will include links to the failing builds below, followed by the cause of failure.

@adamfarley
Copy link
Contributor Author

adamfarley commented Apr 17, 2023

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk20u/job/jdk20u-linux-arm-temurin/9/
TEST FAIL: sanity.perf and extended.system failed on test-sxa-armv7l-ubuntu2004-odroid-1 with: AccessDeniedException: /ssd/jenkins

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk20u/job/jdk20u-aix-ppc64-temurin/5/console
BUILD FAIL: Several instances of this error in different files during build:
The builtin "__builtin_mul_overflow" is not supported.

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-linux-arm-temurin/192/console
TEST FAIL: sanity.perf and extended.functional failed on test-sxa-armv7l-ubuntu2004-odroid-1 with: AccessDeniedException: /ssd/jenkins

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-mac-x64-temurin/207/
INSTALLER FAIL: Apple failed to sign the installer because:

Failed to notarize the requested file (status=invalid). Error code=OptionalInt.empty. Reason: Optional.empty
...
path\": \"OpenJDK11U-jdk_x64_mac_hotspot_2023-04-11-18-05-431574756988957424.pkg/net.temurin.11.jdk.pkg Contents/Payload/Library/Java/JavaVirtualMachines/temurin-11.jdk/Contents/Home/jmods/java.desktop.jmod/lib/libosxui.dylib\"
...
message\": \"The binary is not signed with a valid Developer ID certificate.

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-linux-arm-temurin/199/
TEST FAIL: sanity.perf and extended.functional failed on test-sxa-armv7l-ubuntu2004-odroid-1 with: AccessDeniedException: /ssd/jenkins

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-temurin/221/console
TEST FAIL: Four tests failed with a fatal error during java -version.

16:41:57  #  Internal Error (threadCritical_aix.cpp:45), pid=32309510, tid=258
16:41:57  #  guarantee(ret == 0) failed: fatal error with pthread_mutex_lock()

All failures occurred on test-osuosl-aix715-ppc64-<1-4>. Two other tests ran on test-osuosl-aix72-ppc64-<1-2> and didn't have this problem. Machine issue? Unknown, but it looks like we were having this issue (or something that looks exactly like it) four years ago on JDK13. That problem was OS version specific, and it doesn't look like it was fixed (we just ran on an earlier OS version), so it could well be the same issue.

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-solaris-x64-temurin/307/console
BUILD FAIL: Known Solaris issue with fix merged. PR link.

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-solaris-sparcv9-temurin/295/console
BUILD FAIL: Ditto.

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-linux-arm-temurin/266/consoleFull
TEST FAIL: sanity.perf and sanity.system failed on test-sxa-armv7l-ubuntu2004-odroid-1 with: AccessDeniedException: /ssd/jenkins

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-alpine-linux-x64-temurin/245/consoleFull
TEST FAIL: All tests failed because of known issue.

DETECTED_JDK_VERSION value is 17, settled JDK_VERSION value is 8

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-aix-ppc64-temurin/276/
TEST FAIL: As above, the 715 machines all fail to run java -version with a fatal error in threadCritical_aix.cpp

@adamfarley
Copy link
Contributor Author

adamfarley commented Apr 17, 2023

3 biggest issues:

  • ssd/jenkins access denied issue on arm (Fixed).
  • Solaris illegal argument error (Fixed).
  • aix715 machines issue (Maybe not fixed. Pursue). (Update 2023/04/18: Fixed.)

@sxa
Copy link
Member

sxa commented Apr 17, 2023

(Wrote this hours ago but apparently didn't clck the button to add it to the issue!)

Ref the AIX ones:

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk20u/job/jdk20u-aix-ppc64-temurin/5/console BUILD FAIL: Several instances of this error in different files during build: The builtin "__builtin_mul_overflow" is not supported.

That /may/ be related to adoptium/jdk@f5c8b68 based on seeing when it started failing on in jdk21 builds

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-aix-ppc64-temurin/276/ TEST FAIL: As above, the 715 machines all fail to run java -version with a fatal error in threadCritical_aix.cpp

https://ci.adoptium.net/view/Failing%20Temurin%20jobs/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-aix-ppc64-temurin/276 was run on an AIX 7.2 machine, so I would not expect it to run properly on AIX 7.1. https://github.com/adoptium/ci-jenkins-pipelines/pull/622/files was where that change was implemented and it's supposed to be only targetting 7.2 machines for the testing, but the error suggests that's not taking effect as it should - can you check that the test jobs and machines are labelled in a way consistent with that PR?

@adamfarley
Copy link
Contributor Author

@sxa - I'm confused by the second PR link listed above. From your description, I was expecting some form of code that affects the selection of test machines based on aix version. Instead, it seems to be a link to a harfbuzz version update.

Please verify that that's the right PR link. Thank you. :)

@adamfarley
Copy link
Contributor Author

Also, analysis implies that the failed test jobs were born from a build job that was originally run before your test label change was integrated.

Builds (and tests of those builds) after that point seem to run with the correct labels.

@sxa
Copy link
Member

sxa commented Apr 18, 2023

Please verify that that's the right PR link. Thank you. :)

Edited to have the correct link.

@adamfarley
Copy link
Contributor Author

adamfarley commented Apr 18, 2023

Here's a summary of the failing builds, grouped by issue and topped with a summary.

Summary

Build failures Upstream issue raised; possible problem Issue raised; nonblocker Fixed
11 1 2 8

Details

jdk20u-linux-arm-temurin/9
jdk17u-linux-arm-temurin/192
jdk11u-linux-arm-temurin/199
jdk8u-linux-arm-temurin/266
TEST FAIL: AccessDeniedException: /ssd/jenkins
Fixed. Issue link.

jdk20u-aix-ppc64-temurin/5
BUILD FAIL: The builtin "__builtin_mul_overflow" is not supported.
Not fixed. Upstream issue caused by this. Issue already exists here.

jdk11u-mac-x64-temurin/207
SIGN_INSTALLER FAIL: libosxui.dylib is unsigned
Cannot investigate due to this issue. Issue is intermittent though, so I'd advise rebuilding if the build fails, and solving the "sometimes" issue after the release (rather than risk introducing an issue that fails all the time).

jdk11u-aix-ppc64-temurin/221
jdk8u-aix-ppc64-temurin/276
TEST FAIL: Four tests failed with a fatal error in threadCritical_aix.cpp:45 when tested on aix715
Fixed. PR Link.

jdk8u-solaris-x64-temurin/307
jdk8u-solaris-sparcv9-temurin/295
BUILD FAIL: Known Solaris issue with fix merged. PR link.
Fixed. PR link.

jdk8u-alpine-linux-x64-temurin/245
TEST FAIL: DETECTED_JDK_VERSION value is 17, settled JDK_VERSION value is 8
Issue raised. A manual workaround exists, see final comment in the issue. Issue link.

@adamfarley
Copy link
Contributor Author

Triage complete. No more builds this week. Release in progress. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants