-
-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK22: Riscv: Test jobs randomly "Terminated" on specific riscv machines #3669
Comments
@sxa - Tagging because Stewart may already be aware of this. |
Seen also in dry-run triage for JDK22 ![]() example from https://ci.adoptium.net/job/Test_openjdk22_hs_sanity.openjdk_riscv64_linux_testList_3/16/console
|
test-rise-ubuntu2404-riscv64-5 through to 7 were all created around the time these failures were first observed (Issue: #3598 (comment)). They are running the same kernel as the earlier numbered 2404 machines, and have the same amount of swap as the earlier ones so there should be no difference in behavior. Having said that, based on the information in this issue I'm going to remove |
Had a (perhaps obvious) brainwave ... The issue is that some of the agent definitions are pointing to duplicate machines which is why we're getting terminations - two test jobs running on a single machine will be trying to clear up each others processes. The config in jenkins is correct, but i suspect when the definition was duplicated the machine started up connecting to the machine that was the source of the copy and didn't get a disconnect/reconnect cycle. |
Summary
The JDK22 Riscv test jobs seem to fail near-constantly with makefile "Terminated" messages.
Details
Most JDK22 Riscv test jobs seem to fail near-constantly, but only after July 3rd mid-afternoon-ish, and only on specific machines.
This may or may not be a related issue, as the test jobs that pass are only run on a specific subset of the riscv machines, and the jobs that fail are only run on a separate subset. There doesn't appear to be a difference in tag, so it may just be random chance.
Machines where failures are seen:
Machines where passes are seen:
Example of a job where one testlist always seems to pass, and the other always seems to fail:
https://ci.adoptium.net/job/Test_openjdk22_hs_extended.perf_riscv64_linux/27/
Error message
Many look like this:
But there are variations, like this one:
I'm lumping these all in together due to the similarity in jdk version, platform, and the "Terminated" part, but these may prove to be separate issues after the first problem is resolved.
URLs
The text was updated successfully, but these errors were encountered: