Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Ubuntu 23.10 Scaleway machines with 24.04 LTS #3598

Open
Tracked by #3588
sxa opened this issue Jun 14, 2024 · 8 comments
Open
Tracked by #3588

Replace Ubuntu 23.10 Scaleway machines with 24.04 LTS #3598

sxa opened this issue Jun 14, 2024 · 8 comments

Comments

@sxa
Copy link
Member

sxa commented Jun 14, 2024

Existing machines will be out of support soon, so using the LTS release would be preferable now that Scaleway can provide Ubuntu 24.04

Part of #3589

@sxa
Copy link
Member Author

sxa commented Jun 14, 2024

First machine setup and tests being run

Once the results are verified as good we can start migrating the others (currently 10 in total). Noting that Scaleway also now has the option of debian (unstable) and Fedora 37 so it may be worth provisioning one of each of those for testing too.

@sxa
Copy link
Member Author

sxa commented Jun 14, 2024

Added https://ci.adoptium.net/computer/test-rise-fedora37-riscv64-1
Running playbook now (various issues - covered in #3599
AQA pipeline: https://ci.adoptium.net/job/AQA_Test_Pipeline/293/

@sxa
Copy link
Member Author

sxa commented Jun 14, 2024

Ubuntu 24.04 / JDK21 / riscv64 (Nightly)

Test suite Result ✅⚠️
sanity.functional
extended.functional Unrecognized VM option 'EnableExtendedHCR' from IllegalAccessProtectedMethodTest_0 suite Issue raised adoptium/aqa-tests#5393 - regrind after potentail fix - Re-run of full test run at https://ci.adoptium.net/job/Test_openjdk21_hs_extended.functional_riscv64_linux/53/ (which will hopefully exclude the target) YES, PASS
special.functional
sanity.openjdk java_lang VarHandleTestAccessShort Grinder*100#10389 PASSED 98/100 Failure is test VarHandleTestAccessShort.testAccess("VarHandle -> Array", VarHandle -> Array):: java.lang.AssertionError: success weakCompareAndSetPlain short expected [true] but found [false]
extended.openjdk
sanity.system
extended.system
sanity.perf
extended.perf dacapo-fop_0 failure - core dump Fatal glibc error: pthread_mutex_lock.c:94 (___pthread_mutex_lock): assertion failed: mutex->__data.__owner == 0 Grinder*10#10390 PASSED 10/10

Fedora37 / JDK21 / riscv64 (nightly)

Test suite Result ✅⚠️
sanity.functional
extended.functional IllegalAccessProtectedMethodTest_0 Unrecognized VM option 'EnableExtendedHCR' Issue raised
special.functional
sanity.openjdk java/lang/Error in java_util MultipleProducersSingleConsumerLoops.java Grinder*100#10391 Passed 99/100
extended.openjdk 39 failures Grinder*10#10393 Only ran first test with hotspot_custom - running with jdk_custom at G#10407
sanity.system
extended.system
sanity.perf
extended.perf

Most of the extended.openjdk failures on F37 are Datagram/Multicast tests so are likely related to the network configuration on their deployed system (Maybe IPv6 related in some case?).

Ubuntu 24.04 / JDK17 / riscv64 (Nightly)

Test suite Result ✅⚠️
sanity.functional ⚠️ cmdLineTester_libpathTestRtfChild_0 failure due to missing libawt_xawt.so (Headless build) (xml file) Grinder re-run link. Related issue Grinder*10@10414 FAILED 10/10
extended.functional
special.functional
sanity.openjdk
extended.openjdk 14 jpackage failures (existing issue) plus failure in SSLSocketAlpnTest Grinder*100#10428 PASS 99/100
sanity.system
extended.system Failed LockingLoadTest_0 (Hung process) Re-grind*100@10412 PASSED 99/100
sanity.perf
extended.perf Failed renaissance-dec-tree_0 Crash - fatal error: refcount underflow Internal Error (symbol.cpp:335) Regrind*10@10413 PASSED 10/10

Fedora37 / JDK17 / riscv64 (Nightly)

Test suite Result ✅⚠️
sanity.functional Same cmdLineTester_libpathTestRtfChild_0 failure as Ubuntu JDk17
extended.functional IllegalAccessProtectedMethodTest_0 (J9 test failure) but didn't fail on Ubuntu? Maybe fixed via adoptium/aqa-tests#5393
special.functional
sanity.openjdk ❌ Failed java/lang PublicMethodsTest.java (crash), java/util CurrencyTests.java and java/util SpinedBufferTest.java - Re-grinding*100#10429 ALL THREE PASSED 100/100
extended.openjdk ❌ 38 failures - similar to JDK21
sanity.system
extended.system
sanity.perf
extended.perf ❌ Terminating after 17h Running again Failed dacapo_jython_0 and renaissance-gauss-mix_0 - Re-running with 20 iterations

@sxa
Copy link
Member Author

sxa commented Jun 26, 2024

Four new machines provisioned and used to replace the 6-10 numbered ubuntu2310 systems.
Installed via playbooks with a hosts entry test-rise-ubuntu2410-riscv64-[1:4]
Two extras also provisioned for temurin-compliance that will be set up in parallel.

Noting:

  • crontab wasn't on the systems so the Crontab role failed. Installed manually for now
  • It tried to install the gcc_7 (and probably other versions) and adoptopenjdk_install for JDK11 roles which are not applicable on RISC-V so that should be mitigated for the future
  • The jdk21 "special case" may be broken - now that it is GA we should aim to replace that with the regular logic (Also noting that the JDK19 section in the adoptopenjdk_install role for riscv64 can be removed as we don't need it for anything)
  • Eventually also skipped adoptopenjdk_install (GPG signatrure verification failed) and ant (which worked on 1/4 machines in one of the runs but had an issue with the certfile directive
  • nagios_plugins role did not run successfully, so that has been skipped (although that skippped Get_Vendor_Files which broken the jenkins user role, so I commented out that tag from Get_Vendor_Files)

@sxa
Copy link
Member Author

sxa commented Jun 27, 2024

Most of the problems above were resolved by switching to an Ubuntu 24.04 base with the version of python and ansible installed with the OS. The underlying message was An unknown error occurred: HTTPSConnection.__init__() got an unexpected keyword argument 'cert_file' which was introduced in ansible-core 2.12 and rendered it incompatible with Ubuntu 20.04's python 3.8. I had been using 2.13.3 installed via pip on Ubuntu 20.04.

From https://docs.python.org/3.12/library/http.client.html#http.client.HTTPSConnection
Changed in version 3.12: The deprecated key_file, cert_file and check_hostname parameters have been removed.

Ansible reference: ansible/ansible#83213 (comment)

@sxa
Copy link
Member Author

sxa commented Jun 27, 2024

All four Ubuntu 24.04 machines are now live in jenkins and will be used from now on. I have marked all of the 23.10 ones offline for now with an intention to run a full aqa test run on the 24.04 ones over the weekend then decomission the older ones on Monday, replacing them with more 24.04 ones.

Full list of the machines

@sxa
Copy link
Member Author

sxa commented Jun 28, 2024

aqa_test_pipelines submitted for -3 and -4:

@sxa
Copy link
Member Author

sxa commented Jul 2, 2024

New 24.04 machines 5-7 created to replace 23.10 machines 3-5. Added to the PR at 2cc5cf8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

1 participant