Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible request for ensuring jenkins agents are run with JDK11+ #2763

Closed
sxa opened this issue Oct 4, 2022 · 23 comments
Closed

Ansible request for ensuring jenkins agents are run with JDK11+ #2763

sxa opened this issue Oct 4, 2022 · 23 comments

Comments

@sxa
Copy link
Member

sxa commented Oct 4, 2022

Delete as appropriate from this list:

  • Missing install

Details:
At the end of June Jenkins announced in a blog post that the Jenkins 2.357 and the forthcoming 2.361.1 LTS would require Java 11 or 17 on the server side (We are now running ours with Temurin 17).

With those new versions, using Java 8 for the jenkins agent systems will NOT BE SUPPORTED.

For this reason, we need to upgrade the jenkins agents to all run java 11 (or later) before we perform the next Jenkins LTS upgrade. We are currently on 2.346.3 (previous LTS) and will need to look at upgrading to 2.36.1. But before that, we need to ensure all the jenkins agents are running a suitable version.

Solaris in particular only has Java 8 on it (Temurin does not produce a later one) but there are versions of JDK11 for Solaris available from:

Or we could try building our own JDK11, but I don't think our machine configurations supported that the last time I tried.

FYA @karianna and I'll tag @steelhead31 too since he likes playing with Solaris!

@sxa
Copy link
Member Author

sxa commented Oct 4, 2022

So far none of the above options are proving to be feasible for Solaris 10 due to the dependency on the Solaris 11 posix_fallocate symbol as referenced in this article.

@sxa
Copy link
Member Author

sxa commented Oct 13, 2022

We should aim for JDK17 on all systems (For AIX we can use a nightly build or GA candidate)

@sxa
Copy link
Member Author

sxa commented Dec 5, 2022

So far none of the above options are proving to be feasible for Solaris 10 due to the dependency on the Solaris 11 posix_fallocate symbol as referenced in this article.

Fixed. a JDK11 with a 'fake' posix_fallocate (Not a very common function) will allow a prebuilt java 11 to run if you set LD_PRELOAD in the environment to a trivial library created as follows (It should print a message if the function is called, but the jenkins agent does not appear to trigger it:

cat > fallocate.c << EOT
#include <stdio.h>
int posix_fallocate(int fd, off_t offset, off_t len)
{
  fprintf(stderr, "posix_fallocate() called but stubbed out\n");
}
EOT
cc -G -m64 -o fallocate.so -Kpic fallocate.c
$ LD_PRELOAD=$PWD/fallocate.so /usr/lib/jvm/zulu11.60.19-ca-jdk11.0.17-solaris/bin/java -version
openjdk version "11.0.17" 2022-10-18 LTS
OpenJDK Runtime Environment Zulu11.60+19-CA (build 11.0.17+8-LTS)
OpenJDK 64-Bit Server VM Zulu11.60+19-CA (build 11.0.17+8-LTS, mixed mode)
bash-3.2$ 

A couple of extra adjustments were needed to make git work properly as it was, by default, getting the LD_PRELOAD setting from the java process. I created a small wrapper around git to squash LD_PRELOAD (since git was a 32-bit application and was trying to use the symbol) I created this:

mkdir -p /usr/local/sxabin
cat > /usr/local/sxabin/git
#!/bin/sh
/usr/local/bin/git "$@"

And set the jenkins configuration for the agent to point to Tool locations -> git to /usr/local/sxabin/git and als set /usr/local/sxabin/git first in the PATH. This appears to have worked: https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-solaris-x64-temurin/201/console

@steelhead31
Copy link
Contributor

This looks like a good solution to me, the only real impact is on the jenkins agent, its probably a cleaner (and less prone to errors) than running via ssh forwarding or similar.

@sxa
Copy link
Member Author

sxa commented Dec 6, 2022

Unfortunately I hadn't noticed that the linked job ran on the old Solaris machine (I thought it had been disabled) so this isn't fully working yet as the LD_PRELOAD is going to sub-processes and since most of them are 32-bit the library causes a failure:

ld.so.1: sh: fatal: /usr/lib/jvm/fallocate-preload.so: wrong ELF class: ELFCLASS64
Killed

@sxa
Copy link
Member Author

sxa commented Dec 6, 2022

Considering not persuing this and going down another route - build JDK11 on a Solaris 11 system but adjust the code so it doesn't require posix_fallocate

@sxa
Copy link
Member Author

sxa commented Dec 8, 2022

@ptribble's 11.0.2 works on Solaris 10/x64: https://pkgs.tribblix.org/openjdk/openjdk11.0.2-s10-x86_64.tar.bz2
sha256sum a5484bd35ed15ea7dc97870cea470aedf0c713ecca8075e57954a70e8b32cd89

@ptribble
Copy link

ptribble commented Dec 8, 2022

Using LD_PRELOAD_64 rather than a bare LD_PRELOAD ought to restrict it to 64-bit processes, if you want to pursue that route.

@sxa
Copy link
Member Author

sxa commented Dec 9, 2022

Using LD_PRELOAD_64 rather than a bare LD_PRELOAD ought to restrict it to 64-bit processes, if you want to pursue that route.

Thank you! I don't think I've used that before but it's EXACTLY what I needed and https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-solaris-x64-temurin/208/ has successfully built with the job running via a JDK11u jenkins agent on the machine.

So, for reference:

  • I built the shared library from the earlier comment on the machines and put it in /usr/lib/jvm
  • Extracted the X64 or SPARC version of Zulu under /usr/lib/jvm
  • Adjust the jenkins agent startup to have a JavaPath of /usr/lib/jvm/zulu11.60.19-ca-jdk11.0.17-solaris/bin/java and Prefix Start Agent Command of export LD_PRELOAD_64=/usr/lib/jvm/fallocate.so &&. Or if using a startup script use LD_PRELOAD_64=/usr/lib/jvm/fallocate-preload.so /usr/lib/jvm/zulu11.60.19-ca-jdk11.0.17-solaris/bin/java [...]

There were a few messages showing in the startup log on SPARC, but that appears to be building ok too.

FYI @speakjava - we might have a solution :-)

@sxa
Copy link
Member Author

sxa commented Jan 11, 2023

We still have over 50 systems running JDK8 as the agent. Thanks to @steelhead31 for collating this list. We can check these off as they are fixed. Also need to ensure that the playbooks deploy new ones with JDK11/17 available (and ideally as the default)

Windows (14)

  • build-alibaba-win2012r2-x64-1
  • build-alibaba-win2012r2-x64-2
  • build-azure-win2012r2-x64-1
  • build-azure-win2012r2-x64-2
  • build-azure-win2012r2-x64-4-sxa
  • build-azure-win2016-x64-1
  • build-ibmcloud-win2012r2-x64-1
  • build-ibmcloud-win2012r2-x64-2
  • test-azure-win2012r2-x64-1
  • test-azure-win2012r2-x64-3
  • test-azure-win2016-x64-1
  • test-azure-win2019-x64-1
  • test-ibmcloud-win2012r2-x64-1
  • test-ibmcloud-win2012r2-x64-2

Linux build + test (22)

  • build-digitalocean-centos69-x64-2
  • build-osuosl-centos74-ppc64le-1X
  • build-osuosl-centos74-ppc64le-2
  • docker-osuosl-ubuntu2004-ppc64le-1
  • test-aws-rhel76-armv8-1
  • test-aws-rhel8-x64-1
  • ~test-docker-alpine316-aarch64-1 - Removed From Jenkins
  • ~test-docker-ubuntu1804-armv8l-2 - Removed From Jenkins
  • test-docker-ubuntu1804-armv8l-4
  • test-docker-ubuntu2110-armv8l-1
  • test-docker-ubuntu2204-armv8l-2
  • test-ibmcloud-rhel7-x64-1
  • test-ibmcloud-ubuntu1604-x64-1
  • test-osuosl-centos74-ppc64le-1
  • test-osuosl-centos74-ppc64le-2
  • test-osuosl-ubuntu1604-ppc64le-1
  • test-osuosl-ubuntu1604-ppc64le-2
  • test-osuosl-ubuntu1804-ppc64le-1
  • test-osuosl-ubuntu1804-ppc64le-2
  • test-osuosl-ubuntu2004-ppc64le-1
  • test-scaleway-ubuntu1604-x64-1
  • test-skytap-ubuntu2004-ppc64le-1

Others (16)

  • build-osuosl-aix71-ppc64-1
  • build-osuosl-aix71-ppc64-2
  • build-packet_esxi-solaris10u11-x64-1
  • build-siteox-solaris10u11-sparcv9-1
  • infra-ibmcloud-vagrant-x64-1.1
  • infra-ibmcloud-vagrant-x64-1.2
  • infra-ibmcloud-vagrant-x64-1.3
  • infra-ibmcloud-vagrant-x64-1.4
  • infra-ibmcloud-vagrant-x64-1.5
  • test-osuosl-aix715-ppc64-1 p9-aix1-adopt05.osuosl.org
  • test-osuosl-aix715-ppc64-2 adopt06
  • test-osuosl-aix715-ppc64-3 adopt07
  • test-osuosl-aix715-ppc64-4 adopt08
  • test-osuosl-aix72-ppc64-1 adopt03
  • test-osuosl-aix72-ppc64-2 adopt04
  • test-siteox-solaris10u11-sparcv9-1

Note that from the comments in #2847 the Solaris 10 machines will require Bellsoft Liberica 11 to be used (Azul's seems to result inhigh CPU load after a while on Solaris 10) and requires the fallocate preload mentioned in an earlier comment.

@sxa
Copy link
Member Author

sxa commented Feb 2, 2023

@steelhead31 @Haroon-Khel I've just adjusted the last comment to categorise the machines (Windows/Linux build+test/Other). I reckon at this point we should set an hour aside and do this. Would you be interested in doing it with an open call to discuss any issues? FYI @karianna in case you have someone who has access and would like to take on the windows subset :-)

@steelhead31 Can you easily tell how many of the ones already migrated are using a jenkins config that points to a specific java as opposed to the default version? (I'm not sure if the command lines you were getting would say just java in the latter case)

@steelhead31
Copy link
Contributor

@sxa / @Haroon-Khel  sounds like a good idea to me, I'll produce an up to date ( and complete list ) of the current state, this afternoon in preperation for doing this....  Im not sure how easy it is to tell default java from specified, as it just shows a path.. lets see if the updated list offers any insight

@sxa
Copy link
Member Author

sxa commented Feb 2, 2023

I'm somewhat in two minds about whether to change the default or override in jenkins (and if we override we need to bear in mind me raising #2912 recently!) But I think an override for the remaining ones, then possibly look at adjusting the defaults and resetting the override after 2912 goes in is probably my preferred approach ...

@Haroon-Khel
Copy link
Contributor

+1 from me on overriding. From what ive seen, the default is usually 8, so unless you guys have an objection I see no reason as to why we shouldnt override it with 17 (on platforms that have 17)

@sxa
Copy link
Member Author

sxa commented Feb 2, 2023

@steelhead31 I guess the table from #2879 (comment) would suggest that the ones with an "empty" Java version column are using the default on the system.

@steelhead31
Copy link
Contributor

@steelhead31
Copy link
Contributor

@steelhead31 I guess the table from #2879 (comment) would suggest that the ones with an "empty" Java version column are using the default on the system.

I believe so for the online ones at least... they will be using the system default, or they may have been connected from the node itself with a different JDK from the default... I'll audit a few of them, and see if its obvious to discern those two cases.

@sxa
Copy link
Member Author

sxa commented Feb 2, 2023

Looks like there's plenty of precedent and no objections to tweaking the jenkins agent config so I suggest we go for that, using /usr/lib/jvm/jdk-17 where available and a suitable platform-specific alternative elsewhere.

@Haroon-Khel
Copy link
Contributor

Haroon-Khel commented Feb 3, 2023

@sxa I'm guessing build-azure-win2012r2-x64-4-sxa is on your home network? I cant get into it

Since I deleted the jenkins service on build-alibaba-win2012r2-x64-1 and I cant get it back up, for now I have the jenkins agent running in a background process with java17.

Steps to change the java path for the jenkins service:

  • In the jenkins user folder (C:\Users\jenkins usually) there should be a jenkins-slave.xml file
  • Open it with an editor and change the executable path to your desired java path
  • Save and close the file
  • Restart the jenkins service in Services
  • Check the System Information page for the jenkins node to see if the update has taken place

@sxa
Copy link
Member Author

sxa commented Feb 3, 2023

@sxa I'm guessing build-azure-win2012r2-x64-4-sxa is on your home network? I cant get into it

No - the only ones hosted be me are the test-sxa ones. I'll check that definition as I think it's one of a set that I used when working with Andrew on reproducible builds. It probably just needs to be deleted.

@sxa
Copy link
Member Author

sxa commented Feb 3, 2023

Actions arising from today's activities:

release
You are trying to get resource http://47.111.84.87:8080/jnlpJars/remoting.jar but it is not in cache and could not be downloaded. Attempting to continue, but you may expect failure
JAR http://47.111.84.87:8080/jnlpJars/remoting.jar not found. Continuing.
JAR http://47.111.84.87:8080/jnlpJars/remoting.jar not found. Continuing.
netx: Initialization Error: Could not initialize application. (Fatal: Initialization Error: Unknown Main-Class. Could not determine the main class for this application.)

Follow-on actions: Change all the definitions once we implement #2912 ;-)

@sxa
Copy link
Member Author

sxa commented Feb 23, 2023

Based on the new column from the plugin in #2950 there are five machines still running JDK8 for the agent:

@steelhead31 steelhead31 moved this to In Progress in Adoptium 1Q 2023 Plan Feb 28, 2023
@steelhead31
Copy link
Contributor

The final machines have all been upgraded to JDK17 for jenkins agents.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Adoptium 1Q 2023 Plan Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants