-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8256127: Add cross-compiled foreign architectures builds to submit workflow #1147
Conversation
👋 Welcome back shade! A progress list of the required criteria for merging this PR into |
c8b17b8
to
fe71198
Compare
5016df7
to
bed8d47
Compare
Webrevs
|
Awesome! Thanks a lot for doing this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aleksey,
I'm not at all sure I want to accept these changes. I understand that you care about these exotic platforms, and so do I, but adding them to the submit workflow for all changes seem like the wrong way to proceed.
My concern vary all the way from increased build time, to increased usage of the Github CPU quota (which, as I understand it, is fairly large but not unlimited), to the elephant in the room: the question on who the burden lays to fix bugs in these exotic platforms. It is not at all obvious that these builds should be tested before commit.
I think this change warrants a much wider and deeper discussion about the purpose of the Github commit hook actions, how the free Github CPU quota should be spent, and on whom the responsibility lays to either make sure no. commit is pushed that breaks an exotic platform, or to rectify the problem with a follow-up issue.
I recommend you withdraw this PR and instead initiate a wider discussion on a suitable mailing list, e.g. jdk-dev.
Maybe it would be possible to build hotspot only as this is the part where things often break? This would save a lot of CPU time. |
I understand the first reaction, but please read the updated description. It includes the discussion what bugs it tries to catch, and updated cost estimates. Submit workflow is a convenience thing for better testing coverage. Having the early warnings for fixable issues in foreign architectures helps to avoid follow-up churn without increasing the costs all too much. Do note that it does not require contributors to fix the runtime issues in foreign architectures, it only asks for passing a rather low bar of not breaking the builds for them, which is, in my experience, an easily achievable goal. It is easily achievable because -- I am speaking from experience of fixing a lot of them over the years! -- the overwhelming majority of arch-specific build breakages are about silly things that are obvious from the build logs. So I am leaving this PR open, letting others to chime in. If there is really a need for a wider discussion here, I could start a thread at jdk-dev@, but so far it does not look as a good use of our collective time. Please read through the description, and see if that soothes your concerns. Aside: the issues about costs were raised before, but I think all of them resolve as "enjoy your free tier". If we target to optimize the GH action costs, IMO we should instead consider to avoid triggering them on every push, as I hinted before, and which I think requires better Skara integration (i.e. for |
I think this is a really useful feature and I'm a little surprised that it gets rejected. Please find my comments inline.
I think "on who's the burden to fix issues" on exotic platforms and "testing changes on all platforms" at commit time are two totally distinct questions. While I agree that problems on a exotic platform shouldn't block reviewed changes for an unreasonable amount of time, I also think it is very valuable for both, authors of changes and maintainers of exotic platforms to at least get early notifications of problems on other platforms. My experience as a maintainer of the AIX7PowerPC/s390 ports has always been very positive with regards to the collaboration with authors of changes which impact these ports. Authors are anxious to not break other platforms but usually don't have the possibility to test on these platforms. Aleksey's change comes in very handy here. It will help authors to fix trivial build problems right in their initial submission which I'm sure they'll appreciate. For more complex problems it gives Authors the possibility to notify port maintainers early, before a change gets pushed. I'm sure this alerting could even be automatized in the future if builds/tests fail on certain platforms. Of course such problems shouldn't block changes which are otherwise reviewed and ready to push for an unreasonable amount of time, so I have nothing against making the test on such platform optional submit requirements. But overall I think Aleksey's enhancement are a great means to improve the development experience and to keep the OpenJDK repos clean of trivial follow-up fixes. PS: I specially emphasized exotic platforms because it seems to me that your comment implies that all not Oracle-supported platforms are considered exotic for you. I think in an open source project it should be decided by the community which platforms are considered main (or primary) and which are considered secondary platforms.
I agree that we should we frugal with resources, especially if these resources are being offered to us at no charge. But from my understanding, the resources for GitHub actions are accounted on a personal account base for every developer and not to the upstream project. So every developer who clones OpenJDK and submits changes to his fork will use his personal "GitHub Actions" budget. According to the documentation and @rwestberg's explanations on skara-dev this budget contains 20 parallel jobs (with up to 6 hours runtime) 24/7. Even with Aleksey's changes I think we're well in this budget. Are you afraid tha GitHub/Microsoft will cut down this free-tier in general or for for OpenJDK users? If that's the case we should probably try to get an offical statement from GitHub/Microsoft on this topic before unnecessarily restricting ourselves. As Aleksey explained, the additional builds should still finish before the corresponding Windows/MacOS builds and tests terminate, so there shouldn't be any usability issue either.
I really like this change and I'd appreciate if we could get it merged. I'm of course always open to discussion on details (e.g. if the new biuilds/tests should be mandatory or. optional, how to notify port maintainers, etc...). |
The change to only build hotspot is good. It alleviates some of my concern over this patch. I still think there is a wide difference in how different contributors view these Github Actions submit hooks. Some would never consider to push anything that is not green everywhere, while you think it could be okay to push with broken platform if it takes an unreasonable amount of time to fix. I do think we need to align ourself as a community on how to relate to these submit hook tests. Perhaps getting some ground rules down in the new Developer's Guide, if we can agree on such ground rules. (It's not like there's an old history of oral tradition on how to do things that needs to be written down like the rest of the Guide, it's more like we decide now on how to deal with them.) Finally, I'm worried about reproducibility and debuggability of the Github Actions. We've seen that Github Actions can be updated right under our feet, and have already had trouble with that. And if the logs are not enough, there is no simple way to reproduce the build environment locally do examine the situation. But these are general concerns about the Github Actions themselves, and not specifically aimed at this patch. And I've initiated an off-list discussion on how we can try to address some of these issues. |
@shipilev This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 5 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I did a similar attempt at cross building a while ago, but never got around to finishing it, so it's nice to see it materializing! I do have a general comment on reducing the amount of duplicated content though. Since all these cross-build platforms share the same prerequisites, they can be expressed as matrix builds. Here's what I did: https://github.com/rwestberg/jdk/blob/947c934621c3013c055152356615e0120382cedf/.github/workflows/submit.yml#L102
You'd have to adjust the details obviously, but I think it could help with future maintainability.
Another minor comment is that it may be faster to use http://debian-archive.trafficmanager.net/debian/ instead of http://httpredir.debian.org/debian/ (the former is Azure-specific but I don't think it's part of the list that the latter uses). But since it will be cached after first use it probably doesn't matter much.
Right. AFAIU your code, it bootstraps the chroot and builds x86_64 build JDK for every config and every run, something this PR is able to avoid. We'd need to figure that out. I think that is pretty doable, but it would require a few days worth of pipeline testing to work out the kinks. So, would you mind we do that in the follow-ups? |
Yeah, it would certainly need a bit of adaptation to capture the differences properly, so perfectly fine to look into later! |
Thanks folks! I'll merge the master and see if it is still green, and if it is, I'll integrate. |
Cross-compiled builds still look green. Integrating. /integrate |
@shipilev Since your change was applied there have been 5 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit e9956fe. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This adds the cross-compiled build only, as no Windows+Arm64 machines are available on GitHub Action that we could use to run the tests. Due to cross-compilation a build JDK is required. Initially I added EA builds to be downloaded from https://jdk.java.net/16/ and used for that, but then I saw how @shipiliv attempted it for the linux cross-compilation builds in openjdk#1147. That is using the JDK image produced by the x64 variant. This however add more stress to the "critical path", as now two more jobs depend on the x64 build first. Let's see how it works out in the long-run. A Windows+AArch64 build takes 40-50min.
It is possible to efficiently cross-compile to foreign architectures on current GH actions that are driven by Ubuntu. I have been using this method for years to produce binaries at builds.shipilev.net.
These cross-compilation targets frequently find arch-specific build bugs. Hotspot is rather well insulated from the fundamental problems in arch-support, so the overwhelming majority of those arch-specific bugs are simple omissions of
#include
-s, inconsistent#ifdef
-ing, missed method renames / signature changes, or incorrect removals of methods that are used by arch-specific code. Those are straightforward to fix, if the contributor knows about them. My experience says the common attitude to these fixes is: "Oh, I would have fixed it in the original change if I knew about this."This improvement adds several foreign architectures to GH actions workflow to provide the automatic safety net to give early warning for these cases. In fact, many Hotspot contributors already build AArch64, ARM, PPC64 targets as pre-integration sanity checks, and this does the important checks automatically for them.
To optimize workflow costs, the change does the following:
submit.yml
hash. So it would regenerate only if the workflow itself changes. This saves about 10 minutes per arch.Space-wise, sysroots in GH cache take about 580M uncompressed, and 270M zstd-compressed bundle. In the end, this adds 4x270 = 1080M into local cache.
Time-wise, ball-parking the current workflow budget looks like this:
In other words, new workflow takes about 715 Linux-host-minutes, 410 Windows-host-minutes, 240 Mac-host-minutes. Out of which new jobs take about 85 Linux-host-minutes. So the cost of new jobs is roughly:
In other words, the additional runner costs are pale in comparison with what is already done in build and test jobs.
Progress
Testing
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1147/head:pull/1147
$ git checkout pull/1147