-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8355698: JDK not supporting sleef could cause exception at runtime after JDK-8353786 #24914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back mli! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@Hamlin-Li To determine the appropriate audience for reviewing this pull request, one or more labels corresponding to different subsystems will normally be applied automatically. However, no automatic labelling rule matches the changes in this pull request. In order to have an "RFR" email sent to the correct mailing list, you will need to add one or more applicable labels manually using the /label pull request command. Applicable Labels
|
iwanowww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is if vector math library is present, it provides full set of stubs.
Do we really want to complicate things even more by supporting arbitrary subsets of vector math stubs?
IMO a better way to fix the problem is to avoid building/bundling SLEEF when any stubs are missing.
|
This PR can fix the failure of JDK-8355656 |
|
I think, first JDK-8353786 changes the behaviour of jdk (previously it does not throw exception or fail in other ways). second it should fall back to a java implementation rather than fail, this is more friendly to jdk vendors and users. As you can see, @sendaoYan report another issue related to it on x64 platform. For the jdk vendors, I suppose they should try their best to build a jdk with full vector math library support, but they can fail to do so for many reasons, in that situation we should fall back to a default version which could be slow in performance, but should not fail, because we already has a java scalar version implemented, and the motivation of this version is for falling back when there is no optimal native version.
Yes. The issues here is the released jdk could be lack of support of such library for various reasons.
I'm not sure what you mean here, but seems to me the fix is simple, we just not throw exception and allow it fall back to java scalar version, and this behaviour is consitent with previous jdk behaviour. Or can you clarify further?
This behavour was decided in previous PRs #20781, #21083, #21502, and some other uncommited PRs prior to these ones. How do you think about it? |
|
/solves JDK-8355656 |
|
@Hamlin-Li |
Can you point me, please, to the relevant parts of the discussions? I wasn't part of them.
Both SVML and SLEEF-based vector math native libraries are built and bundled with the JDK. JDK is in full control of their sources and should be able to rely on the exact API they expose. Such behavior has a number of advantages, in particular:
API variability complicates things, so should be avoided unless there are compelling reasons to justify it. Can you summarize, please, what's the motivation behind such behavior? If build platform doesn't fully satisfy library requirements, the library can be excluded. If the library is missing, default (Java) implementations are used. |
Sorry, I can not find out this specific answer for you, as there were too long discussion there, the whole process took several months.
Seems both jdk vendors and jdk end users don't have a full control of the environment to opposite sides.
It's arguable if missing entries is a bug, please check JDK-8355656 for example
I'm not familar with this feature, but could it be simple to catch a missing API, then do some necessary things?
One advantage of it is fault tolerance, please check JDK-8355656 for example. And I don't think this compliates things.
Does JDK-8355656 fall in the above scenario? Why should we exclude the whole library when only part of the library is not supported? In summary, I'm concerned that there could always be some potential issue, and we should consider to make it fall back when we could, as after a jdk release to end users it's out of your control. If we want to consider these things ( like missing entries as bugs, we still could do it in other ways (like checking the log in a test), rather than totally stop the jdk to run in end users' environment. |
Personally, I'm surprised SVML is affected in a similar way because stubs are provided in assembly form, so they are simply linked into a shared library during assembly phase. I can only guess what causes such inconsistencies during JDK build phase. And it just reassures me that current behavior is a source of bugs in the future. Overall, it still looks like a JDK build issue to me. Hiding problems occurred during the build is not good. If some toolchains can't successfully build the library, the library shouldn't be included in JDK.
JDK code is aware of that and doesn't try to look up corresponding entries. Ideally, it shouldn't be included into the native library, since it increases its size for no good reason. But having unused code is a redundancy and not a bug per se. (For SVML it's a bit more complicated, because JDK is the only place where sources of SVML stubs live (there's no upstream project/distribution)). |
|
/label core-libs, hotspot |
|
@iwanowww The |
No, in riscv case (possiblely also on arm?) I don't think it's a build issue, the jdk vendor can choose to support it or not, it's just the way passing the information ( whether it's supported or not) are different. How about we consider this in another way, we could fix this issue first, as it fails regularly on riscv and x64 in some situations. Then if you still consider the existing behaviour should be changed and could be improved further, it can be done in another pr. After all, it already broke the existing jdk in some scenarios. |
|
I want to understand the issue with missing entries in vector math native libraries first before making a decision how to proceed.
Sorry, I don't get it. How does it affect the contents of the native library? JDK vendors do have an option to bundle the library or drop it from their distribution. But when SLEEF-based and SVML math libraries are built by JDK there's no distinction between entries being included in the library. If a vendor modifies make files or native library code, it's up to them to adjust JDK accordingly. Upstream JDK doesn't have to take such scenarios into account. I looked through SVML and SLEEF-based vector math code and noticed there are some capability check [1] [2] [3] guarding library code. It means that if some library entry is missing, then the whole library is empty. Can you confirm it's the case you see? [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/linux/native/libjsvml/globals_vectorApiSupport_linux.S.inc#L35 |
Sorry for confusing you. No, what I mean is vendors can choose their build environment (e.g. with/without support of compiler, I discuss this a bit below), and different environment will lead to whether sleef is supported or not (i.e. whether there are entries in the libsleef.so).
In riscv sleef case, yes, it checks native compiler version and a flat, this is required because only these versions (or higher) support vector intrinsics which are used in sleef header files. ( I think arm is the same case, but in a bit simpler way only with a flag). |
I am not convinced that supporting such a divergence between builds of the JDK is something we should support. Sure, we can choose at runtime whether to link with SLEEF or not, but having some builds of (say) risc OpenJDK with SLEEF and some without is a Bad Thing. It is a build issue. |
|
Mailing list message from Andrew Haley on core-libs-dev: On 5/2/25 10:11, Hamlin Li wrote:
Again, this is an unhelpful thing to do. If the environment doesn't -- |
|
Thanks, Li. I think I have a better understanding now how it all works across platforms. I agree with Andrew that it's highly undesireable to have divergence across JDK builds, but I don't have a good understanding how mature toolchain support is when it comes to vector intrinsics SLEEF library relies on. As of now, producing empty native library is the only way to communicate that a vector math library isn't properly built. So, 2 problems to address: Speaking of SVML, it requires GCC 4.9+ and MSVC 2017. According to JDK build documentation [1], newer toolchain versions are required for JDK build. So, it should be fine to remove GCC version checks. (JDK-8355656 is reported for a JDK build using Clang. I have no idea whether Clang can build SVML stubs or not.) So, the stop-the-gap fix for JDK-8355698 is to disable SLEEF build when toolchain doesn't have proper support. Alternatively, VectorMathLibrary can probe the library to ensure it is not empty [2]. [1] https://github.com/openjdk/jdk/blob/master/doc/building.md [2] |
|
Thanks, I'll check it. |
|
@Hamlin-Li This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
|
I'll close this pr, as seems to us it's also helpful to push the jdk vendors to support sleef when they release. |
Hi,
Can you help to review this patch?
Before JDK-8353786, when a released jdk not supportting sleef (for any reason, e.g. low gcc version, intrinsic not supported, rvv not supported, and so on) runs on machine support vector operation (e.g. on riscv, it supports rvv), it can not call into sleef, but will not fail either, it falls back to java scalar version implementation.
But after JDK-8353786, it will cause an exception thrown at runtime.
This change the behaviour of existing jdk, and it should not throw exception anyway.
@iwanowww @RealFYang
Thanks!
Progress
Issues
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24914/head:pull/24914$ git checkout pull/24914Update a local copy of the PR:
$ git checkout pull/24914$ git pull https://git.openjdk.org/jdk.git pull/24914/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24914View PR using the GUI difftool:
$ git pr show -t 24914Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24914.diff
Using Webrev
Link to Webrev Comment