Skip to content

Conversation

@jianglizhou
Copy link
Contributor

@jianglizhou jianglizhou commented Jan 17, 2024

Please review this PR with a simple solution for resolving duplicate Thread symbol issue. In #14808 comments, there was an alternative suggestion to redefine the symbol at build time, such as using-DThread=HotSpotThread. That would not address issues when symbol were references as string literals. #14808 also discussed using namespace for hotspot code, which can have multiple benefits/motivations. We could explore further using namespace with more consensus on that approach.

Contributed by Chuck Rasbold and @jianglizhou.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Integration blockers

 ⚠️ Too few reviewers with at least role reviewer found (have 0, need at least 1) (failed with updated jcheck configuration in pull request)
 ⚠️ Whitespace errors (failed with updated jcheck configuration in pull request)

Issue

  • JDK-8311846: Resolve duplicate 'Thread' related symbols with JDK static linking (Enhancement - P4)

Contributors

  • Chuck Rasbold <rasbold@google.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17456/head:pull/17456
$ git checkout pull/17456

Update a local copy of the PR:
$ git checkout pull/17456
$ git pull https://git.openjdk.org/jdk.git pull/17456/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17456

View PR using the GUI difftool:
$ git pr show -t 17456

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17456.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 17, 2024

👋 Welcome back jiangli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 17, 2024

/contributor add Chuck Rasbold rasbold@google.com

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 17, 2024
@openjdk
Copy link

openjdk bot commented Jan 17, 2024

@jianglizhou Chuck Rasbold rasbold@google.com is not a valid name and email string.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

@openjdk
Copy link

openjdk bot commented Jan 17, 2024

@jianglizhou The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Jan 17, 2024
@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 17, 2024

/contributor add Chuck Rasbold rasbold@google.com

@openjdk
Copy link

openjdk bot commented Jan 17, 2024

@jianglizhou
Contributor Chuck Rasbold <rasbold@google.com> successfully added.

@mlbridge
Copy link

mlbridge bot commented Jan 17, 2024

Webrevs

@theRealAph
Copy link
Contributor

Hooboy, this is an ugly solution, with some nasty side effects such as confusing error mesasges for developers and a very confusing debugger experience. Let's try to find a solution with a smaller blast radius.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 17, 2024

Hi @theRealAph Thanks for looking into this! #14808 comments touched on several options:

  1. Using namespace, in smaller scope for specific class such as StringTable or for all hotspot code in a global scope.

    Most seem to prefer using a specific namespace for all hotspot code, but there were still concerns.

  2. Using #define to redefine the symbol (using in the current PR)

    This is a somewhat hacky solution. It requires small changes without touching many source code for renaming.

  3. Redefine symbol at build/compile time. This is similar to the above.

  4. Direct rename in the source

Earlier discussions and feedback seem to prefer options requiring non-large scale change (except hotspot namespace solution). If acceptable by everyone, direct renaming would be the least confusion causing option. Any other suggestions and ideas for resolving the Thread issue?

Thanks!

@coleenp
Copy link
Contributor

coleenp commented Jan 17, 2024

I was reading through the other PR for StringTable and was wonder how difficult it would be to wrap all of hotspot in namespace hotspot {}; using namespace hotspot; It would need a JEP as discussed in the other PR.

Alternatively if there's a #ifdef you can use for renaming the Thread to HotspotThread for static linking only, it might make this change less worrysome.

@dholmes-ora
Copy link
Member

Linking failures were observed when statically linking the launcher executable with hotspot and user native code together:

So the problem is that the user native code defines Thread as well - right? So this could keep happening for name after name depending on what native code is being linked.

I second what @theRealAph said! This is really ugly. The way disparate libraries just get munged into a single namespace with static linking just seems wrong to me.

At a minimum this hack should only be used when doing static linking as Coleen suggested. But I'd much prefer a solution that came from the tools doing the linking.

@dholmes-ora
Copy link
Member

Okay so now that I have context switched in the discussion from:

#14808

what happened to doing a JEP for namespaces?

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 18, 2024

Thanks @coleenp, @dholmes-ora. For using a hotspot namespace, there are probably similar complications like the symbol usages that the current PR addresses in src/hotspot/os_cpu/linux_aarch64/threadLS_linux_aarch64.S and src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Thread.java. There might also be some complications with accessing hotspot code in JNI code. Those issues probably could be resolved relatively easily, I haven't experimented it. It seems that we may be converging on using hotspot namespace?

For just redefining the symbol only when doing static linking, it adds more differences between the static and non-static support. It's more useful when we can create both .so and .a from the same set of .o files without having to build two different .o from each c/c++ source files.

@dholmes-ora
Copy link
Member

It seems that we may be converging on using hotspot namespace?

Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 19, 2024

It seems that we may be converging on using hotspot namespace?

Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form.

Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable.

@theRealAph
Copy link
Contributor

It seems that we may be converging on using hotspot namespace?

Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form.

Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable.

Using a default namespace for everything could have bad effects on debugging and other maintenance tools. it's unlikely to be a low-cost option. Perhaps it could be restricted to the static linking case, but that further complicates testing.

@coleenp
Copy link
Contributor

coleenp commented Jan 19, 2024

You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming.

I don't know how bad "namespace hotspot" would be for debugging. At least for some of the common names. I suppose breakpoints would have to be specified in gdb as break at hotspot::Thread::is_owning_thread or something like that, and with a using namespace hotspot, it wouldn't be visible looking at the source code in that form.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 19, 2024

It seems that we may be converging on using hotspot namespace?

Based on previous discussions I had been expecting to see a JEP on this after last US summer. I was surprised to see this PR pop up in this form.

Ah, I see. Thanks for the clarification. I had an offline conversation with @iklam about the namespace and JEP topic during during last August JVM Language Summit. Based on the feedback from the conversion, it was not very clear if the namespace approach was broadly acceptable.

Using a default namespace for everything could have bad effects on debugging and other maintenance tools. it's unlikely to be a low-cost option. Perhaps it could be restricted to the static linking case, but that further complicates testing.

Thanks. Agreed to both points. It seems to add too much complexities if the namespace usage is restricted to static linking case only.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 19, 2024

You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming.

Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR.

For longer term we probably still want to find a cleaner solution when the static support becomes more popular.

@theRealAph
Copy link
Contributor

You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming.

Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR.

For longer term we probably still want to find a cleaner solution when the static support becomes more popular.

I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 22, 2024

You could support one build by adding something like -DSUPPORTS_STATIC_LINK for both .so and .a builds for Google, then use that to protect the renaming.

Thanks, @coleenp. I think that could work for all different cases. I'll reflect that in this PR.
For longer term we probably still want to find a cleaner solution when the static support becomes more popular.

I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export.

We also discussed about objcopy in #14808 (comment) and #14808 (comment). My main concern was the portability of objcopy approach.

@theRealAph
Copy link
Contributor

I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export.

We also discussed about objcopy in #14808 (comment) and #14808 (comment). My main concern was the portability of objcopy approach.

I replied:

OK, but it is the right thing to do on Linux. If some other operating systems don't provide useful tools, that's on them.
I haven't checked, but I strongly suspect that LLVM can do it too, so all that remains is Windows, and maybe they can't have static linking (or maybe they have to use something like this PR) until the right tooling is provided.

If Windows really can't do it, that's no reason to burden systems that can. Namespaces are not a low-cost solution for developers.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 24, 2024

I think you should be able to use ld and objcopy to merge the .o files and hide all of the symbols you don't want to export.

We also discussed about objcopy in #14808 (comment) and #14808 (comment). My main concern was the portability of objcopy approach.

I replied:

OK, but it is the right thing to do on Linux. If some other operating systems don't provide useful tools, that's on them. I haven't checked, but I strongly suspect that LLVM can do it too, so all that remains is Windows, and maybe they can't have static linking (or maybe they have to use something like this PR) until the right tooling is provided.

If Windows really can't do it, that's no reason to burden systems that can. Namespaces are not a low-cost solution for developers.

Thanks, @theRealAph.

Yeah, I was mainly concerned about non-unix like systems, Windows particularly. It might not work on all potentially supported compilers (gcc) on linux, however. To localizing symbols in libjvm using objcopy, we can first partially link (with -r) all hotspot .o into a single object file, then run objcopy for the output object file to localize the affected symbols. The partial linking work (

ifeq ($$($1_ENABLE_PARTIAL_LINKING), true)
) has been added already. However, during the #14064 work, we ran into issues with partial linking on older gcc for linux-aarch64. The details were captured in #14064 (comment) discussion with @erikj79. Only clang currently work well with the partial linking and symbol localizing solution.

Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like objcopy can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that?

@theRealAph
Copy link
Contributor

Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like objcopy can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that?

I suppose so, but why?

Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 30, 2024

Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like objcopy can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that?

I suppose so, but why?

Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there.

We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and objcopy seems to be overly restrictive (it does simplify the requirements significantly however :-)):

  • The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently.

  • As we haven't found many duplicate symbol issues with hotspot code, resolving them case by case may still be a good choice. We don't have to tie into any permanent solution during the early stage.

  • Based on what we learned from the static/hermetic Java prototyping and investigations, majority of the work is non-os and non-cpu specific. If we can carefully handle the platform specific part with portable solution(s), we can support static/hermetic Java for different supported platforms as a more general solution.

Those are my reasonings. :-)

@theRealAph
Copy link
Contributor

theRealAph commented Jan 30, 2024

Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like objcopy can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that?

I suppose so, but why?
Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there.

We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and objcopy seems to be overly restrictive (it does simplify the requirements significantly however :-)):

The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently.

I believe this to be a mistake. HotSpot, by design, exports only the symbols intended for use by other components. Many of the symbol names are highly generic, and will conflict with application code.

Sure, you have enough to be able to do some prototyping, but for real-world deployment you must be able to control symbol exports.

@iklam
Copy link
Member

iklam commented Jan 30, 2024

Maybe we could live with symbol redefinition using #define (conditionally for static linking in OpenJDK, as Coleen suggested earlier) for now, until the tooling can support symbol localizing better. Then localizing symbols using tools like objcopy can be the longer term and cleaner solution, instead of using namespace. What's your thoughts on that?

I suppose so, but why?
Why should any of this have to work on old systems? If their binutils is broken, static linking of openjdk won't work there.

We ran into issues with older gcc on linux-aarch for partial linking, but the problem may not be older gcc only(?). At the current stage, limiting static/hermetic Java runtime support to only the platforms that support partial linking and objcopy seems to be overly restrictive (it does simplify the requirements significantly however :-)):
The duplicate symbol problems are mostly found in JDK natives and have been resolved already. We've found very few symbol issues with hotspot code so far. As there are portable alternative solutions that can resolve the symbol issues in hotspot, choosing a less portable solution seems not too attractive currently.

I believe this to be a mistake. HotSpot, by design, exports only the symbols intended for use by other components. Many of the symbol names are highly generic, and will conflict with application code.

Sure, you have enough to be able to do some prototyping, but for real-world deployment you must be able to control symbol exports.

I agree with Andrew. We don't want the perfect to be the enemy of the good.

The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now.

I think partial linking with objcopy is a clean solution that's good enough for the actual use cases.

If someone wants to use #define, they can just make a local branch and add a few #define lines in their globalDefinitions.hpp. I suspect the configure script also allows adding C compiler options like -DThread=HSThread.

#define is going to be a whack-a-mole hack. Google may need to isolate the Thread symbol, but other people may need to isolate things like Symbol, etc. It's not a good idea to add arbitrary #define in the HotSpot source code just because someone doesn't like it.

@theRealAph
Copy link
Contributor

The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now.

I don't think that putting all of the HotSpot code in a namespace. At least, I hope not: it'll mess up debugging so much that it'll be intolerable, IMO, and there will be other side effects.

@iklam
Copy link
Member

iklam commented Jan 30, 2024

The only "perfect" solution is putting the HotSpot code in a namespace. This is going to be a huge undertaking. I don't think we have enough interest in the OpenJDK community to make such a change now.

I don't think that putting all of the HotSpot code in a namespace. At least, I hope not: it'll mess up debugging so much that it'll be intolerable, IMO, and there will be other side effects.

I forgot to qualify "perfect" only in the sense of isolating the HotSpot symbols. It's obviously not perfect at all in other aspects.

@magicus
Copy link
Member

magicus commented Jan 30, 2024

/label add build

@openjdk openjdk bot added the build build-dev@openjdk.org label Jan 30, 2024
@openjdk
Copy link

openjdk bot commented Jan 30, 2024

@magicus
The build label was successfully added.

@jianglizhou
Copy link
Contributor Author

jianglizhou commented Jan 30, 2024

We (@AlanBateman, @cushon, @magicus, @jerboaa, @pron, @jianglizhou) discussed this topic via zoom as part of a regular static/hermetic Java discussions. The outcome favors the partial-linking/objcopy to localize symbols for hotspot. Here is a summary:

  • A general solution is preferred compared to resolving symbol issues case by case.
  • We can address this for unix-like platforms with toolings supporting partial-linking/objcopy for now. @magicus will provide additional information on supported gcc versions and considerations for Windows support.
  • There is also a preference to localize symbols automatically without editing the symbol list manually. In our prototype for handling freetype symbols (as mentioned in 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking #14808 (comment)), @cjmoon1 looked into using nm to generate symbol list and feed that into objcopy. That might be do-able for hotspot symbols.

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 28, 2024

@jianglizhou This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper bridgekeeper bot added the oca Needs verification of OCA signatory status label Feb 28, 2024
@openjdk
Copy link

openjdk bot commented Feb 28, 2024

@jianglizhou this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8311846
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed rfr Pull request is ready for review labels Feb 28, 2024
@bridgekeeper bridgekeeper bot removed the oca Needs verification of OCA signatory status label Feb 28, 2024
@openjdk
Copy link

openjdk bot commented Mar 13, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 11, 2024

@jianglizhou This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented May 9, 2024

@jianglizhou This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build build-dev@openjdk.org hotspot hotspot-dev@openjdk.org merge-conflict Pull request has merge conflict with target branch serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

6 participants