Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8317510: Change Windows debug symbol files naming to avoid losing info when an executable and a library share the same name #16039

Closed
wants to merge 4 commits into from

Conversation

fthevenet
Copy link
Member

@fthevenet fthevenet commented Oct 4, 2023

When building OpenJDK on Windows using "--with-native-debug-info=external", the resulting debug symbols are saved in files located in the same folder as the corresponding executable or library and named by swapping the extension ".exe" or ".dll" for a ".pdb" one (or "diz" if option "--with-native-debug-info=zipped" is used), which means that in the event of an exe and a dll file sharing the same target folder and file name (e.g. bin\java.exe and bin\java.dll), we have to choose whether symbols in bin\java.pdb will refer to the exe or the dll; we can't have both.

This PR addresses this issue by adopting a different naming strategy for the resulting symbol files where we keep the full name of every file - including its dll or exe extension) and then add the appropriate .pdb, .map or .diz extension .

For instance, jvm.dll symbols are no longer called jvm.pdb but instead jvm.dll.pdb. Similarly, it is now jvm.dll.diz when using zipped symbols, and "jvm.dll.stripped.pdb" for stripped symbols (i.e. when "--with-external-symbols-in-bundles=public" is used).

The PR also removes the existing filtering for java.pdb, jimage.pdb and jpackage.pdb used to guaranty the dll symbols were bundled over the ones from the exe, since we no longer need that.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8317510: Change Windows debug symbol files naming to avoid losing info when an executable and a library share the same name (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16039/head:pull/16039
$ git checkout pull/16039

Update a local copy of the PR:
$ git checkout pull/16039
$ git pull https://git.openjdk.org/jdk.git pull/16039/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16039

View PR using the GUI difftool:
$ git pr show -t 16039

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16039.diff

Webrev

Link to Webrev Comment

…fo when an executable and a library share the same name
@bridgekeeper
Copy link

bridgekeeper bot commented Oct 4, 2023

👋 Welcome back fthevenet! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@fthevenet
Copy link
Member Author

The following build scenarios where tested on Windows, and the symbols name manually verified for correctness:

  • ./configure --with-native-debug-symbols=external
  • ./configure --with-native-debug-symbols=zipped
  • ./configure --with-native-debug-symbols=external --with-external-symbols-in-bundles=full
  • ./configure --with-native-debug-symbols=external --with-external-symbols-in-bundles=public

For sanity check, I also ran the following on Linux (where we're not expecting to see any differences):

  • ./configure --with-native-debug-symbols=external
  • ./configure --with-native-debug-symbols=zipped
  • ./configure --with-native-debug-symbols=external --with-external-symbols-in-bundles=full

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 4, 2023
@openjdk
Copy link

openjdk bot commented Oct 4, 2023

@fthevenet The following label will be automatically applied to this pull request:

  • build

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the build build-dev@openjdk.org label Oct 4, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 4, 2023

Webrevs

@erikj79
Copy link
Member

erikj79 commented Oct 4, 2023

I tried to build this internally and hit this error:

make[3]: *** No rule to make target '/cygdrive/c/sb/prod/1696445346/workspace/build/windows-x64/hotspot/variant-server/libjvm/gtest/jvm.pdb', needed by '/cygdrive/c/sb/prod/1696445346/workspace/build/windows-x64/images/test/hotspot/gtest/server/jvm.pdb'.  Stop.

Looks like the gtest build is failing.

@fthevenet fthevenet changed the title 8317510: Change Windows debug symbol files naming to avoid loosing info when an executable and a library share the same name 8317510: Change Windows debug symbol files naming to avoid losing info when an executable and a library share the same name Oct 5, 2023
@@ -81,13 +81,11 @@ endif
ifneq ($(CMDS_DIR), )
DEPS += $(call FindFiles, $(CMDS_DIR))
ifeq ($(call isTargetOs, windows)+$(SHIP_DEBUG_SYMBOLS), true+public)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ifeq ($(call isTargetOs, windows)+$(SHIP_DEBUG_SYMBOLS), true+public)
ifeq ($(call isCompiler, microsoft)+$(SHIP_DEBUG_SYMBOLS), true+public)

stripped pdbs (and pdbs in general) are only ever used with the VS toolchain

Copy link
Member Author

@fthevenet fthevenet Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point of this change, but it is not directly related to the issue addressed here (i.e. this condition wasn't introduced in this PR.).
Should it be included in the PR anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, hold on for a while

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @fthevenet, such a change should be separate from this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll retract my suggestion then, and include this change after this is integrated on my own

@@ -81,13 +81,11 @@ endif
ifneq ($(CMDS_DIR), )
DEPS += $(call FindFiles, $(CMDS_DIR))
ifeq ($(call isTargetOs, windows)+$(SHIP_DEBUG_SYMBOLS), true+public)
# For public debug symbols on Windows, we have to use stripped pdbs, rename them
# and filter out a few launcher pdbs where there's a lib that goes by the same name
# For public debug symbols on Windows, we have to use stripped pdbs and rename them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# For public debug symbols on Windows, we have to use stripped pdbs and rename them
# For public debug symbols on VS, we have to use stripped pdbs and rename them

@fthevenet
Copy link
Member Author

I have added a basic test that verifies that symbols can be resolved by the internal jdk tooling that makes use of them, even after the name change for the .pdb files.
It relies on the fact that on Windows/MSVC, native frames contains no function names when dumped in hs_err if pdb are not available/cannot be loaded.
This is not true for all platforms (e.g. on Linux, function names are present in hs_err even when full symbols aren't available), so I have restricted this test to run on Windows only.

@erikj79
Copy link
Member

erikj79 commented Oct 9, 2023

I have added a basic test that verifies that symbols can be resolved by the internal jdk tooling that makes use of them, even after the name change for the .pdb files. It relies on the fact that on Windows/MSVC, native frames contains no function names when dumped in hs_err if pdb are not available/cannot be loaded. This is not true for all platforms (e.g. on Linux, function names are present in hs_err even when full symbols aren't available), so I have restricted this test to run on Windows only.

Adding a test seems nice, but I will have to defer to Hotspot people for reviewing the validity, placement and tier/group inclusion of the new test. My interpretation is that with the current placement and TEST.groups it will be part of tier1.

@fthevenet
Copy link
Member Author

fthevenet commented Oct 9, 2023

Yes, of course.
Please note that, like many other tests in the same group, it will not be run in non-debug builds; this is because it relies on -XX:ErrorHandlerTest to cause a controlled crash of the JVM (so that the hs_err file is produced) and this is only available in debug builds (fast or slow).
I think this precludes this test from being part of tiers 1 (which is what I intended for), but I may very well be wrong.

@fthevenet
Copy link
Member Author

OK, so unsurprisingly I was indeed wrong... The test ran as part or tier1 in the GHA checks - and failed.
I'll let other advise on whether or not running this is tier1 is appropriate or not, but in the meantime I'll try to figure out why it fails in GHA.

@tstuefe
Copy link
Member

tstuefe commented Oct 9, 2023

Yes, of course. Please note that, like many other tests in the same group, it will not be run in non-debug builds; this is because it relies on -XX:ErrorHandlerTest to cause a controlled crash of the JVM (so that the hs_err file is produced) and this is only available in debug builds (fast or slow). I think this precludes this test from being part of tiers 1 (which is what I intended for), but I may very well be wrong.

They will run in tier1.

Test looks good.

Cheers, Thomas

@tstuefe
Copy link
Member

tstuefe commented Oct 9, 2023

OK, so unsurprisingly I was indeed wrong... The test ran as part or tier1 in the GHA checks - and failed. I'll let other advise on whether or not running this is tier1 is appropriate or not, but in the meantime I'll try to figure out why it fails in GHA.

Looks like symbol resolution of the crash address failed:

TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: '# V \[jvm.dll.*\].*(crash_with_segfault|controlled_crash).*' missing from stdout/stderr

looking at the output of the crashing process, it just prints out a PC, but not the expected frame:

# Problematic frame:
# V  [jvm.dll+0xfdd069]

where we expected to see a function name.

@fthevenet
Copy link
Member Author

fthevenet commented Oct 9, 2023

According to the GHA workflow, tier1 test are run with a jdk build using the following configure line, which doesn't set --with-native-debug-symbols:

bash configure --with-conf-name=windows-x64 --with-debug-level=fastdebug --with-version-opt=${GITHUB_ACTOR}-${GITHUB_SHA} --with-boot-jdk=/d/a/jdk/jdk/bootjdk/jdk --with-jtreg=jtreg/installed --with-gtest=gtest --with-msvc-toolset-version=14.29 --with-jmod-compress=zip-1 

The documentation isn't much help in advising what the default is in this case, but if it is equivalent to --with-native-debug-symbols:none , then this explains why the test fails.
I'll dig a bit deeper to confirm that.

@erikj79
Copy link
Member

erikj79 commented Oct 9, 2023

I have added a basic test that verifies that symbols can be resolved by the internal jdk tooling that makes use of them, even after the name change for the .pdb files. It relies on the fact that on Windows/MSVC, native frames contains no function names when dumped in hs_err if pdb are not available/cannot be loaded. This is not true for all platforms (e.g. on Linux, function names are present in hs_err even when full symbols aren't available), so I have restricted this test to run on Windows only.

Adding a test seems nice, but I will have to defer to Hotspot people for reviewing the validity, placement and tier/group inclusion of the new test. My interpretation is that with the current placement and TEST.groups it will be part of tier1.

Verified that the test passes in our internal build and test environment (which runs hotspot tests in tier1 on debug builds).

@erikj79
Copy link
Member

erikj79 commented Oct 9, 2023

According to the GHA workflow, tier1 test are run with a jdk build using the following configure line, which doesn't set --with-native-debug-symbols:

bash configure --with-conf-name=windows-x64 --with-debug-level=fastdebug --with-version-opt=${GITHUB_ACTOR}-${GITHUB_SHA} --with-boot-jdk=/d/a/jdk/jdk/bootjdk/jdk --with-jtreg=jtreg/installed --with-gtest=gtest --with-msvc-toolset-version=14.29 --with-jmod-compress=zip-1 

The documentation isn't much help in advising what the default is in this case, but if it is equivalent to --with-native-debug-symbols:none , then this explains why the test fails. I'll dig a bit deeper to confirm that.

I would expect the default to be to include pdbs on Windows (external).

@TheShermanTanker
Copy link
Contributor

The default for all platforms is none on a static build, and external otherwise

with_native_debug_symbols="external"

@TheShermanTanker
Copy link
Contributor

P.S. I suggest changing the comment in the new test that states "where the function names are available even with no symbols". This is not very accurate, the reason it works is because the symbols are inside libjvm.so itself, unlike with Microsoft Visual C which places all debug information (including symbol names) into the pdb files entirely. (For instance, if you compile the Windows Java VM with gcc, the symbol names would also be inside the VM itself, and not in the debug files)

@fthevenet
Copy link
Member Author

From what I can tell, the test fails because of an unrelated issue with the "test-prebuilt" target used to run the test by GHA.

According to the logs for the failed test, path to the symbol folder "/d/a/jdk/jdk/bundles/symbols/jdk-22/fastdebug" is passed explicitly via the env variable SYMBOLS_IMAGE_DIR:

make test-prebuilt TEST='test/hotspot/jtreg/:tier1_runtime' BOOT_JDK=/d/a/jdk/jdk/bootjdk/jdk JT_HOME=jtreg/installed JDK_IMAGE_DIR=/d/a/jdk/jdk/bundles/jdk/jdk-22/fastdebug SYMBOLS_IMAGE_DIR=/d/a/jdk/jdk/bundles/symbols/jdk-22/fastdebug TEST_IMAGE_DIR=/d/a/jdk/jdk/bundles/tests JTREG='JAVA_OPTIONS=-XX:-CreateCoredumpOnCrash;VERBOSE=fail,error,time;KEYWORDS=!headful'

But looking into the hs_err report, we see the path for the symbols folder is not considered by the symbol engine:

symbol engine: initialized successfully - sym options: 0x614 - pdb path: .;D:\a\jdk\jdk\bundles\jdk\jdk-22\fastdebug\bin;C:\Windows\SYSTEM32;C:\Windows\WinSxS\amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.17763.4851_none_de72d1b65349cfc4;D:\a\jdk\jdk\bundles\jdk\jdk-22\fastdebug\bin\server

That would still imply that the pdb files were stripped from the jdk bundle that is uploaded by the build task, or they would be picked by the symbol engine from there. Right now, I don't know if this is indeed the case, and if so caused by the renaming of pdb files in this PR.

@erikj79
Copy link
Member

erikj79 commented Oct 10, 2023

From what I can tell, the test fails because of an unrelated issue with the "test-prebuilt" target used to run the test by GHA.

According to the logs for the failed test, path to the symbol folder "/d/a/jdk/jdk/bundles/symbols/jdk-22/fastdebug" is passed explicitly via the env variable SYMBOLS_IMAGE_DIR:

make test-prebuilt TEST='test/hotspot/jtreg/:tier1_runtime' BOOT_JDK=/d/a/jdk/jdk/bootjdk/jdk JT_HOME=jtreg/installed JDK_IMAGE_DIR=/d/a/jdk/jdk/bundles/jdk/jdk-22/fastdebug SYMBOLS_IMAGE_DIR=/d/a/jdk/jdk/bundles/symbols/jdk-22/fastdebug TEST_IMAGE_DIR=/d/a/jdk/jdk/bundles/tests JTREG='JAVA_OPTIONS=-XX:-CreateCoredumpOnCrash;VERBOSE=fail,error,time;KEYWORDS=!headful'

But looking into the hs_err report, we see the path for the symbols folder is not considered by the symbol engine:

symbol engine: initialized successfully - sym options: 0x614 - pdb path: .;D:\a\jdk\jdk\bundles\jdk\jdk-22\fastdebug\bin;C:\Windows\SYSTEM32;C:\Windows\WinSxS\amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.17763.4851_none_de72d1b65349cfc4;D:\a\jdk\jdk\bundles\jdk\jdk-22\fastdebug\bin\server

That would still imply that the pdb files were stripped from the jdk bundle that is uploaded by the build task, or they would be picked by the symbol engine from there. Right now, I don't know if this is indeed the case, and if so caused by the renaming of pdb files in this PR.

I must confess that I'm not very well versed in the github actions configuration and execution model, nor our implementation. In our internal build-and-test system, we provide SYMBOLS_IMAGE_DIR as an env variable when invoking test-prebuilt, and the test passes in this setup. From your description it seems our GHA are supposed to be setup in the same way, but maybe there is a bug in there somewhere.

@fthevenet
Copy link
Member Author

In our internal build-and-test system, we provide SYMBOLS_IMAGE_DIR as an env variable when invoking test-prebuilt, and the test passes in this setup. From your description it seems our GHA are supposed to be setup in the same way, but maybe there is a bug in there somewhere.

If it isn't too much of a bother, could you please verify that the path you pass via SYMBOLS_IMAGE_DIR is indeed listed in the symbol engine search path in the resulting hs_err produced as part of the test?
I have been doing some testing locally (without GHA) and I'm seeing weird things in the symbol engine logs in hs_err, like the path passed to SYMBOLS_IMAGE_DIR not being converted to Windows path, while JDK_IMAGE_DIR is, e.g.:

Command line:
make test-prebuilt BOOT_JDK=/cygdrive/c/java-21-openjdk-21.0.0.0.35-7.win.jdk.x86_64/ JT_HOME=../../upstream/jtreg/build/images/jtreg JDK_IMAGE_DIR=build/prebuilt_images/jdk/ SYMBOLS_IMAGE_DIR=build/prebuilt_images/symbols/ TEST_IMAGE_DIR=build/prebuilt_images/test/ TEST="test/hotspot/jtreg/runtime/errorhandling/TestSymbolsInHsErrFile.java"

hs_err:
symbol engine: initialized successfully - sym options: 0x614 - pdb path: .; build/prebuilt_images/symbols//bin:build/prebuilt_images/symbols//bin/server;C:\openjdk\jdk\build\prebuilt_images\jdk\bin;C:\Windows\SYSTEM32;C:\Windows\WinSxS\amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.19041.1110_none_60b5254171f9507e;C:\\openjdk\jdk\build\prebuilt_images\jdk\bin\server

Also please note that if pdb files are present in the folder that you pass via JDK_IMAGE_DIR, the symbol engine will likely pick them up from there even if the path set in SYMBOLS_IMAGE_DIR is empty or incorrect (and therefore the test will pass).

@fthevenet
Copy link
Member Author

fthevenet commented Oct 10, 2023

All that to say that it looks like there might be a bug in test-prebuilt (which doesn't mean there couldn't be one GHA too, of course...)

Apologies if I'm just shooting in the dark here, but at a cursory glance, I notice that in RunTestsPrebuilt.gmk, SYMBOLS_IMAGE_DIR isn't checked for validity like JDK_IMAGE_DIR or TEST_IMAGE_DIR are; do you know why that is? Although from what I understand, it wouldn't transform the path, it would have caught the problem, wouldn't it?

https://github.com/openjdk/jdk/blob/80232b7e753129ca7a4f1ca9b70844e0c7d8eabf/make/RunTestsPrebuilt.gmk#L121C9-L124

@erikj79
Copy link
Member

erikj79 commented Oct 10, 2023

All that to say that it looks like there might be a bug in test-prebuilt (which doesn't mean there couldn't be one GHA too, of course...)

Apologies if I'm just shooting in the dark here, but at a cursory glance, I notice that in RunTestsPrebuilt.gmk, SYMBOLS_IMAGE_DIR isn't checked for validity like JDK_IMAGE_DIR or TEST_IMAGE_DIR are; do you know why that is? Although from what I understand, it wouldn't transform the path, it would have caught the problem, wouldn't it?

https://github.com/openjdk/jdk/blob/80232b7e753129ca7a4f1ca9b70844e0c7d8eabf/make/RunTestsPrebuilt.gmk#L121C9-L124

We aren't checking SYMBOLS_IMAGE_DIR because we consider it optional. Tests are expected to be able to run without it, you just get less information on crashes.

_NT_SYMBOL_PATH is setup with Windows style paths based on SYMBOLS_IMAGE_DIR here:

# Setup _NT_SYMBOL_PATH on Windows, which points to our pdb files.

In our run-test-prebuilt scenario, our JDK_IMAGE_DIR does not have any pdb files. The _NT_SYMBOLS_PATH setup was introduced to provide pdb files for symbol lookup in hs_err files, so I'm fairly certain that part works.

@fthevenet
Copy link
Member Author

I see, thanks!
I'll try to figure out why the symbol path isn't converted to Windows path in my own local tests.

Copy link
Member

@magicus magicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code itself looks good. We just need to figure out why it is breaking on GHA.

@openjdk
Copy link

openjdk bot commented Oct 10, 2023

@fthevenet This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8317510: Change Windows debug symbol files naming to avoid losing info when an executable and a library share the same name

Reviewed-by: ihse, erikj

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 10 new commits pushed to the master branch:

  • 10427c0: 8318613: ChoiceFormat patterns are not well tested
  • ca3bdfc: 8318186: ChoiceFormat inconsistency between applyPattern() and setChoices()
  • a520887: 8318487: Specification of the ListFormat.equals() method can be improved
  • cf4ede0: 8317360: Missing null checks in JfrCheckpointManager and JfrStringPool initialization routines
  • 9e98ee6: 8318735: RISC-V: Enable related hotspot tests run on riscv
  • 29d462a: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests
  • 43f31d7: 8318607: Enable parallelism in vmTestbase/nsk/stress/jni tests
  • cee44a6: 8318608: Enable parallelism in vmTestbase/nsk/stress/threads tests
  • b026d0b: 8312980: C2: "malformed control flow" created during incremental inlining
  • 3abd772: 8316017: Refactor timeout handler in PassFailJFrame

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@magicus, @erikj79) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 10, 2023
@TheShermanTanker
Copy link
Contributor

TheShermanTanker commented Oct 11, 2023

Once integrated, this is going to spam everyone on GitHub with Windows test failures until the issue is fixed separately :/
(Could we perhaps split the test out into another change?)

@tstuefe
Copy link
Member

tstuefe commented Oct 11, 2023

Once integrated, this is going to spam everyone on GitHub with Windows test failures until the issue is fixed separately :/ (Could we perhaps split the test out into another change?)

I don't think @fthevenet plans on checking in a broken test. He will figure this out or disable the test.

@fthevenet
Copy link
Member Author

fthevenet commented Oct 11, 2023

Indeed, I see no reason to rush the integration for this before we've resolved the failing test.

I can confirm that outside of the context of GHA, the test-prebuilt target does not properly resolve relative paths that are passed to SYMBOLS_IMAGE_DIR, but works fine with absolute path.
This is therefore likely not the reason for the test failing in GHA, since this uses absolute paths (i.e. /d/a/jdk/jdk/bundles/symbols/jdk-22/fastdebug).
This is still annoying, as the other paths used as parameters by the same test-prebuilt do accept related paths, but beside the point here, so I'll leave it at that for now and maybe come back to it in separately.

Meanwhile, I have instrumented the GHA test workflow as @magicus suggested and waiting for the results.

@magicus
Copy link
Member

magicus commented Oct 20, 2023

@fthevenet Did the instrumentation give anything?

@fthevenet
Copy link
Member Author

@magicus As a matter of fact, it did yield some interesting resutls, but then I got quite caught up with the release earlier this week and this and didn't find the time to update this thread.

The listing of the folder that is passed as a parameter to the test run shows it contains the pdb files with the expected names (e.g. jvm.dll.pdb). You can see an example of that in the "Check symbols" step in the run below:
https://github.com/fthevenet/jdk/actions/runs/6480669248/job/17599856238

I also added a print out of the values for SYMBOL_PATH and _NT_SYMBOL_PATH when they are first assigned (see fthevenet@5655ca7#diff-041bf69ea79b333b9ce99c1f879e398d698538530a35c361500b72631f059233R70), but to my surprise I could not see those in the test run logs from GHA, while are indeed printed when I run the test locally.

One notable difference I noticed, is that I run all my local tests using cygwin, while GHA uses MSYS2; could this explain anything?

@magicus
Copy link
Member

magicus commented Oct 21, 2023

One notable difference I noticed, is that I run all my local tests using cygwin, while GHA uses MSYS2; could this explain anything?

Yes, msys2/cygwin are sufficiently different that it could possibly explain this.

@fthevenet
Copy link
Member Author

The root cause for this test failing turns out to be indeed related to MSYS2 ; namely checks made to determine whether or not we're running on Windows return "false", so all Windows specific code is ignored.
In this case, we simply jump over the part where _NT_SYMBOL_PATH is set.

I opened https://bugs.openjdk.org/browse/JDK-8318669 to make the logic that auto detects the target OS when running pre-built test in "RunTestsPrebuilt.gmk" aware of MSYS2, but I am now wondering if it might not be best to just set OPENJDK_TARGET_OS explicitly to "windows" in the command line that launch the tests?

@fthevenet
Copy link
Member Author

I finally opted to address the underlying issue by patching RunTestsPrebuilt.gmk, rather than GHA; #16343.

As for this PR, I see two possible ways forward; one is to remove the test and integrate the change without it as part of the current PR, and add the test back in a follow up once the RunTestsPrebuilt patch is integrated.
The other is to convert this PR to a draft, wait for the separate fix to be integrated, and then rebase this PR on top of it and resume its review.

I like the first one better; a few more steps but overall less fussy. I'm also open to another solution.

@tstuefe
Copy link
Member

tstuefe commented Oct 24, 2023

I finally opted to address the underlying issue by patching RunTestsPrebuilt.gmk, rather than GHA; #16343.

As for this PR, I see two possible ways forward; one is to remove the test and integrate the change without it as part of the current PR, and add the test back in a follow up once the RunTestsPrebuilt patch is integrated. The other is to convert this PR to a draft, wait for the separate fix to be integrated, and then rebase this PR on top of it and resume its review.

I like the first one better; a few more steps but overall less fussy. I'm also open to another solution.

+1 Vote for first option.

@magicus
Copy link
Member

magicus commented Oct 24, 2023

You don't have to rebase, in fact, you should not rebase an open PR. Just merge from master, once the msys fix is in.

Your fix for JDK-8318669 was simple; just fix the "else ifeq" as Erik suggests, and you're good to go. So just delay pushing this a bit more -- you don't have to move it to Draft either. I think that is preferable, since it will keep the test with this PR, where it belongs.

@fthevenet
Copy link
Member Author

fthevenet commented Oct 25, 2023

@magicus So, just so that I get this straight, what you're suggesting is that now #16343 is integrated, I merge master into my branch for the PR and push it, right?
(sorry for being a bit dense 😅)

@erikj79
Copy link
Member

erikj79 commented Oct 25, 2023

@magicus So, just so that I get this strait, what you're suggesting is that now #16343 is integrated, I merge master into my branch for the PR and push it, right? (sorry for being a bit dense 😅)

Yes, but perhaps wait for the GHA to finish to verify that it all really works together.

@fthevenet
Copy link
Member Author

Test now passes in GHA.

@fthevenet
Copy link
Member Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Oct 25, 2023
@openjdk
Copy link

openjdk bot commented Oct 25, 2023

@fthevenet
Your change (at version bf75aff) is now ready to be sponsored by a Committer.

@erikj79
Copy link
Member

erikj79 commented Oct 25, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Oct 25, 2023

Going to push as commit d96f38b.
Since your change was applied there have been 10 commits pushed to the master branch:

  • 10427c0: 8318613: ChoiceFormat patterns are not well tested
  • ca3bdfc: 8318186: ChoiceFormat inconsistency between applyPattern() and setChoices()
  • a520887: 8318487: Specification of the ListFormat.equals() method can be improved
  • cf4ede0: 8317360: Missing null checks in JfrCheckpointManager and JfrStringPool initialization routines
  • 9e98ee6: 8318735: RISC-V: Enable related hotspot tests run on riscv
  • 29d462a: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests
  • 43f31d7: 8318607: Enable parallelism in vmTestbase/nsk/stress/jni tests
  • cee44a6: 8318608: Enable parallelism in vmTestbase/nsk/stress/threads tests
  • b026d0b: 8312980: C2: "malformed control flow" created during incremental inlining
  • 3abd772: 8316017: Refactor timeout handler in PassFailJFrame

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 25, 2023
@openjdk openjdk bot closed this Oct 25, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Oct 25, 2023
@openjdk
Copy link

openjdk bot commented Oct 25, 2023

@erikj79 @fthevenet Pushed as commit d96f38b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@TheShermanTanker
Copy link
Contributor

Aw, I wanted to sponsor this one in particular :(

Oh well, it's great that it's integrated now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants