Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name #4297

Closed
wants to merge 2 commits into from

Conversation

dholmes-ora
Copy link
Member

@dholmes-ora dholmes-ora commented Jun 2, 2021

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:

  • info: show whether the new API is available or not
  • debug: report failures that are ignored (as we fallback to debugger mechanism)
  • trace: report successes for full tracking

Testing:

  • internal self-verification in debug builds as previously described
  • verified the logging output on different Windows systems that have, and don't have, the new API
  • sanity testing for tiers 1-3

Thanks,
David


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name

Reviewers

Contributors

  • Markus GaisBauer <markus.gaisbauer@dynatrace.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4297/head:pull/4297
$ git checkout pull/4297

Update a local copy of the PR:
$ git checkout pull/4297
$ git pull https://git.openjdk.java.net/jdk pull/4297/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4297

View PR using the GUI difftool:
$ git pr show -t 4297

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4297.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 2, 2021

👋 Welcome back dholmes! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@dholmes-ora dholmes-ora changed the title 8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name() 8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name Jun 2, 2021
@openjdk
Copy link

openjdk bot commented Jun 2, 2021

@dholmes-ora The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jun 2, 2021
@dholmes-ora
Copy link
Member Author

dholmes-ora commented Jun 2, 2021

/contributor add Markus GaisBauer markus.gaisbauer@dynatrace.com

@openjdk
Copy link

openjdk bot commented Jun 2, 2021

@dholmes-ora
Contributor Markus GaisBauer <markus.gaisbauer@dynatrace.com> successfully added.

@dholmes-ora dholmes-ora marked this pull request as ready for review Jun 2, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 2, 2021
@mlbridge
Copy link

mlbridge bot commented Jun 2, 2021

Webrevs

@dholmes-ora
Copy link
Member Author

dholmes-ora commented Jun 2, 2021

I've checked the performance impact and there are no regressions across a set of sanity footprint and performance benchmarks in product mode.

For fastdebug there is a very slight regression in footprint, but that seems likely to be a false result and is not important for fastdebug anyway.

@dholmes-ora
Copy link
Member Author

dholmes-ora commented Jun 4, 2021

I wonder if I could entice any of our Microsoft contributors to give this a technical review (even if not Reviewers) - @luhenry ?

Thanks

@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

Trying again to get some attention on this PR. :)

Thanks,
David

On 4/06/2021 11:06 am, David Holmes wrote:

1 similar comment
@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

Trying again to get some attention on this PR. :)

Thanks,
David

On 4/06/2021 11:06 am, David Holmes wrote:

Copy link
Member

@tstuefe tstuefe left a comment

Hi David,

I think the original code was one of the first contributions I made to OpenJDK: https://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015528.html

I would not call it "weird" though. At that point it was the only available and officially documented way to set thread names on Windows.

I am not sure if the error check is worth the complexity, especially since it only kind of checks itself (we may just set and read an invisible variable for all we know). We had no such checks for the old version, nor for the linux implementation.

I read https://docs.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2019. I believe the documentation is incorrect insofar as that the old version worked for WinDbg too, not only VS, I tested this in 2014.

Cheers, Thomas

@@ -885,8 +885,62 @@ uint os::processor_id() {
return (uint)GetCurrentProcessorNumber();
}

// For dynamic lookup of SetThreadDescription API
typedef HRESULT (WINAPI *SetThreadDescriptionFnPtr)(HANDLE, PCWSTR);
typedef HRESULT (WINAPI *GetThreadDescriptionFnPtr)(HANDLE, PWSTR*);
Copy link
Member

@tstuefe tstuefe Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG_ONLY?

Copy link
Member Author

@dholmes-ora dholmes-ora Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is only a typedef I didn't think the DEBUG_ONLY ugliness was warranted.

Copy link
Member

@tstuefe tstuefe Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I find it inconsistent, but its also not important. Okay.

-1, // null-terminated
thread_name,
-1 // null-terminated
);
Copy link
Member

@tstuefe tstuefe Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use wcscmp? (WCHAR == wchar_t, and also you use %ls below which implies wchar_t, so you may use wcscmp here too).

Copy link
Member Author

@dholmes-ora dholmes-ora Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply because I wasn't aware of it. I'm not at all familiar with Unicode/string API's on Windows and just looked around for something that would seem to do the job. All these types seem to be wchar_t in some form under the covers.

Copy link
Member

@tstuefe tstuefe Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its not that well documented. But we fall back to wcs... APIs in a couple of places. But if your code works there is no need to change it.

@tstuefe
Copy link
Member

tstuefe commented Jun 11, 2021

Another issue with the check just occurred to me which is that there may be (its not documented) a limit to the thread name length. There is one on Linux. Which means a check would need to do a substring comparison.

@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

Hi Thomas,

Thanks for taking a look at this.

On 11/06/2021 4:05 pm, Thomas Stuefe wrote:

On Wed, 2 Jun 2021 02:19:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:
- info: show whether the new API is available or not
- debug: report failures that are ignored (as we fallback to debugger mechanism)
- trace: report successes for full tracking

Testing:
- internal self-verification in debug builds as previously described
- verified the logging output on different Windows systems that have, and don't have, the new API
- sanity testing for tiers 1-3

Thanks,
David

Hi David,

I think the original code was one of the first contributions I made to OpenJDK: https://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015528.html

I would not call it "weird" though. At that point it was the only available and officially documented way to set thread names on Windows.

To be clear it isn't our use of this mechanism that is being labelled
"weird" but the actual win32 mechanism itself. :)

I am not sure if the error check is worth the complexity, especially since it only kind of checks itself (we may just set and read an invisible variable for all we know). We had no such checks for the old version, nor for the linux implementation.

For the old version and other platforms we can easily check the result
using external tools, like debuggers and process viewers. But that is
not the case for this new API. While the newer VS versions support it I
don't think there would be a way to know that we are observing use of
the old or the new API for certain (and I don't have direct access to VS
anyway). Hence the attempt to at least sanity check - though you are
right I may not actually have set the correct name in the first place. I
thought about comparing with the passed in char* but the unicode
gymnastics was more than I could stomach. :)

I'll respond to other comments in the PR UI.

Thanks,
David

@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 11/06/2021 4:21 pm, Thomas Stuefe wrote:

On Wed, 2 Jun 2021 02:19:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:
- info: show whether the new API is available or not
- debug: report failures that are ignored (as we fallback to debugger mechanism)
- trace: report successes for full tracking

Testing:
- internal self-verification in debug builds as previously described
- verified the logging output on different Windows systems that have, and don't have, the new API
- sanity testing for tiers 1-3

Thanks,
David

Another issue with the check just occurred to me which is that there may be (its not documented) a limit to the thread name length. There is one on Linux. Which means a check would need to do a substring comparison.

I'm assuming that if no limitations are documented then they don't
exist. AFAIK the debugger hook mechanism has no limits so I don't see
why this should either.

Cheers,
David

1 similar comment
@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 11/06/2021 4:21 pm, Thomas Stuefe wrote:

On Wed, 2 Jun 2021 02:19:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:
- info: show whether the new API is available or not
- debug: report failures that are ignored (as we fallback to debugger mechanism)
- trace: report successes for full tracking

Testing:
- internal self-verification in debug builds as previously described
- verified the logging output on different Windows systems that have, and don't have, the new API
- sanity testing for tiers 1-3

Thanks,
David

Another issue with the check just occurred to me which is that there may be (its not documented) a limit to the thread name length. There is one on Linux. Which means a check would need to do a substring comparison.

I'm assuming that if no limitations are documented then they don't
exist. AFAIK the debugger hook mechanism has no limits so I don't see
why this should either.

Cheers,
David

Copy link
Member

@luhenry luhenry left a comment

Looks good.

@@ -4311,6 +4367,24 @@ jint os::init_2(void) {
jdk_misc_signal_init();
}

// Lookup SetThreadDescription - the docs state we must use runtime-linking of
// kernelbase.dll, so that is what we do.
HINSTANCE _kernelbase = LoadLibrary(TEXT("kernelbase.dll"));
Copy link
Member

@luhenry luhenry Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernel32.dll would be the better dll to look into (per https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription and the .NET source code).

@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 11/06/2021 6:31 pm, Ludovic Henry wrote:

On Wed, 2 Jun 2021 02:19:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:
- info: show whether the new API is available or not
- debug: report failures that are ignored (as we fallback to debugger mechanism)
- trace: report successes for full tracking

Testing:
- internal self-verification in debug builds as previously described
- verified the logging output on different Windows systems that have, and don't have, the new API
- sanity testing for tiers 1-3

Thanks,
David

Looks good.

Thanks for reviewing it!

src/hotspot/os/windows/os_windows.cpp line 4372:

4370: // Lookup SetThreadDescription - the docs state we must use runtime-linking of
4371: // kernelbase.dll, so that is what we do.
4372: HINSTANCE _kernelbase = LoadLibrary(TEXT("kernelbase.dll"));

`kernel32.dll` would be the better dll to look into (per https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription and the .NET source code).

Those docs state:

"Windows Server 2016, Windows 10 LTSB 2016 and Windows 10 version 1607:
SetThreadDescription is only available by Run Time Dynamic Linking in
KernelBase.dll."

so if I use kernel32 does that mean it will fail on the above Windows
versions? Is this suggesting I need to use a different dll depending on
the actual Windows version?

Thanks,
David

1 similar comment
@mlbridge
Copy link

mlbridge bot commented Jun 11, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

On 11/06/2021 6:31 pm, Ludovic Henry wrote:

On Wed, 2 Jun 2021 02:19:52 GMT, David Holmes <dholmes at openjdk.org> wrote:

From Windows 10 and Windows 2016 server, we have a direct API for setting the thread name/description. Use of this API was suggested by Markus Gaisbauer:

http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030366.html

Using the new API was quite straight forward, but verifying that it had worked correctly was far more challenging. It seems there are no tools that use the new GetThreadDescription API to display thread names, so no easy check that this had worked. While Visual Studio will use it, it also uses the old debugger mechanism, so we wouldn't be able to tell the difference.

Writing a Windows-only test was one possibility, but the conversion to/from Unicode and java.lang.String would make that test very cumbersome in itself (for something that should be trivial!).

So instead for debug builds I read back the thread name using GetThreadDescription and check that the name we set and the name we read are the same. I'm a bit concerned about the impact this may have on performance so I'm going to run some benchmarks.

I will also run benchmarks to watch for issues with the unicode conversion costs related to this.

The logging strategy is as follows:
- info: show whether the new API is available or not
- debug: report failures that are ignored (as we fallback to debugger mechanism)
- trace: report successes for full tracking

Testing:
- internal self-verification in debug builds as previously described
- verified the logging output on different Windows systems that have, and don't have, the new API
- sanity testing for tiers 1-3

Thanks,
David

Looks good.

Thanks for reviewing it!

src/hotspot/os/windows/os_windows.cpp line 4372:

4370: // Lookup SetThreadDescription - the docs state we must use runtime-linking of
4371: // kernelbase.dll, so that is what we do.
4372: HINSTANCE _kernelbase = LoadLibrary(TEXT("kernelbase.dll"));

`kernel32.dll` would be the better dll to look into (per https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription and the .NET source code).

Those docs state:

"Windows Server 2016, Windows 10 LTSB 2016 and Windows 10 version 1607:
SetThreadDescription is only available by Run Time Dynamic Linking in
KernelBase.dll."

so if I use kernel32 does that mean it will fail on the above Windows
versions? Is this suggesting I need to use a different dll depending on
the actual Windows version?

Thanks,
David

Copy link
Member

@tstuefe tstuefe left a comment

Hi David, I am fine with your change in its current form (if you want to reshape it, I'll look again). I see no other concerns.

@openjdk
Copy link

openjdk bot commented Jun 11, 2021

@dholmes-ora This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8238649: Call new Win32 API SetThreadDescription in os::set_native_thread_name

Co-authored-by: Markus GaisBauer <markus.gaisbauer@dynatrace.com>
Reviewed-by: stuefe, luhenry

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 203 new commits pushed to the master branch:

  • 49112fa: 8265909: build.tools.dtdbuilder.DTDBuilder.java failed detecting missing path of dtd_home
  • 94d0b0f: 8268565: runtime/records/RedefineRecord.java should be run in driver mode
  • df65237: 8267930: Refine code for loading hsdis library
  • 2e900da: 8268574: ProblemList tests failing due to UseBiasedLocking going away
  • 4fd2a14: 8267556: Enhance class paths check during runtime
  • 8c8422e: 8267893: Improve jtreg test failure handler do get native/mixed stack traces for cores and live processes
  • 1e1039a: 8268223: Problemlist vmTestbase/nsk/jdi/HiddenClass/events/events001.java
  • 78cb677: 8268539: several serviceability/sa tests should be run in driver mode
  • 7267227: 8268361: Fix the infinite loop in next_line
  • b018c45: 8267630: Start of release updates for JDK 18
  • ... and 193 more: https://git.openjdk.java.net/jdk/compare/379376f0783facba93e1d11db9b184ef8183a13b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 11, 2021
@luhenry
Copy link
Member

luhenry commented Jun 12, 2021

Let’s go with kernelbase.dll then since it’s a superset of kernel32.dll.

@dholmes-ora
Copy link
Member Author

dholmes-ora commented Jun 15, 2021

Thanks again for the reviews.

/integrate

@openjdk
Copy link

openjdk bot commented Jun 15, 2021

Going to push as commit 9f3c7e7.
Since your change was applied there have been 227 commits pushed to the master branch:

  • 2e70bc3: 8268626: Remove native pre-jdk9 support for jtreg failure handler
  • e59acd9: 8268699: Shenandoah: Add test for JDK-8268127
  • 17295b1: Merge
  • b318535: 8267579: Thread::cooked_allocated_bytes() hits assert(left >= right) failed: avoid underflow
  • fe48ea9: 8268342: java/foreign/channels/TestAsyncSocketChannels.java fails with "IllegalStateException: This segment is already closed"
  • 6171ae4: 8268630: ProblemList serviceability/jvmti/CompiledMethodLoad/Zombie.java on linux-aarch64
  • 01054e6: 8268470: CDS dynamic dump asserts with JFR RecordingStream
  • e39346e: 8268093: Manual Testcase: "sun/security/krb5/config/native/TestDynamicStore.java" Fails with NPE
  • cce8da2: 8268602: a couple runtime/os tests don't check exit code
  • da043e9: 8268555: Update HttpClient tests that use ITestContext to jtreg 6+1
  • ... and 217 more: https://git.openjdk.java.net/jdk/compare/379376f0783facba93e1d11db9b184ef8183a13b...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jun 15, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated labels Jun 15, 2021
@openjdk
Copy link

openjdk bot commented Jun 15, 2021

@dholmes-ora Pushed as commit 9f3c7e7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jun 15, 2021
@dholmes-ora dholmes-ora deleted the 8238649 branch Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
3 participants