-
Notifications
You must be signed in to change notification settings - Fork 5.9k
8311993: Test serviceability/sa/UniqueVtableTest.java failed: duplicate vtables detected #20684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back amenkov! A progress list of the required criteria for merging this PR into |
@alexmenkov This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 131 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@alexmenkov The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an expert on this code by any means, but this seems a reasonable way to tackle the problem.
Thanks.
// when requested for decorated "class" or "class*" (i.e. "??_7class@@6B@"/"??_7class*@@6B@"). | ||
// As a workaround check if returned symbol contains requested symbol. | ||
ULONG64 disp = 0L; | ||
char buf[512]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/512/SYMBOL_BUFSIZE/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, fixed
Why do we rely on GetOffsetByName() when we can already lookup directly from the dll, and this is in fact the fallback already in place when GetOffsetByName() fails? |
I'm not expert in the area, but as far as I see this is caused by SA agent design - vtable for a class is considered as usual symbol and standard symbol lookup routine is used (on Windows it's GetOffsetByName and search through DLL exported symbols as a fallback). |
I found this:
I'm not sure what is meant by "take advantage of '.pbp' files". Is it perhaps a more reliable or complete database of symbols, or perhaps it is faster? In any case, I'd be interested in seeing if all our tests still pass when useNativeLookup is false. |
As far as I understand .pdb files contain symbol information.
|
I'm not sure why this is failing. Based on the existing code and comments, it seems at some point SA worked without relying on the windbg native symbol support. Code like isSharingEnabled() probably got added afterwards and was never tested with it disabled. |
It seems the
I wonder if the call to |
ULONG64 disp = 0L; | ||
char buf[SYMBOL_BUFSIZE]; | ||
memset(buf, 0, sizeof(buf)); | ||
if (ptrIDebugSymbols->GetNameByOffset(offset, buf, sizeof(buf), 0, &disp) == S_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Extra space before S_OK
.
Alex's changes neither introduce nor fix the isSharingEnabled() bug. It was just discovered out of an investigation to better understand how SA is doing symbol lookups on Windows. Basically we are trying to understand why it tries both windbg and dll lookups. Currently it won't work with just dll lookups, as the isSharingEnabled() failures show, but if that is the case how can dll lookups be a reliable fallback when windbg lookups fail. |
"UseSharedSpaces" is exported from jvm.dll, but the issue here on Windows SymbolLookup searches for decorated symbols (i.e. "??_7UseSharedSpaces@@6b@") and the symbol is exported undecorated.
Update: usually constants are available for SA by using VMStruct, I'm not sure why "UseSharedSpaces" is exported this way |
What would happen if it was not extern "C". Would the windbg lookup succeed in that case? |
Actually I'm a bit confused on this. The extern "C" is making it so the dll lookup fails, but is it helping the windbg lookup to succeed? If yes, why did it ever get added if it was always first succeeding with the dll lookup. |
Probably the author just didn't know about the VMStruct approach. |
without extern "C" the variable is exported as "?UseSharedSpaces@@3ea". So it wouldn't work with DLL lookup neither windbg. |
Anyway failures with disableNativeLookup and UseSharedSpaces is different issue. |
Where do you see runtime flags in vmstructs? |
I do not see :) I thought about constants, but they are different |
I can't find any other place where we directly lookup a global symbol like this. I did find this other reference to the lookup() API:
So it seems isSharingEnabled() could be using this instead, although that is not solving any of the problems we are seeing with non-native lookups. clhsdb findsym uses this findSymbol() API.
We have one test that does a findsym of MaxJNILocalCapacity (ClhsdbFindPC.java). I'm guessing this would fail if native lookups were disabled. Note findsym was added less than 3 years ago. Another thing I noticed is that isSharingEnabled() used to be the following: public boolean isSharingEnabled() { So it used to use getCommandLineFlag(). It was modified to use lookup() as part of https://bugs.openjdk.org/browse/JDK-8277481, which changed UseSharedSpaces from a command line flag to just a regular global. This change was done 3 years ago. getCommandLineFlag() seems to get the flag from VMStructs, which references the following: JVMFlag* JVMFlag::flags = flagTable; So there is no dll or windbg symbol lookup here, just access to the VMStructs database. It seems then that theoretically the dll lookup should never be needed, but due to the bug this PR is fixing, it is needed as a fallback when windbg lookup fails to do the right thing. I wonder if there are also other types of symbols we lookup that would only be found with the dll lookup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good.
The test fails on isSharingEnabled, but I suppose it will fail to find MaxJNILocalCapacity too
As far as I understand dll lookup is needed to get vtables for classes (and I suppose it's the only case). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it odd that we lookup symbols including the *, e.g.
Duplicate vtable: 0x00007ffd233633d8:
- CompiledMethod (extends CodeBlob)
- CompiledMethod* (extends null)
And iterating over agent.getTypeDataBase().getTypes(); we see e.g. jbyte, jbyte*, jbyte** which looks odd and unnecessary.
We know what a pointer is, so not needing to lookup up "jbyte*" might be good also.
But that might be more change than we want to do right now, so this double check looks good.
/integrate |
Going to push as commit a7120e2.
Your commit was automatically rebased without conflicts. |
@alexmenkov Pushed as commit a7120e2. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
On Windows SA agent gets a class vtable from symbols, exported from jvm.dll (it exports symbols like "??_7" + type + "@@6b@").
But symbol lookup function first requests WinDbg about the symbol.
Sometimes WinDbg routine IDebugSymbols::GetOffsetByName() returns offset for both class and class pointer types. Returned offsets correspond to symbols like "jvm!class_name::`vftable'".
The behavior is intermittent, I was not able to find what is the reason.
The fix adds workaround for the case - if GetOffsetByName succeeded, we check if corresponding symbol contains requested one.
So it returns expected offset for non-vtable symbols like "MaxJNILocalCapacity" (GetOffsetByName returns offset for "jvm!MaxJNILocalCapacity"), but returns 0 for vtlb lookup.
Additionally added check for results of IDebugSymbols::SetImagePath/SetSymbolPath
Testing: tier1,tier2,hs-tier5-svc
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20684/head:pull/20684
$ git checkout pull/20684
Update a local copy of the PR:
$ git checkout pull/20684
$ git pull https://git.openjdk.org/jdk.git pull/20684/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 20684
View PR using the GUI difftool:
$ git pr show -t 20684
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20684.diff
Webrev
Link to Webrev Comment