-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8333639: ubsan: cppVtables.cpp:81:55: runtime error: index 14 out of bounds for type 'long int [1]' #19623
8333639: ubsan: cppVtables.cpp:81:55: runtime error: index 14 out of bounds for type 'long int [1]' #19623
Conversation
…bounds for type 'long int [1]'
👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into |
@TheRealMDoerr This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 83 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@TheRealMDoerr The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
This fixes the ubsan warning. |
Yes, it's a flexible array member. I'm not changing the allocation. |
it seems to break the Windows build
looks the the MSVC compiler is 'thinking' the same I was thinking :-) . |
I thought flexible array members were a C only thing. I did something along the lines of this when I was experimenting with UBsan. Not sure if it is any better, but it does not use language extensions. Not sure if it is ok to look beyond the object through a diff --git a/src/hotspot/share/cds/cppVtables.cpp b/src/hotspot/share/cds/cppVtables.cpp
index c339ce9c0de..55332dc484e 100644
--- a/src/hotspot/share/cds/cppVtables.cpp
+++ b/src/hotspot/share/cds/cppVtables.cpp
@@ -66,19 +66,19 @@
class CppVtableInfo {
intptr_t _vtable_size;
- intptr_t _cloned_vtable[1];
+ intptr_t _cloned_vtable;
public:
static int num_slots(int vtable_size) {
return 1 + vtable_size; // Need to add the space occupied by _vtable_size;
}
int vtable_size() { return int(uintx(_vtable_size)); }
void set_vtable_size(int n) { _vtable_size = intptr_t(n); }
- intptr_t* cloned_vtable() { return &_cloned_vtable[0]; }
- void zero() { memset(_cloned_vtable, 0, sizeof(intptr_t) * vtable_size()); }
+ intptr_t* cloned_vtable() { return &_cloned_vtable; }
+ void zero() { memset(&_cloned_vtable, 0, sizeof(intptr_t) * vtable_size()); }
// Returns the address of the next CppVtableInfo that can be placed immediately after this CppVtableInfo
static size_t byte_size(int vtable_size) {
CppVtableInfo i;
- return pointer_delta(&i._cloned_vtable[vtable_size], &i, sizeof(u1));
+ return pointer_delta(&i.cloned_vtable()[vtable_size], &i, sizeof(u1));
}
};
|
Ah, flexible array members are C99, but not C++. GCC supports it, but obviously not your MSVC. |
@xmas92: Thanks! I have implemented a similar emulation for "flexible array members". Not sure which one is better. |
I like yours because it does not look beyond the object through a pointer into the object. It instead creates a pointer beyond the object and uses that. As an attached storage. Just like always, need to be careful with alignment and padding when adding storage beyond the objects representation. But everything is |
Thanks for your review! |
Hi Martin, ubsan (on my Linux x86_64 test machine) is still happy with your latest patch. |
There are a number of "fake" VLA usage in HotSpot. Some of them have come up in recent ubsan cleanups for similar |
@kimbarrett: Thanks for taking a look! It makes sense to unify all VLA emulations. The implementation in |
I think it's not UB. Or if it is, then I don't see how the mechanism used in the current version of this change isn't |
The implementation in |
They are different.
This approach was recently used to fix an identical ubsan issue:
|
Got it, thanks! I had missed that BufferNode uses an address computation instead of array member access. So, it's basically the same trick as I was using.
I've updated my implementation to use |
void zero() { memset(_cloned_vtable, 0, sizeof(intptr_t) * vtable_size()); } | ||
// Using _cloned_vtable[i] for i > 0 causes undefined behavior. We use address calculation instead. | ||
intptr_t* cloned_vtable() { return (intptr_t*)((char*)this + offset_of(CppVtableInfo, _cloned_vtable)); } | ||
void zero() { memset(cloned_vtable(), 0, sizeof(intptr_t) * vtable_size()); } | ||
// Returns the address of the next CppVtableInfo that can be placed immediately after this CppVtableInfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description of this function is wrong, as it returns an offset rather than
an address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns a pointer which is computed by base + offset. I've factored out the offset computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says byte_size()
returns an address, but it actually returns a size_t offset.
src/hotspot/share/cds/cppVtables.cpp
Outdated
// Returns the address of the next CppVtableInfo that can be placed immediately after this CppVtableInfo | ||
static size_t byte_size(int vtable_size) { | ||
CppVtableInfo i; | ||
return pointer_delta(&i._cloned_vtable[vtable_size], &i, sizeof(u1)); | ||
return pointer_delta(&i.cloned_vtable()[vtable_size], &i, sizeof(u1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making a dummy CppVTableInfo and doing pointer arithmetic, better
would be something like
offset_of(CppVtableInfo, _cloned_vtable) + (sizeof(intptr_t) * vtable_size)
It might be that some of the subexpressions of that should be broken out into helper
functions that can also be used in clone_vtable()
and zero()
.
Also, the really paranoid might align_up
that to alignof(CppVtableInfo)
. Currently that's a nop. Up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. The pointer_delta variant was not so nice.
gcc doc calls these VLAs, but C99 calls them FAM. FAM is fine; I'm just used to the older VLA terminology. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more nits or pre-existing issues.
src/hotspot/share/cds/cppVtables.cpp
Outdated
@@ -66,19 +66,20 @@ | |||
|
|||
class CppVtableInfo { | |||
intptr_t _vtable_size; | |||
intptr_t _cloned_vtable[1]; | |||
intptr_t _cloned_vtable[1]; // Pseudo flexible array member. | |||
static size_t cloned_vtable_offs() { return offset_of(CppVtableInfo, _cloned_vtable); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really prefer spelling out "offset" rather than saving two characters with the "offs" abbreviation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I don't mind.
@@ -66,19 +66,20 @@ | |||
|
|||
class CppVtableInfo { | |||
intptr_t _vtable_size; | |||
intptr_t _cloned_vtable[1]; | |||
intptr_t _cloned_vtable[1]; // Pseudo flexible array member. | |||
static size_t cloned_vtable_offs() { return offset_of(CppVtableInfo, _cloned_vtable); } | |||
public: | |||
static int num_slots(int vtable_size) { | |||
return 1 + vtable_size; // Need to add the space occupied by _vtable_size; | |||
} | |||
int vtable_size() { return int(uintx(_vtable_size)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bunch of pre-existing weirdness around the type of _vtable_size. (I think every use involves a
conversion.) Doing anything about that doesn't really belong in this change, but consider a followup cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Please note that I usually don't touch code in this area. If you would like it to get improved, I suggest filing an RFE and discussing with the CDS folks. My intention is to get rid of UB which is terrible.
src/hotspot/share/cds/cppVtables.cpp
Outdated
@@ -66,19 +66,20 @@ | |||
|
|||
class CppVtableInfo { | |||
intptr_t _vtable_size; | |||
intptr_t _cloned_vtable[1]; | |||
intptr_t _cloned_vtable[1]; // Pseudo flexible array member. | |||
static size_t cloned_vtable_offs() { return offset_of(CppVtableInfo, _cloned_vtable); } | |||
public: | |||
static int num_slots(int vtable_size) { | |||
return 1 + vtable_size; // Need to add the space occupied by _vtable_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-existing: Maybe this ought to be byte_size() / sizeof(intptr_t)
or something like that? And the
name num_slots
seems confusing for what this is doing. Do we actually need both byte_size
and num_slots
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_slots
is unused. Removed.
Hi Martin, thanks for fixing the issue (btw. I tested with the commit from this morning, ubsan was still 'happy'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, other than pre-existing issues like the description of byte_size.
Those can be addressed later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems alright to me.
However I am not sure I understood what was the problem with the initial solution where you make the metadata and the payload disjoint. Nor why the following statement no longer applies.
The current code takes the address of the array member and uses that as the
base of an array access. So it's effectively doing obj._buffer[i] for i > 0.
And that is UB, and ubsan rightly complains.
All of these implementations are now reinterpreting a field of type T[1]
as a T[]
. It is unclear to me why using offset_of
changes anything.
Side note:
The thing I liked with the disjoin approach is that you create two objects, one with the metadata
(the length in this case) and one which is the payload allocated with and created next to the metadata. As a form of attached storage. Using something like this:
T* payload() { return reinterpret_cast<T*>(align_up(reinterpret_cast<char*>(this + 1), align_of(T))); }
Which for CppVtableInfo
boiled down to &_vtable_size + 1
.
It would be nicer to have a more explicit lifetime for these, such that you allocated the memory, get the char* metadata
and char* payload
addresses and then create the objects new (metadata) MetadataT(...)
followed by new (payload) PayloadT(...)
.
Thanks for the reviews and all comments! I also like your side note @xmas92. But, let's ship it. |
Going to push as commit 0199fee.
Your commit was automatically rebased without conflicts. |
@TheRealMDoerr Pushed as commit 0199fee. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
We shouldn't specify a wrong array length which causes undefined behavior. Using a "flexible array member".
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19623/head:pull/19623
$ git checkout pull/19623
Update a local copy of the PR:
$ git checkout pull/19623
$ git pull https://git.openjdk.org/jdk.git pull/19623/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19623
View PR using the GUI difftool:
$ git pr show -t 19623
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19623.diff
Webrev
Link to Webrev Comment