New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 23978 - ICE: dip1021 memory corruption #15302
Conversation
Thanks for your pull request and interest in making D better, @dkorpel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "stable + dmd#15302" |
Honestly we should just deprecate dip1021, it's a broken design with a poor implementation. There's not even a migration path needed: all it did was add errors that don't accomplish anything, so it could just be treated as a no-op. |
To prevent this class of bugs in the future, would it make sense to wrap all |
I think |
Ubuntu 22.04 x64, GDC fails on
I suspect one of the |
Kill it |
I just noticed that xrealloc is aware of the GC: static void* xrealloc(void* p, size_t size) pure nothrow
{
if (isGCEnabled)
return GC.realloc(p, size); But the documentation of GC.realloc says: "If p was not allocated by the GC, points inside a block, or is null, no action will be taken." If that's true, how would it have ever begun to allocate an array when combining -preview=dip1021 and -lowmem? I think the documentation is incorrect and it allocated anyway, but the block is not being scanned. |
Yes, which is why I'm not sure whether this is correct. |
compiler/src/dmd/escape.d
Outdated
// Clear the new section | ||
memset(newPtr + escapeBy.length, 0, (len - escapeBy.length) * EscapeBy.sizeof); | ||
escapeBy = newPtr[0 .. len]; | ||
escapeBy.length = len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_d_arraysetlength
extends the array by using allocate+copy without marking the old data as "free" in the GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this no different to reimplementing xrealloc
to allocate and copy inline.
--- a/compiler/src/dmd/root/rmem.d
+++ b/compiler/src/dmd/root/rmem.d
@@ -71,7 +71,18 @@ extern (C++) struct Mem
static void* xrealloc(void* p, size_t size) pure nothrow
{
if (isGCEnabled)
- return GC.realloc(p, size);
+ {
+ if (auto p2 = GC.malloc(size))
+ {
+ const psize = GC.sizeOf(p);
+ if (psize < size)
+ size = psize;
+ //p2[0 .. size] = p[0 .. size];
+ memcpy(p2, p, size);
+ return p2;
+ }
+ return null;
+ }
if (!size)
{
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[The druntime function is probably a bit smarter than that and tries to reuse the existing block.]
The memset-to-zero is missing now, so the existing elements would be reused in the loop below, that's probably problematic. Edit: Oh sorry, it previously just memset the newly allocated elements, all good.
It does seem to be related to the |
Not beyond the pointer dump I hacked together and posted an abridged version of (the 14MB file) in the issue. Running it through valgrind might help locate who is referencing a given memory address. I have my doubts it'd be capable though. |
Maybe the failure is more reproducible if the reset-loop at the end of the function actually clears the arrays in EscapeByResults (it currently only resets the length) and also clears EscapeBy.param. |
I suspect the test is more reproducible if you insert |
The switch to dmd/compiler/src/dmd/root/rmem.d Lines 237 to 244 in f2b3b97
-lowmem ), all class instances in the compiler are now allocated by the (disabled) GC, not with the bump-pointer scheme anymore.
|
That's a good point about rmem, it seems like dmd now only half-overrides druntime's allocating functions. |
Yep, good call. I completely overlooked that. Both I'd be more inclined to remove all uses of both these |
As mentioned in the bug report, I've finally tracked it down. The data being Naturally then, the Long story short. Don't call |
|
The short answer is no. Why? Because only The |
Not sure I follow, I'm talking in general about this unmovable dmd/compiler/src/dmd/root/array.d Lines 52 to 57 in f2b3b97
Array field. It's a real bad boy.
|
GC array expansions won't explicitly mark the old memory as "free" though. It still leaves that up for the next garbage collection run to clear that up. And the GC won't do that because it still sees live references via the new expanded memory. |
Yeah okay, but then later we get the |
No, we don't. Because the |
[Well, we did here for CI with the intermediate commit where Dennis extended the GC array length. And now don't anymore with the dropped cache, and allocating a fresh new array each time now (for |
Ah, I didn't see that intermediate commit, was it perhaps using |
Yes, you commented on it ;) - #15302 (comment) |
Thank you Iain, Dennis, et al for the debugging and fix. Amazing. I submitted original bug report to gdc, Iain minimized it further and did extensive debugging.
I didn't actually intended to use dip1021, but I used |
…e::lookup Backports patch from upstream dmd mainline for fixing PR110113. The data being Mem.xrealloc'd contains many Array(T) fields, some of which have self references in their data.ptr field thanks to the smallarray optimization used by Array. Naturally then, the memcpy from old GC data to new retains those self referenced addresses, and the GC marks the old data as "free". Some time later GC.malloc will return a pointer to said "free" data. So now we have two GC references to the same memory. One that is treating the data as an Array(VarDeclaration) in dmd.escape.escapeByStorage, and the other as an AA in the symtab of a dmd.dsymbol.ScopeDsymbol. Fix this memory corruption by not storing the data in a global variable for reuse. If there are no more live references, the GC will free it. PR d/110113 gcc/d/ChangeLog: * dmd/escape.d (checkMutableArguments): Always allocate new buffer for computing escapeBy. gcc/testsuite/ChangeLog: * gdc.test/compilable/test23978.d: New test. Reviewed-on: dlang/dmd#15302 (cherry picked from commit ae3a4ce)
…e::lookup Backports patch from upstream dmd mainline for fixing PR110113. The data being Mem.xrealloc'd contains many Array(T) fields, some of which have self references in their data.ptr field thanks to the smallarray optimization used by Array. Naturally then, the memcpy from old GC data to new retains those self referenced addresses, and the GC marks the old data as "free". Some time later GC.malloc will return a pointer to said "free" data. So now we have two GC references to the same memory. One that is treating the data as an Array(VarDeclaration) in dmd.escape.escapeByStorage, and the other as an AA in the symtab of a dmd.dsymbol.ScopeDsymbol. Fix this memory corruption by not storing the data in a global variable for reuse. If there are no more live references, the GC will free it. PR d/110113 gcc/d/ChangeLog: * dmd/escape.d (checkMutableArguments): Always allocate new buffer for computing escapeBy. gcc/testsuite/ChangeLog: * gdc.test/compilable/test23978.d: New test. Reviewed-on: dlang/dmd#15302
@ibuclaw It's not actually a regression, -preview=dip1021 has been broken from the start. It just happened to manifest in the report's piece of code after the _d_newclassT PR.
I don't want to add a slow test for this, so I added a comment instead of PERMUTE_ARGS to run it 256 times.