-
-
Notifications
You must be signed in to change notification settings - Fork 422
core/exception: use less TLS storage for staticError #3802
Conversation
|
Thanks for your pull request, @ljmf00! Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + druntime#3802" |
f1ff6c2 to
874dd53
Compare
|
Weird, this is side-effecting on |
It should be the maximum of all class instance alignments, for which there's no trait yet ( |
874dd53 to
d9bc37f
Compare
That trait is available on phobos. I'm going try to add the correct alignment. |
I very much doubt it's correct.
A unittest would IMO perfectly suffice and be simpler (just using the TypeInfos). |
d9bc37f to
1ed5018
Compare
Right, that is considering that the
I don't get it. The problem here is that I can't use TypeInfo in compile-time. Or do you mean at the |
Ouch, okay. [A runtime unittest would otherwise have been perfectly fine IMO, just verifying that no class has a greater alignment.] |
Ah for sanity checking, I understand. Well, I'll put https://issues.dlang.org/show_bug.cgi?id=16508 on my TODO list. I will also fill a bug report about the Apart from it, since |
That's surely intended this way, just like |
src/core/exception.d
Outdated
| // TLS storage shared for all errors, chaining might create circular reference | ||
| private align(2 * size_t.sizeof) void[256] _store; | ||
| private void[alignedStoreSize()] _store; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing proper alignment by increasing the allocation size? This would be a regression for LDC at least.
The problem surely isn't the approximation - for x86_64, it's 16, what the GC guarantees. So if you hit a unittest regression somewhere else after optimizing the TLS size here with DMD, that's probably a sign that another allocation hasn't been properly aligned and just happened to work because of the old round 256 size (or a DMD backend bug).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[And AFAICT, you're still using _store.ptr directly, no manual alignment.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing proper alignment by increasing the allocation size? This would be a regression for LDC at least.
The problem surely isn't the approximation - for x86_64, it's 16, what the GC guarantees. So if you hit a unittest regression somewhere else after optimizing the TLS size here with DMD, that's probably a sign that another allocation hasn't been properly aligned and just happened to work because of the old round 256 size (or a DMD backend bug).
This is pretty much the same as this: https://github.com/ldc-developers/phobos/blob/ldc/std/typecons.d#L8165 . This is doing the alignment by hand.
I don't see the problem with this, as is now. And this is trusted to do, unless a custom alignment is done on one of those classes. And to solve it, we need the trait you mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, that line is 10 years old. Back then, align(N) might have been buggy. There shouldn't be any reason for manual alignments anymore if the alignment is known at compile-time, as in these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought void[] had some special case. I'll change it then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, I forgot to change on the emplace side 🤦 . I think I found the side effect. The weirdest part is that it was only failing on macOS, so we probably got lucky all this time with some other introduced allocation on TLS in the other platforms.
I think now things align in my brain too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. It uses the class ref alignment, but 8 should still be fine for that little class; so it might be a DMD backend bug after all.
Yup, probably, given that A only have 8-byte aligned memory. I need to dig a bit more to spot the issue, but probably someone who has macOS would be more helpful to reproduce this in a sandbox with a minimal example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to dig a bit more to spot the issue, but probably someone who has macOS would be more helpful to reproduce this in a sandbox with a minimal example.
I've just tried to cross-compile with DMD, and then using llvm-objdump to inspect the Mach-O object file, for a trivial:
module mod;
int a;
align(64) int b;$ dmd -betterC -c testalign.d -target=x86_64-apple-darwin
$ llvm-objdump --section-headers testalign.o
testalign.o: file format Mach-O 64-bit x86-64
Sections:
Idx Name Size VMA Type
0 __text 00000000 0000000000000000 TEXT
1 __data 00000000 0000000000000000 DATA
2 __const 00000000 0000000000000000 DATA
3 __bss 00000000 0000000000000030 BSS
4 __const 00000000 0000000000000000 DATA
5 __thread_bss 00000008 0000000000000030 DATA
6 __thread_vars 00000030 0000000000000000 DATA
With LDC however:
1 __thread_bss 00000044 0000000000000040 DATA
So with DMD, the __thread_bss section is only 8 bytes large, and the VMA only 16-bytes aligned. With LDC, the size is 68 bytes, and the VMA is 64-bytes aligned.
With DMD for Linux x64, same 68 bytes size:
14 .tbss 00000044 0000000000000000 BSS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So aligned TLS on macOS is clearly not working 🙃 . Maybe alignment in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same way |
1ed5018 to
9e5cd56
Compare
Since classInstanceSize trait is available, we can now know, at compile-time, the size needed for the TLS storage used in staticError template. This should reduce the allocated size in TLS significantly. On 64-bit it results on about 221 bytes and in 32-bit about 135 bytes. This patch also fixes interdependent missalignment on core.lifetime unittests. Signed-off-by: Luís Ferreira <contact@lsferreira.net>
9e5cd56 to
da29f9e
Compare
|
Druntime have been merged into DMD. Please re-submit your PR to |
Since classInstanceSize trait is available, we can now know, at compile-time,
the size needed for the TLS storage used in staticError template. This should
reduce the allocated size in TLS significantly. On 64-bit it results on about
221 bytes and in 32-bit about 135 bytes.
Signed-off-by: Luís Ferreira contact@lsferreira.net