-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compare class name signatures #16032
Conversation
|
Thanks for your pull request, @WalterBright! Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#16032" |
bfc199a
to
2748576
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
added changelog entry |
| ulong[2] nameSig; /// unique signature for `name` | ||
|
|
||
| immutable(void)* m_RTInfo; // data for precise GC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New fields should only ever be added at the end for strict abi compatibility. Might be a good idea to add a new class flag too hasNameSig = 0x200.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole reason for nameSig is performance. Having to test a flag for it won't kill the performance, but it will degrade it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole reason for nameSig is performance. Having to test a flag for it won't kill the performance, but it will degrade it.
It'll still be better than string comparison, but it at least gives a migration/fallback in case someone forgot to recompile all their code. Shouldn't need to keep it in for too many releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I won't be the one explaining to someone why their project crashed on the NG. You've already explained the risk in the changelog, I'll leave it to you whether we also try avoid any known breakage in the runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the last field moving, it's only supposed to be accessed through the member function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can that function do something sneaky such that it won't link if linked with the wrong ABI? Could one inject a symbol reference after the return statement or something devious like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way I think the risk of this is very low, (say) dub will rebuild per compiler version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had no idea that last field was variadic - I was under the impression that it was just 0x0 or 0x1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked this, and it's a pointer to a symbol, not variadic. So there's no blocking reason to have it at the end.
3bb3d67
to
25cb859
Compare
|
Are we confident we don't need to worry about hash collisions for different names? |
druntime/src/rt/cast_.d
Outdated
| // take care of potential duplicates across binaries | ||
| return a.name == b.name; | ||
| // same class if signatures match, works with potential duplicates across binaries | ||
| return a.m_flags & TypeInfo_Class.ClassFlags.hasNameSig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Would be cleaner to do the old way in an if statement that returns early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new way requires two comparisons max. The old way requires one and then a string comparison. I want to get rid of the old way, but @ibuclaw suggested keeping it for a release or two.
The reason for the change is deadalnix has identified it as serious performance bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gathered (I've discussed this amaury in person), it just reads nicer to do not do the whole shebang in the same expression (this is my main gripe with a lot of DMD, too many subexpressions your majesty!)
|
FWIW I haven't seen many if any people knowingly relying on ABI compatibility in the wild. I certainly haven't ever deliberately at least. |
| pointers with a fallback to doing a string compare on the name, which can | ||
| be rather slow. | ||
|
|
||
| The result is both druntime and phobos will need to be recompiled to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably misleading if a user sees this in the changelog of a release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how it's misleading. What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that someone may see this as a recommendation that they have to do something to druntime or similar, which won't be true because they're using said release.
So something more explicit about where any breakage would actually come about would better, that and that the size has changed.
Probabilistically I think we're fine — there are better hashes if one cares, but the shortest md5 collisions I'm aware of are pretty big |
https://stackoverflow.com/questions/201705/how-many-random-elements-before-md5-produces-collisions |
|
BTW, dmd uses md5 to merge identical string literals at link time, and has for many years. |
Yes, I find it very unlikely we'd see one unless intentional |
|
LGTM other than minor style point. Confirmation from LDC/GDC then merge, one presumes. |
| (a.m_flags & TypeInfo_Class.ClassFlags.hasNameSig | ||
| ? (a.nameSig[0] == b.nameSig[0] && | ||
| a.nameSig[1] == b.nameSig[1]) // new fast way | ||
| : (a is b || a.name == b.name)); // old slow way for temporary binary compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a is b will always be false.
|
Is it worth switching to BLAKE3 (no known vulnerabilities, faster)? https://github.com/BLAKE3-team/BLAKE3 |
|
@WalterBright did you forget to set |
This speeds up class name matches by comparing name md5 signatures rather than doing a string compare, as names can get really long. It matches names with only two compares. It also matches names across shared libraries and DLLs.
Since it also changes the layout of ClassInfo, you'll need to update library builds.
(md5 is also used to merge identical strings across compilation units.)