-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Remove the extra-inhabitant value witness functions #21204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please benchmark. |
Note that a number of IRGen tests are still failing. |
Build comment file:Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! For a follow up, can we kill the SinglePayloadGeneric runtime functions entirely?
It's possible, but we'd have to do a lot of work inline in the cases that are currently relying on them (dynamic-layout single-payload enums and dynamic-layout structs). |
@swift-ci test compiler performance |
e6b39e0
to
53e0362
Compare
@swift-ci Please test. |
Build failed |
Build failed |
I might change the dynamic multi-payload implementation to use the fixed-style pattern, since the extra-inhabitants logic is totally static and the only dynamic thing is the offset of the tag. |
I wanted to make it so that single-payload enums with an added tag gave extra inhabitants from the tag like multi-payload enums do too, but ran out of time. |
Yeah, I saw the FIXMEs there. I didn't want to touch that, either. |
...the multi-payload tagging system, even in the purely dynamic, no-spare-bits case, is way weirder than I gave it credit for. I do not understand all this stuff with rotation, but I'm basically going to have to leave the current practice intact. |
This is essentially a long-belated follow-up to Arnold's swiftlang#12606. The key observation here is that the enum-tag-single-payload witnesses are strictly more powerful than the XI witnesses: you can simulate the XI witnesses by using an extra case count that's <= the XI count. Of course the result is less efficient than the XI witnesses, but that's less important than overall code size, and we can work on fast-paths for that. The extra inhabitant count is stored in a 32-bit field (always present) following the ValueWitnessFlags, which now occupy a fixed 32 bits. This inflates non-XI VWTs on 32-bit targets by a word, but the net effect on XI VWTs is to shrink them by two words, which is likely to be the more important change. Also, being able to access the XI count directly should be a nice win.
This allows callers to avoid needing to reload these tags in common cases.
The rotation stuff should only impact spare-bits-using multi-payload enums, not any place where we're falling back to the dynamic implementation. For a dynamically laid out multipayload enum, it really should just be a tag at a byte offset. |
53e0362
to
724c192
Compare
@swift-ci Please test. |
Build failed |
Build failed |
This is essentially a long-belated follow-up to Arnold's #12606. The key observation here is that the enum-tag-single-payload witnesses are strictly more powerful than the XI witnesses: you can simulate
the XI witnesses by using an extra case count that's <= the XI count. Of course the result is less efficient than the XI witnesses when actually running generic code, but that's less important than overall code size, and we can work on fast-paths in the future.
The extra inhabitant count is stored in a 32-bit field (always present) following the
ValueWitnessFlags
, which now occupy a fixed 32 bits. This inflates non-XI VWTs on 32-bit targets by a word, but the net effect on XI VWTs is to shrink them by two words, which is likely to be the more important change. Also, being able to access the XI count directly should be a nice win.