-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Dispatch to concrete implementations of known types for generic BinaryFloatingPoint and FixedWidthInteger initialisers #33799
Conversation
…aryFloatingPoint> and init?<BinaryFloatingPoint>(exactly:)
…yFloatingPoint> and init?<BinaryFloatingPoint>(exactly:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It we must we must, but increasing the amount of gyb in the standard library (or anywhere) should be a last resort, only used when all other language-native options have been explored.
|
(including improving the optimizer instead of working around it) |
|
The standard library owners explicitly required I agree that our first approach should be to improve the optimizer. |
Note that this is not fixable with just an optimizer change (barring heroics specific to this one case, which may also be worthwhile). It probably requires either a language change (allowing a function to be marked as a specialization of another), or something like the polymorphic builtins support that has been suggested elsewhere. Note also that this PR still doesn't resolve the problem completely for CGFloat, which is actually the source of the vast majority of conversions, and this approach cannot fully do so, because the standard library doesn't know about CGFloat's existence =/ |
@stephentyrone A semantics annotation that this is a numeric conversion routine would be something to explore, I think? Regarding |
|
For the purposes of Two |
It's a white lie, but would be fine in practice. |
Hmm, tell me more: where does it break down in theory? Could we eliminate any discrepancy between theory and practice by fixing the behavior of any corner cases as a semantic requirement of Alternatively, is there a "magic" value (say, |
It may be worth recapping that initiative and where we stand with it on the forums as it’s mostly been point to point discussion. |
|
Just to provide some context: the prompting forum thread (https://forums.swift.org/t/numeric-type-conversion-marking-functions-as-equivalent-for-generic-specialisation/39810) did at least explore the mark-as-specialisation language change (or underscored annotation) and initially got some reasonably strong pushback. |
Since generic functions in Swift cannot behave differently under specialisation, a number of numeric conversion behaviours were producing poor assembly (calling out to the generic
_convertmethods) in a generic context. This can be seen in scenarios like the following:which produces this when optimised and fully specialised.
This PR explicitly branches and specialises where necessary within the generic conversion methods for both
init()andinit?(exactly:)forFixedWidthIntegerandBinaryFloatingPointby GYBing the checks for all known types. This ensures that fully specialised code is able to call the concrete conversions, resulting in much better assembly.This PR will regress performance slightly for non-stdlib types that pass through these generic initialisers and may negatively impact code-size. These are unfortunate consequences of this approach. An alternative approach might be to mark the concrete initialisers as being somehow substitutable for the generic initialiser such that the optimiser is able to substitute in the more-specific conversion in fully specialised code; however, that comes at the cost of potential behaviour differences between specialised and unspecialised code. Both of these options are discussed in the context of this thread: https://forums.swift.org/t/numeric-type-conversion-marking-functions-as-equivalent-for-generic-specialisation/39810
I'm not sure whether the existing benchmarks cover this, since I couldn't see clear differences locally. If it turns out they don't, I'll write and submit a benchmark in a separate PR.
cc @xwu – you mentioned taking a look at this, but hopefully this covers what you were thinking of.