You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(This is a umbrella bug covering several pieces of work across different components of Swift)
Swift does not currently have much support for taking advantage of CPU features (we don't even build an x86_64h slice of the standard library on macOS). It is possible to compile a module with specific features enabled, via an arcane and undiscoverable incantation:
but this is not something that we can recommend to users in good faith, and operates at the wrong granularity for must use, anyway.
A very rough outline of a plan (which will surely not withstand contact with reality):
1. Design a source-level function attribute similar to clang and GCC's attribute((target("OPTIONS"))) that allows one to specify that a function is to be compiled with specific target options enabled. This should (I think) change the name mangling, to support having multiple versions of a function for different subtargets. It should propagate onto anything that gets inlined into the current function (but should not propagate in the other direction). This feature should allow you to tag a function with multiple sets of CPU features to use (e.g. I should be able to specify that single source function gets compiled with +avx2+fma, +avx512f+avx512vl, and +avx512vnni).
Add a runtime info object to the standard library that provides a common denominator of information about CPU features, cache sizes, portably across OSes. (Hard questions that I'm punting on: what is this "common denominator" and what do we do in a heterogeneous environment. See, e.g. a hilarious bug where Samsung packaged cores with 64B and 128B cachelines together. Hopefully people know not to do that, but it will happen again.)
When a function is called that has multiple implementations available, we should, either by default or when indicated by some source annotation, use the information from this struct to dispatch to the appropriate implementation at the call site. The interaction of this with code size is subtle--CPU features often let us use less code, but the dispatch itself can increase size if it isn't done right.
Consider auto-generating dyld resolvers (Apple platforms) and ifuncs (linux) for versioned functions that are dylib symbols. We don't really want people to write these things themselves because they're easy to screw up and the set of operations you're actually allowed to use in them becomes quite restrictive as you move down the system).
This is all very rough, and I expect to add more to it over time. This is just a stick in the sand to point to and a place to collect more thoughts on the subject.
The text was updated successfully, but these errors were encountered:
Additional Detail from JIRA
md5: b35212867b34a993cbeb9f33d729dd13
Issue Description:
(This is a umbrella bug covering several pieces of work across different components of Swift)
Swift does not currently have much support for taking advantage of CPU features (we don't even build an x86_64h slice of the standard library on macOS). It is possible to compile a module with specific features enabled, via an arcane and undiscoverable incantation:
but this is not something that we can recommend to users in good faith, and operates at the wrong granularity for must use, anyway.
A very rough outline of a plan (which will surely not withstand contact with reality):
1. Design a source-level function attribute similar to clang and GCC's attribute((target("OPTIONS"))) that allows one to specify that a function is to be compiled with specific target options enabled. This should (I think) change the name mangling, to support having multiple versions of a function for different subtargets. It should propagate onto anything that gets inlined into the current function (but should not propagate in the other direction). This feature should allow you to tag a function with multiple sets of CPU features to use (e.g. I should be able to specify that single source function gets compiled with +avx2+fma, +avx512f+avx512vl, and +avx512vnni).
Add a runtime info object to the standard library that provides a common denominator of information about CPU features, cache sizes, portably across OSes. (Hard questions that I'm punting on: what is this "common denominator" and what do we do in a heterogeneous environment. See, e.g. a hilarious bug where Samsung packaged cores with 64B and 128B cachelines together. Hopefully people know not to do that, but it will happen again.)
When a function is called that has multiple implementations available, we should, either by default or when indicated by some source annotation, use the information from this struct to dispatch to the appropriate implementation at the call site. The interaction of this with code size is subtle--CPU features often let us use less code, but the dispatch itself can increase size if it isn't done right.
Consider auto-generating dyld resolvers (Apple platforms) and ifuncs (linux) for versioned functions that are dylib symbols. We don't really want people to write these things themselves because they're easy to screw up and the set of operations you're actually allowed to use in them becomes quite restrictive as you move down the system).
This is all very rough, and I expect to add more to it over time. This is just a stick in the sand to point to and a place to collect more thoughts on the subject.
The text was updated successfully, but these errors were encountered: