Contents
- Abstract
- SIL in the Swift Compiler
- Syntax
- Dataflow Errors
- Runtime Failure
- Undefined Behavior
- Calling Convention
- Type Based Alias Analysis
- Value Dependence
- Instruction Set
- Allocation and Deallocation
- Debug Information
- Accessing Memory
- Reference Counting
- strong_retain
- strong_release
- set_deallocating
- strong_copy_unowned_value
- strong_retain_unowned
- unowned_retain
- unowned_release
- load_weak
- store_weak
- load_unowned
- store_unowned
- fix_lifetime
- mark_dependence
- is_unique
- is_escaping_closure
- copy_block
- copy_block_without_escaping
- builtin "unsafeGuaranteed"
- builtin "unsafeGuaranteedEnd"
- Literals
- Dynamic Dispatch
- Function Application
- Metatypes
- Aggregate Types
- retain_value
- retain_value_addr
- unmanaged_retain_value
- strong_copy_unmanaged_value
- copy_value
- release_value
- release_value_addr
- unmanaged_release_value
- destroy_value
- autorelease_value
- tuple
- tuple_extract
- tuple_element_addr
- destructure_tuple
- struct
- struct_extract
- struct_element_addr
- destructure_struct
- object
- ref_element_addr
- ref_tail_addr
- Enums
- Protocol and Protocol Composition Types
- init_existential_addr
- init_existential_value
- deinit_existential_addr
- deinit_existential_value
- open_existential_addr
- open_existential_value
- init_existential_ref
- open_existential_ref
- init_existential_metatype
- open_existential_metatype
- alloc_existential_box
- project_existential_box
- open_existential_box
- open_existential_box_value
- dealloc_existential_box
- Blocks
- Unchecked Conversions
- upcast
- address_to_pointer
- pointer_to_address
- unchecked_ref_cast
- unchecked_ref_cast_addr
- unchecked_addr_cast
- unchecked_trivial_bit_cast
- unchecked_bitwise_cast
- ref_to_raw_pointer
- raw_pointer_to_ref
- ref_to_unowned
- unowned_to_ref
- ref_to_unmanaged
- unmanaged_to_ref
- convert_function
- convert_escape_to_noescape
- thin_function_to_pointer
- pointer_to_thin_function
- classify_bridge_object
- value_to_bridge_object
- ref_to_bridge_object
- bridge_object_to_ref
- bridge_object_to_word
- thin_to_thick_function
- thick_to_objc_metatype
- objc_to_thick_metatype
- objc_metatype_to_object
- objc_existential_metatype_to_object
- Checked Conversions
- Runtime Failures
- Terminators
- Differentiable Programming
- Assertion configuration
SIL is an SSA-form IR with high-level semantic information designed to implement the Swift programming language. SIL accommodates the following use cases:
- A set of guaranteed high-level optimizations that provide a predictable baseline for runtime and diagnostic behavior.
- Diagnostic dataflow analysis passes that enforce Swift language requirements, such as definitive initialization of variables and constructors, code reachability, switch coverage.
- High-level optimization passes, including retain/release optimization, dynamic method devirtualization, closure inlining, promoting heap allocations to stack allocations, promoting stack allocations to SSA registers, scalar replacement of aggregates (splitting aggregate allocations into multiple smaller allocations), and generic function instantiation.
- A stable distribution format that can be used to distribute "fragile" inlineable or generic code with Swift library modules, to be optimized into client binaries.
In contrast to LLVM IR, SIL is a generally target-independent format representation that can be used for code distribution, but it can also express target-specific concepts as well as LLVM can.
For more information on developing the implementation of SIL and SIL passes, see SILProgrammersManual.md.
At a high level, the Swift compiler follows a strict pipeline architecture:
- The Parse module constructs an AST from Swift source code.
- The Sema module type-checks the AST and annotates it with type information.
- The SILGen module generates raw SIL from an AST.
- A series of Guaranteed Optimization Passes and Diagnostic Passes are run over the raw SIL both to perform optimizations and to emit language-specific diagnostics. These are always run, even at -Onone, and produce canonical SIL.
- General SIL Optimization Passes optionally run over the canonical SIL to improve performance of the resulting executable. These are enabled and controlled by the optimization level and are not run at -Onone.
- IRGen lowers canonical SIL to LLVM IR.
- The LLVM backend (optionally) applies LLVM optimizations, runs the LLVM code generator and emits binary code.
The stages pertaining to SIL processing in particular are as follows:
SILGen produces raw SIL by walking a type-checked Swift AST. The form of SIL emitted by SILGen has the following properties:
- Variables are represented by loading and storing mutable memory locations
instead of being in strict SSA form. This is similar to the initial
alloca
-heavy LLVM IR emitted by frontends such as Clang. However, Swift represents variables as reference-counted "boxes" in the most general case, which can be retained, released, and captured into closures. - Dataflow requirements, such as definitive assignment, function returns, switch coverage (TBD), etc. have not yet been enforced.
transparent
function optimization has not yet been honored.
These properties are addressed by subsequent guaranteed optimization and diagnostic passes which are always run against the raw SIL.
After SILGen, a deterministic sequence of optimization passes is run over the raw SIL. We do not want the diagnostics produced by the compiler to change as the compiler evolves, so these passes are intended to be simple and predictable.
- Mandatory inlining inlines calls to "transparent" functions.
- Memory promotion is implemented as two optimization phases, the first
of which performs capture analysis to promote
alloc_box
instructions toalloc_stack
, and the second of which promotes non-address-exposedalloc_stack
instructions to SSA registers. - Constant propagation folds constant expressions and propagates the constant values. If an arithmetic overflow occurs during the constant expression computation, a diagnostic is issued.
- Return analysis verifies that each function returns a value on every
code path and doesn't "fall off the end" of its definition, which is an error.
It also issues an error when a
noreturn
function returns. - Critical edge splitting splits all critical edges from terminators that don't support arbitrary basic block arguments (all non cond_branch terminators).
If all diagnostic passes succeed, the final result is the canonical SIL for the program.
TODO:
- Generic specialization
- Basic ARC optimization for acceptable performance at -Onone.
SIL captures language-specific type information, making it possible to perform high-level optimizations that are difficult to perform on LLVM IR.
- Generic Specialization analyzes specialized calls to generic functions and generates new specialized version of the functions. Then it rewrites all specialized usages of the generic to a direct call of the appropriate specialized function.
- Witness and VTable Devirtualization for a given type looks up the associated method from a class's vtable or a type witness table and replaces the indirect virtual call with a call to the mapped function.
- Performance Inlining
- Reference Counting Optimizations
- Memory Promotion/Optimizations
- High-level domain specific optimizations The Swift compiler implements high-level optimizations on basic Swift containers such as Array or String. Domain specific optimizations require a defined interface between the standard library and the optimizer. More details can be found here: HighLevelSILOptimizations
SIL is reliant on Swift's type system and declarations, so SIL syntax
is an extension of Swift's. A .sil
file is a Swift source file
with added SIL definitions. The Swift source is parsed only for its
declarations; Swift func
bodies (except for nested declarations)
and top-level code are ignored by the SIL parser. In a .sil
file,
there are no implicit imports; the swift
and/or Builtin
standard modules must be imported explicitly if used.
Here is an example of a .sil
file:
sil_stage canonical
import Swift
// Define types used by the SIL function.
struct Point {
var x : Double
var y : Double
}
class Button {
func onClick()
func onMouseDown()
func onMouseUp()
}
// Declare a Swift function. The body is ignored by SIL.
func taxicabNorm(_ a:Point) -> Double {
return a.x + a.y
}
// Define a SIL function.
// The name @_T5norms11taxicabNormfT1aV5norms5Point_Sd is the mangled name
// of the taxicabNorm Swift function.
sil @_T5norms11taxicabNormfT1aV5norms5Point_Sd : $(Point) -> Double {
bb0(%0 : $Point):
// func Swift.+(Double, Double) -> Double
%1 = function_ref @_Tsoi1pfTSdSd_Sd
%2 = struct_extract %0 : $Point, #Point.x
%3 = struct_extract %0 : $Point, #Point.y
%4 = apply %1(%2, %3) : $(Double, Double) -> Double
return %4 : Double
}
// Define a SIL vtable. This matches dynamically-dispatched method
// identifiers to their implementations for a known static class type.
sil_vtable Button {
#Button.onClick: @_TC5norms6Button7onClickfS0_FT_T_
#Button.onMouseDown: @_TC5norms6Button11onMouseDownfS0_FT_T_
#Button.onMouseUp: @_TC5norms6Button9onMouseUpfS0_FT_T_
}
decl ::= sil-stage-decl
sil-stage-decl ::= 'sil_stage' sil-stage
sil-stage ::= 'raw'
sil-stage ::= 'canonical'
There are different invariants on SIL depending on what stage of processing has been applied to it.
- Raw SIL is the form produced by SILGen that has not been run through
guaranteed optimizations or diagnostic passes. Raw SIL may not have a
fully-constructed SSA graph. It may contain dataflow errors. Some instructions
may be represented in non-canonical forms, such as
assign
anddestroy_addr
for non-address-only values. Raw SIL should not be used for native code generation or distribution. - Canonical SIL is SIL as it exists after guaranteed optimizations and diagnostics. Dataflow errors must be eliminated, and certain instructions must be canonicalized to simpler forms. Performance optimization and native code generation are derived from this form, and a module can be distributed containing SIL in this (or later) forms.
SIL files declare the processing stage of the included SIL with one of the
declarations sil_stage raw
or sil_stage canonical
at top level. Only
one such declaration may appear in a file.
sil-type ::= '$' '*'? generic-parameter-list? type
SIL types are introduced with the $
sigil. SIL's type system is
closely related to Swift's, and so the type after the $
is parsed
largely according to Swift's type grammar.
A formal type is the type of a value in Swift, such as an expression result. Swift's formal type system intentionally abstracts over a large number of representational issues like ownership transfer conventions and directness of arguments. However, SIL aims to represent most such implementation details, and so these differences deserve to be reflected in the SIL type system. Type lowering is the process of turning a formal type into its lowered type.
It is important to be aware that the lowered type of a declaration need not be the lowered type of the formal type of that declaration. For example, the lowered type of a declaration reference:
- will usually be thin,
- may have a non-Swift calling convention,
- may use bridged types in its interface, and
- may use ownership conventions that differ from Swift's default conventions.
Generic functions working with values of unconstrained type must generally work with them indirectly, e.g. by allocating sufficient memory for them and then passing around pointers to that memory. Consider a generic function like this:
func generateArray<T>(n : Int, generator : () -> T) -> [T]
The function generator
will be expected to store its result
indirectly into an address passed in an implicit parameter. There's
really just no reasonable alternative when working with a value of
arbitrary type:
- We don't want to generate a different copy of
generateArray
for every typeT
. - We don't want to give every type in the language a common representation.
- We don't want to dynamically construct a call to
generator
depending on the typeT
.
But we also don't want the existence of the generic system to force
inefficiencies on non-generic code. For example, we'd like a function
of type () -> Int
to be able to return its result directly; and
yet, () -> Int
is a valid substitution of () -> T
, and a
caller of generateArray<Int>
should be able to pass an arbitrary
() -> Int
in as the generator.
Therefore, the representation of a formal type in a generic context may differ from the representation of a substitution of that formal type. We call such differences abstraction differences.
SIL's type system is designed to make abstraction differences always result in differences between SIL types. The goal is that a properly- abstracted value should be correctly usable at any level of substitution.
In order to achieve this, the formal type of a generic entity should always be lowered using the abstraction pattern of its unsubstituted formal type. For example, consider the following generic type:
struct Generator<T> {
var fn : () -> T
}
var intGen : Generator<Int>
intGen.fn
has the substituted formal type () -> Int
, which
would normally lower to the type @callee_owned () -> Int
, i.e.
returning its result directly. But if that type is properly lowered
with the pattern of its unsubstituted type () -> T
, it becomes
@callee_owned () -> @out Int
.
When a type is lowered using the abstraction pattern of an unrestricted type, it is lowered as if the pattern were replaced with a type sharing the same structure but replacing all materializable types with fresh type variables.
For example, if g
has type Generator<(Int, Int) -> Float>
, g.fn
is
lowered using the pattern () -> T
, which eventually causes (Int, Int)
-> Float
to be lowered using the pattern T
, which is the same as
lowering it with the pattern U -> V
; the result is that g.fn
has the following lowered type:
@callee_owned () -> @owned @callee_owned (@in (Int, Int)) -> @out Float.
As another example, suppose that h
has type
Generator<(Int, inout Int) -> Float>
. Neither (Int, inout Int)
nor inout Int
are potential results of substitution because they
aren't materializable, so h.fn
has the following lowered type:
@callee_owned () -> @owned @callee_owned (@in Int, @inout Int) -> @out Float
This system has the property that abstraction patterns are preserved
through repeated substitutions. That is, you can consider a lowered
type to encode an abstraction pattern; lowering T
by R
is
equivalent to lowering T
by (S
lowered by R
).
SILGen has procedures for converting values between abstraction patterns.
At present, only function and tuple types are changed by abstraction differences.
The type of a value in SIL shall be:
- a loadable legal SIL type,
$T
, - the address of a legal SIL type,
$*T
, or
A type T
is a legal SIL type if:
- it is a function type which satisfies the constraints (below) on function types in SIL,
- it is a metatype type which describes its representation,
- it is a tuple type whose element types are legal SIL types,
- it is
Optional<U>
, whereU
is a legal SIL type, - it is a legal Swift type that is not a function, tuple, optional, metatype, or l-value type, or
- it is a
@box
containing a legal SIL type.
Note that types in other recursive positions in the type grammar are still formal types. For example, the instance type of a metatype or the type arguments of a generic type are still formal Swift types, not lowered SIL types.
The address of T $*T
is a pointer to memory containing a value
of any reference or value type $T
. This can be an internal
pointer into a data structure. Addresses of loadable types can be
loaded and stored to access values of those types.
Addresses of address-only types (see below) can only be used with
instructions that manipulate their operands indirectly by address, such
as copy_addr
or destroy_addr
, or as arguments to functions.
It is illegal to have a value of type $T
if T
is address-only.
Addresses are not reference-counted pointers like class values are. They cannot be retained or released.
Address types are not first-class: they cannot appear in recursive
positions in type expressions. For example, the type $**T
is not
a legal type.
The address of an address cannot be directly taken. $**T
is not a representable
type. Values of address type thus cannot be allocated, loaded, or stored
(though addresses can of course be loaded from and stored to).
Addresses can be passed as arguments to functions if the corresponding parameter is indirect. They cannot be returned.
Captured local variables and the payloads of indirect
value types are stored
on the heap. The type @box T
is a reference-counted type that references
a box containing a mutable value of type T
. Boxes always use Swift-native
reference counting, so they can be queried for uniqueness and cast to the
Builtin.NativeObject
type.
A concrete or existential metatype in SIL must describe its representation. This can be:
@thin
, meaning that it requires no storage and thus necessarily represents an exact type (only allowed for concrete metatypes);@thick
, meaning that it stores a reference to a type or (if a concrete class) a subclass of that type; or@objc
, meaning that it stores a reference to a class type (or a subclass thereof) using an Objective-C class object representation rather than the native Swift type-object representation.
Function types in SIL are different from function types in Swift in a number of ways:
A SIL function type may be generic. For example, accessing a generic function with
function_ref
will give a value of generic function type.A SIL function type may be declared
@noescape
. This is required for any function type passed to a parameter not declared with@escaping
declaration modifier.@noescape
function types may be either@convention(thin)
or@callee_guaranteed
. They have an unowned context--the context's lifetime must be independently guaranteed.A SIL function type declares its conventional treatment of its context value:
- If it is
@convention(thin)
, the function requires no context value. Such types may also be declared@noescape
, which trivially has no effect passing the context value. - If it is
@callee_guaranteed
, the context value is treated as a direct parameter. This implies@convention(thick)
. If the function type is also@noescape
, then the context value is unowned, otherwise it is guaranteed. - If it is
@callee_owned
, the context value is treated as an owned direct parameter. This implies@convention(thick)
and is mutually exclusive with@noescape
. - If it is
@convention(block)
, the context value is treated as an unowned direct parameter. - Other function type conventions are described in
Properties of Types
andCalling Convention
.
- If it is
A SIL function type declares the conventions for its parameters. The parameters are written as an unlabeled tuple; the elements of that tuple must be legal SIL types, optionally decorated with one of the following convention attributes.
The value of an indirect parameter has type
*T
; the value of a direct parameter has typeT
.- An
@in
parameter is indirect. The address must be of an initialized object; the function is responsible for destroying the value held there. - An
@inout
parameter is indirect. The address must be of an initialized object. The memory must remain initialized for the duration of the call until the function returns. The function may mutate the pointee, and furthermore may weakly assume that there are no aliasing reads from or writes to the argument, though must preserve a valid value at the argument so that well-ordered aliasing violations do not compromise memory safety. This allows for optimizations such as local load and store propagation, introduction or elimination of temporary copies, and promotion of the@inout
parameter to an@owned
direct parameter and result pair, but does not admit "take" optimization out of the parameter or other optimization that would leave memory in an uninitialized state. - An
@inout_aliasable
parameter is indirect. The address must be of an initialized object. The memory must remain initialized for the duration of the call until the function returns. The function may mutate the pointee, and must assume that other aliases may mutate it as well. These aliases however can be assumed to be well-typed and well-ordered; ill-typed accesses and data races to the parameter are still undefined. - An
@owned
parameter is an owned direct parameter. - A
@guaranteed
parameter is a guaranteed direct parameter. - An
@in_guaranteed
parameter is indirect. The address must be of an initialized object; both the caller and callee promise not to mutate the pointee, allowing the callee to read it. - An
@in_constant
parameter is indirect. The address must be of an initialized object; the function will treat the value held there as read-only. - Otherwise, the parameter is an unowned direct parameter.
- An
A SIL function type declares the conventions for its results. The results are written as an unlabeled tuple; the elements of that tuple must be legal SIL types, optionally decorated with one of the following convention attributes. Indirect and direct results may be interleaved.
Indirect results correspond to implicit arguments of type
*T
in function entry blocks and in the arguments toapply
andtry_apply
instructions. These arguments appear in the order in which they appear in the result list, always before any parameters.Direct results correspond to direct return values of type
T
. A SIL function type has areturn type
derived from its direct results in the following way: when there is a single direct result, the return type is the type of that result; otherwise, it is the tuple type of the types of all the direct results, in the order they appear in the results list. The return type is the type of the operand ofreturn
instructions, the type ofapply
instructions, and the type of the normal result oftry_apply
instructions.- An
@out
result is indirect. The address must be of an uninitialized object. The function is required to leave an initialized value there unless it terminates with athrow
instruction or it has a non-Swift calling convention. - An
@owned
result is an owned direct result. - An
@autoreleased
result is an autoreleased direct result. If there is an autoreleased result, it must be the only direct result. - Otherwise, the parameter is an unowned direct result.
- An
A direct parameter or result of trivial type must always be unowned.
An owned direct parameter or result is transferred to the recipient, which becomes responsible for destroying the value. This means that the value is passed at +1.
An unowned direct parameter or result is instantaneously valid at the
point of transfer. The recipient does not need to worry about race
conditions immediately destroying the value, but should copy it
(e.g. by strong_retain
ing an object pointer) if the value will be
needed sooner rather than later.
A guaranteed direct parameter is like an unowned direct parameter
value, except that it is guaranteed by the caller to remain valid
throughout the execution of the call. This means that any
strong_retain
, strong_release
pairs in the callee on the
argument can be eliminated.
An autoreleased direct result must have a type with a retainable
pointer representation. Autoreleased results are nominally transferred
at +0, but the runtime takes steps to ensure that a +1 can be safely
transferred, and those steps require precise code-layout control.
Accordingly, the SIL pattern for an autoreleased convention looks exactly
like the SIL pattern for an owned convention, and the extra runtime
instrumentation is inserted on both sides when the SIL is lowered into
LLVM IR. An autoreleased apply
of a function that is defined with
an autoreleased result has the effect of a +1 transfer of the result.
An autoreleased apply
of a function that is not defined with
an autoreleased result has the effect of performing a strong retain in
the caller. A non-autoreleased apply
of a function that is defined
with an autoreleased result has the effect of performing an
autorelease in the callee.
SIL function types may provide an optional error result, written by placing
@error
on a result. An error result is always implicitly@owned
. Only functions with a native calling convention may have an error result.A function with an error result cannot be called with
apply
. It must be called withtry_apply
. There is one exception to this rule: a function with an error result can be called withapply [nothrow]
if the compiler can prove that the function does not actually throw.return
produces a normal result of the function. To return an error result, usethrow
.Type lowering lowers the
throws
annotation on formal function types into more concrete error propagation:- For native Swift functions,
throws
is turned into an error result. - For non-native Swift functions,
throws
is turned in an explicit error-handling mechanism based on the imported API. The importer only imports non-native methods and types asthrows
when it is possible to do this automatically.
- For native Swift functions,
SIL function types may provide a pattern signature and substitutions to express that values of the type use a particular generic abstraction pattern. Both must be provided together. If a pattern signature is present, the component types (parameters, yields, and results) must be expressed in terms of the generic parameters of that signature. The pattern substitutions should be expressed in terms of the generic parameters of the overall generic signature, if any, or else the enclosing generic context, if any.
A pattern signature follows the
@substituted
attribute, which must be the final attribute preceding the function type. Pattern substitutions follow the function type, preceded by thefor
keyword. For example:@substituted <T: Collection> (@in T) -> @out T.Element for Array<Int>
The low-level representation of a value of this type may not match the representation of a value of the substituted-through version of it:
(@in Array<Int>) -> @out Int
Substitution differences at the outermost level of a function value may be adjusted using the
convert_function
instruction. Note that this only works at the outermost level and not in nested positions. For example, a function which takes a parameter of the first type above cannot be converted byconvert_function
to a function which takes a parameter of the second type; such a conversion must be done with a thunk.Type substitution on a function type with a pattern signature and substitutions only substitutes into the substitutions; the component types are preserved with their exact original structure.
In the implementation, a SIL function type may also carry substitutions for its generic signature. This is a convenience for working with applied generic types and is not generally a formal part of the SIL language; in particular, values should not have such types. Such a type behaves like a non-generic type, as if the substitutions were actually applied to the underlying function type.
A coroutine is a function which can suspend itself and return control to its caller without terminating the function. That is, it does not need to obey a strict stack discipline.
SIL supports two kinds of coroutine: @yield_many
and @yield_once
.
Either of these attributes may be written before a function type to
indicate that it is a coroutine type.
A coroutine type may declare any number of yielded values, which is to
say, values which are provided to the caller at a yield point. Yielded
values are written in the result list of a function type, prefixed by
the @yields
attribute. A yielded value may have a convention attribute,
taken from the set of parameter attributes and interpreted as if the yield
site were calling back to the calling function.
Currently, a coroutine may not have normal results.
Coroutine functions may be used in many of the same ways as normal
function values. However, they cannot be called with the standard
apply
or try_apply
instructions. A non-throwing yield-once
coroutine can be called with the begin_apply
instruction. There
is no support yet for calling a throwing yield-once coroutine or for
calling a yield-many coroutine of any kind.
Coroutines may contain the special yield
and unwind
instructions.
A @yield_many
coroutine may yield as many times as it desires.
A @yield_once
coroutine may yield exactly once before returning,
although it may also throw
before reaching that point.
This coroutine representation is well-suited to coroutines whose control
flow is tightly integrated with their callers and which intend to pass
information back and forth. This matches the needs of generalized
accessor and generator features in Swift. It is not a particularly good
match for async
/await
-style features; a simpler representation
would probably do fine for that.
SIL classifies types into additional subgroups based on ABI stability and generic constraints:
Loadable types are types with a fully exposed concrete representation:
- Reference types
- Builtin value types
- Fragile struct types in which all element types are loadable
- Tuple types in which all element types are loadable
- Class protocol types
- Archetypes constrained by a class protocol
A loadable aggregate type is a tuple or struct type that is loadable.
A trivial type is a loadable type with trivial value semantics. Values of trivial type can be loaded and stored without any retain or release operations and do not need to be destroyed.
Runtime-sized types are restricted value types for which the compiler does not know the size of the type statically:
- Resilient value types
- Fragile struct or tuple types that contain resilient types as elements at any depth
- Archetypes not constrained by a class protocol
Address-only types are restricted value types which cannot be loaded or otherwise worked with as SSA values:
- Runtime-sized types
- Non-class protocol types
- @weak types
Values of address-only type ("address-only values") must reside in memory and can only be referenced in SIL by address. Addresses of address-only values cannot be loaded from or stored to. SIL provides special instructions for indirectly manipulating address-only values, such as
copy_addr
anddestroy_addr
.
Some additional meaningful categories of type:
- A heap object reference type is a type whose representation consists of a
single strong-reference-counted pointer. This includes all class types,
the
Builtin.NativeObject
andAnyObject
types, and archetypes that conform to one or more class protocols. - A reference type is more general in that its low-level representation may
include additional global pointers alongside a strong-reference-counted
pointer. This includes all heap object reference types and adds
thick function types and protocol/protocol composition types that conform to
one or more class protocols. All reference types can be
retain
-ed andrelease
-d. Reference types also have ownership semantics for their referenced heap object; see Reference Counting below. - A type with retainable pointer representation is guaranteed to
be compatible (in the C sense) with the Objective-C
id
type. The value at runtime may benil
. This includes classes, class metatypes, block functions, and class-bounded existentials with only Objective-C-compatible protocol constraints, as well as one level ofOptional
orImplicitlyUnwrappedOptional
applied to any of the above. Types with retainable pointer representation can be returned via the@autoreleased
return convention.
SILGen does not always map Swift function types one-to-one to SIL function types. Function types are transformed in order to encode additional attributes:
The convention of the function, indicated by the
@convention(convention)
attribute. This is similar to the language-level
@convention
attribute, though SIL extends the set of supported conventions with additional distinctions not exposed at the language level:@convention(thin)
indicates a "thin" function reference, which uses the Swift calling convention with no special "self" or "context" parameters.@convention(thick)
indicates a "thick" function reference, which uses the Swift calling convention and carries a reference-counted context object used to represent captures or other state required by the function. This attribute is implied by@callee_owned
or@callee_guaranteed
.@convention(block)
indicates an Objective-C compatible block reference. The function value is represented as a reference to the block object, which is anid
-compatible Objective-C object that embeds its invocation function within the object. The invocation function uses the C calling convention.@convention(c)
indicates a C function reference. The function value carries no context and uses the C calling convention.@convention(objc_method)
indicates an Objective-C method implementation. The function uses the C calling convention, with the SIL-levelself
parameter (by SIL convention mapped to the final formal parameter) mapped to theself
and_cmd
arguments of the implementation.@convention(method)
indicates a Swift instance method implementation. The function uses the Swift calling convention, using the specialself
parameter.@convention(witness_method)
indicates a Swift protocol method implementation. The function's polymorphic convention is emitted in such a way as to guarantee that it is polymorphic across all possible implementors of the protocol.
(This section applies only to Swift 1.0 and will hopefully be obviated in future releases.)
SIL tries to be ignorant of the details of type layout, and low-level
bit-banging operations such as pointer casts are generally undefined. However,
as a concession to implementation convenience, some types are allowed to be
considered layout compatible. Type T
is layout compatible with type
U
iff:
- an address of type
$*U
can be cast byaddress_to_pointer
/pointer_to_address
to$*T
and a valid value of typeT
can be loaded out (or indirectly used, ifT
is address- only), - if
T
is a nontrivial type, thenretain_value
/release_value
of the loadedT
value is equivalent toretain_value
/release_value
of the originalU
value.
This is not always a commutative relationship; T
can be layout-compatible
with U
whereas U
is not layout-compatible with T
. If the layout
compatible relationship does extend both ways, T
and U
are
commutatively layout compatible. It is however always transitive; if T
is layout-compatible with U
and U
is layout-compatible with V
, then
T
is layout-compatible with V
. All types are layout-compatible with
themselves.
The following types are considered layout-compatible:
Builtin.RawPointer
is commutatively layout compatible with all heap object reference types, andOptional
of heap object reference types. (Note thatRawPointer
is a trivial type, so does not have ownership semantics.)Builtin.RawPointer
is commutatively layout compatible withBuiltin.Word
.- Structs containing a single stored property are commutatively layout compatible with the type of that property.
- A heap object reference is commutatively layout compatible with any type
that can correctly reference the heap object. For instance, given a class
B
and a derived classD
inheriting fromB
, a value of typeB
referencing an instance of typeD
is layout compatible with bothB
andD
, as well asBuiltin.NativeObject
andAnyObject
. It is not layout compatible with an unrelated class typeE
. - For payloaded enums, the payload type of the first payloaded case is layout-compatible with the enum (not commutatively).
sil-identifier ::= [A-Za-z_0-9]+
sil-value-name ::= '%' sil-identifier
sil-value ::= sil-value-name
sil-value ::= 'undef'
sil-operand ::= sil-value ':' sil-type
SIL values are introduced with the %
sigil and named by an
alphanumeric identifier, which references the instruction or basic block
argument that produces the value. SIL values may also refer to the keyword
'undef', which is a value of undefined contents.
Unlike LLVM IR, SIL instructions that take value operands only accept
value operands. References to literal constants, functions, global variables, or
other entities require specialized instructions such as integer_literal
,
function_ref
, global_addr
, etc.
decl ::= sil-function
sil-function ::= 'sil' sil-linkage? sil-function-attribute+
sil-function-name ':' sil-type
'{' sil-basic-block+ '}'
sil-function-name ::= '@' [A-Za-z_0-9]+
SIL functions are defined with the sil
keyword. SIL function names
are introduced with the @
sigil and named by an alphanumeric
identifier. This name will become the LLVM IR name for the function,
and is usually the mangled name of the originating Swift declaration.
The sil
syntax declares the function's name and SIL type, and
defines the body of the function inside braces. The declared type must
be a function type, which may be generic.
sil-function-attribute ::= '[canonical]'
The function is in canonical SIL even if the module is still in raw SIL.
sil-function-attribute ::= '[ossa]'
The function is in OSSA (ownership SSA) form.
sil-function-attribute ::= '[transparent]'
Transparent functions are always inlined and don't keep their source information when inlined.
sil-function-attribute ::= '[' sil-function-thunk ']'
sil-function-thunk ::= 'thunk'
sil-function-thunk ::= 'signature_optimized_thunk'
sil-function-thunk ::= 'reabstraction_thunk'
The function is a compiler generated thunk.
sil-function-attribute ::= '[dynamically_replacable]'
The function can be replaced at runtime with a different implementation. Optimizations must not assume anything about such a function, even if the SIL of the function body is available.
sil-function-attribute ::= '[dynamic_replacement_for' identifier ']'
sil-function-attribute ::= '[objc_replacement_for' identifier ']'
Specifies for which function this function is a replacement.
sil-function-attribute ::= '[exact_self_class]'
The function is a designated initializers, where it is known that the static type being allocated is the type of the class that defines the designated initializer.
sil-function-attribute ::= '[without_actually_escaping]'
The function is a thunk for closures which are not actually escaping.
sil-function-attribute ::= '[' sil-function-purpose ']'
sil-function-purpose ::= 'global_init'
The implied semantics are:
- side-effects can occur any time before the first invocation.
- all calls to the same
global_init
function have the same side-effects.- any operation that may observe the initializer's side-effects must be preceded by a call to the initializer.
This is currently true if the function is an addressor that was lazily generated from a global variable access. Note that the initialization function itself does not need this attribute. It is private and only called within the addressor.
sil-function-purpose ::= 'lazy_getter'
The function is a getter of a lazy property for which the backing storage is
an Optional
of the property's type. The getter contains a top-level
switch_enum
(or switch_enum_addr
), which tests if the lazy property
is already computed. In the None
-case, the property is computed and stored
to the backing storage of the property.
After the first call of a lazy property getter, it is guaranteed that the
property is computed and consecutive calls always execute the Some
-case of
the top-level switch_enum
.
sil-function-attribute ::= '[weak_imported]'
Cross-module references to this function should always use weak linking.
sil-function-attribute ::= '[available' sil-version-tuple ']'
sil-version-tuple ::= [0-9]+ ('.' [0-9]+)*
The minimal OS-version where the function is available.
sil-function-attribute ::= '[' sil-function-inlining ']'
sil-function-inlining ::= 'never'
The function is never inlined.
sil-function-inlining ::= 'always'
The function is always inlined, even in a Onone
build.
sil-function-attribute ::= '[' sil-function-optimization ']'
sil-function-inlining ::= 'Onone'
sil-function-inlining ::= 'Ospeed'
sil-function-inlining ::= 'Osize'
The function is optimized according to this attribute, overriding the setting from the command line.
sil-function-attribute ::= '[' sil-function-effects ']'
sil-function-effects ::= 'readonly'
sil-function-effects ::= 'readnone'
sil-function-effects ::= 'readwrite'
sil-function-effects ::= 'releasenone'
The specified memory effects of the function.
sil-function-attribute ::= '[_semantics "' [A-Za-z._0-9]+ '"]'
The specified high-level semantics of the function. The optimizer can use this
information to perform high-level optimizations before such functions are
inlined. For example, Array
operations are annotated with semantic
attributes to let the optimizer perform redundant bounds check elimination and
similar optimizations.
sil-function-attribute ::= '[_specialize "' [A-Za-z._0-9]+ '"]'
Specifies for which types specialized code should be generated.
sil-function-attribute ::= '[clang "' identifier '"]'
The clang node owner.
sil-basic-block ::= sil-label sil-instruction-def* sil-terminator
sil-label ::= sil-identifier ('(' sil-argument (',' sil-argument)* ')')? ':'
sil-argument ::= sil-value-name ':' sil-type
sil-instruction-result ::= sil-value-name
sil-instruction-result ::= '(' (sil-value-name (',' sil-value-name)*)? ')'
sil-instruction-source-info ::= (',' sil-scope-ref)? (',' sil-loc)?
sil-instruction-def ::=
(sil-instruction-result '=')? sil-instruction sil-instruction-source-info
A function body consists of one or more basic blocks that correspond to the nodes of the function's control flow graph. Each basic block contains one or more instructions and ends with a terminator instruction. The function's entry point is always the first basic block in its body.
In SIL, basic blocks take arguments, which are used as an alternative to LLVM's phi nodes. Basic block arguments are bound by the branch from the predecessor block:
sil @iif : $(Builtin.Int1, Builtin.Int64, Builtin.Int64) -> Builtin.Int64 {
bb0(%cond : $Builtin.Int1, %ifTrue : $Builtin.Int64, %ifFalse : $Builtin.Int64):
cond_br %cond : $Builtin.Int1, then, else
then:
br finish(%ifTrue : $Builtin.Int64)
else:
br finish(%ifFalse : $Builtin.Int64)
finish(%result : $Builtin.Int64):
return %result : $Builtin.Int64
}
Arguments to the entry point basic block, which has no predecessor, are bound by the function's caller:
sil @foo : $(Int) -> Int {
bb0(%x : $Int):
return %x : $Int
}
sil @bar : $(Int, Int) -> () {
bb0(%x : $Int, %y : $Int):
%foo = function_ref @foo
%1 = apply %foo(%x) : $(Int) -> Int
%2 = apply %foo(%y) : $(Int) -> Int
%3 = tuple ()
return %3 : $()
}
sil-scope-ref ::= 'scope' [0-9]+
sil-scope ::= 'sil_scope' [0-9]+ '{'
sil-loc
'parent' scope-parent
('inlined_at' sil-scope-ref)?
'}'
scope-parent ::= sil-function-name ':' sil-type
scope-parent ::= sil-scope-ref
sil-loc ::= 'loc' string-literal ':' [0-9]+ ':' [0-9]+
Each instruction may have a debug location and a SIL scope reference at the end. Debug locations consist of a filename, a line number, and a column number. If the debug location is omitted, it defaults to the location in the SIL source file. SIL scopes describe the position inside the lexical scope structure that the Swift expression a SIL instruction was generated from had originally. SIL scopes also hold inlining information.
sil-decl-ref ::= '#' sil-identifier ('.' sil-identifier)* sil-decl-subref?
sil-decl-subref ::= '!' sil-decl-subref-part ('.' sil-decl-lang)? ('.' sil-decl-autodiff)?
sil-decl-subref ::= '!' sil-decl-lang
sil-decl-subref ::= '!' sil-decl-autodiff
sil-decl-subref-part ::= 'getter'
sil-decl-subref-part ::= 'setter'
sil-decl-subref-part ::= 'allocator'
sil-decl-subref-part ::= 'initializer'
sil-decl-subref-part ::= 'enumelt'
sil-decl-subref-part ::= 'destroyer'
sil-decl-subref-part ::= 'deallocator'
sil-decl-subref-part ::= 'globalaccessor'
sil-decl-subref-part ::= 'ivardestroyer'
sil-decl-subref-part ::= 'ivarinitializer'
sil-decl-subref-part ::= 'defaultarg' '.' [0-9]+
sil-decl-lang ::= 'foreign'
sil-decl-autodiff ::= sil-decl-autodiff-kind '.' sil-decl-autodiff-indices
sil-decl-autodiff-kind ::= 'jvp'
sil-decl-autodiff-kind ::= 'vjp'
sil-decl-autodiff-indices ::= [SU]+
Some SIL instructions need to reference Swift declarations directly. These
references are introduced with the #
sigil followed by the fully qualified
name of the Swift declaration. Some Swift declarations are
decomposed into multiple entities at the SIL level. These are distinguished by
following the qualified name with !
and one or more .
-separated component
entity discriminators:
getter
: the getter function for avar
declarationsetter
: the setter function for avar
declarationallocator
: astruct
orenum
constructor, or aclass
's allocating constructorinitializer
: aclass
's initializing constructorenumelt
: a member of aenum
type.destroyer
: a class's destroying destructordeallocator
: a class's deallocating destructorglobalaccessor
: the addressor function for a global variableivardestroyer
: a class's ivar destroyerivarinitializer
: a class's ivar initializerdefaultarg.
n: the default argument-generating function for the n-th argument of a Swiftfunc
foreign
: a specific entry point for C/Objective-C interoperability
sil-linkage ::= 'public'
sil-linkage ::= 'hidden'
sil-linkage ::= 'shared'
sil-linkage ::= 'private'
sil-linkage ::= 'public_external'
sil-linkage ::= 'hidden_external'
sil-linkage ::= 'non_abi'
A linkage specifier controls the situations in which two objects in different SIL modules are linked, i.e. treated as the same object.
A linkage is external if it ends with the suffix external
. An
object must be a definition if its linkage is not external.
All functions, global variables, and witness tables have linkage.
The default linkage of a definition is public
. The default linkage of a
declaration is public_external
. (These may eventually change to hidden
and hidden_external
, respectively.)
On a global variable, an external linkage is what indicates that the
variable is not a definition. A variable lacking an explicit linkage
specifier is presumed a definition (and thus gets the default linkage
for definitions, public
.)
Two objects are linked if they have the same name and are mutually visible:
- An object with
public
orpublic_external
linkage is always visible.- An object with
hidden
,hidden_external
, orshared
linkage is visible only to objects in the same Swift module.- An object with
private
linkage is visible only to objects in the same SIL module.
Note that the linked relationship is an equivalence relation: it is reflexive, symmetric, and transitive.
If two objects are linked, they must have the same type.
If two objects are linked, they must have the same linkage, except:
- A
public
object may be linked to apublic_external
object.- A
hidden
object may be linked to ahidden_external
object.
If two objects are linked, at most one may be a definition, unless:
- both objects have
shared
linkage or- at least one of the objects has an external linkage.
If two objects are linked, and both are definitions, then the definitions must be semantically equivalent. This equivalence may exist only on the level of user-visible semantics of well-defined code; it should not be taken to guarantee that the linked definitions are exactly operationally equivalent. For example, one definition of a function might copy a value out of an address parameter, while another may have had an analysis applied to prove that said value is not needed.
If an object has any uses, then it must be linked to a definition with non-external linkage.
The non_abi linkage is a special linkage used for definitions which only exist in serialized SIL, and do not define visible symbols in the object file.
A definition with non_abi linkage behaves like it has shared linkage, except that it must be serialized in the SIL module even if not referenced from anywhere else in the module. For example, this means it is considered a root for dead function elimination.
When a non_abi definition is deserialized, it will have shared_external linkage.
There is no non_abi_external linkage. Instead, when referencing a non_abi declaration that is defined in a different translation unit from the same Swift module, you must use hidden_external linkage.
public
definitions are unique and visible everywhere in the program. In LLVM IR, they will be emitted withexternal
linkage anddefault
visibility.hidden
definitions are unique and visible only within the current Swift module. In LLVM IR, they will be emitted withexternal
linkage andhidden
visibility.private
definitions are unique and visible only within the current SIL module. In LLVM IR, they will be emitted withprivate
linkage.shared
definitions are visible only within the current Swift module. They can be linked only with othershared
definitions, which must be equivalent; therefore, they only need to be emitted if actually used. In LLVM IR, they will be emitted withlinkonce_odr
linkage andhidden
visibility.public_external
andhidden_external
objects always have visible definitions somewhere else. If this object nonetheless has a definition, it's only for the benefit of optimization or analysis. In LLVM IR, declarations will haveexternal
linkage and definitions (if actually emitted as definitions) will haveavailable_externally
linkage.
decl ::= sil-vtable
sil-vtable ::= 'sil_vtable' identifier '{' sil-vtable-entry* '}'
sil-vtable-entry ::= sil-decl-ref ':' sil-linkage? sil-function-name
SIL represents dynamic dispatch for class methods using the class_method, super_method, objc_method, and objc_super_method instructions.
The potential destinations for class_method and super_method are
tracked in sil_vtable
declarations for every class type. The declaration
contains a mapping from every method of the class (including those inherited
from its base class) to the SIL function that implements the method for that
class:
class A {
func foo()
func bar()
func bas()
}
sil @A_foo : $@convention(thin) (@owned A) -> ()
sil @A_bar : $@convention(thin) (@owned A) -> ()
sil @A_bas : $@convention(thin) (@owned A) -> ()
sil_vtable A {
#A.foo: @A_foo
#A.bar: @A_bar
#A.bas: @A_bas
}
class B : A {
func bar()
}
sil @B_bar : $@convention(thin) (@owned B) -> ()
sil_vtable B {
#A.foo: @A_foo
#A.bar: @B_bar
#A.bas: @A_bas
}
class C : B {
func bas()
}
sil @C_bas : $@convention(thin) (@owned C) -> ()
sil_vtable C {
#A.foo: @A_foo
#A.bar: @B_bar
#A.bas: @C_bas
}
Note that the declaration reference in the vtable is to the least-derived method
visible through that class (in the example above, B
's vtable references
A.bar
and not B.bar
, and C
's vtable references A.bas
and not
C.bas
). The Swift AST maintains override relationships between declarations
that can be used to look up overridden methods in the SIL vtable for a derived
class (such as C.bas
in C
's vtable).
In case the SIL function is a thunk, the function name is preceded with the linkage of the original implementing function.
decl ::= sil-witness-table
sil-witness-table ::= 'sil_witness_table' sil-linkage?
normal-protocol-conformance '{' sil-witness-entry* '}'
SIL encodes the information needed for dynamic dispatch of generic types into witness tables. This information is used to produce runtime dispatch tables when generating binary code. It can also be used by SIL optimizations to specialize generic functions. A witness table is emitted for every declared explicit conformance. Generic types share one generic witness table for all of their instances. Derived classes inherit the witness tables of their base class.
protocol-conformance ::= normal-protocol-conformance
protocol-conformance ::= 'inherit' '(' protocol-conformance ')'
protocol-conformance ::= 'specialize' '<' substitution* '>'
'(' protocol-conformance ')'
protocol-conformance ::= 'dependent'
normal-protocol-conformance ::= identifier ':' identifier 'module' identifier
Witness tables are keyed by protocol conformance, which is a unique identifier for a concrete type's conformance to a protocol.
- A normal protocol conformance names a (potentially unbound generic) type, the protocol it conforms to, and the module in which the type or extension declaration that provides the conformance appears. These correspond 1:1 to protocol conformance declarations in the source code.
- If a derived class conforms to a protocol through inheritance from its base class, this is represented by an inherited protocol conformance, which simply references the protocol conformance for the base class.
- If an instance of a generic type conforms to a protocol, it does so with a specialized conformance, which provides the generic parameter bindings to the normal conformance, which should be for a generic type.
Witness tables are only directly associated with normal conformances. Inherited and specialized conformances indirectly reference the witness table of the underlying normal conformance.
sil-witness-entry ::= 'base_protocol' identifier ':' protocol-conformance
sil-witness-entry ::= 'method' sil-decl-ref ':' sil-function-name
sil-witness-entry ::= 'associated_type' identifier
sil-witness-entry ::= 'associated_type_protocol'
'(' identifier ':' identifier ')' ':' protocol-conformance
Witness tables consist of the following entries:
- Base protocol entries provide references to the protocol conformances that satisfy the witnessed protocols' inherited protocols.
- Method entries map a method requirement of the protocol to a SIL function that implements that method for the witness type. One method entry must exist for every required method of the witnessed protocol.
- Associated type entries map an associated type requirement of the protocol to the type that satisfies that requirement for the witness type. Note that the witness type is a source-level Swift type and not a SIL type. One associated type entry must exist for every required associated type of the witnessed protocol.
- Associated type protocol entries map a protocol requirement on an associated type to the protocol conformance that satisfies that requirement for the associated type.
decl ::= sil-default-witness-table
sil-default-witness-table ::= 'sil_default_witness_table'
identifier minimum-witness-table-size
'{' sil-default-witness-entry* '}'
minimum-witness-table-size ::= integer
SIL encodes requirements with resilient default implementations in a default witness table. We say a requirement has a resilient default implementation if the following conditions hold:
- The requirement has a default implementation
- The requirement is either the last requirement in the protocol, or all subsequent requirements also have resilient default implementations
The set of requirements with resilient default implementations is stored in protocol metadata.
The minimum witness table size is the size of the witness table, in words, not including any requirements with resilient default implementations.
Any conforming witness table must have a size between the minimum size, and the maximum size, which is equal to the minimum size plus the number of default requirements.
At load time, if the runtime encounters a witness table with fewer than the maximum number of witnesses, the witness table is copied, with default witnesses copied in. This ensures that callers can always expect to find the correct number of requirements in each witness table, and new requirements can be added by the framework author, without breaking client code, as long as the new requirements have resilient default implementations.
Default witness tables are keyed by the protocol itself. Only protocols with public visibility need a default witness table; private and internal protocols are never seen outside the module, therefore there are no resilience issues with adding new requirements.
sil-default-witness-entry ::= 'method' sil-decl-ref ':' sil-function-name
Default witness tables currently contain only one type of entry:
- Method entries map a method requirement of the protocol to a SIL function that implements that method in a manner suitable for all witness types.
decl ::= sil-global-variable
static-initializer ::= '=' '{' sil-instruction-def* '}'
sil-global-variable ::= 'sil_global' sil-linkage identifier ':' sil-type
(static-initializer)?
SIL representation of a global variable.
Global variable access is performed by the alloc_global
, global_addr
and global_value
instructions.
A global can have a static initializer if its initial value can be composed of literals. The static initializer is represented as a list of literal and aggregate instructions where the last instruction is the top-level value of the static initializer:
sil_global hidden @$S4test3varSiv : $Int {
%0 = integer_literal $Builtin.Int64, 27
%initval = struct $Int (%0 : $Builtin.Int64)
}
If a global does not have a static initializer, the alloc_global
instruction must be performed prior an access to initialize the storage.
Once a global's storage has been initialized, global_addr
is used to
project the value.
If the last instruction in the static initializer is an object
instruction
the global variable is a statically initialized object. In this case the
variable cannot be used as l-value, i.e. the reference to the object cannot be
modified. As a consequence the variable cannot be accessed with global_addr
but only with global_value
.
decl ::= sil-differentiability-witness
sil-differentiability-witness ::=
'sil_differentiability_witness'
sil-linkage?
'[' 'parameters' sil-differentiability-witness-function-index-list ']'
'[' 'results' sil-differentiability-witness-function-index-list ']'
generic-parameter-clause?
sil-function-name ':' sil-type
sil-differentiability-witness-body?
sil-differentiability-witness-body ::=
'{' sil-differentiability-witness-entry?
sil-differentiability-witness-entry? '}'
sil-differentiability-witness-entry ::=
sil-differentiability-witness-entry-kind ':'
sil-entry-name ':' sil-type
sil-differentiability-witness-entry-kind ::= 'jvp' | 'vjp'
SIL encodes function differentiability via differentiability witnesses.
Differentiability witnesses map a "key" (including an "original" SIL function) to derivative SIL functions.
Differentiability witnesses are keyed by the following:
- An "original" SIL function name.
- Differentiability parameter indices.
- Differentiability result indices.
- A generic parameter clause, representing differentiability generic requirements.
Differentiability witnesses may have a body, specifying derivative functions for the key. Verification checks that derivative functions have the expected type based on the key.
sil_differentiability_witness hidden [parameters 0] [results 0] <T where T : Differentiable> @id : $@convention(thin) (T) -> T {
jvp: @id_jvp : $@convention(thin) (T) -> (T, @owned @callee_guaranteed (T.TangentVector) -> T.TangentVector)
vjp: @id_vjp : $@convention(thin) (T) -> (T, @owned @callee_guaranteed (T.TangentVector) -> T.TangentVector)
}
During SILGen, differentiability witnesses are emitted for the following:
- @differentiable declaration attributes.
- @derivative declaration attributes. Registered derivative functions become differentiability witness entries.
The SIL differentiation transform canonicalizes differentiability witnesses, filling in missing entries.
Differentiability witness entries are accessed via the differentiability_witness_function instruction.
Dataflow errors may exist in raw SIL. Swift's semantics defines these conditions as errors, so they must be diagnosed by diagnostic passes and must not exist in canonical SIL.
Swift requires that all local variables be initialized before use. In constructors, all instance variables of a struct, enum, or class type must be initialized before the object is used and before the constructor is returned from.
The unreachable
terminator is emitted in raw SIL to mark incorrect control
flow, such as a non-Void
function failing to return
a value, or a
switch
statement failing to cover all possible values of its subject.
The guaranteed dead code elimination pass can eliminate truly unreachable
basic blocks, or unreachable
instructions may be dominated by applications
of functions returning uninhabited types. An unreachable
instruction that
survives guaranteed DCE and is not immediately preceded by a no-return
application is a dataflow error.
Some operations, such as failed unconditional checked conversions or the
Builtin.trap
compiler builtin, cause a runtime failure, which
unconditionally terminates the current actor. If it can be proven that a
runtime failure will occur or did occur, runtime failures may be reordered so
long as they remain well-ordered relative to operations external to the actor
or the program as a whole. For instance, with overflow checking on integer
arithmetic enabled, a simple for
loop that reads inputs in from one or more
arrays and writes outputs to another array, all local
to the current actor, may cause runtime failure in the update operations:
// Given unknown start and end values, this loop may overflow
for var i = unknownStartValue; i != unknownEndValue; ++i {
...
}
It is permitted to hoist the overflow check and associated runtime failure out of the loop itself and check the bounds of the loop prior to entering it, so long as the loop body has no observable effect outside of the current actor.
Incorrect use of some operations is undefined behavior, such as invalid
unchecked casts involving Builtin.RawPointer
types, or use of compiler
builtins that lower to LLVM instructions with undefined behavior at the LLVM
level. A SIL program with undefined behavior is meaningless, much like undefined
behavior in C, and has no predictable semantics. Undefined behavior should not
be triggered by valid SIL emitted by a correct Swift program using a correct
standard library, but cannot in all cases be diagnosed or verified at the SIL
level.
This section describes how Swift functions are emitted in SIL.
The Swift calling convention is the one used by default for native Swift functions.
Tuples in the input type of the function are recursively destructured into
separate arguments, both in the entry point basic block of the callee, and
in the apply
instructions used by callers:
func foo(_ x:Int, y:Int)
sil @foo : $(x:Int, y:Int) -> () {
entry(%x : $Int, %y : $Int):
...
}
func bar(_ x:Int, y:(Int, Int))
sil @bar : $(x:Int, y:(Int, Int)) -> () {
entry(%x : $Int, %y0 : $Int, %y1 : $Int):
...
}
func call_foo_and_bar() {
foo(1, 2)
bar(4, (5, 6))
}
sil @call_foo_and_bar : $() -> () {
entry:
...
%foo = function_ref @foo : $(x:Int, y:Int) -> ()
%foo_result = apply %foo(%1, %2) : $(x:Int, y:Int) -> ()
...
%bar = function_ref @bar : $(x:Int, y:(Int, Int)) -> ()
%bar_result = apply %bar(%4, %5, %6) : $(x:Int, y:(Int, Int)) -> ()
}
Calling a function with trivial value types as inputs and outputs simply passes the arguments by value. This Swift function:
func foo(_ x:Int, y:Float) -> UnicodeScalar
foo(x, y)
gets called in SIL as:
%foo = constant_ref $(Int, Float) -> UnicodeScalar, @foo
%z = apply %foo(%x, %y) : $(Int, Float) -> UnicodeScalar
NOTE This section only is speaking in terms of rules of thumb. The
actual behavior of arguments with respect to arguments is defined by
the argument's convention attribute (e.g. @owned
), not the
calling convention itself.
Reference type arguments are passed in at +1 retain count and consumed by the callee. A reference type return value is returned at +1 and consumed by the caller. Value types with reference type components have their reference type components each retained and released the same way. This Swift function:
class A {}
func bar(_ x:A) -> (Int, A) { ... }
bar(x)
gets called in SIL as:
%bar = function_ref @bar : $(A) -> (Int, A)
strong_retain %x : $A
%z = apply %bar(%x) : $(A) -> (Int, A)
// ... use %z ...
%z_1 = tuple_extract %z : $(Int, A), 1
strong_release %z_1
When applying a thick function value as a callee, the function value is also consumed at +1 retain count.
For address-only arguments, the caller allocates a copy and passes the address of the copy to the callee. The callee takes ownership of the copy and is responsible for destroying or consuming the value, though the caller must still deallocate the memory. For address-only return values, the caller allocates an uninitialized buffer and passes its address as the first argument to the callee. The callee must initialize this buffer before returning. This Swift function:
@API struct A {}
func bas(_ x:A, y:Int) -> A { return x }
var z = bas(x, y)
// ... use z ...
gets called in SIL as:
%bas = function_ref @bas : $(A, Int) -> A
%z = alloc_stack $A
%x_arg = alloc_stack $A
copy_addr %x to [initialize] %x_arg : $*A
apply %bas(%z, %x_arg, %y) : $(A, Int) -> A
dealloc_stack %x_arg : $*A // callee consumes %x.arg, caller deallocs
// ... use %z ...
destroy_addr %z : $*A
dealloc_stack stack %z : $*A
The implementation of @bas
is then responsible for consuming %x_arg
and
initializing %z
.
Tuple arguments are destructured regardless of the address-only-ness of the tuple type. The destructured fields are passed individually according to the above convention. This Swift function:
@API struct A {}
func zim(_ x:Int, y:A, (z:Int, w:(A, Int)))
zim(x, y, (z, w))
gets called in SIL as:
%zim = function_ref @zim : $(x:Int, y:A, (z:Int, w:(A, Int))) -> ()
%y_arg = alloc_stack $A
copy_addr %y to [initialize] %y_arg : $*A
%w_0_addr = element_addr %w : $*(A, Int), 0
%w_0_arg = alloc_stack $A
copy_addr %w_0_addr to [initialize] %w_0_arg : $*A
%w_1_addr = element_addr %w : $*(A, Int), 1
%w_1 = load %w_1_addr : $*Int
apply %zim(%x, %y_arg, %z, %w_0_arg, %w_1) : $(x:Int, y:A, (z:Int, w:(A, Int))) -> ()
dealloc_stack %w_0_arg
dealloc_stack %y_arg
Variadic arguments and tuple elements are packaged into an array and passed as a single array argument. This Swift function:
func zang(_ x:Int, (y:Int, z:Int...), v:Int, w:Int...)
zang(x, (y, z0, z1), v, w0, w1, w2)
gets called in SIL as:
%zang = function_ref @zang : $(x:Int, (y:Int, z:Int...), v:Int, w:Int...) -> ()
%zs = <<make array from %z1, %z2>>
%ws = <<make array from %w0, %w1, %w2>>
apply %zang(%x, %y, %zs, %v, %ws) : $(x:Int, (y:Int, z:Int...), v:Int, w:Int...) -> ()
@inout
arguments are passed into the entry point by address. The callee
does not take ownership of the referenced memory. The referenced memory must
be initialized upon function entry and exit. If the @inout
argument
refers to a fragile physical variable, then the argument is the address of that
variable. If the @inout
argument refers to a logical property, then the
argument is the address of a caller-owned writeback buffer. It is the caller's
responsibility to initialize the buffer by storing the result of the property
getter prior to calling the function and to write back to the property
on return by loading from the buffer and invoking the setter with the final
value. This Swift function:
func inout(_ x: inout Int) {
x = 1
}
gets lowered to SIL as:
sil @inout : $(@inout Int) -> () {
entry(%x : $*Int):
%1 = integer_literal $Int, 1
store %1 to %x
return
}
The method calling convention is currently identical to the freestanding function convention. Methods are considered to be curried functions, taking the "self" argument as their outer argument clause, and the method arguments as the inner argument clause(s). The "self" argument is thus passed last:
struct Foo {
func method(_ x:Int) -> Int {}
}
sil @Foo_method_1 : $((x : Int), @inout Foo) -> Int { ... }
The witness method calling convention is used by protocol witness methods in
witness tables. It is identical to the method
calling convention
except that its handling of generic type parameters. For non-witness methods,
the machine-level convention for passing type parameter metadata may be
arbitrarily dependent on static aspects of the function signature, but because
witnesses must be polymorphically dispatchable on their Self
type,
the Self
-related metadata for a witness must be passed in a maximally
abstracted manner.
In Swift's C module importer, C types are always mapped to Swift types considered trivial by SIL. SIL does not concern itself with platform ABI requirements for indirect return, register vs. stack passing, etc.; C function arguments and returns in SIL are always by value regardless of the platform calling convention.
SIL (and therefore Swift) cannot currently invoke variadic C functions.
Objective-C methods use the same argument and return value ownership rules as
ARC Objective-C. Selector families and the ns_consumed
,
ns_returns_retained
, etc. attributes from imported Objective-C definitions
are honored.
Applying a @convention(block)
value does not consume the block.
In SIL, the "self" argument of an Objective-C method is uncurried to the last argument of the uncurried type, just like a native Swift method:
@objc class NSString {
func stringByPaddingToLength(Int) withString(NSString) startingAtIndex(Int)
}
sil @NSString_stringByPaddingToLength_withString_startingAtIndex \
: $((Int, NSString, Int), NSString)
That self
is passed as the first argument at the IR level is abstracted
away in SIL, as is the existence of the _cmd
selector argument.
SIL supports two types of Type Based Alias Analysis (TBAA): Class TBAA and Typed Access TBAA.
Class instances and other heap object references are pointers at the
implementation level, but unlike SIL addresses, they are first class values and
can be capture
-d and aliased. Swift, however, is memory-safe and statically
typed, so aliasing of classes is constrained by the type system as follows:
- A
Builtin.NativeObject
may alias any native Swift heap object, including a Swift class instance, a box allocated byalloc_box
, or a thick function's closure context. It may not alias natively Objective-C class instances. - An
AnyObject
orBuiltin.BridgeObject
may alias any class instance, whether Swift or Objective-C, but may not alias non-class-instance heap objects. - Two values of the same class type
$C
may alias. Two values of related class type$B
and$D
, where there is a subclass relationship between$B
and$D
, may alias. Two values of unrelated class types may not alias. This includes different instantiations of a generic class type, such as$C<Int>
and$C<Float>
, which currently may never alias. - Without whole-program visibility, values of archetype or protocol type must
be assumed to potentially alias any class instance. Even if it is locally
apparent that a class does not conform to that protocol, another component
may introduce a conformance by an extension. Similarly, a generic class
instance, such as
$C<T>
for archetypeT
, must be assumed to potentially alias concrete instances of the generic type, such as$C<Int>
, becauseInt
is a potential substitution forT
.
A violation of the above aliasing rules only results in undefined
behavior if the aliasing references are dereferenced within Swift code.
For example,
__SwiftNativeNS[Array|Dictionary|String]
classes alias with
NS[Array|Dictionary|String]
classes even though they are not
statically related. Since Swift never directly accesses stored
properties on the Foundation classes, this aliasing does not pose a
danger.
Define a typed access of an address or reference as one of the following:
- Any instruction that performs a typed read or write operation upon the memory
at the given location (e.x.
load
,store
). - Any instruction that yields a typed offset of the pointer by performing a
typed projection operation (e.x.
ref_element_addr
,tuple_element_addr
).
With limited exceptions, it is undefined behavior to perform a typed access to an address or reference addressed memory is not bound to the relevant type.
This allows the optimizer to assume that two addresses cannot alias if there does not exist a substitution of archetypes that could cause one of the types to be the type of a subobject of the other. Additionally, this applies to the types of the values from which the addresses were derived via a typed projection.
Consider the following SIL:
struct Element {
var i: Int
}
struct S1 {
var elt: Element
}
struct S2 {
var elt: Element
}
%adr1 = struct_element_addr %ptr1 : $*S1, #S.elt
%adr2 = struct_element_addr %ptr2 : $*S2, #S.elt
The optimizer may assume that %adr1
does not alias with %adr2
because the values that the addresses are derived from (%ptr1
and
%ptr2
) have unrelated types. However, in the following example,
the optimizer cannot assume that %adr1
does not alias with
%adr2
because %adr2
is derived from a cast, and any subsequent
typed operations on the address will refer to the common Element
type:
%adr1 = struct_element_addr %ptr1 : $*S1, #S.elt
%adr2 = pointer_to_address %ptr2 : $Builtin.RawPointer to $*Element
Exceptions to typed access TBAA rules are only allowed for blessed
alias-introducing operations. This permits limited type-punning. The only
current exception is the non-struct pointer_to_address
variant. The
optimizer must be able to defensively determine that none of the roots of an
address are alias-introducing operations. An address root is the operation that
produces the address prior to applying any typed projections, indexing, or
casts. The following are valid address roots:
- Object allocation that generates an address, such as
alloc_stack
andalloc_box
. - Address-type function arguments. These are crucially not considered alias-introducing operations. It is illegal for the SIL optimizer to form a new function argument from an arbitrary address-type value. Doing so would require the optimizer to guarantee that the new argument is both has a non-alias-introducing address root and can be properly represented by the calling convention (address types do not have a fixed representation).
- A strict cast from an untyped pointer,
pointer_to_address [strict]
. It is illegal forpointer_to_address [strict]
to derive its address from an alias-introducing operation's value. A type punned address may only be produced from an opaque pointer via a non-strictpointer_to_address
at the point of conversion.
Address-to-address casts, via unchecked_addr_cast
, transparently
forward their source's address root, just like typed projections.
Address-type basic block arguments can be conservatively considered aliasing-introducing operations; they are uncommon enough not to matter and may eventually be prohibited altogether.
Although some pointer producing intrinsics exist, they do not need to be
considered alias-introducing exceptions to TBAA rules. Builtin.inttoptr
produces a Builtin.RawPointer
which is not interesting because by definition
it may alias with everything. Similarly, the LLVM builtins Builtin.bitcast
and Builtin.trunc|sext|zextBitCast
cannot produce typed pointers. These
pointer values must be converted to an address via pointer_to_address
before
typed access can occur. Whether the pointer_to_address
is strict determines
whether aliasing may occur.
Memory may be rebound to an unrelated type. Addresses to unrelated types may alias as long as typed access only occurs while memory is bound to the relevant type. Consequently, the optimizer cannot outright assume that addresses accessed as unrelated types are nonaliasing. For example, pointer comparison cannot be eliminated simply because the two addresses derived from those pointers are accessed as unrelated types at different program points.
In general, analyses can assume that independent values are independently assured of validity. For example, a class method may return a class reference:
bb0(%0 : $MyClass):
%1 = class_method %0 : $MyClass, #MyClass.foo
%2 = apply %1(%0) : $@convention(method) (@guaranteed MyClass) -> @owned MyOtherClass
// use of %2 goes here; no use of %1
strong_release %2 : $MyOtherClass
strong_release %1 : $MyClass
The optimizer is free to move the release of %1
to immediately
after the call here, because %2
can be assumed to be an
independently-managed value, and because Swift generally permits the
reordering of destructors.
However, some instructions do create values that are intrinsically
dependent on their operands. For example, the result of
ref_element_addr
will become a dangling pointer if the base is
released too soon. This is captured by the concept of value dependence,
and any transformation which can reorder of destruction of a value
around another operation must remain conscious of it.
A value %1
is said to be value-dependent on a value %0
if:
%1
is the result and%0
is the first operand of one of the following instructions:ref_element_addr
struct_element_addr
tuple_element_addr
unchecked_take_enum_data_addr
pointer_to_address
address_to_pointer
index_addr
index_raw_pointer
- possibly some other conversions
%1
is the result ofmark_dependence
and%0
is either of the operands.%1
is the value address of a box allocation instruction of which%0
is the box reference.%1
is the result of astruct
,tuple
, orenum
instruction and%0
is an operand.%1
is the result of projecting out a subobject of%0
withtuple_extract
,struct_extract
,unchecked_enum_data
,select_enum
, orselect_enum_addr
.%1
is the result ofselect_value
and%0
is one of the cases.%1
is a basic block parameter and%0
is the corresponding argument from a branch to that block.%1
is the result of aload
from%0
. However, the value dependence is cut after the first attempt to manage the value of%1
, e.g. by retaining it.- Transitivity: there exists a value
%2
which%1
depends on and which depends on%0
. However, transitivity does not apply to different subobjects of a struct, tuple, or enum.
Note, however, that an analysis is not required to track dependence
through memory. Nor is it required to consider the possibility of
dependence being established "behind the scenes" by opaque code, such
as by a method returning an unsafe pointer to a class property. The
dependence is required to be locally obvious in a function's SIL
instructions. Precautions must be taken against this either by SIL
generators (by using mark_dependence
appropriately) or by the user
(by using the appropriate intrinsics and attributes with unsafe
language or library features).
Only certain types of SIL value can carry value-dependence:
- SIL address types
- unmanaged pointer types:
@sil_unmanaged
typesBuiltin.RawPointer
- aggregates containing such a type, such as
UnsafePointer
, possibly recursively
- non-trivial types (but they can be independently managed)
This rule means that casting a pointer to an integer type breaks
value-dependence. This restriction is necessary so that reading an
Int
from a class doesn't force the class to be kept around!
A class holding an unsafe reference to an object must use some
sort of unmanaged pointer type to do so.
This rule does not include generic or resilient value types which
might contain unmanaged pointer types. Analyses are free to assume
that e.g. a copy_addr
of a generic or resilient value type yields
an independently-managed value. The extension of value dependence to
types containing obvious unmanaged pointer types is an affordance to
make the use of such types more convenient; it does not shift the
ultimate responsibility for assuring the safety of unsafe
language/library features away from the user.
These instructions allocate and deallocate memory.
sil-instruction ::= 'alloc_stack' '[dynamic_lifetime]'? sil-type (',' debug-var-attr)*
%1 = alloc_stack $T
// %1 has type $*T
Allocates uninitialized memory that is sufficiently aligned on the stack
to contain a value of type T
. The result of the instruction is the address
of the allocated memory.
alloc_stack
always allocates memory on the stack even for runtime-sized type.
alloc_stack
marks the start of the lifetime of the value; the
allocation must be balanced with a dealloc_stack
instruction to
mark the end of its lifetime. All alloc_stack
allocations must be
deallocated prior to returning from a function. If a block has multiple
predecessors, the stack height and order of allocations must be consistent
coming from all predecessor blocks. alloc_stack
allocations must be
deallocated in last-in, first-out stack order.
The dynamic_lifetime
attribute specifies that the initialization and
destruction of the stored value cannot be verified at compile time.
This is the case, e.g. for conditionally initialized objects.
The memory is not retainable. To allocate a retainable box for a value
type, use alloc_box
.
sil-instruction ::= 'alloc_ref'
('[' 'objc' ']')?
('[' 'stack' ']')?
('[' 'tail_elems' sil-type '*' sil-operand ']')*
sil-type
%1 = alloc_ref [stack] $T
%1 = alloc_ref [tail_elems $E * %2 : Builtin.Word] $T
// $T must be a reference type
// %1 has type $T
// $E is the type of the tail-allocated elements
// %2 must be of a builtin integer type
Allocates an object of reference type T
. The object will be initialized
with retain count 1; its state will be otherwise uninitialized. The
optional objc
attribute indicates that the object should be
allocated using Objective-C's allocation methods (+allocWithZone:
).
The optional stack
attribute indicates that the object can be allocated
on the stack instead on the heap. In this case the instruction must have
balanced with a dealloc_ref [stack]
instruction to mark the end of the
object's lifetime.
Note that the stack
attribute only specifies that stack allocation is
possible. The final decision on stack allocation is done during llvm IR
generation. This is because the decision also depends on the object size,
which is not necessarily known at SIL level.
The optional tail_elems
attributes specifies the amount of space to be
reserved for tail-allocated arrays of given element types and element counts.
If there are more than one tail_elems
attributes then the tail arrays are
allocated in the specified order.
The count-operand must be of a builtin integer type.
The instructions ref_tail_addr
and tail_addr
can be used to project
the tail elements.
The objc
attribute cannot be used together with tail_elems
.
sil-instruction ::= 'alloc_ref_dynamic'
('[' 'objc' ']')?
('[' 'tail_elems' sil-type '*' sil-operand ']')*
sil-operand ',' sil-type
%1 = alloc_ref_dynamic %0 : $@thick T.Type, $T
%1 = alloc_ref_dynamic [objc] %0 : $@objc_metatype T.Type, $T
%1 = alloc_ref_dynamic [tail_elems $E * %2 : Builtin.Word] %0 : $@thick T.Type, $T
// $T must be a class type
// %1 has type $T
// $E is the type of the tail-allocated elements
// %2 must be of a builtin integer type
Allocates an object of class type T
or a subclass thereof. The
dynamic type of the resulting object is specified via the metatype
value %0
. The object will be initialized with retain count 1; its
state will be otherwise uninitialized.
The optional tail_elems
and objc
attributes have the same effect as
for alloc_ref
. See alloc_ref
for details.
sil-instruction ::= 'alloc_box' sil-type (',' debug-var-attr)*
%1 = alloc_box $T
// %1 has type $@box T
Allocates a reference-counted @box
on the heap large enough to hold a value
of type T
, along with a retain count and any other metadata required by the
runtime. The result of the instruction is the reference-counted @box
reference that owns the box. The project_box
instruction is used to retrieve
the address of the value inside the box.
The box will be initialized with a retain count of 1; the storage will be
uninitialized. The box owns the contained value, and releasing it to a retain
count of zero destroys the contained value as if by destroy_addr
.
Releasing a box is undefined behavior if the box's value is uninitialized.
To deallocate a box whose value has not been initialized, dealloc_box
should be used.
sil-instruction ::= 'alloc_value_buffer' sil-type 'in' sil-operand
%1 = alloc_value_buffer $(Int, T) in %0 : $*Builtin.UnsafeValueBuffer
// The operand must have the exact type shown.
// The result has type $*(Int, T).
Given the address of an unallocated value buffer, allocate space in it for a value of the given type. This instruction has undefined behavior if the value buffer is currently allocated.
The type operand must be a lowered object type.