! soundness, type system
The type system <type-system>
of WebAssembly is sound, implying both type safety and memory safety with respect to the WebAssembly semantics. For example:
- All types declared and derived during validation are respected at run time; e.g., every
local <syntax-local>
orglobal <syntax-global>
variable will only contain type-correct values, everyinstruction <syntax-instr>
will only be applied to operands of the expected type, and everyfunction <syntax-func>
invocation <exec-invocation>
always evaluates to a result of the right type (if it does nottrap <trap>
or diverge). - No memory location will be read or written except those explicitly defined by the program, i.e., as a
local <syntax-local>
, aglobal <syntax-global>
, an element in atable <syntax-table>
, or a location within a linearmemory <syntax-mem>
. - There is no undefined behavior, i.e., the
execution rules <exec>
cover all possible cases that can occur in avalid <valid>
program, and the rules are mutually consistent.
Soundness also is instrumental in ensuring additional properties, most notably, encapsulation of function and module scopes: no locals <syntax-local>
can be accessed outside their own function and no module <syntax-module>
components can be accessed outside their own module unless they are explicitly exported <syntax-export>
or imported <syntax-import>
.
The typing rules defining WebAssembly validation <valid>
only cover the static components of a WebAssembly program. In order to state and prove soundness precisely, the typing rules must be extended to the dynamic components of the abstract runtime <syntax-runtime>
, that is, the store <syntax-store>
, configurations <syntax-config>
, and administrative instructions <syntax-instr-admin>
.1
value, value type, result, result type, trap
Results <syntax-result>
can be classified by result types <syntax-resulttype>
as follows.
- For each
value <syntax-val>
$\val_i$ in$\val^\ast$ :- The value
$\val_i$ isvalid <valid-val>
with somevalue type <syntax-valtype>
ti.
- The value
- Let t* be the concatenation of all ti.
- Then the result is valid with
result type <syntax-resulttype>
[t*].
- The result is valid with
result type <syntax-resulttype>
[t*], for any sequence t* ofvalue types <syntax-valtype>
.
The following typing rules specify when a runtime store <syntax-store>
S is valid. A valid store must consist of function <syntax-funcinst>
, table <syntax-tableinst>
, memory <syntax-meminst>
, global <syntax-globalinst>
, and module <syntax-moduleinst>
instances that are themselves valid, relative to S.
To that end, each kind of instance is classified by a respective function <syntax-functype>
, table <syntax-tabletype>
, memory <syntax-memtype>
, or global <syntax-globaltype>
type. Module instances are classified by module contexts, which are regular contexts <context>
repurposed as module types describing the index spaces <syntax-index>
defined by a module.
store, function instance, table instance, memory instance, global instance, function type, table type, memory type, global type
- Each
function instance <syntax-funcinst>
$\funcinst_i$ in$S.\SFUNCS$ must bevalid <valid-funcinst>
with somefunction type <syntax-functype>
$\functype_i$ . - Each
table instance <syntax-tableinst>
$\tableinst_i$ in$S.\STABLES$ must bevalid <valid-tableinst>
with sometable type <syntax-tabletype>
$\tabletype_i$ . - Each
memory instance <syntax-meminst>
$\meminst_i$ in$S.\SMEMS$ must bevalid <valid-meminst>
with somememory type <syntax-memtype>
$\memtype_i$ . - Each
global instance <syntax-globalinst>
$\globalinst_i$ in$S.\SGLOBALS$ must bevalid <valid-globalinst>
with someglobal type <syntax-globaltype>
$\globaltype_i$ . - Each
element instance <syntax-eleminst>
$\eleminst_i$ in$S.\SELEMS$ must bevalid <valid-eleminst>
. - Each
data instance <syntax-datainst>
$\datainst_i$ in$S.\SDATAS$ must bevalid <valid-datainst>
. - Then the store is valid.
function type, function instance
- The
function type <syntax-functype>
$\functype$ must bevalid <valid-functype>
. - The
module instance <syntax-moduleinst>
$\moduleinst$ must bevalid <valid-moduleinst>
with somecontext <context>
C. - Under
context <context>
C, thefunction <syntax-func>
$\func$ must bevalid <valid-func>
withfunction type <syntax-functype>
$\functype$ . - Then the function instance is valid with
function type <syntax-functype>
$\functype$ .
function type, function instance, host function
- The
function type <syntax-functype>
$\functype$ must bevalid <valid-functype>
. - Let [t1*] → [t2*] be the
function type <syntax-functype>
$\functype$ . - For every
valid <valid-store>
store <syntax-store>
S1extending <extend-store>
S and every sequence$\val^\ast$ ofvalues <syntax-val>
whosetypes <valid-val>
coincide with t1*:-
Executing <exec-invoke-host>
$\X{hf}$ in store S1 with arguments$\val^\ast$ has a non-empty set of possible outcomes. - For every element R of this set:
- Either R must be ⊥ (i.e., divergence).
- Or R consists of a
valid <valid-store>
store <syntax-store>
S2extending <extend-store>
S1 and aresult <syntax-result>
$\result$ whosetype <valid-result>
coincides with [t2*].
-
- Then the function instance is valid with
function type <syntax-functype>
$\functype$ .
Note
This rule states that, if appropriate pre-conditions about store and arguments are satisfied, then executing the host function must satisfy appropriate post-conditions about store and results. The post-conditions match the ones in the execution rule <exec-invoke-host>
for invoking host functions.
Any store under which the function is invoked is assumed to be an extension of the current store. That way, the function itself is able to make sufficient assumptions about future stores.
table type, table instance, limits, function address
- The
table type <syntax-tabletype>
$\limits~t$ must bevalid <valid-tabletype>
. - The length of
$\reff^\ast$ must equal$\limits.\LMIN$ . - For each
reference <syntax-ref>
$\reff_i$ in the table's elements$\reff^n$ :- The
reference <syntax-ref>
$\reff_i$ must bevalid <valid-ref>
withreference type <syntax-reftype>
t.
- The
- Then the table instance is valid with
table type <syntax-tabletype>
$\limits~t$ .
memory type, memory instance, limits, byte
- The
memory type <syntax-memtype>
$\{\LMIN~n, \LMAX~m^?\}$ must bevalid <valid-memtype>
. - The length of b* must equal
$\limits.\LMIN$ multiplied by thepage size <page-size>
$64\,\F{Ki}$ . - Then the memory instance is valid with
memory type <syntax-memtype>
$\limits$ .
global type, global instance, value, mutability
- The
global type <syntax-globaltype>
$\mut~t$ must bevalid <valid-globaltype>
. - The
value <syntax-val>
$\val$ must bevalid <valid-val>
withvalue type <syntax-valtype>
t. - Then the global instance is valid with
global type <syntax-globaltype>
$\mut~t$ .
element instance, reference
- For each
reference <syntax-ref>
$\reff_i$ in the elements$\reff^n$ :- The
reference <syntax-ref>
$\reff_i$ must bevalid <valid-ref>
withreference type <syntax-reftype>
t.
- The
- Then the table instance is valid.
data instance, byte
- The data instance is valid.
external type, export instance, name, external value
- The
external value <syntax-externval>
$\externval$ must bevalid <valid-externval>
with someexternal type <syntax-externtype>
$\externtype$ . - Then the export instance is valid.
module instance, context
- Each
function type <syntax-functype>
$\functype_i$ in$\moduleinst.\MITYPES$ must bevalid <valid-functype>
. - For each
function address <syntax-funcaddr>
$\funcaddr_i$ in$\moduleinst.\MIFUNCS$ , theexternal value <syntax-externval>
$\EVFUNC~\funcaddr_i$ must bevalid <valid-externval-func>
with someexternal type <syntax-externtype>
$\ETFUNC~\functype'_i$ . - For each
table address <syntax-tableaddr>
$\tableaddr_i$ in$\moduleinst.\MITABLES$ , theexternal value <syntax-externval>
$\EVTABLE~\tableaddr_i$ must bevalid <valid-externval-table>
with someexternal type <syntax-externtype>
$\ETTABLE~\tabletype_i$ . - For each
memory address <syntax-memaddr>
$\memaddr_i$ in$\moduleinst.\MIMEMS$ , theexternal value <syntax-externval>
$\EVMEM~\memaddr_i$ must bevalid <valid-externval-mem>
with someexternal type <syntax-externtype>
$\ETMEM~\memtype_i$ . - For each
global address <syntax-globaladdr>
$\globaladdr_i$ in$\moduleinst.\MIGLOBALS$ , theexternal value <syntax-externval>
$\EVGLOBAL~\globaladdr_i$ must bevalid <valid-externval-global>
with someexternal type <syntax-externtype>
$\ETGLOBAL~\globaltype_i$ . - For each
element address <syntax-elemaddr>
$\elemaddr_i$ in$\moduleinst.\MIELEMS$ , theelement instance <syntax-eleminst>
$S.\SELEMS[\elemaddr_i]$ must bevalid <valid-eleminst>
. - For each
data address <syntax-dataaddr>
$\dataaddr_i$ in$\moduleinst.\MIDATAS$ , thedata instance <syntax-datainst>
$S.\SDATAS[\dataaddr_i]$ must bevalid <valid-datainst>
. - Each
export instance <syntax-exportinst>
$\exportinst_i$ in$\moduleinst.\MIEXPORTS$ must bevalid <valid-exportinst>
. - For each
export instance <syntax-exportinst>
$\exportinst_i$ in$\moduleinst.\MIEXPORTS$ , thename <syntax-name>
$\exportinst_i.\EINAME$ must be different from any other name occurring in$\moduleinst.\MIEXPORTS$ . - Let
${\functype'}^\ast$ be the concatenation of all$\functype'_i$ in order. - Let
$\tabletype^\ast$ be the concatenation of all$\tabletype_i$ in order. - Let
$\memtype^\ast$ be the concatenation of all$\memtype_i$ in order. - Let
$\globaltype^\ast$ be the concatenation of all$\globaltype_i$ in order. - Then the module instance is valid with
context <context>
$\{\CTYPES~\functype^\ast, \CFUNCS~{\functype'}^\ast, \CTABLES~\tabletype^\ast, \CMEMS~\memtype^\ast, \CGLOBALS~\globaltype^\ast\}$ .
configuration, administrative instruction, store, frame
To relate the WebAssembly type system <valid>
to its execution semantics <exec>
, the typing rules for instructions <valid-instr>
must be extended to configurations <syntax-config>
S; T, which relates the store <syntax-store>
to execution threads <syntax-thread>
.
Configurations and threads are classified by their result type <syntax-resulttype>
. In addition to the store S, threads are typed under a return type
Finally, frames <syntax-frame>
are classified with frame contexts, which extend the module contexts <module-context>
of a frame's associated module instance <syntax-moduleinst>
with the locals <syntax-local>
that the frame contains.
result type, thread
- The
store <syntax-store>
S must bevalid <valid-store>
. - Under no allowed return type, the
thread <syntax-thread>
T must bevalid <valid-thread>
with someresult type <syntax-resulttype>
[t?]. - Then the configuration is valid with the
result type <syntax-resulttype>
[t?].
thread, frame, instruction, result type, context
- Let
$\resulttype^?$ be the current allowed return type. - The
frame <syntax-frame>
F must bevalid <valid-frame>
with acontext <context>
C. - Let C′ be the same
context <context>
as C, but with set to$\resulttype^?$ . - Under context C′, the instruction sequence
$\instr^\ast$ must bevalid <valid-instr-seq>
with some type [] → [t?]. - Then the thread is valid with the
result type <syntax-resulttype>
[t?].
frame, local, module instance, value, value type, context
- The
module instance <syntax-moduleinst>
$\moduleinst$ must bevalid <valid-moduleinst>
with somemodule context <module-context>
C. - Each
value <syntax-val>
$\val_i$ in$\val^\ast$ must bevalid <valid-val>
with somevalue type <syntax-valtype>
ti. - Let t* the concatenation of all ti in order.
- Let C′ be the same
context <context>
as C, but with thevalue types <syntax-valtype>
t* prepended to the vector. - Then the frame is valid with
frame context <frame-context>
C′.
administrative instruction, value type, context, store
Typing rules for administrative instructions <syntax-instr-admin>
are specified as follows. In addition to the context <context>
C, typing of these instructions is defined under a given store <syntax-store>
S. To that end, all previous typing judgements administrative instructions <valid-instr-admin>
given below.
trap
- The instruction is valid with type [t1*] → [t2*], for any sequences of
value types <syntax-valtype>
t1* and t2*.
extern address
- The instruction is valid with type
$[] \to [\EXTERNREF]$ .
function address, extern value, extern type, function type
- The
external function value <syntax-externval>
$\EVFUNC~\funcaddr$ must bevalid <valid-externval-func>
withexternal function type <syntax-externtype>
$\ETFUNC \functype$ . - Then the instruction is valid with type
$[] \to [\FUNCREF]$ .
function address, extern value, extern type, function type
- The
external function value <syntax-externval>
$\EVFUNC~\funcaddr$ must bevalid <valid-externval-func>
withexternal function type <syntax-externtype>
$\ETFUNC ([t_1^\ast] \to [t_2^\ast])$ . - Then the instruction is valid with type [t1*] → [t2*].
label, instruction, result type
- The instruction sequence
$\instr_0^\ast$ must bevalid <valid-instr-seq>
with some type [t1n] → [t2?]. - Let C′ be the same
context <context>
as C, but with theresult type <syntax-resulttype>
[t1n] prepended to the vector. - Under context C′, the instruction sequence
$\instr^\ast$ must bevalid <valid-instr-seq>
with type [] → [t2?]. - Then the compound instruction is valid with type [] → [t2?].
frame, instruction, result type
- Under the return type [tn], the
thread <syntax-frame>
$F; \instr^\ast$ must bevalid <valid-frame>
withresult type <syntax-resulttype>
[tn]. - Then the compound instruction is valid with type [] → [tn].
! store extension, store
Programs can mutate the store <syntax-store>
and its contained instances. Any such modification must respect certain invariants, such as not removing allocated instances or changing immutable definitions. While these invariants are inherent to the execution semantics of WebAssembly instructions <exec-instr>
and modules <exec-instantiation>
, host functions <syntax-hostfunc>
do not automatically adhere to them. Consequently, the required invariants must be stated as explicit constraints on the invocation <exec-invoke-host>
of host functions. Soundness only holds when the embedder <embedder>
ensures these constraints.
The necessary constraints are codified by the notion of store extension: a store state S′ extends state S, written
Note
Extension does not imply that the new store is valid, which is defined separately above <valid-store>
.
store, function instance, table instance, memory instance, global instance
- The length of
$S.\SFUNCS$ must not shrink. - The length of
$S.\STABLES$ must not shrink. - The length of
$S.\SMEMS$ must not shrink. - The length of
$S.\SGLOBALS$ must not shrink. - The length of
$S.\SELEMS$ must not shrink. - The length of
$S.\SDATAS$ must not shrink. - For each
function instance <syntax-funcinst>
$\funcinst_i$ in the original$S.\SFUNCS$ , the new function instance must be anextension <extend-funcinst>
of the old. - For each
table instance <syntax-tableinst>
$\tableinst_i$ in the original$S.\STABLES$ , the new table instance must be anextension <extend-tableinst>
of the old. - For each
memory instance <syntax-meminst>
$\meminst_i$ in the original$S.\SMEMS$ , the new memory instance must be anextension <extend-meminst>
of the old. - For each
global instance <syntax-globalinst>
$\globalinst_i$ in the original$S.\SGLOBALS$ , the new global instance must be anextension <extend-globalinst>
of the old. - For each
element instance <syntax-eleminst>
$\eleminst_i$ in the original$S.\SELEMS$ , the new global instance must be anextension <extend-eleminst>
of the old. - For each
data instance <syntax-datainst>
$\datainst_i$ in the original$S.\SDATAS$ , the new global instance must be anextension <extend-datainst>
of the old.
function instance
- A function instance must remain unchanged.
table instance
- The
table type <syntax-tabletype>
$\tableinst.\TITYPE$ must remain unchanged. - The length of
$\tableinst.\TIELEM$ must not shrink.
memory instance
- The
memory type <syntax-memtype>
$\meminst.\MITYPE$ must remain unchanged. - The length of
$\meminst.\MIDATA$ must not shrink.
global instance, value, mutability
- The
global type <syntax-globaltype>
$\globalinst.\GITYPE$ must remain unchanged. - Let
$\mut~t$ be the structure of$\globalinst.\GITYPE$ . - If
$\mut$ is , then thevalue <syntax-val>
$\globalinst.\GIVALUE$ must remain unchanged.
element instance
- The vector
$\eleminst.\EIELEM$ must either remain unchanged or shrink to length 0.
data instance
- The vector
$\datainst.\DIDATA$ must either remain unchanged or shrink to length 0.
! preservation, ! progress, soundness, configuration, thread, terminal configuration, instantiation, invocation, validity, module
Given the definition of valid configurations <valid-config>
, the standard soundness theorems hold.2
Theorem (Preservation). If a configuration <syntax-config>
S; T is valid <valid-config>
with result type <syntax-resulttype>
[t*] (i.e., extension <extend-store>
of S (i.e.,
A terminal thread <syntax-thread>
is one whose sequence of instructions <syntax-instr>
is a result <syntax-result>
. A terminal configuration is a configuration whose thread is terminal.
Theorem (Progress). If a configuration <syntax-config>
S; T is valid <valid-config>
(i.e., result type <syntax-resulttype>
[t*]), then either it is terminal, or it can step to some configuration S′; T′ (i.e.,
From Preservation and Progress the soundness of the WebAssembly type system follows directly.
Corollary (Soundness). If a configuration <syntax-config>
S; T is valid <valid-config>
(i.e., result type <syntax-resulttype>
[t*]), then it either diverges or takes a finite number of steps to reach a terminal configuration S′; T′ (i.e., extension <extend-store>
of S (i.e.,
In other words, every thread in a valid configuration either runs forever, traps, or terminates with a result that has the expected type. Consequently, given a valid store <valid-store>
, no computation defined by instantiation <exec-instantiation>
or invocation <exec-invocation>
of a valid module can "crash" or otherwise (mis)behave in ways not covered by the execution <exec>
semantics given in this specification.
The formalization and theorems are derived from the following article: Andreas Haas, Andreas Rossberg, Derek Schuff, Ben Titzer, Dan Gohman, Luke Wagner, Alon Zakai, JF Bastien, Michael Holman. _. Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM 2017.↩
A machine-verified version of the formalization and soundness proof is described in the following article: Conrad Watt. _. Proceedings of the 7th ACM SIGPLAN Conference on Certified Programs and Proofs (CPP 2018). ACM 2018.↩