SpiderMonkey's optimizing JIT, IonMonkey, uses a number of different optimization strategies to speed up various operations. The most commonplace operations that are relevant for fast program execution are property accesses and function calls.
Optimization information is currently collected for the following operations:
- BinaryArith (
+-/*%
) - GetProperty (
obj.prop
) - SetProperty (
obj.prop = val
) - GetElement (
obj[elemName]
) - SetElement (
obj[elemName] = val
) - Call (
func(...)
)
At each operation site, IonMonkey tries a battery of strategies, from the most optimized but most restrictive to the least optimized but least restrictive. For each strategy attempted, its outcome is tracked. An outcome is either success or a reason why the strategy failed to apply.
This page documents the various optimization strategies and their outcomes. It provides information on what they attempt to do, what general level of speed-up they provide, what kind of program characteristics can prevent them from being used, and common ways to enable the engine to utilize that optimization.
Attempts to optimize an addition to string concatenation. The types of the operands are checked to contain a hint of being a concatenation. Both have to be string or one has to be a string and the other a type that easily can get converted to string (like numbers).
Attempts to do a numeric arithmetic based on operand types. One of the inputs need to be a numeric type and the other a simple arithmetic type.
Just like BinrayArith_SpecializedTypes tries to do a numeric arithmetic, but based on the observed types in the Baseline Compiler. If it succeeds it will roughly give same optimization as BinaryArith_SpecializedTypes, but the deduction is more fragile. In the best case one should try to get this optimization based on the input types.
Attempts to optimize a binary arithmetic using inline cache.
Last resort which cannot get specialized at all and takes the slow path to do the arithmetic.
Attempts to optimize an arguments.length
property access. This optimization
only works if the arguments object is used in well-understood ways within the
function. The function containing the arguments.length is allowed to use the
arguments object in the following ways without disabling this optimization:
- Access
arguments.length
- Access
arguments.callee
- Access individual args using
arguments[i]
- Save arguments into variables, as long as those variables cannot be accessed by any nested function, and as long as there exists no eval nywhere within the function or nested function definitions.
- Call a function using
f.apply(obj, arguments)
If the function contains any use of the arguments object that falls out of the
cases defined above, this optimization will be suppressed. In particular,
arguments
cannot be returned from the function, or passed as an argument into
calls (except for the apply
case above).
Attempts to optimize an arguments.callee
property access. This optimization
only works if the arguments object is used in well-understood ways within the
function. The function containing the arguments.callee
is allowed to use the
arguments object in the following ways without disabling this optimization:
- Access arguments.length
- Access arguments.callee
- Access individual args using arguments[i]
- Save arguments into variables, as long as those variables cannot be accessed by any nested function, and as long as there exists no eval nywhere within the function or nested function definitions.
- Call a function using
f.apply(obj, arguments)
If the function contains any use of the arguments
object that falls out of
the cases defined above, this optimization will be suppressed. In particular,
arguments cannot be returned from the function, or passed as an argument into
calls (except for the apply
example listed above).
Attempts to optimize an access to a property that seems to be a constant. It applies to property accesses on objects which are global-like in that there is only one instance of them per program. This includes global objects, object literals defined at the top-level of a script, and top-level function objects.
This optimization makes the assumption that a property that has not changed after it was first assigned, is likely a constant property. It then directly inlines the value of the property into hot code that accesses it. For example, in the following code:
var Constants = {};
Constants.N = 100;
function testArray(array) {
for (var i = 0; i < array.length; i++) {
if (array[i] > Constants.N)
return true;
}
return false;
}
Will have the loop compiled into the following when testArray
gets hot.
for (var i = 0; i < array.length; i++) {
if (array[i] > 100)
return true;
}
When this optimization is successful, property access is eliminated entirely and replaced with an inline constant.
Attempts to optimize reading a property that contains a uniquely-typed (or "singleton") object. With uniquely-typed objects, it is guaranteed that no other object has that same type. Unique (or "singleton") types are assigned to certain kinds of objects, like global objects, top-level functions, and object literals declared at the top-level of a script. If a property has always contained the same uniquely-typed object, then the engine can use the unique type to map back to a specific object, and eliminate the property access, replacing it with a constant reference to the object.
When this optimization is successful, property access is eliminated entirely and replaced with an inline constant. The different success and failure conditions are documented below:
Attempts to optimize a property access on window
which refers to
a property on the global object.
Optimizes accesses to properties on TypedObjects.
Optimizes access to a well-known regular property on an object. For this optimization to succeed, the property needs to be well-defined on the object. For objects constructed by constructor functions, this means that the property needs to be defined in the constructor, before any complex logic occurs within the constructor.
This is the best case for a regular "field" type property that is not turned into a constant. It compiles down to a single CPU-level load instruction.
function SomeConstructor() {
this.x = 10; // x is a definite slot property
this.y = 10; // y is a definite slot property
someComplicatedFunctionCall();
this.z = 20; // z is not a definite slot property.
}
In the above example, the properties x
and y
can be determined to always
exist on any instance of SomeConstructor
at definite locations, allowing
the engine to deterministically infer the position of x
without a shape
guard.
This optimization can fail for a number of reasons. If the types observed at the property access are polymorphic (more than one type), this optimization cannot succeed. Furthermore, even if the object type is monomorphic, the optimization will fail if the property being accessed is not a definite slot as described above.
Similar to GetProp_DefiniteSlot
. Unboxed property reads are possible on
properties which satisfy all the characteristics of a definite slot, and
additionally have been observed to only store values of one kind of value.
Consider the following constructor:
function Point(x, y) {
this.x = x;
this.y = y;
}
If only integers are ever stored in the x
and y
properties,
then the instances of Point
will be represented in an "unboxed" mode -
with the property values stored as raw 4-byte values within the object.
Objects which have the unboxed optimization are more compact.
Optimizes access to properties which are implemented by a getter function, where the getter is shared between multiple types.
This optimization applies most often when the property access site is polymorphic, but all the object types are derived variants of a single base class, where the property access refers to a getter on the base class.
Consider the following example: function Base() {} Base.prototype = { get x() { return 3; } };
function Derived1() {}
Derived1.prototype = Object.create(Base.prototype);
function Derived2() {}
Derived1.prototype = Object.create(Base.prototype);
If a property access for d.x
sees only instances of both Derived1
and
Derived2
for d
, it can optimize the access to x
to a call to the
getter function defined on Base
.
This optimization applies to shared getters on both pure JS objects as well as DOM objects.
Optimizes a polymorphic property access where there are only a few different types of objects seen, and the property on all of the different types is determinable through a shape-check.
If a property access is monomorphic and the property's location is determinable from the object's shape, but the property is not definite (see: GetProp_DefiniteProperty), then this optimization may be used.
Alternatively, if the property access is polymorphic, but only has a few different shapes observed at the access site, this optimization may be used.
This optimization compiles down to one-or more shape-guarded direct loads from the object. The following pseudocode describes the kind of machine code generated by this optimization:
if obj.shape == Shape1 then
obj.slots[0]
elif obj.shape == Shape2 then
obj.slots[5]
elif obj.shape == Shape3 then
obj.slots[2]
...
end
Attempts to optimize a situation where a property access of the form
window.PROP
can be directly translated into a property access on
the inner global object.
This optimization will always fail on property accesses which are not on the window object.
It is useful because accessing global names via the 'window' object is a common idiom in web programming.
This is the worst-case scenario for a property access optimization. This strategy is used when all the others fail. The engine simply inserts an inline cache at the property access site.
Inline caches start off as a jump to a separate piece of code called a "fallback". The fallback calls into the interpreter VM (which is very slow) to perform the operation, and then decides if the operation can be optimized in that particular case. If so, it generates a new "stub" (or freestanding piece of jitcode) and changes the inline cache to jump to the stub. The stub attempts to optimize further occurrences of that same kind of operation.
Inline caches are an order of magnitude slower than the other optimization strategies, and are an indication that the type inference engine has failed to collect enough information to guide the optimization process.
Optimizes access to properties which are implemented by a setter function, where the setter is shared between multiple types.
This optimization applies most often when the property access site is polymorphic, but all the object types are derived variants of a single base class, where the property access refers to a setter on the base class.
Consider the following example: function Base() {} Base.prototype = { set x(val) { ... } };
function Derived1() {}
Derived1.prototype = Object.create(Base.prototype);
function Derived2() {}
Derived1.prototype = Object.create(Base.prototype);
If a property write for d.x = val
sees only instances of both Derived1
and
Derived2
for d
, it can optimize the write to x
to a call to the
setter function defined on Base
.
This optimization applies to shared setters on both pure JS objects as well as DOM objects.
Optimizes accesses to properties on TypedObjects.
Optimizes a write to a well-known regular property on an object. For this optimization to succeed, the property needs to be well-defined on the object. For objects constructed by constructor functions, this means that the property needs to be defined in the constructor, before any complex logic occurs within the constructor.
This is the best case for a regular "field" type property that is not turned into a constant. It compiles down to a single CPU-level load instruction.
function SomeConstructor() {
this.x = 10; // x is a definite slot property
this.y = 10; // y is a definite slot property
someComplicatedFunctionCall();
this.z = 20; // z is not a definite slot property.
}
In the above example, the properties x
and y
can be determined to always
exist on any instance of SomeConstructor
at definite locations, allowing
the engine to deterministically infer the position of x
without a shape
guard.
This optimization can fail for a number of reasons. If the types observed at the property access are polymorphic (more than one type), this optimization cannot succeed. Furthermore, even if the object type is monomorphic, the optimization will fail if the property being written is not a definite slot as described above.
Similar to SetProp_DefiniteSlot
. Unboxed property writes are possible on
properties which satisfy all the characteristics of a definite slot, and
additionally have been observed to only store values of one kind of value.
Consider the following constructor:
function Point(x, y) {
this.x = x;
this.y = y;
}
If only integers are ever stored in the x
and y
properties,
then the instances of Point
will be represented in an "unboxed" mode -
with the property values stored as raw 4-byte values within the object.
Objects which have the unboxed optimization are more compact.
Optimizes a polymorphic property write where there are only a few different types of objects seen, and the property on all of the different types is determinable through a shape-check.
If a property write is monomorphic and the property's location is determinable from the object's shape, but the property is not definite (see: GetProp_DefiniteProperty), then this optimization may be used.
Alternatively, if the property write is polymorphic, but only has a few different shapes observed at the access site, this optimization may be used.
This optimization compiles down to one-or more shape-guarded direct stores to the object. The following pseudocode describes the kind of machine code generated by this optimization:
if obj.shape == Shape1 then
obj.slots[0] = val
elif obj.shape == Shape2 then
obj.slots[5] = val
elif obj.shape == Shape3 then
obj.slots[2] = val
...
end
This is the worst-case scenario for a property access optimization. This strategy is used when all the others fail. The engine simply inserts an inline cache at the property write site.
Inline caches start off as a jump to a separate piece of code called a "fallback". The fallback calls into the interpreter VM (which is very slow) to perform the operation, and then decides if the operation can be optimized in that particular case. If so, it generates a new "stub" (or freestanding piece of jitcode) and changes the inline cache to jump to the stub. The stub attempts to optimize further occurrences of that same kind of operation.
Inline caches are an order of magnitude slower than the other optimization strategies, and are an indication that the type inference engine has failed to collect enough information to guide the optimization process.
Attempts to optimized element accesses on array Typed Objects.
Attempts to optimize element accesses on densely packed array objects. Dense
arrays are arrays which do not have any 'holes'. This means that the array
has valid values for all indexes from 0
to length-1
.
Attempts to optimize element accesses on a typed array that can be determined to always refer to the same array object. If this optimization succeeds, the 'array' object is treated as a constant, and is not looked up or retrieved from a variable.
Attempts to optimize element accesses on a typed array.
Attempts to optimize element accesses on a string.
Attempts to optimize element accesses on the arguments
special object
available in functions. This optimization only works if the arguments
object is used in well-understood ways within the function. The function
containing the arguments.length is allowed to use the arguments object in
the following ways without disabling this optimization:
- Access
arguments.length
- Access
arguments.callee
- Access individual args using
arguments[i]
- Save
arguments
into variables, as long as those variables cannot be accessed by any nested function, and as long as there exists noeval
anywhere within the function or nested function definitions. - Call a function using
f.apply(obj, arguments)
If the function contains any use of the arguments object that falls out of the
cases defined above, this optimization will be suppressed. In particular,
arguments
cannot be returned from the function, or passed as an argument into
calls (except for the apply
case above).
Similar to GetEelem_Arguments, but optimizes cases where the access on
arguments
is happening within an inlined function. In these cases, an
access of the form arguments[i]
can be directly translated into a
direct reference to the corresponding argument value in the inlined call.
Consider the following: function foo(arg) { return bar(arg, 3); } function bar() { return arguments[0] + arguments[1]; }
In the above case, if foo
is compiled with Ion, and the call to bar
is inlined, then this optimization can transform the entire procedure to
the following pseudo-code:
compiled foo(arg):
// inlined call to bar(arg, 3) {
return arg + 3;
// }
This is the worst-case scenario for a element access optimization. This strategy is used when all the others fail. The engine simply inserts an inline cache at the property write site.
Inline caches start off as a jump to a separate piece of code called a "fallback". The fallback calls into the interpreter VM (which is very slow) to perform the operation, and then decides if the operation can be optimized in that particular case. If so, it generates a new "stub" (or freestanding piece of jitcode) and changes the inline cache to jump to the stub. The stub attempts to optimize further occurrences of that same kind of operation.
Inline caches are an order of magnitude slower than the other optimization strategies, and are an indication that the type inference engine has failed to collect enough information to guide the optimization process.
Attempts to optimized element writes on array Typed Objects.
Attempts to optimize element writes on a typed array that can be determined to always refer to the same array object. If this optimization succeeds, the 'array' object is treated as a constant, and is not looked up or retrieved from a variable.
Attempts to optimize element writes on a typed array.
Attempts to optimize element writes on densely packed array objects. Dense
arrays are arrays which do not have any 'holes'. This means that the array
has valid values for all indexes from 0
to length-1
.
Attempts to optimize element writes to the arguments
special object
available in functions. This optimization only works if the arguments
object is used in well-understood ways within the function. The function
containing the arguments.length is allowed to use the arguments object in
the following ways without disabling this optimization:
- Access
arguments.length
- Access
arguments.callee
- Access individual args using
arguments[i]
- Save
arguments
into variables, as long as those variables cannot be accessed by any nested function, and as long as there exists noeval
anywhere within the function or nested function definitions. - Call a function using
f.apply(obj, arguments)
If the function contains any use of the arguments object that falls out of the
cases defined above, this optimization will be suppressed. In particular,
arguments
cannot be returned from the function, or passed as an argument into
calls (except for the apply
case above).
This is the worst-case scenario for a element write optimization. This strategy is used when all the others fail. The engine simply inserts an inline cache at the property write site.
Inline caches start off as a jump to a separate piece of code called a "fallback". The fallback calls into the interpreter VM (which is very slow) to perform the operation, and then decides if the operation can be optimized in that particular case. If so, it generates a new "stub" (or freestanding piece of jitcode) and changes the inline cache to jump to the stub. The stub attempts to optimize further occurrences of that same kind of operation.
Inline caches are an order of magnitude slower than the other optimization strategies, and are an indication that the type inference engine has failed to collect enough information to guide the optimization process.
A function call f(x)
usually pushes a frame onto the call stack. Inlining a
call site conceptually copies the body of the callee function and pastes it
in place of the call site and avoids pushing a new execution frame. Usually,
hot functions do well to be inlined. This is one of the most important
optimizations the JIT performs.
Ion inlines both interpreted (i.e., written in JavaScript) functions and
native (i.e., built-ins such as Math.sin
implemented in C++).
A successfully inlined call site has the outcome Inlined.
Failure to inline comes in two flavors: unable (e.g., unable to determine exact callee) and unwilling (e.g., heuristics concluded that the time-space tradeoff will not pay off).