Skip to content

Speculative specialization is the key optimization #26

@markshannon

Description

@markshannon

Not sure where to put this, so I'm putting it an issue for now.

Speculative specialization

Speculative specialization is absolutely the key optimization for any dynamic language. The evidence for this is overwhelming. Most optimizations for dynamic languages will do one of these:

  • Specialize
  • Exploit specialization
  • Enable specialization

Examples:

  • Type profiling: enable specializations
  • Compilation to machine code: Exploits specialization (we can't usefully compile until we have something like statically type code)
  • Inline caches: enable specialization and specialize (that's why inline caches are good, they both enable and perform specialization)

Definition

Speculative specialization is where the VM runs fast for some code when some predicate holds, expecting that that predicate will nearly always be true. It is also necessary that the VM runs correctly when that predicate does not hold, although it may be very slow in that case.

Analysis

Ttotal = Tfast * p + Tslow *(1-p) where p is the probability of the predicate holding.

An optimization is useful if Tfast * p + Tslow *(1-p) < Tgeneral where Tgeneral is the time for the unoptimized operation.

Provided p ≃ 1 then it doesn't matter how large Tslow is, because if p ≃ 1 then Ttotal ≃ Tfast
For smaller values of p, some care needs to be taken that Tslow isn't too much slower than Tgeneral, the unoptimized case.
If Tslow ≃ Tgeneral then it doesn't matter too much what p is, although if p is small, the performance gains will be small.

Practicalities

Although specialization is often thought to be by type, or something similar like a type, fundamentally specialization can be on any predicate. However we want things to be fast, so the predicates must be fast and thus simple. Typically they should be a few (ideally zero) simple operations on an object, followed by a test-and-branch.

if (predicate(obj) != EXPECTED_VALUE) {
     goto slow_path;
}

Often we will have to use some a proxy for the predicate we want, as the exact predicate is too slow.
If predicate p1 is expensive to compute, we can use a proxy predicate p2 provided p2 is fast to compute and p2(obj) ⇒ p1(obj). Note that we do not require that p1(obj) ⇒ p2(obj).

For example, suppose we want to specialize an operation depending on whether some attribute of the class of an object is as expected, that is type(obj).attr is EXPECTED.
However attribute lookup on a type is expensive, so we use the version of the type as a proxy.

The expensive test:

PyType_Lookup(obj->ob_type, "attr") == EXPECTED_OBJECT

can be replaced with the much cheaper test:

obj->ob_type->tp_version == EXPECTED_VERSION

We have to accept that the faster predicate may produce false negatives.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions