Skip to content

enhancements typeinference

scoder edited this page Aug 18, 2014 · 12 revisions

Enhancement proposal: Basic Type Inference

Author: Robert Bradshaw, Vitja Makarov

Status: Mostly implemented with some room for improvements.

Justification

Cython makes most of its speed gains, and derives it's ability to efficiently wrap external libraries, by explicitly declaring types. Many of these type declarations are redundant, and forgetting to declare a type can lead to drastic inefficiencies (e.g. converting back and forth between python objects and native C types). Reducing the volume of explicit type declarations will also bring Cython code closer to pure Python code.

Proposal

I propose that as a first pass undeclared variables be typed according to the union of the types of their assignments. Many objects (e.g. literal integers, lists, etc.) would already have types associated to them which would be propagated throughout the code. The most generic type would be a Python object. Thus for

def foo(self, int x):
    a = x
    a *= 1.5
    b = x*x
    c = x*x
    c = "hi"

The type of a would be a c double, the type of b a c int, and the type of c a Python object.

This would probably require a new phase (specifically, splitting analyze_types into two distinct phases, with a type resolution step between them). It would also probably require more unification of the Python and non-python types, which should be looked at in the context of the other proposals.

A more complete and powerful type inference engine should be developed, but would necessarily be more complicated as well.

Implementation

The current implementation is enabled for safe types by default and can be extended to unsafe (potentially overflowing etc.) types with the infer_types=True directive. Circular inference is only supported for a single variable.

Comments

  • locals() will not be an issue when doing type inference because it is read-only.

DagSverreSeljebotn:

tree transforms can be used for this (and that also advocates a split of analyze_types). An implementation strategy would then be first applying a more generic "find usages" transform that records all usages of a symbol and links them together, and then have a seperate type inference transform afterwards (the find usages transform has a lot more uses).

Clone this wiki locally