Type inference prototype based on static analysis #92

0x7CFE · 2016-06-18T12:35:26Z

This PR is my first rather independent (or naïve?) attempt to realize how type inference may be used to aid JIT VM. Underlying concept is not based on any well-known theory like Hindley-Milner type system or Martin Löf's Intuitionistic type theory. It's the result of a pure meditation on Smalltalk bytecodes.

Surprisingly enough, Smalltalk being fully dynamic in it's nature is still very regular in terms of it's memory access and control flow. This really helps when we try to perform static analysis.

Current implementation concentrates on the method temporaries and gives up completely when it faces object fields. However, I believe that in local context it is still possible to infer the object's fields to fully unlock further analysis and optimizations, like stack allocation, GC root elimination, and of course TBAA.

Current implementation is powerful enough to infer self assigning temporaries within a loop and even in complex closure contexts.

For example, the following method is inferred completely:

testInference |sum|
    sum <- 0.
    1 to: 100 do:
        [ :x | sum <- sum + x ].
    ^sum

Here analyzer proves that sum variable will have SmallInt type in it's scope which spans across the follwing methods: Undefined>>testInference, Number>>to:do:, Block>>value: and finally, the block Undefined>>testInference@8. Integer overflow is currently undefined.

This allows IR generator to encode operations on sum directly as i32 without any worries about GC or dynamic dispatch.

Current inference scheme highly uses method monomorphisation and specialization, which helps to perform calculations at compile time.

For example, this code is reduced to a single literal value (the result) at compile time:

fibonacci: n
    n < 3 ifTrue: [ ^1 ].
    ^ (self fibonacci: n - 2) + (self fibonacci: n - 1)

Type inference works even in case of recurring contexts and correctly solves chicken or egg problem. Consider the listing of the Collection>>sort:

sort: criteria | left right mediane |
    (self size < 2) ifTrue: [^self].

    mediane <- self popFirst.

    left  <- List new.
    right <- List new.
    self do: [ :x |
        (criteria value: x value: mediane)
            ifTrue:  [ left  add: x ]
            ifFalse: [ right add: x ] ].

    left  <- left  sort: criteria.
    right <- right sort: criteria.

    right add: mediane.
    ^ left appendList: right

This is suboptimal implementation of the Quicksort algorithm used here only for testing purposes. It performs two recursive calls to Collection>>sort: when sorting left and right sublists. The main difficulty here is to infer the return value of the Collection>>sort:. For example, analysis of the left refers to the outer context which at that point is not inferred completely, hence no return type available.

However, current implementation correctly solves the problem using a fact, that every recursion has it's base which should be evaluated prior to the next recursive call. By propagating the base return type from the outer context to the inner one we succeed in the whole process. See the log for an example of such inference. You may also see the resulting call graph as rendered svg or graphviz source.

Static analysis along with runtime statistics and polymorphic method cacheing provides enough information for effective dispatch of the Smalltalk code.

See also: #56, #58.

The implementation is not complete yet, there are a lot of things to do:

Also branch optimization technique is proposed, however it is currently lead to a malformed graph Issue: #32

Issue: #32

This allows us to classify graph edges during graph walk.

Previous version of optimizer merged taus incorrectly. Method which produced error was Interval>>do: Producer1 \ { Aggregator <- Consumer } x 3 / Producer2 \ { Aggregator <- Consumer } x 3 Producer3 / Correct solution should be Producer1 \ / Consumer1 Aggregator - Consumer2 / \ Consumer3 Producer2 \ / Consumer4 Aggregator - Consumer5 Producer3 / \ Consumer6 but due to incorrect handling of pending nodes lists algorithm yielded the following result: Producer1 \ Producer2 - Aggregator <- Consumer x 6 Producer3 / So there was a single aggregator node that was referred by all six consumers.

AssignX instructions leave their argument on the stack. That was causing problems during processing of argument requests.

Issue: #17

It allows to compare any two types, store them in STL container such as std::set, use them as a key in std::map and use composition operators during type inference procedure. Operator | is a disjunction-like operator used to get sum of several possible types within a type. For example: 2 | 2 -> 2 2 | 3 -> (2, 3) 2 | * -> (2, *) (Object) | (SmallInt) -> ((Object), (SmallInt)) This operator may be used to aggregate possible types within a linear sequence where several type outcomes are possible: x <- y isNil ifTrue: [ nil ] ifFalse: [ 42 ]. In this case x will have composite type (nil, 42). On the other hand, when dealing with loops we need some kind of a reduction operator that will act as a conjunction: 2 & 2 -> 2 2 & 3 -> (SmallInt) 2 & (SmallInt) -> (SmallInt) <any type> & * -> * (SmallInt) & (Object) -> * This operator is used during induction run of the type analyzer to prove that variable does not leave it's local type domain, i.e it's type is not reduced to a *. Issue: #17

Meta info is very useful during type analysis. It helps to make decisions based on graph structure. In future, more flags will be added. Issue: #17

Issue: #17

This code need to be refactored properly. In case if both operands are literal, then result may be defined as literal too. Otherwise primitive should "fail" by allowing control flow to pass further. For literal calculation it is best to use existing code for software VM. Issue: #17

Issue: #92

coveralls · 2016-08-13T17:14:58Z

Coverage decreased (-12.7%) to 48.582% when pulling 34c1048 on feature/17/type_inference into ff4d76d on develop.

0x7CFE added 30 commits June 18, 2016 18:29

Adds BranchNode for easy operation on branches

cd3ffe7

Also branch optimization technique is proposed, however it is currently lead to a malformed graph Issue: #32

Fixes branch node visualization

0e68d06

Issue: #32

Fixes type.txt

113bed0

Adds stub for TauLinker

1688a7d

Refactors GraphWalker to use node colors

90b42b2

This allows us to classify graph edges during graph walk.

Adds sample back edge classifier

1dba10d

Minor fixes in GraphLinker

b77dd54

Removes Forward- and Backward- Walkers as violating the LSP

c91b043

Adds tau node linkage and optimization logic

8dcb48f

Adds logic to render tau nodes

d4a7be7

Adds accumulated path to GraphWalker, BackEdgeDetector refactoring

6c9105b

draft: refactors TauLinker to track back edges

0b67873

Tau nodes now store back edge flag in incoming list

2fcf67e

Fixes graph linker in case of assign node first in the domain

7e70c0d

AssignX instructions leave their argument on the stack. That was causing problems during processing of argument requests.

Cleans up control graph visualizer

61a2543

Minor fixes in graph api

ec1644c

Adds basic logic of type analyzer and inference API

5b3a637

Issue: #17

Adds Type::toString()

44a1370

Issue: #17

Adds CallContext::operator[index]

3ac8c0a

Issue: #17

Adds const cast for BranchNode

97ea0cb

Issue: #17

Fixes analyzer and context, adds handling of conditional branches

d6db564

Issue: #17

Hides trace messages in ControlGraph under condition

1cc3e65

Adds meta information to control graph

12d5460

Meta info is very useful during type analysis. It helps to make decisions based on graph structure. In future, more flags will be added. Issue: #17

Adds core inference logic to TypeAnalyzer

6c8a069

Issue: #17

Fixes asserts

3adc4dc

Adds more inference logic

4a921d9

Adds inference for instantiation and get class primitives

d53e50d

Issue: #17

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

3a54498

Issue: #92

Frees mem in TypeSystem

63f134e

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 3a54498 to 9cfb35d Compare August 13, 2016 04:22

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

9cfb35d

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 9cfb35d to 6c130b3 Compare August 13, 2016 04:49

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

6c130b3

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 6c130b3 to c8e71b9 Compare August 13, 2016 08:49

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

c8e71b9

Issue: #92

kpp force-pushed the feature/17/type_inference branch from c8e71b9 to 1c03337 Compare August 13, 2016 09:31

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

1c03337

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 1c03337 to 8dd1ffa Compare August 13, 2016 10:11

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

8dd1ffa

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 8dd1ffa to 9c654c5 Compare August 13, 2016 10:36

kpp added a commit that referenced this pull request Aug 13, 2016

Fixes automkdir for graphviz

a1c3cb7

Issue: #92

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

9c654c5

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 9c654c5 to a187a4b Compare August 13, 2016 10:48

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

a187a4b

Issue: #92

kpp force-pushed the feature/17/type_inference branch from a187a4b to cc8290a Compare August 13, 2016 10:56

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

cc8290a

Issue: #92

kpp force-pushed the feature/17/type_inference branch from cc8290a to 5dd2d12 Compare August 13, 2016 11:39

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

5dd2d12

Issue: #92

kpp added 2 commits August 13, 2016 17:19

Fixes case 2 & (SmallInt) -> (SmallInt)

62019e7

Issue: #92

Fixes automkdir for graphviz

033a3e8

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 5dd2d12 to 06d0901 Compare August 13, 2016 14:21

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

06d0901

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 06d0901 to 5d23406 Compare August 13, 2016 16:09

kpp added a commit that referenced this pull request Aug 13, 2016

Adds basic tests and patterns for type inference

5d23406

Issue: #92

Adds basic tests and patterns for type inference

34c1048

Issue: #92

kpp force-pushed the feature/17/type_inference branch from 5d23406 to 34c1048 Compare August 13, 2016 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type inference prototype based on static analysis #92

Type inference prototype based on static analysis #92

0x7CFE commented Jun 18, 2016 •

edited

Loading

coveralls commented Aug 13, 2016 •

edited

Loading

Type inference prototype based on static analysis #92

Are you sure you want to change the base?

Type inference prototype based on static analysis #92

Conversation

0x7CFE commented Jun 18, 2016 • edited Loading

coveralls commented Aug 13, 2016 • edited Loading

0x7CFE commented Jun 18, 2016 •

edited

Loading

coveralls commented Aug 13, 2016 •

edited

Loading