Skip to content
This repository has been archived by the owner on Nov 3, 2021. It is now read-only.

Flow Sensitivity #42

Open
RossTate opened this issue Mar 19, 2020 · 7 comments
Open

Flow Sensitivity #42

RossTate opened this issue Mar 19, 2020 · 7 comments

Comments

@RossTate
Copy link

Here is some food for thought that might be relevant to discussions such as in #41. Low-level type systems generally need to at some point incorporate multi-value flow-sensitivity. That means that a branch taken on one value needs to be able to affect the operations you can do on other values on the context.

Here's an example for supporting Java/C# (that normally wouldn't be written this way but which would be produced by inlining):

int sum(Object arr) {
    Integer[] ints = (Integer[]) arr;
    int sum = 0;
    for (int i = 0; i < ints.length; i++) {
        Integer intref = ints[i];
        sum += intref.int_value;
    }
    return sum;
}

At the low level, this turns into:

int sum(Object arr) {
    ints = cast arr to Array;
    elem_type = ints.elem_rtt;
    assert elem_type == Integer.rtt;
    sum = 0;
    for (i = 0; i < ints.length; i++) {
        intref = ints[i];
        sum += intref.int_value;
    }
    return sum;
}

Notice that, in order for this to type check, after branching on elem_type == Integer.rtt we need to realize that this means the references contained in ints necessarily have an int_value field even though elem_type == Integer.rtt has no direct reference to ints.

Without multi-value flow-sensitivity, you can only get the following to type check:

int sum(Object arr) {
    ints = cast arr to Array;
    elem_type = ints.elem_rtt;
    assert elem_type == Integer.rtt;
    sum = 0;
    for (i = 0; i < ints.length; i++) {
        intref = cast ints[i] to Integer;
        sum += intref.int_value;
    }
    return sum;
}

That is, we have to cast every reference retrieved from ints to Integer even after asserting that its element type is the correct type. To make matters worse, if you don't have multi-value flow-sensitivity, there's no way to enforce the expectation that a surface-level Integer[] only contains Integer references, so you'll have to cast the result of int[i] even when ints has surface-level type Integer[].

(Note: all of this is independent of covariant arrays, so feel free to ignore that additional complexity when pondering this food for thought.)

So when considering design options, you might want to think of which options better support multi-value flow-sensitivity.

@fgmccabe
Copy link

How does nominal types solve this issue?

@tlively
Copy link
Member

tlively commented Mar 19, 2020

I'm not sure how this example relates to the multivalue proposal (or WebAssembly more generally), and it seems to assume quite a bit of typing machinery that WebAssembly does not have today. Could you perhaps sketch out what features would have to be included in WebAssembly for this example to become possible and also reframe the example to use WebAssembly rather than Java?

@RossTate
Copy link
Author

@tlively

(or WebAssembly more generally)

I just thought y'all might appreciate some foreshadowing on what kind of reasoning it will take for WebAssembly to ensure that the values from a String[] are actually Strings, like what the JVM and CLR can do. I'm not saying that we must add this reasoning—we might decide the complexity outweighs the performance gains—just letting you know what it is.

I'm not sure how this example relates to the multivalue proposal

Discussions like #41 talk about the tradeoffs between using local variables and using the stack, with the multivalue proposal making the stack much more usable. Other low-level type systems, like typed assembly languages, use reasoning more like the stack than like local variables. The reason is that the type of the stack is a snapshot in time of multiple values, whereas the type of a local variable stays fixed and only talks about one value.

To get a sense of what that means, consider the line assert elem_type == Integer.rtt;. Right before this line, the snapshot-in-time-type is "[ints : Array, elem_type : RTTI] where ints' element type equals the type represented by elem_type". Right after that line, we learn that elem_type is equal to the RTTI for Integer, so we get the snapshot-in-time-type "[ints : Array, elem_type : RTTI] where ints' element type equals the type represented by elem_type and the type represented by elem_type equals Integer". By combining these two equalities relating the two distinct variables ints and elem_type, we realize that ints is an Array whose element type equals Integer in this snapshot-in-time-type.

it seems to assume quite a bit of typing machinery that WebAssembly does not have today

Certainly. This is about looking ahead, specifically about where WebAssembly would need to go to be able to type-check the compilations of major type-safe languages without inserting many dynamic casts (if that's where it wants to go).

@fgmccabe

How does nominal types solve this issue?

Theoretically speaking, this is orthogonal to nominal vs. structural types. Instead, it's more related to existential types. Practically speaking, I don't believe there exists even a research compiler that has been able to put the relevant theory for structural types to practice on a real language. On the other hand, for nominal types there already exists an algorithm that can automatically type-check the above example without any additional type annotations within the body of sum—an algorithm that has been able to scale to automatically type-check the output of existing C# (1.0) to x86 compilers.

Regardless, I didn't bring this up as part of that debate. I just wanted to give y'all a heads up rather than have y'all later wish the issue had been mentioned before. I also wanted y'all to have a glimpse of what we meant by "fancy types", and why we opted to present y'all with an MVP that was forwards-compatible with fancy types but didn't itself require fancy types—a question that was raised during one of the meetings.

@tlively
Copy link
Member

tlively commented Mar 20, 2020

Thanks, this makes more sense to me now :)

@RossTate
Copy link
Author

Cool. Thanks for helping me realize how unclear I was being.

@fgmccabe
Copy link

fgmccabe commented Mar 20, 2020 via email

@RossTate
Copy link
Author

Yeah, there are a number of ways to express this formally. The most common is constrained existential types, but there are others that might be a better fit for WebAssembly (if we go down this path). Other VMs have tackled this by baking in language primitives, so by not taking that path y'all have embarked on a fairly unworn path.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants