Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonical abi: union-type #325

Open
oovm opened this issue Mar 20, 2024 · 4 comments
Open

Canonical abi: union-type #325

oovm opened this issue Mar 20, 2024 · 4 comments

Comments

@oovm
Copy link

oovm commented Mar 20, 2024

Motivation

Runtime Reinterpretation

Many languages have the ability to reinterpret a piece of data at runtime, and this ability can be constrained by the type system.

For example, in C language:

#include <stdio.h>

union MyUnion {
    int i;
    float f;
};

int main() {
    union MyUnion u;

    u.i = 42;
    printf("Value of i: %d\n", u.i);

    u.f = 3.14;
    printf("Value of f: %f\n", u.f);

    return 0;
}

Or in typescript:

interface MyUnion { i: number } | { f: number }

let u: MyUnion;
u = { i: 42 };
console.log("Value of i:", u.i);

u = { f: 3.14 };
console.log("Value of f:", u.f);

Interface merging (interface subtype)

In some type systems, identical types or subtypes can be merged:

type Option<T> = T | null;

Option<Option<T>>  ==>  Option<T>

This is completely different from variants (sum types) in algebraic data types.

ABI change

Add a new union option, which does not take effect by default

This option will require variants to pass data directly without adding additional enumeration parameters.

Changes to reference-type

When reference-type and union-type are enabled at the same time, the following changes will occur

wit type wasm w/o rt + ut wasm w/ rt + ut
option<bool> (i32, i32) (ref null i31)
option<option<bool>> (i32, i32) (ref null i31)
option<char> (i32, i32) (ref null i32)
option<i8> (i32, i32) (ref null i31)
option<i32> (i32, i32) (ref null i32)
option<i64> (i32, i64) (ref null i64)
option<T> (heap type) (i32, SIZE_OF_T) (ref null $t)
option<option<T>> (heap type) (i32, SIZE_OF_T) (ref null $t)
result<A, B> (i32, MAX_SIZE_A_B) anyref
variants (i32, MAX_SIZE) anyref

Each variant item will have an independent type id, which is used for type conversion and distinguishing variant items with the same name.

variant a { // struct a
   aa(i32)  // struct a-aa (field i32)
   ab(i32)  // struct a-ab (field i32)
}
variant b { // struct b
   aa(i32)  // struct b-aa (field i32)
   ab(i32)  // struct b-ab (field i32)
}

This helps to implement features such as abstract classes, interface inheritance, ?. (non-null call), ?? (null merge), etc.

This was referenced Mar 20, 2024
@lukewagner
Copy link
Member

lukewagner commented Mar 21, 2024

I haven't digested the whole idea, but two initial thoughts:

  1. I don't think in general we can conflate option<option<T>> with option<T> since none may mean something different than some(none). (Whether it's good interface design to have an option<option<T>> and depend on this difference is a different story, but it's hard for me to feel comfortable declaring that there is never a good reason to do this.)
  2. Until wasm-gc has explicit rtts that are generative (i.e., not canonicalized), then if we have two cases in a variant with the same structural contents, then a receiver of the variant value will not be able to use casts to tell the two cases apart, which would lose potentially-necessary semantic information. Thus, I think we'll need an explicit discriminant until then (which will take a while).

@rossberg
Copy link
Member

For context, options of options usually come up when composing things. For example, you have some domain of values that is represented by an option (say, because it contains an "empty" element), and then you need to put those in some kind of map where the lookup function returns an optional result to indicate lookup failure. Conflating the two then becomes a fatal composability failure.

@fitzgen
Copy link
Collaborator

fitzgen commented Mar 21, 2024

Until wasm-gc has explicit rtts that are generative (i.e., not canonicalized), then if we have two cases in a variant with the same structural contents, then a receiver of the variant value will not be able to use casts to tell the two cases apart, which would lose potentially-necessary semantic information. Thus, I think we'll need an explicit discriminant until then (which will take a while).

FWIW, you could do types inside a rec group to prevent a.aa and a.ab from canonicalizing to the same thing, but that seems less elegant than simply using a discriminant to me.

@lukewagner
Copy link
Member

Oh right, good point! Thinking about which is better from a perf POV, I would guess that for low number of cases, the difference is negligible, but for high numbers of cases, the nice thing about a (dense) discriminant is that you can br_table on it, so probably that one is the winner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants