Add a utility for principal types #7642

tlively · 2025-06-06T18:16:47Z

WebAssembly has the property that every valid sequence of instructions
has a principal type, i.e. a unique most-specific type. However, due to
various kinds of polymorphism, these principal types may contain type
variables. Specifically, the syntaxes for value types, heap types,
nullability, sharedness, and exactness are all augmented with type
variables.

Add a PrincipalType utility for representing and composing these
principal types. This can be seen as a more powerful version of the
existing StackSignature utility. Follow-up PRs will update ChildTyper to
use principal types to represent type constraints and use the principal
types of outlined instruction sequences to detect polymorphism and
determine the types of outlined functions. There are also further
potential applications in simplifying validation, simplifying expression
typing, and performing spec-compliant validation.

WebAssembly has the property that every valid sequence of instructions has a principal type, i.e. a unique most-specific type. However, due to various kinds of polymorphism, these principal types may contain type variables. Specifically, the syntaxes for value types, heap types, nullability, sharedness, and exactness are all augmented with type variables. Add a PrincipalType utility for representing and composing these principal types. This can be seen as a more powerful version of the existing StackSignature utility. Follow-up PRs will update ChildTyper to use principal types to represent type constraints and use the principal types of outlined instruction sequences to detect polymorphism and determine the types of outlined functions. There are also further potential applications in simplifying validation, simplifying expression typing, and performing spec-compliant validation.

src/ir/principal-type.h

src/ir/principal-type.cpp

kripken · 2025-06-17T17:09:26Z

src/ir/principal-type.cpp

+  const auto* null = std::get_if<Nullability>(&ref.null);
+  if (!null) {
+    return ref;
+  }


These helpers might benefit from comments, as I feel I'm missing the point. What sense of "canonicalization" is meant here? Is the canonical representation defined somewhere?

I've added comments for all the helpers. It might be helpful to start reading the implementation of PrincipalType::compose at the end of the file to get a sense for all the helpers are used, too.

kripken · 2025-06-17T22:01:18Z

src/ir/principal-type.cpp

+  return VarRef{null, VarDefHeapType{type.getExactness(), ht}};
+}
+
+// join: Update `value` to the least upper bound of `value` and `other`.


Suggested change

// join: Update `value` to the least upper bound of `value` and `other`.

// join: Update `value` to the least upper bound of `value` and `other`. Returns

// true if a change was made.

Or does the return value have some other meaning? Actually after reading a bit further down, I am starting to think it does.

src/ir/principal-type.cpp

kripken · 2025-06-18T16:56:14Z

src/ir/principal-type.cpp

+};
+
+void countVar(Index* countp, Index i) {
+  // Variables must be introduced sequentialy starting at 0 so their count is


Suggested change

// Variables must be introduced sequentialy starting at 0 so their count is

// Variables must be introduced sequentially starting at 0 so their count is

kripken · 2025-06-18T17:06:00Z

src/ir/principal-type.h

+// variable. If a reference type can be expressed in the normal syntax rather
+// than the extended syntax, it should be represented here as a `Type`, not a
+// `VarRef`.
+using VarType = std::variant<Type, VarRef, Index>;


I've read about half of the cpp and think I have some feel for what's going on here. But I have a question about the design. Rather than VarType, VarRef, VarHeapType, VarDefHeapType, VarAbsHeapType, could we have a single class:

struct VarStar { std::optional<VarNullability> null; std::optional<VarExactness> exact; std::optional<VarShare> share; std::optional<VarHeapType> heap; };

Not all those 4 fields would be needed in all cases, but there would be a natural/canonical way to represent things in each situation. That is, those 4 attributes are all we can track: nullability, exactness, shareability, and heap type, and each of them can be present or absent, and can also be a variable.

I feel like a single class could avoid much of the code in the cpp file? E.g. joining would just join each of the 4 attributes. Am I missing something?

I actually started out with a design that was similar to your suggestion. The problem I saw is that there are many nonsensical configurations to ignore and that the different members cannot be handled independently. For example, it is impossible to correctly join exactness without also considering heap types, and there is double-bookkeeping between the explicit sharedness and the sharedness that is part of a heap type. Having several types that more precisely model the allowed syntax lets us break what would have been one very large and messy function into several much smaller and simpler functions that only consider one case each. Having just one type holding everything would reduce the number of functions, but not the overall amount of code, and it would make the code much more complex.

I see, makes sense. Good to know you considered that option.

kripken · 2025-06-18T19:38:23Z

src/ir/principal-type.cpp

+      return true;
+    }
+    // Make sure we have a class for i.
+    classes.info.reserve(i + 1);


AFAIK no previous user of DisjointSets uses the internal .info in this manner. Is the API there not sufficient? (if not should we extend it?)

Hmm, I guess we could add a reserve method that just calls reserve on the underlying vector. I'll do that.

kripken · 2025-06-18T19:55:51Z

src/ir/principal-type.cpp

+  }
+}
+
+// matchBottom: join the bottom value into the assignemnt in `assignments` for


Suggested change

// matchBottom: join the bottom value into the assignemnt in `assignments` for

// matchBottom: join the bottom value into the assignment in `assignments` for

kripken · 2025-06-18T22:41:04Z

src/ir/principal-type.cpp

+
+// matchBottom: join the bottom value into the assignemnt in `assignments` for
+// any variables appearing in the second parameter. Returns `true` if this
+// succeeds. TODO: This currently always succeeded, but if we supported repeated


Suggested change

// succeeds. TODO: This currently always succeeded, but if we supported repeated

// succeeds. TODO: This currently always succeeds, but if we supported repeated

kripken · 2025-06-18T22:42:50Z

src/ir/principal-type.cpp

+  return matchBottom(assignments, ref.null) && matchBottom(assignments, ref.ht);
+}
+
+// match: Record the variable assignemnts necessary to make `a` match `b` (i.e.


Suggested change

// match: Record the variable assignemnts necessary to make `a` match `b` (i.e.

// match: Record the variable assignments necessary to make `a` match `b` (i.e.

kripken · 2025-06-18T22:43:53Z

src/ir/principal-type.cpp

+// variables in outputs properly, we would need to avoid mixing left assignment
+// and right assignemnt, so this would be able to fail.
+
+bool matchBottom(VarAssignments& assignments, const VarNullability& null) {


I'm confused as to where the word "match" comes from, here and below. Here, can this not be assignBottom? Looks like that is what it does, assign the bottom type if the input is an index.

matches is the word the spec uses for "is a subtype of," so here I'm using match for "make into a subtype of."

kripken · 2025-06-18T22:44:45Z

src/ir/principal-type.cpp

+}
+
+// match: Record the variable assignemnts necessary to make `a` match `b` (i.e.
+// to ensure `a` <: `b`).


Continuing the above comment about the word "match", can we say ensureSub, as that IIUC what the comment describes?

I think there's value in matching the spec wording, especially for something that is so directly taken from the spec, and also when the spec wording so conveniently pithy.

tlively requested review from kripken and ashleynh June 6, 2025 18:16

tlively added 4 commits June 6, 2025 18:23

maybe_unused

1af7aef

remove unused vars

2955bb0

unused var

3177490

ndebug

5f56b33

kripken reviewed Jun 9, 2025

View reviewed changes

src/ir/principal-type.h Show resolved Hide resolved

more comments in header

7845be5

kripken reviewed Jun 17, 2025

View reviewed changes

comments on invariants and helper functions

eddbbfa

kripken reviewed Jun 17, 2025

View reviewed changes

update comment about join

c42c83c

kripken reviewed Jun 18, 2025

View reviewed changes

	// join: Update `value` to the least upper bound of `value` and `other`.
	// join: Update `value` to the least upper bound of `value` and `other`. Returns
	// true if a change was made.

	// Variables must be introduced sequentialy starting at 0 so their count is
	// Variables must be introduced sequentially starting at 0 so their count is

	// matchBottom: join the bottom value into the assignemnt in `assignments` for
	// matchBottom: join the bottom value into the assignment in `assignments` for

	// succeeds. TODO: This currently always succeeded, but if we supported repeated
	// succeeds. TODO: This currently always succeeds, but if we supported repeated

	// match: Record the variable assignemnts necessary to make `a` match `b` (i.e.
	// match: Record the variable assignments necessary to make `a` match `b` (i.e.

Add a utility for principal types #7642

Are you sure you want to change the base?

Add a utility for principal types #7642

Conversation

tlively commented Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!