-
Notifications
You must be signed in to change notification settings - Fork 787
Add a utility for principal types #7642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WebAssembly has the property that every valid sequence of instructions has a principal type, i.e. a unique most-specific type. However, due to various kinds of polymorphism, these principal types may contain type variables. Specifically, the syntaxes for value types, heap types, nullability, sharedness, and exactness are all augmented with type variables. Add a PrincipalType utility for representing and composing these principal types. This can be seen as a more powerful version of the existing StackSignature utility. Follow-up PRs will update ChildTyper to use principal types to represent type constraints and use the principal types of outlined instruction sequences to detect polymorphism and determine the types of outlined functions. There are also further potential applications in simplifying validation, simplifying expression typing, and performing spec-compliant validation.
const auto* null = std::get_if<Nullability>(&ref.null); | ||
if (!null) { | ||
return ref; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helpers might benefit from comments, as I feel I'm missing the point. What sense of "canonicalization" is meant here? Is the canonical representation defined somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments for all the helpers. It might be helpful to start reading the implementation of PrincipalType::compose
at the end of the file to get a sense for all the helpers are used, too.
src/ir/principal-type.cpp
Outdated
return VarRef{null, VarDefHeapType{type.getExactness(), ht}}; | ||
} | ||
|
||
// join: Update `value` to the least upper bound of `value` and `other`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// join: Update `value` to the least upper bound of `value` and `other`. | |
// join: Update `value` to the least upper bound of `value` and `other`. Returns | |
// true if a change was made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does the return value have some other meaning? Actually after reading a bit further down, I am starting to think it does.
}; | ||
|
||
void countVar(Index* countp, Index i) { | ||
// Variables must be introduced sequentialy starting at 0 so their count is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Variables must be introduced sequentialy starting at 0 so their count is | |
// Variables must be introduced sequentially starting at 0 so their count is |
// variable. If a reference type can be expressed in the normal syntax rather | ||
// than the extended syntax, it should be represented here as a `Type`, not a | ||
// `VarRef`. | ||
using VarType = std::variant<Type, VarRef, Index>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read about half of the cpp and think I have some feel for what's going on here. But I have a question about the design. Rather than VarType, VarRef, VarHeapType, VarDefHeapType, VarAbsHeapType, could we have a single class:
struct VarStar {
std::optional<VarNullability> null;
std::optional<VarExactness> exact;
std::optional<VarShare> share;
std::optional<VarHeapType> heap;
};
Not all those 4 fields would be needed in all cases, but there would be a natural/canonical way to represent things in each situation. That is, those 4 attributes are all we can track: nullability, exactness, shareability, and heap type, and each of them can be present or absent, and can also be a variable.
I feel like a single class could avoid much of the code in the cpp file? E.g. joining would just join each of the 4 attributes. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually started out with a design that was similar to your suggestion. The problem I saw is that there are many nonsensical configurations to ignore and that the different members cannot be handled independently. For example, it is impossible to correctly join exactness without also considering heap types, and there is double-bookkeeping between the explicit sharedness and the sharedness that is part of a heap type. Having several types that more precisely model the allowed syntax lets us break what would have been one very large and messy function into several much smaller and simpler functions that only consider one case each. Having just one type holding everything would reduce the number of functions, but not the overall amount of code, and it would make the code much more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, makes sense. Good to know you considered that option.
return true; | ||
} | ||
// Make sure we have a class for i. | ||
classes.info.reserve(i + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK no previous user of DisjointSets uses the internal .info
in this manner. Is the API there not sufficient? (if not should we extend it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess we could add a reserve
method that just calls reserve
on the underlying vector. I'll do that.
} | ||
} | ||
|
||
// matchBottom: join the bottom value into the assignemnt in `assignments` for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// matchBottom: join the bottom value into the assignemnt in `assignments` for | |
// matchBottom: join the bottom value into the assignment in `assignments` for |
|
||
// matchBottom: join the bottom value into the assignemnt in `assignments` for | ||
// any variables appearing in the second parameter. Returns `true` if this | ||
// succeeds. TODO: This currently always succeeded, but if we supported repeated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// succeeds. TODO: This currently always succeeded, but if we supported repeated | |
// succeeds. TODO: This currently always succeeds, but if we supported repeated |
return matchBottom(assignments, ref.null) && matchBottom(assignments, ref.ht); | ||
} | ||
|
||
// match: Record the variable assignemnts necessary to make `a` match `b` (i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// match: Record the variable assignemnts necessary to make `a` match `b` (i.e. | |
// match: Record the variable assignments necessary to make `a` match `b` (i.e. |
// variables in outputs properly, we would need to avoid mixing left assignment | ||
// and right assignemnt, so this would be able to fail. | ||
|
||
bool matchBottom(VarAssignments& assignments, const VarNullability& null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused as to where the word "match" comes from, here and below. Here, can this not be assignBottom
? Looks like that is what it does, assign the bottom type if the input is an index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matches
is the word the spec uses for "is a subtype of," so here I'm using match
for "make into a subtype of."
} | ||
|
||
// match: Record the variable assignemnts necessary to make `a` match `b` (i.e. | ||
// to ensure `a` <: `b`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing the above comment about the word "match", can we say ensureSub
, as that IIUC what the comment describes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's value in matching the spec wording, especially for something that is so directly taken from the spec, and also when the spec wording so conveniently pithy.
WebAssembly has the property that every valid sequence of instructions
has a principal type, i.e. a unique most-specific type. However, due to
various kinds of polymorphism, these principal types may contain type
variables. Specifically, the syntaxes for value types, heap types,
nullability, sharedness, and exactness are all augmented with type
variables.
Add a PrincipalType utility for representing and composing these
principal types. This can be seen as a more powerful version of the
existing StackSignature utility. Follow-up PRs will update ChildTyper to
use principal types to represent type constraints and use the principal
types of outlined instruction sequences to detect polymorphism and
determine the types of outlined functions. There are also further
potential applications in simplifying validation, simplifying expression
typing, and performing spec-compliant validation.