-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul typechecks for Mapping/Listing #628
Conversation
39664d6
to
cedaf22
Compare
pkl-core/src/test/files/LanguageSnippetTests/output/errors/listingTypeCheckError1.err
Show resolved
Hide resolved
@@ -1,41 +1,367 @@ | |||
import "pkl:test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: these don't amend pkl:test
because facts
and examples
are declared as Mapping<String, Listing<Boolean>>
and Mapping<String, Listing<unknown>>
. These are typed properties that themselves use Mapping/Listing type checks, and problems with typechecking can quickly be conflated between the tests themselves and the test framework.
ce4f445
to
86d6d7b
Compare
*/ | ||
public abstract void execute(VirtualFrame frame, Object value); | ||
public abstract Object execute(VirtualFrame frame, Object value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the biggest changes. Executing a typecheck will return a type-casted value.
For most types, this simply returns value
if the value conforms to the type (e.g. Int
). For Listing
and Mapping
, this returns a surrogate listing/mapping object. Other parameterized types may also return a new object if the parameterized type is Listing
or Mapping
(e.g. List<Listing<Int>>
).
} | ||
} | ||
|
||
public abstract static class ListTypeNode extends ObjectSlotTypeNode { | ||
public static final class ListTypeNode extends ObjectSlotTypeNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truffle's DSL was overriding executeEagerly
when making this an abstract class and using specializations. I'm not sure if I'm doing this wrong?
But, regardless, having specializations here doesn't seem that useful anyway? Reading the same member value skips the typecheck the second time around because it gets cached. And reading a new value needs an instanceof
call and will de-specialize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be related to my lambda remark above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so; I'm referring to the generated code that Truffle produces when you have an abstract class with specializations, which is a different concern than partial evaluation?
this.insertedStackFrames = new HashMap<>(); | ||
} | ||
this.insertedStackFrames.put(callTarget, stackFrame); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this so we can allow a VmTypeMismatchException
to blame a source section when a typecheck fails (see ListingOrMappingTypeNode#doEagerCheck
)
var alreadyVisited = !visited.add(key); | ||
// important to record hidden member as visited before skipping it | ||
// because any overriding member won't carry a `hidden` identifier | ||
if (alreadyVisited || member.isLocalOrExternalOrHidden()) return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is changed because iterateMemberValues
might cause a VmMapping
/VmListing
to throw, because VmListingOrMapping#getCachedValue()
can possibly perform an additional typecheck. Mapping#length
should not need to execute any members, so, it should not need to run typechecks.
aa9eba3
to
38369c2
Compare
38369c2
to
25e45a0
Compare
This changes how the language performs typechecks for mappings and listings. Currently, Pkl will shallow-force any Mapping and Listing to check it the type parameter (e.g. Listing<Person> means each element is checked to be an instance of Person). This changes the language to check each member's type when the member is accessed. Note: List/Map/Set are still eager checks, because these types are not VmObject and not lazy in the first place. Note 2: Unions of Listings or Mappings are still eager (we must force the listing to know if it satisfies Listing<A>|Listing<B>).
With the change to make mapping/listing typechecks lazy, we can now correctly handle thrown errors from within a single test case. This adjusts the test runner to consider any thrown errors as a failure for that specific test case.
25e45a0
to
670d67e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice solution! Less code than I expected. Just have some small things.
ctx.objectBody(), | ||
new GetParentForTypeNode( | ||
createSourceSection(ctx), | ||
visitType(typeCtx), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use parentType
here as you already visited it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
@@ -45,10 +47,10 @@ public Object executeGeneric(VirtualFrame frame) { | |||
unresolvedTypeNode = null; | |||
} | |||
|
|||
// TODO: throw if typeNode is FunctionTypeNode (it's impossible to check) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to create an issue for that (if it doesn't exist yet) and link it here for tracking.
var seenCollections = new MutableLong(0); | ||
this.acceptTypeNode( | ||
(typeNode) -> { | ||
if (typeNode instanceof ListingTypeNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if TypeNode
had a isParametric
property you could just check for that and store the Class
in a set and check if the class is not in the set or return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
for (var elem : value) { | ||
elementTypeNode.execute(frame, elem); | ||
for (var elem : vmList) { | ||
elementTypeNode.executeEagerly(frame, elem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you are not checking the return of executeEargerly
to check if the value didn't change, like in execute
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executeEagerly
is a type test rather than a type cast. It will never return a different value; instead, it will throw if the type test fails.
} | ||
) | ||
module.catch(() -> l[0][1]) | ||
module.catch(() -> l[0][2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to check l[0][1]
and l[0][2]
here instead of [0][0]
and [0][1]
? I don't think the index out of bounds check is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, good catch!
@@ -0,0 +1,4 @@ | |||
// ensure that this member is only evaluated once (should result in only one trace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: all other tests are in the singular (listing4/5/6.pkl
), only this one is plural.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock, but some significant questions; especially memory pressure!
@@ -107,6 +107,138 @@ class CliTestRunnerTest { | |||
assertThat(err.toString()).isEqualTo("") | |||
} | |||
|
|||
@Test | |||
fun `CliTestRunner thrown error in facts`(@TempDir tempDir: Path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fun `CliTestRunner thrown error in facts`(@TempDir tempDir: Path) { | |
fun `CliTestRunner throws error in facts`(@TempDir tempDir: Path) { |
?
[throughout]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not CliTestRunner
that throws the error. I'll change this to CliTestRunner with thrown error in facts
if (isEntry()) { | ||
while (ch == ']' || ch == '=' || Character.isWhitespace(ch)) { | ||
ch = text.charAt(++skip); | ||
} | ||
} else { | ||
while (ch == '=' || Character.isWhitespace(ch)) { | ||
ch = text.charAt(++skip); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (isEntry()) { | |
while (ch == ']' || ch == '=' || Character.isWhitespace(ch)) { | |
ch = text.charAt(++skip); | |
} | |
} else { | |
while (ch == '=' || Character.isWhitespace(ch)) { | |
ch = text.charAt(++skip); | |
} | |
while ((ch == ']' && isEntry()) || ch == '=' || Character.isWhitespace(ch)) { | |
ch = text.charAt(++skip); | |
} |
@Override | ||
protected void acceptTypeNode(Consumer<TypeNode> visitor) { | ||
visitor.accept(this); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to put this default implementation in TypeNode
and reduce some of this duplication (only override when visiting is something other than visitor.accept()
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing it this way is less error prone--having a default implementation made it too easy to miss if something did need to change for any one TypeNode.
if (skipElementTypeChecks) return; | ||
protected void acceptTypeNode(Consumer<TypeNode> visitor) { | ||
visitor.accept(this); | ||
//noinspection ForLoopReplaceableByForEach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to use foreach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah; writing it this way allows Truffle to unroll this loop into a series of statements. See ExplodeLoop for details.
var seenPairs = new MutableLong(0); | ||
var seenCollections = new MutableLong(0); | ||
this.acceptTypeNode( | ||
(typeNode) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually avoid lambdas in any FooNode
, because of the partial evaluation of node.execute()
. Did you inspect this with IdealGraphVisualizer to confirm this one's okay?
Besides; this visits exhaustively, even when any of these observations hits >= 2
. Is it worth pruning that (let visitors return a boolean
to mark they want to continue; instead of passing in a lambda, define a class with this logic, create and instance and pass a method reference on said instance as the visitor)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a truffle boundary here, so there's no impact on partial evaluation/inlining.
instead of passing in a lambda, define a class with this logic, create and instance and pass a method reference on said instance as the visitor)
Not sure what the benefit of this is? The current function syntax is sugar for new Consumer<TypeNode>()
, and passing that as a reference.
But, fair point on returning early, will implement!
if (typeNode instanceof ListingTypeNode) { | ||
seenListings.getAndIncrement(); | ||
} | ||
if (typeNode instanceof MappingTypeNode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be marginal and I see the point of easy translation to a switch, but that's still far off... else if
?
} | ||
} | ||
|
||
public abstract static class ListTypeNode extends ObjectSlotTypeNode { | ||
public static final class ListTypeNode extends ObjectSlotTypeNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be related to my lambda remark above.
throw typeMismatch(value, BaseModule.getListClass()); | ||
@Override | ||
public Object execute(VirtualFrame frame, Object value) { | ||
if (!(value instanceof VmList vmList)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a newly introduced scenario, or a previously missed one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neither; I rewrote this class to remove specializations (see comment here: https://github.com/apple/pkl/pull/628/files#r1720334213)
/** The original that this listing/mapping is a surrogate of. It might have its own surrogatee. */ | ||
private final @Nullable SELF surrogatee; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the surrogate(e) naming; what about s/surrogate/typeCheckedThis and s/surrogatee/uncheckedThis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "delegate"? That seems like a more common term for this type of thing.
private final @Nullable SELF surrogatee; | ||
|
||
private final ListingOrMappingTypeCheckNode typeCheckNode; | ||
private final MaterializedFrame typeNodeFrame; | ||
private final EconomicMap<Object, ObjectMember> cachedMembers = EconomicMaps.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried about the memory load this imposes (2x this for all Listing
s and Mapping
s; especially the cachedMembers
). Have you measured the cost of this on a significant corpus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only introduces new cached members on some mappings and listings.
For example, this still contains 2 cached members, because foo
is hidden and never executed:
hidden foo: Mapping = new {
["first"] = 1
["second"] = 2
}
bar: Mapping<String, Int> = foo
Same here:
hidden foo: Mapping<String, Int> = new {
["first"] = 1
}
However, this example means we get four cached members, because foo
is not hidden:
foo: Mapping = new {
["first"] = 1
["second"] = 2
}
bar: Mapping<String, Int> = foo
We can certainly profile this on some real-world projects to see what this change looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an additional optimization here: now, we will only store cached members on the delegated object if the typecast results in a new object.
This results in two cached members:
foo: Mapping = new {
["first"] = 1
["second"] = 2
}
bar: Mapping<String, Int> = foo
This results in four cached members, due to the Listing
to Listing<Int>
typecast:
foo: Mapping<Listing> = new {
["first"] = new { 1 }
["second"] = new { 2 }
}
bar: Mapping<String, Listing<Int>> = foo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added another optimization: A typecast will only return a new delegating object if the types are different.
This results in no delegation:
foo: Listing<String>
bar = foo as Listing<String> // no delegation here because `foo` is already `Listing<String>`
* Don't create a delegated VmListingOrMapping if the element type check is the same (don't create a new object for: `new Listing<String> {} as Listing<String>`) * Don't store cached values on a delegated object unless a typecast returns a different value
This follows this SPICE: apple/pkl-evolution#11
This changes how the language performs typechecks for mappings and
listings.
Currently, Pkl will shallow-force any Mapping and Listing to check
the type parameter (e.g.
Listing<Person>
means each element is checkedto be an instance of
Person
).This changes the language to check each member's type when the member
is accessed.
This PR is designed to be reviewed as two separate commits.
The first one contains business logic that changes how mapping/listing
typechecks work.
The second one adjusts the test runner to handle errors thrown within
tests.
Closes #405
Closes #406