Do not materialize optional `undefined` properties inside actual types #28727

weswigham · 2018-11-29T07:00:43Z

This is what I suggested when I had heard about the problem #28707 was opened to fix.

Fixes #28540 without making the input code into an error 😉

This is done by tracking the freshness state of the unions which contain and used to contain fresh object literals, and generating the missing properties on the fly as needed in any given function. This greatly reduces our memory usage in pathological examples like the given test, to the point where we can check it pretty easily - the test actually takes slightly less time to execute than the largeControlFlowGraph.ts test, which is which is actually only a third of the length of the new test.

Also in this PR:

Disabling subtype reduction for huge unions. (> 1000 members was my arbitrary cutoff - that's still 1,000,000 comparisons just on union construction (which is double the limit in Error on complex array literals #28707), so it might even be a good idea to lower it a lot more)

Subtype reduction is supposedly an optimization; but the gamble is that the powerset comparison within the union is beaten by the potential savings of eliminating union members for future comparisons. #28707 simply disables subtype reduction and reports an error to prevent followup issues with larger types. Instead, this lets the non-reduced type through and handles larger types better. For truly large unions in practice, subtype reduction is simply rarely a benefit - as normally these things get assigned once or twice and then disappear, rather than compared and narrowed extensively (like smaller unions do). In effect, if the number of followup comparisons exceeds the square of the number of union members not eliminated, then subtype reduction is a plus - however if it regularly eliminates few or no members from the union, then for large unions, it represents a huge cost for little potential gain.

It's possible there's an efficient middleground in subtype reduction here for large types where, rather than outright disabling subtype reduction, we divide the union into object and non-object type buckets, perform normal reduction on the non-object types (since there are usually a limited number of them, unless someone has 10000 different string literal variants in a single union for some reason), scan the object types of the union for the "least specific type" and then attempt to assign each union element to only the least specific type - that grows only linearly with union size, rather than exponentially, and still would handle cases where, eg, there's a lot of very specific types and a single base type in the array to unify them. In abstract, the idea is to spend one pass over the list grouping things into similarity classes which are likely a subtype of a single easily identifiable thing in the class, then doing a single pass on each group to check if it can be reduced to the candidate in that class, then combining the candidates (if a group succeeds in being entirely a subtype of its candidate) or classes back together. Just spitballing here, though - mostly a sketch of trying to the use sorta-transitivity of the subtype relationship to remodel subtype reduction into a divide and conquer style algorithm.

Anyways, back to smaller things in this PR:

On top of this, one of those assignments involving the widened and unwidened forms of large unions are now trivial - that relationship is tracked (for the purpose of fetching properties lazily rather than storing them in a ton of maps and consuming space), so that comparison can be effectively skipped, thus with this PR now these large unions of types can often have no actual costly comparisons performed on them.
A hard limit on the length of a type string we generate before applying truncation rules, even if noTruncation is set - this is done to avoid an OOM crash for huge output types, as the materialized AST nodes for the types we can now create and reason about are, well, huge. I've capped this at 100,000 characters right now, which is incredibly generous compared to the normal truncation limit of 160 characters. If you hit the 100,000 character limit, you're probably not reading the type anyway, and we'd probably crash on trying to parse the type back in anyway. This isn't strictly necessary for this PR to pass at this point, but I added it so we wouldn't OOM on the type baselines for the new test before I added skipTypeAndSymbol (see next bullet). I'd understand if we'd rather not see this in this PR.
The ability to skipTypeAndSymbol baselines for a given test (because type and symbol baselines for a 30k line long file are never going to be reviewed anyway, and they take like 30 seconds to generate besides because of how big the types are, even with the 100,000 character limit).
Minor optimizations to union property generation - rather than using appendIfUnique, we push symbols into a map, to avoid the cost of checking uniqueness repeatedly in unions with a huge number of members.
undefined property symbols are now globally cached and don't include a declaration - meaning we produce just one undefined symbol for each property name, rather than one for each name maybe with a declaration from the first one of that name we saw. This has the nice side effect of removing duplicated symbol declaration entries from union symbols, with the minor downside that if a property is narrowed to just an implied undefined symbol, it will no longer have an associated declaration at that site.

And one smaller change that might be a regression/bug but also might not depending on how flexible we are:

An object type containing a spread of a non-fresh type is no longer a fresh type. This makes sense for declarations, property types, and the like, but disables a somewhat desirable excess property check on the properties actually in the literal after such a spread (the props within the spread were always excluded via a value declaration comparison) - it might be worth tagging the desire for a property to be excess-checked on the property itself, rather than checking the containing type for freshness to fix this. So if this is a big issue (I know we've swapped our behavior on excess property checks in spreads like 4 times), it can be fixed, if need be.

ahejlsberg · 2018-11-29T17:37:14Z

There's a lot happening in this PR and I haven't formed opinions about all of it yet. However, one thing I noticed is the 1000 type cutoff in union subtype reduction. I think this is way too blunt of a hammer. Consider an array literal with >1000 elements that have structurally identical types (e.g. object literals with a uniform set of properties). This is very common and doesn't present a problem for subtype reduction at all. In fact, subtype reduction scales linearly in this scenario because it immediately succeeds on the first subtype check for each element, quickly eliminating all but the first element in the type set.

This is why #28707 uses a heuristic of counting the number of failed subtype checks. Those provide a much more accurate measure of complexity and impose no hard limit on the number of structurally identical types you can union together.

weswigham · 2018-11-29T17:56:16Z

Hm. Yeah, I can understand changing it to reduction failures (I just added it as a quick heuristic) - although rather than a hard limit, how do you feel about a % of the input union length? Ie, for unions > some small minimum length (10?) up to 30% of the possible comparisons are allowed to fail before we bail? The performance penalty of running/not running reduction scales with input size, so it seems to make sense to scale the heuristic with it.

weswigham · 2018-11-29T18:59:00Z

Hmmm, on inspection, I don't think basing the heuristic on failures alone is good enough - pass or fail, we still perform a number of comparisons equal to the square of the number of distinct types provided - and they are all distinct, as they come from differing object literals and have differing identities, even if their underlying structure is identical. Because of this, the success case is actually the worst case - there's absolutely no early bailouts and every object must be fully structurally compared with every other object. Especially since, barring subtype reduction, there's a real possibility we never need to perform any comparisons on the type besides these.

weswigham · 2018-11-29T18:59:32Z

If anything, we should be bailing if we don't see some number of successful comparisons.

sandersn · 2020-02-03T18:03:36Z

@weswigham I think this was superceded by #29435. Feel free to re-open if it's not.

weswigham · 2020-02-03T18:41:34Z

We added an error on subtype checks (rather than just skipping it), and the skipTypeAndSymbol test flag has been added by another PR. We could still do the whole "only lazily create optional undefined properties thing, but we don't strictly have any pressing asks/crashes which require it (since we made the original scenario into an error...).

weswigham requested review from ahejlsberg and sandersn November 29, 2018 07:00

Do not materialize optional undefined properties inside actual types

564da4f

weswigham force-pushed the remove-explicit-object-type-widening branch from e7a4b10 to 564da4f Compare November 29, 2018 18:21

weswigham mentioned this pull request Dec 1, 2018

getOutliningSpans hangs on large, json-like file #27960

Closed

DanielRosenwasser mentioned this pull request Dec 7, 2018

Design Meeting Notes, 12/7/2018 #28908

Closed

RyanCavanaugh assigned weswigham Jan 24, 2019

weswigham mentioned this pull request Jan 25, 2019

Error on complex array literals #28707

Closed

DanielRosenwasser mentioned this pull request Jan 25, 2019

Design Meeting Notes, 1/25/2019 #29588

Closed

RyanCavanaugh assigned weswigham and unassigned weswigham Apr 30, 2019

sandersn added this to Limbo in Pall Mall Jan 29, 2020

sandersn closed this Feb 3, 2020

typescript-bot added the Author: Team label Feb 3, 2020

sandersn removed this from Limbo in Pall Mall Feb 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not materialize optional `undefined` properties inside actual types #28727

Do not materialize optional `undefined` properties inside actual types #28727

weswigham commented Nov 29, 2018 •

edited

Loading

ahejlsberg commented Nov 29, 2018

weswigham commented Nov 29, 2018

weswigham commented Nov 29, 2018

weswigham commented Nov 29, 2018

sandersn commented Feb 3, 2020

weswigham commented Feb 3, 2020

Do not materialize optional undefined properties inside actual types #28727

Do not materialize optional undefined properties inside actual types #28727

Conversation

weswigham commented Nov 29, 2018 • edited Loading

ahejlsberg commented Nov 29, 2018

weswigham commented Nov 29, 2018

weswigham commented Nov 29, 2018

weswigham commented Nov 29, 2018

sandersn commented Feb 3, 2020

weswigham commented Feb 3, 2020

Do not materialize optional `undefined` properties inside actual types #28727

Do not materialize optional `undefined` properties inside actual types #28727

weswigham commented Nov 29, 2018 •

edited

Loading