Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set literal #37

Closed
lrhn opened this issue Oct 8, 2018 · 16 comments
Closed

Set literal #37

lrhn opened this issue Oct 8, 2018 · 16 comments
Assignees
Labels
feature Proposed language feature that solves one or more problems
Milestone

Comments

@lrhn
Copy link
Member

lrhn commented Oct 8, 2018

Solution for #36, feature specification.

This issue is for discussing the pros and cons of this particular proposal.
General discussion about the issue should go in #36 where everybody can see it.

I propose to add set literals to Dart by using <int>{1, 2, 3} syntax. This is distinguishable from a map literal in every case except the empty set with not type variable, {}. In that case, we should use the context type to pick the correct Set or Map type, defaulting to a map if the context type is not prohibiting it from being a map.

(Historical design proposal).

Related issues: dart-lang/sdk#3792

@lrhn lrhn added the feature Proposed language feature that solves one or more problems label Oct 8, 2018
@lrhn lrhn self-assigned this Oct 8, 2018
@lrhn lrhn changed the title Dart set literals strawman. Dart set literals proposal Oct 9, 2018
@lrhn
Copy link
Member Author

lrhn commented Oct 9, 2018

It has been brought up that this proposal intercepts with another proposed future feature: Collection spreads which allows you to write [1, 2, ...someListExpression, 5] as a literal. It works for both lists and maps, and should also work for sets.
Since the syntax for spreading an iterable in a list literal and a map in a map literal is the same, and it would be the same for set literals, it means that there are more ambiguous expressions than the empty map/set, like {... something}. That is, any map/set literal with no type arguments and containing only spreads needs to be disambiguated.

If there is a useful context type, then we can still use that to determine whether it's a set or map literal.
If the context type allows both sets and maps, then the fact that we have an exact type assumption on literals allows us to make that a compile-time error, so we can assume that either there is no context type, or it's a type that neither Iterable nor Map is assignable to.

If the literal is empty (zero spreads) it still needs to be a map. With no context type, that also cannot be an error.

If there is no useful context type and at least one spread, then we can either give up (make it a compile-time error) or use the static type of the spread expressions to determine whether it's a set or map literal.

Giving up is a valid and safe option - you want to write {... e}, you have to say whether it's a map or a set, since the syntax doesn't say anything.

Or we can look at the static type of e.

The complication here is that the static type of e depends on the context type for e, which again depends on whether the surrounding context is a set or a map, which is what we are trying to deduce.
In a set/list literal with static type Set<T>/List<T>, the context type of e in ...e is Iterable<T>.
We can even be in a position where we know that the context type is a set, but the type argument is still unknown (to-be-inferred), like

void foo<T>(Set<T> set) {}
main() {
   foo({... [1, 2]});  // context type of `[1, 2]` is Iterable<?>, it gets filled out with <int> by the list.
}

So, if we try to find the static type of the spread expressions, we do it with no context type, which is still possible, but may give a different result than what we would get if we had written the inferred type explicitly.
We then need to figure out how to deduce the set/map-ness of the container from those types.

I see three options:

  1. If all spread expressions have static types assignable to Iterable<Object>, and none have static types assignable to Map<Object, Object>, it's a set. In the opposite case (all maps, no iterables), it's a map. Otherwise it's a compile-time error. That's probably unnecessarily harsh.

  2. If all spread expressions have static types assignable to Iterable<Object>, and not all have static types assignable to Map<Object, Object>, it's a set. In the opposite case (all maps, not all iterables), it's a map. Otherwise it's a compile-time error (that is, either not all can be used in a set literal or not all can be used in a map literal, or all of them can be used in both - the first two cases are plain errors, and the last one is unavoidably ambiguous - and it can easily happen because dynamic is down-castable to both).

  3. If all spread expressions have static types assignable to Iterable<Object>, and not all have static types assignable to Map<Object, Object>, it's a set. Otherwise it's a map.

The last case generalizes the default handling of the completely empty {} to all ambiguous literals that are not obviously sets. That may still cause a compile-time error if the spread expressions are not actually maps, it's just a different error message ("spreading non-map inside map literal" rather than "cannot determine if map or set literal"). It will allow more programs to compile, but perhaps not with the author's intended meaning.

An example of a pitfall using any of the above cases (assume it's generated code, not just me being randomly silly):

var setUnion = {...{1, 2}, ...{}, ...{5}};

With no context type, the ...{} will deduce that {} is a map, and since {1, 2} and {5} are clearly sets, the union itself is inconsistent.
There are many ways to disambiguate the code, but the fact that it's necessary, even though it's "obvious" to a human what it means, suggests that we have a dangerous design.

I'm leaning towards the second option - use the static type of the spread expressions, and if that gives a unique answer which ensures the code will compile, use that. If it doesn't give an answer, or doesn't give a unique answer, then make it an error. We will not make a guess that will only be proven wrong at run-time.

Another example to consider:

var x = {...{...{...{}}}};

This is a map. There is no context type on the way down, so the inner {} must be a map, and then the rest follows by propagating the choices back up.

(Also, note to self: Never make Map implement Iterable!)

@eernstg
Copy link
Member

eernstg commented Oct 9, 2018

Here's an attempt to keep it simple.

The rule of thumb would be "When we have evidence for a set and no evidence for a map, it will be a set. Otherwise, it will be a map". This means that we have no ambiguity error, all ambiguous forms will be checked assuming that they specify a map, and we may then get errors because they are half-and-half, but the point is that we have a consistent and backward compatible default: "Otherwise, it will be a map".

In more detail, {} with no context type is a Map<dynamic, dynamic> (forced by existing code). With {...spread1 ...spread2} with no context, it is a Set<E> (with bottom-up inference of E, defaulting to dynamic) if one or more of the spread expressions has a static type which is a subtype of Iterable<dynamic>, and all of the spread expressions are not a subtype of Map<dynamic, dynamic>. Otherwise it is a Map<K, V> (where K and V are obtained from bottom-up inference, defaulting to dynamic).

This means that we require unambiguous evidence for sets (noting "Never make Map implement Iterable!"!), and otherwise insist that the analysis must be performed assuming a map.

Note that I'm using 'subtype' rather than 'assignable', because the latter will always apply for dynamic expressions (and similar stuff), and only the former will positively indicate that there is a request for making it a set or a map.

I think this is a rule which is easy to remember and understand, and this means that we don't need to cook up any complex ways to determine whether {...myList ...myMap ...myOtherList} is a map that wrongly got some lists added to it, or a list that wrongly got a map added to it.

@lrhn
Copy link
Member Author

lrhn commented Oct 9, 2018

I don't actually think that makes it simpler. It provides a simple system, but you have to remember and understand the system.

I still think that assuming Map without evidence is a bad design.

If you write:

var x = { ... dynamicExpression };

we would then accept and compile that as a map literal, and potentially fail at run-time if dynamicExpression is a list.

I think the user would be better helped by us flagging the ambiguity at compile-time and refusing to continue.
That problem does not happen for the empty {} because it cannot be disproven as a map at run-time.

The example:

var x = { ... listAndMapExpression };

where the static type implements both Iterable and Map is possible (but exceedingly rare) and also ambiguous. That one can also not be disproven at run-time (both will work), but I again think the users should be notified that their code isn't clear.

I prefer not to guess at compile-time. If there is only one valid interpretation, pick that. If there is more than one (or definitely if there is less than one), fail visibly and early.

I think this is actually easy for users to understand: When it works, it works in the way they expect it to (they gave enough hints about what they meant). When it doesn't work, they are told early and can give the necessary hint. At no time do we get to run-time with an interpretation other than what they expected (except perhaps the empty set, where our hands are tied).
There is no system to remember except "if it can be both, it's not allowed". That is simpler than any system for making a guess, which users do need to understand, because their code might run with that guess.

@eernstg
Copy link
Member

eernstg commented Oct 9, 2018

Right, the strict approach (where anything which is ambiguous is simply an error) is safer, in the sense that it will allow developers to catch unintended ambiguities.

The trade-off is that things like var x = { ...dynamicExpression }; will be an error, no exceptions, even though it could be a concise way (using my proposal) to get a Map<dynamic, dynamic>. I do recognize that the decision to make it a map rather than a set is based on the history rather than any inherent properties of maps and sets, but we do have the extreme case {} (with no context type) which will be a map.

Other than that, I can live with any of these proposals. ;-)

@Hixie
Copy link

Hixie commented Oct 9, 2018

I definitely agree with the earlier comment that we should avoid magic and be predictable.

I wish there was some syntactic way we could be 100% unambiguous. With types given, it's unambiguous: <Foo>{ } is a set. We could use some other syntax, but I can't find any that is sane and not already used by something... <>{} looks ugly and is weird and inconsistent, [. .] doesn't work because of the spreading operator, {{ }} is ambiguous with maps and blocks, ({ }) and {( )} are ambiguous in other ways, *{ } looks okish but looks terrible with types: <Foo>*{}...

@a14n
Copy link

a14n commented Oct 10, 2018

About syntax we could also use a prefix (like raw string) : s[1, 2]

@lrhn
Copy link
Member Author

lrhn commented Oct 10, 2018

The s[1, 2] syntax works badly for singleton sets: s[0] already means something else. We'd need a non-expression as the prefix, and that probably won't improve readability.

@munificent
Copy link
Member

I wish there was some syntactic way we could be 100% unambiguous.

Me too. One option is {,}, but that's kinda hokey too.

@mit-mit mit-mit added this to the Dart 2.2 milestone Oct 22, 2018
@mit-mit mit-mit changed the title Dart set literals proposal Set literal Oct 31, 2018
@lrhn
Copy link
Member Author

lrhn commented Nov 2, 2018

We have now added a more precise feature specification based on the original design proposal.

Please take a look at this specification, and give feedback if it is unclear or if you anticipate any implementation issues. If there are no objections, we will schedule the design for implementation in Q4.

@kmillikin @sigmundch @a-siva @mraleph @vsmenon @johnniwinther @munificent @danrubel @jwren @jcollins-g

@jcollins-g
Copy link

jcollins-g commented Nov 2, 2018 via email

@lrhn
Copy link
Member Author

lrhn commented Nov 5, 2018

I've fixed the formatting issue and added a "tl;dr" summary section.

All set literals would be compile-time errors prior to the feature.
All existing compiling code is unaffected by the feature.

@vsmenon
Copy link
Member

vsmenon commented Nov 5, 2018

@lhrn - what valid rewrite transformation we can do in kernel for initial implementation (i.e., without direct backend support)?

E.g., the summary suggests this:

<T>{x, y, z} => Set<T>()..add(x)..add(y)..add(z)

Would this also work?:

<T>{x, y, z} => Set<T>.from([x, y, z])

@lrhn
Copy link
Member Author

lrhn commented Nov 6, 2018

Both should work, or perhaps Set<T>()..addAll([x, y, z]).

I'd probably use Set<T>.of(<T>[x, y, z]) rather than .from just to be sure to catch type errors early, and avoid unnecessary type checks at run-time.

For constant maps, that won't work. There is currently no constant map implementation in the SDK.
A possible rewrite is:

const <T>{c1, c2, c3} => const _SetWrapper<T>(<T, T>{c1: c1, c2: c2, c3: c3})

(after removing equal objects from c1..c3), where _SetWrapper is an internal set implementation, based on UnmodifiableSetBase (if we have that, otherwise it should be possible to make one) which forwards to methods on the map (we need the value to implement Set.lookup).

@mit-mit mit-mit added this to Being implemented in Language funnel Nov 28, 2018
@mit-mit mit-mit moved this from Being implemented to Done in Language funnel Feb 22, 2019
@mit-mit
Copy link
Member

mit-mit commented Feb 22, 2019

Closing; expected to launch in Dart 2.2

@Solido
Copy link

Solido commented Apr 28, 2019

@mit-mit
Copy link
Member

mit-mit commented Apr 29, 2019

Yes, will do, tracking in dart-lang/site-www#1365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
Development

No branches or pull requests

9 participants