Skip to content
This repository has been archived by the owner on Feb 22, 2018. It is now read-only.

Infer list (and map?) literals based on the data they contain #108

Closed
vsmenon opened this issue Mar 26, 2015 · 48 comments
Closed

Infer list (and map?) literals based on the data they contain #108

vsmenon opened this issue Mar 26, 2015 · 48 comments

Comments

@vsmenon
Copy link
Contributor

vsmenon commented Mar 26, 2015

This came up in recent discussions (@nex3 @kevmoo @leafpetersen).

We currently infer l in:

var l = ["hello", "world"];

as List<dynamic>. This causes issues if we use l in a context that requires List<String>. As an alternative, we could infer l as a List with the narrowest static type that accommodates its contents. In this case,

var l = ["hello", "world"];
l.add(42);

would be an error. Thoughts?

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 26, 2015

If we do this, we may want to disallow empty list literals - e.g., [] - without a type.

@jmesserly
Copy link
Contributor

it's basically a choice whether to optimize for typed+infer or untyped programs, in the defaults?

// current behavior
<String>["hello", "world"]  // List<String>
["hello", "world"]..add(42) // List<Object>

vs

["hello", "world"]  // List<String>
<Object>["hello", "world"]..add(42) // List<Object>

my hunch would be that typed Lists are a (lot) more common than List<Object> ... could be wrong though.

Regarding [], are the choices are <bottom>[] vs <Object>[]? The former being an error and the latter treating it as a special case. Either way, do we infer List<String> list = [];? What about List list = [];

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 26, 2015

@jmesserly We will (with @leafpetersen's cl) infer List<String> list = [];.

Regarding [], here's an example from the path package:

https://codereview.chromium.org/1038063002/diff/1/lib/src/parsed_path.dart

In this case, parts is assigned an empty list, strings are later added, and it is used in a context requiring List<String>. In this case, inferring <Object>[] would require a cast that would fail at runtime.

@nex3
Copy link

nex3 commented Mar 26, 2015

👍 to this proposal. In practice, I very rarely want an actual List<dynamic>; the only place I can think of is collecting a list of exceptions.

I'd also love not to have to add type annotations to my empty list literals (or anywhere in my method bodies for that matter), but I'm given to understand that the inference to make that work would be more difficult. Given that, I'd prefer to have an unannotated, un-inferred empty literal produce an error or warning.

@eernstg
Copy link

eernstg commented Mar 26, 2015

It is a problem if, say, a List is inferred for [2, 4] and then later
a double or some other non-int is added. I think the safest approach is to
ask the programmer who introduces the construct. An IDE offering a
quickfix changing [2, 4] to [2, 4] or [2, 4] could help
programmers making their intentions explicit with a near-zero effort.

On Thu, Mar 26, 2015 at 9:52 PM, Natalie Weizenbaum <
notifications@github.com> wrote:

[image: 👍] to this proposal. In practice, I very rarely want an actual
List; the only place I can think of is collecting a list of
exceptions.

I'd also love not to have to add type annotations to my empty list
literals (or anywhere in my method bodies for that matter), but I'm given
to understand that the inference to make that work would be more difficult.
Given that, I'd prefer to have an unannotated, un-inferred empty literal
produce an error or warning.


Reply to this email directly or view it on GitHub
#108 (comment)
.

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@nex3
Copy link

nex3 commented Mar 26, 2015

Adding text to a file isn't just more effort for the author, it's more effort for the reader as well. The language shouldn't be more verbose just because that verbosity can be hidden behind an IDE.

I think in practice there are going to be very few cases where the user will initialize a list with values that are more narrowly typed than they eventually use.

@eernstg
Copy link

eernstg commented Mar 26, 2015

I don't think that's so unlikely: An empty list has no guidance on the
element types unless you do flow-sensitive analysis in order to see from
its entire scope what it will contain. When that code has not yet been
written (completely) it is not unreasonable to ask for a bit of help from
the user. The point is that it is dangerous to make an incorrect choice of
type argument implicitly, and for a mutable list you can't know which
choice is correct. It's just too "DWIM"-ish to use inference here.

On Thu, Mar 26, 2015 at 10:37 PM, Natalie Weizenbaum <
notifications@github.com> wrote:

Adding text to a file isn't just more effort for the author, it's more
effort for the reader as well. The language shouldn't be more verbose just
because that verbosity can be hidden behind an IDE.

I think in practice there are going to be very few cases where the user
will initialize a list with values that are more narrowly typed than they
eventually use.


Reply to this email directly or view it on GitHub
#108 (comment)
.

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@nex3
Copy link

nex3 commented Mar 26, 2015

I'm on board with not doing inference for empty lists, but I think 95%+ of non-empty list literals will be used in a way that's consistent with their initializing values. And if not, I don't think it's dangerous; the user will just get a static error.

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 26, 2015

Summarizing options:

  1. Infer as List of Object / top (current behavior):

    var l = [1, 2, 3];
    l.add("hello"); // Allowed
    List<int> l2 = l; // Statically rejected
    List<Object> l3 = l;
    l3.add("hello"); // Allowed
  2. Infer via element types:

    var l = [1, 2, 3];
    l.add("hello"); // Statically rejected
    List<int> l2 = l; // Allowed
    List<Object> l3 = l;
    l3.add("hello"); // Rejected at runtime
  3. Reject immediately:

    var l = [1, 2, 3];  // Statically reject - require explicit type annotation

Note, the empty list [] is effectively an error in both 2 and 3.

@kevmoo
Copy link
Contributor

kevmoo commented Mar 26, 2015

There is one thing I hate more than requiring verbosity – trying to guess user intent when it's vague and ending up with unexpected results.

I don't mind var l = <int>[1, 2, 3]; at all – it also ensures reified types on the VM which are great for catching errors in other contexts.

@nex3
Copy link

nex3 commented Mar 27, 2015

There is one thing I hate more than requiring verbosity – trying to guess user intent when it's vague and ending up with unexpected results.

I don't think a list initialized with values indicates a vague user intention at all. I think the intention is so clear that users will be upset when the compiler doesn't understand them.

I don't mind var l = <int>[1, 2, 3]; at all

I think you might change your mind after using a list-based DSL. Here's what a snippet of pub's test code would look like with annotations:

setUp(() {
  d.dir(appPath, <d.Descriptor>[
    d.appPubspec(),
    d.dir("web", <d.Descriptor>[
      d.dir("one", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "one")])
      ]),
      d.dir("two", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "two")])
      ]),
      d.dir("nope", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "nope")])
      ])
    ])
  ]).create();
});

– it also ensures reified types on the VM which are great for catching errors in other contexts.

It'll be reified either way in DDC or a under a DDC-compatibility flag in the VM. Also, right now very little is reified in the VM anyway due to the style guide and the lack of generic methods, and in practice it hasn't caused much of a debugging issue.

@nex3
Copy link

nex3 commented Mar 27, 2015

Another thing just occurred to me: if we consider var x = 1 to mean that x is always an int, even though in current Dart you could theoretically want x to be dynamic, it seems very consistent for var y = [1] to mean y's contents are always ints. There's precedent for the de facto type of a value to be considered de jure by the type system.

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 12:04 AM, vsmenon notifications@github.com wrote:

Summarizing options:

  1. Infer as List of Object / top (current behavior):

It isn't actually List, it is List, and the existence of
any such object (whose type contains 'dynamic') is a problem for heap
invariant maintenance: Even checked mode execution will now be able to
evaluate an expression of a type T and yield a result of a type S where S
\not<: T, i.e., "beware, anything can come out of your heap".

var l = [1, 2, 3];

l has type dynamic, but let's assume that the examples explicitly
express the proposed type for [1, 2, 3], i.e., List l.

l.add("hello"); // AllowedList l2 = l; // Statically rejected

No, List <: List because List<\bottom> << List because
\bottom << int.

List l3 = l;
l3.add("hello"); // Allowed

Yes, both statically and dynamically.

  1. Infer via element types:

var l = [1, 2, 3];
l.add("hello"); // Statically rejected

So we assume the declaration List<int> l: OK, static warning given,
plus runtime error.

List l2 = l; // AllowedList l3 = l;
l3.add("hello"); // Rejected at runtime

Yep.

  1. Reject immediately:

var l = [1, 2, 3]; // Statically reject - require explicit type annotation

That's very, very unDarty. ;-)

Note, the empty list [] is effectively an error in both 2 and 3.


Reply to this email directly or view it on GitHub
#108 (comment)
.

My main motivation for arguing that these type arguments should be made
explicit (all the help we can get is fine) is that the invisible inferred
types can break heap invariants (in checked mode => runtime error, in
production mode => unsound basis for understanding what the program does,
ever after): With List the assignment to a List breaks, with
List the addition of a String breaks. "Damned if you do, damned if
you don't.."

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 1:22 AM, Natalie Weizenbaum <
notifications@github.com> wrote:

There is one thing I hate more than requiring verbosity – trying to guess
user intent when it's vague and ending up with unexpected results.

I don't think a list initialized with values indicates a vague user
intention at all. I think the intention is so clear that users will be
upset when the compiler doesn't understand them.

With final l = const [1, 2, 3]; there is no doubt: l is a list that
contains int objects. With any potential for mutation the programmer
could have plans for adding objects that would work with a proper supertype
of the least upper bound of the contents at initialization.

I don't mind var l = [1, 2, 3]; at all

I think you might change your mind after using a list-based DSL. Here's
what a snippet of pub's test code would look like with annotations:

setUp(() {
d.dir(appPath, <d.Descriptor>[
d.appPubspec(),
d.dir("web", <d.Descriptor>[
d.dir("one", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "one")])
]),
d.dir("two", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "two")])
]),
d.dir("nope", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "nope")])
])
])
]).create();
});

Why not let tools help us here, too? The visual presentation does not
have to spell out all parts of the code at all times.

– it also ensures reified types on the VM which are great for catching
errors in other contexts.

It'll be reified either way in DDC or a under a DDC-compatibility flag in
the VM. Also, right now very little is reified in the VM anyway due to the
style guide and the lack of generic methods, and in practice it hasn't
caused much of a debugging issue.

If the program cannot possibly depend on the value of a reified type
argument then it is of course OK for an implementation to optimize it away,
just like methods can be removed from classes etc. by a tree shaking
process. Semantically, type arguments are always reified.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 2:53 AM, Natalie Weizenbaum <
notifications@github.com> wrote:

Another thing just occurred to me: if we consider var x = 1 to mean that x
is always an int,

Same problem, it should not mean that x is always an int. But it's fine if
your tooling makes it very, very easy to edit the code such that it says int
x = 1, and also if the visual presentation compresses all type annotations
on fields/variables/parms to a tiny ellipsis such that the code looks
simpler until you insist on seeing the details.

even though in current Dart you could theoretically want x to be dynamic,
it seems very consistent for var y = [1] to mean y's contents are always
ints. There's precedent for the de facto type of a value to be
considered de jure by the type system.

I think inference should produce correct results. In the cases where this
is not possible (not just because the implementer of the inference engine
hasn't done a good job, but because it is an undecidable problem whether a
non-int will ever flow into a list-of-inferred-int) it's better to insist
that a human being takes the responsibility and the decision gets recorded
in code.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 27, 2015

On Fri, Mar 27, 2015 at 12:42 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 12:04 AM, vsmenon notifications@github.com
wrote:

Summarizing options:

  1. Infer as List of Object / top (current behavior):

It isn't actually List, it is List, and the existence of
any such object (whose type contains 'dynamic') is a problem for heap
invariant maintenance: Even checked mode execution will now be able to
evaluate an expression of a type T and yield a result of a type S where S
\not<: T, i.e., "beware, anything can come out of your heap".

I am describing DDC behavior. List is treated as List at
runtime in order to prevent the issue you describe.

var l = [1, 2, 3];

l has type dynamic, but let's assume that the examples explicitly
express the proposed type for [1, 2, 3], i.e., List l.

l.add("hello"); // AllowedList l2 = l; // Statically rejected

No, List <: List because List<\bottom> << List because
\bottom << int.

In DDC, this is statically rejected today. List <: List but
not vice versa.

List l3 = l;
l3.add("hello"); // Allowed

Yes, both statically and dynamically.

  1. Infer via element types:

var l = [1, 2, 3];
l.add("hello"); // Statically rejected

So we assume the declaration List<int> l: OK, static warning given,
plus runtime error.

List l2 = l; // AllowedList l3 = l;
l3.add("hello"); // Rejected at runtime

Yep.

  1. Reject immediately:

var l = [1, 2, 3]; // Statically reject - require explicit type
annotation

That's very, very unDarty. ;-)

Agreed, but that's effectively what we're saying if we require explicit
type arguments.

Note, the empty list [] is effectively an error in both 2 and 3.


Reply to this email directly or view it on GitHub
<
#108 (comment)

.

My main motivation for arguing that these type arguments should be made
explicit (all the help we can get is fine) is that the invisible inferred
types can break heap invariants (in checked mode => runtime error, in
production mode => unsound basis for understanding what the program does,
ever after): With List the assignment to a List breaks, with
List the addition of a String breaks. "Damned if you do, damned if
you don't.."

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
#108 (comment)
.

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 2:44 PM, vsmenon notifications@github.com wrote:

On Fri, Mar 27, 2015 at 12:42 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 12:04 AM, vsmenon notifications@github.com
wrote:

Summarizing options:

  1. Infer as List of Object / top (current behavior):

It isn't actually List, it is List, and the existence
of
any such object (whose type contains 'dynamic') is a problem for heap
invariant maintenance: Even checked mode execution will now be able to
evaluate an expression of a type T and yield a result of a type S where S
\not<: T, i.e., "beware, anything can come out of your heap".

I am describing DDC behavior. List is treated as List at
runtime in order to prevent the issue you describe.

Ah, sorry! I didn't realize that.

But if we are really talking about multiple languages with a known relation
between them (e.g., "the set of DDC-Dart programs is a subset of the set of
ECMA-Dart programs, with the same semantics") then I believe we have to be
careful that it is made explicit, and also that the properties are upheld.
I suspect that inferred types for literal lists and inferred types for
var declarations are very good examples of differing semantics, rather
than just program universe restriction, at least when running in checked
mode.

var l = [1, 2, 3];

l has type dynamic, but let's assume that the examples explicitly
express the proposed type for [1, 2, 3], i.e., List l.

l.add("hello"); // AllowedList l2 = l; // Statically rejected

No, List <: List because List<\bottom> << List
because
\bottom << int.

In DDC, this is statically rejected today. List <: List but
not vice versa.

OK, I think that's a pure subset thing! ;)

List l3 = l;

l3.add("hello"); // Allowed

Yes, both statically and dynamically.

  1. Infer via element types:

var l = [1, 2, 3];
l.add("hello"); // Statically rejected

So we assume the declaration List<int> l: OK, static warning given,
plus runtime error.

List l2 = l; // AllowedList l3 = l;
l3.add("hello"); // Rejected at runtime

Yep.

  1. Reject immediately:

var l = [1, 2, 3]; // Statically reject - require explicit type
annotation

That's very, very unDarty. ;-)

Agreed, but that's effectively what we're saying if we require explicit
type arguments.

Indeed.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@jmesserly
Copy link
Contributor

I think @eernstg already addressed this, but var x = 1 is somewhat different from var x = [1]. In the former case, if inference assumes x:int, that assumption only lasts for the method body and only affects the type on the stack. Locals are rarely mutated, and rarely to a different type, so the assumption is usually right. By contrast an inferred x:List<int> for var x = [1] affects the List<T> that is on the heap, and can have effects well beyond the current method. Also Lists are frequently mutated.

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 27, 2015

On Fri, Mar 27, 2015 at 7:31 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 2:44 PM, vsmenon notifications@github.com wrote:

On Fri, Mar 27, 2015 at 12:42 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 12:04 AM, vsmenon notifications@github.com
wrote:

Summarizing options:

  1. Infer as List of Object / top (current behavior):

It isn't actually List, it is List, and the
existence
of
any such object (whose type contains 'dynamic') is a problem for heap
invariant maintenance: Even checked mode execution will now be able to
evaluate an expression of a type T and yield a result of a type S
where S
\not<: T, i.e., "beware, anything can come out of your heap".

I am describing DDC behavior. List is treated as List at
runtime in order to prevent the issue you describe.

Ah, sorry! I didn't realize that.

But if we are really talking about multiple languages with a known relation
between them (e.g., "the set of DDC-Dart programs is a subset of the set of
ECMA-Dart programs, with the same semantics") then I believe we have to be
careful that it is made explicit, and also that the properties are upheld.
I suspect that inferred types for literal lists and inferred types for
var declarations are very good examples of differing semantics, rather
than just program universe restriction, at least when running in checked
mode.

If the semantics were to differ, DDC is obligated to throw an error. I.e.,

a non-exceptional DDC execution -> same checked mode execution -> same
production mode execution

(and ideally any exceptions are catastrophic / non-recoverable).

I agree that inferring - and reifying - more precise types for literals is
tricky in this regard, but I think it fits within the philosophy. Closure
literals (due to the lack of explicit return types) force us to consider
this.

var l = [1, 2, 3];

l has type dynamic, but let's assume that the examples explicitly
express the proposed type for [1, 2, 3], i.e., List l.

l.add("hello"); // AllowedList l2 = l; // Statically rejected

No, List <: List because List<\bottom> << List
because
\bottom << int.

In DDC, this is statically rejected today. List <: List but
not vice versa.

OK, I think that's a pure subset thing! ;)

List l3 = l;

l3.add("hello"); // Allowed

Yes, both statically and dynamically.

  1. Infer via element types:

var l = [1, 2, 3];
l.add("hello"); // Statically rejected

So we assume the declaration List<int> l: OK, static warning given,
plus runtime error.

List l2 = l; // AllowedList l3 = l;
l3.add("hello"); // Rejected at runtime

Yep.

  1. Reject immediately:

var l = [1, 2, 3]; // Statically reject - require explicit type
annotation

That's very, very unDarty. ;-)

Agreed, but that's effectively what we're saying if we require explicit
type arguments.

Indeed.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
#108 (comment)
.

@leafpetersen
Copy link
Contributor

With respect to this specific example:

setUp(() {
  d.dir(appPath, <d.Descriptor>[
    d.appPubspec(),
    d.dir("web", <d.Descriptor>[
      d.dir("one", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "one")])
      ]),
      d.dir("two", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "two")])
      ]),
      d.dir("nope", <d.Descriptor>[
        d.dir("inner", <d.Descriptor>[d.file("file.txt", "nope")])
      ])
    ])
  ]).create();
});

If d.dir is typed appropriately, we might get the right types via downwards type propagation. This doesn't fix everything, but it does help. For example:

// Unannotated, requires upwards inference to get the right type
var l = [[[3]]]
// Fully annotated
var l = <List<List<int>>>[<List<int>>[<int>[3]]]
// Equivalent to the previous if downwards propagation is done
var l = <List<List<int>>>[[[3]]]

@jmesserly
Copy link
Contributor

Here's dir https://github.com/dart-lang/scheduled_test/blob/master/lib/descriptor.dart#L120

DirectoryDescriptor dir(String name, [Iterable<Descriptor> contents]) =>
    new DirectoryDescriptor(name, contents == null ? <Descriptor>[] : contents);

@leafpetersen could we conclude List<Descriptor> based on Iterable<Descriptor>?

@leafpetersen
Copy link
Contributor

Yes, the CL I have out isn't fully general but should already handle this case.

@eernstg
Copy link

eernstg commented Mar 27, 2015

That's a very big door to open (a conforming implementation of DDC is now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'. Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same semantics
for the programs accepted by both? After all, the additional restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

The only thing I'm advocating is to let programmers make the choice when
there is no single "most correct" answer. Choosing the element type for a
mutable List is one such case.

Inference could come up with some bounds though .. if flow analysis shows
that the list will be assigned to a List then the chosen type argument
must be an S such that S <: T. If an element of type U is added to the
list (or otherwise U occurs in a contravariant position relative to the
interface of the list) then we must ensure U <: S.

For function literals using '=>', the spec does mention that the static
type of the function literal has the statically known type of the return
expression as its return type. Why not 'lint' ourselves into using '=>' as
much as possible, and writing a local function with an explicit return type
otherwise? It's really not much more verbose than a function literal, and
it makes the expression containing the function literal much more readable..

On Fri, Mar 27, 2015 at 4:23 PM, vsmenon notifications@github.com wrote:

On Fri, Mar 27, 2015 at 7:31 AM, Erik Ernst notifications@github.com

wrote:

On Fri, Mar 27, 2015 at 2:44 PM, vsmenon notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 12:42 AM, Erik Ernst <notifications@github.com

wrote:

On Fri, Mar 27, 2015 at 12:04 AM, vsmenon notifications@github.com
wrote:

Summarizing options:

  1. Infer as List of Object / top (current behavior):

It isn't actually List, it is List, and the
existence
of
any such object (whose type contains 'dynamic') is a problem for heap
invariant maintenance: Even checked mode execution will now be able
to
evaluate an expression of a type T and yield a result of a type S
where S
\not<: T, i.e., "beware, anything can come out of your heap".

I am describing DDC behavior. List is treated as List
at
runtime in order to prevent the issue you describe.

Ah, sorry! I didn't realize that.

But if we are really talking about multiple languages with a known
relation
between them (e.g., "the set of DDC-Dart programs is a subset of the set
of
ECMA-Dart programs, with the same semantics") then I believe we have to
be
careful that it is made explicit, and also that the properties are
upheld.
I suspect that inferred types for literal lists and inferred types for
var declarations are very good examples of differing semantics, rather
than just program universe restriction, at least when running in checked
mode.

If the semantics were to differ, DDC is obligated to throw an error. I.e.,

a non-exceptional DDC execution -> same checked mode execution -> same
production mode execution

(and ideally any exceptions are catastrophic / non-recoverable).

I agree that inferring - and reifying - more precise types for literals is
tricky in this regard, but I think it fits within the philosophy. Closure
literals (due to the lack of explicit return types) force us to consider
this.

var l = [1, 2, 3];

l has type dynamic, but let's assume that the examples
explicitly
express the proposed type for [1, 2, 3], i.e., List l.

l.add("hello"); // AllowedList l2 = l; // Statically rejected

No, List <: List because List<\bottom> << List
because
\bottom << int.

In DDC, this is statically rejected today. List <: List
but
not vice versa.

OK, I think that's a pure subset thing! ;)

List l3 = l;

l3.add("hello"); // Allowed

Yes, both statically and dynamically.

  1. Infer via element types:

var l = [1, 2, 3];
l.add("hello"); // Statically rejected

So we assume the declaration List<int> l: OK, static warning
given,
plus runtime error.

List l2 = l; // AllowedList l3 = l;
l3.add("hello"); // Rejected at runtime

Yep.

  1. Reject immediately:

var l = [1, 2, 3]; // Statically reject - require explicit type
annotation

That's very, very unDarty. ;-)

Agreed, but that's effectively what we're saying if we require explicit
type arguments.

Indeed.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
<
#108 (comment)

.


Reply to this email directly or view it on GitHub
#108 (comment)
.

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@jmesserly
Copy link
Contributor

Why not 'lint' ourselves into using '=>' as
much as possible, and writing a local function with an explicit return type
otherwise?

Really good point. In practice, most of these are => functions already. The biggest exception that comes to mind is Future and Stream (.then, .listen), but async/await should take care of those?

@jmesserly
Copy link
Contributor

(also, couldn't we compute a static type from return statements if we were interested? For local functions.)

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 5:41 PM, Leaf Petersen notifications@github.com
wrote:

With respect to this specific example:

setUp(() {
d.dir(appPath, <d.Descriptor>[
d.appPubspec(),
d.dir("web", <d.Descriptor>[
d.dir("one", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "one")])
]),
d.dir("two", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "two")])
]),
d.dir("nope", <d.Descriptor>[
d.dir("inner", <d.Descriptor>[d.file("file.txt", "nope")])
])
])
]).create();
});

If d.dir is typed appropriately, we might get the right types via
downwards type propagation. This doesn't fix everything, but it does help.
For example:

// Unannotated, requires upwards inference to get the right typevar l = [[[3]]]// Fully annotatedvar l = <List<List>>[<List>[[3]]]// Equivalent to the previous if downwards propagation is donevar l = <List<List>>[[[3]]]

I think we did at one point discuss an extra device that could be helpful
here: Using a type annotation that indicates to the tools that the locally
inferred type is OK:

var l = <>[<>[<>[3]]]

The innermost list would get type List because we have stated that it
is OK to trust the contents, similarly we get List<List> for the next
level, and so on. With this, the programmer does promise that it is ok
to use local inference, so we get the buy-in that we need, plus a
reasonably concise notation.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@leafpetersen
Copy link
Contributor

I think we did at one point discuss an extra device that could be helpful
here: Using a type annotation that indicates to the tools that the locally
inferred type is OK:

Yes. I very much want to see something like this happen, covering both generic class arguments and generic method arguments (assuming some form of generic methods gets in). I think a key issue is making sure that the inference is specified in a way that admits a very efficient algorithm that can be implemented in the VM - I don't think this should be hard, but the details need to be nailed down.

@eernstg
Copy link

eernstg commented Mar 27, 2015

On Fri, Mar 27, 2015 at 6:03 PM, John Messerly notifications@github.com
wrote:

Why not 'lint' ourselves into using '=>' as
much as possible, and writing a local function with an explicit return type
otherwise?

Really good point. In practice, most of these are => functions already.
The biggest exception that comes to mind is Future and Stream (.then,
.listen), but async/await should take care of those?

Hope so! ;-D


Reply to this email directly or view it on GitHub
#108 (comment)
.

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@leafpetersen
Copy link
Contributor

That's a very big door to open (a conforming implementation of DDC is now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'.

We are still iterating on the type system (static + dynamic), but there is a type system behind what we are doing, so no, such an implementation would not be a conforming DDC implementation. Technically speaking it would satisfy the same correctness criteria as DDC does with respect to semantic coherence with the rest of the platforms though. In practice, the empirical question is whether we can make the set of programs that fall into the "throws under DDC, doesn't on other platforms" small enough, and more importantly, make that set consist primarily of programs that programmers want to have rejected.

Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same semantics
for the programs accepted by both? After all, the additional restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

Maybe? I'm not averse to pushing on this to see if there's a different path to get where we need to get, but... we've looked pretty hard at the problem (including running lots of experiments on existing code bases) and this is the best path forward that we've found so far given the various constraints.

@nex3
Copy link

nex3 commented Mar 27, 2015

@jmesserly

Locals are rarely mutated, and rarely to a different type, so the assumption is usually right.

If we're talking about empirical likelihood, I'm confident that empirically a List literal initialized with values is almost never mutated to contain values of a different type.

By contrast an inferred x:List for var x = [1] affects the List that is on the heap, and can have effects well beyond the current method.

It definitely affects the reified type, but I doubt it would have wide-reaching effects. The type is statically verified everywhere it's passed to another method or returned from its originating method, so it seems very difficult for it to leak an unexpected reified type.

In general, I'm struggling to see a case where the proposed heuristic causes problems in practice that aren't immediately obvious due to static analysis and trivially fixable by adding <dynamic>. Given that none of the options is likely to produce difficult-to-understand or non-local behavior and all of them provide ample means of user customization, it seems like the option that requires the fewest extra annotations is the best.

@munificent
Copy link
Contributor

This discussion is awesome!

Here's my perspective, coming at this as a user. I'm using Dart instead of JS because I want the static checker to catch as many errors as possible. It's a given that as a Dart user I will occasionally have to write extra bits of stuff to shut up the type checker. If I didn't want to do that, I'd use JS.

Given that, I think it makes sense for DDC to default to assuming a narrow type and require a bit of annotation to widen. I am fine with an error here:

var apes = ["chimp", "bonobo"];
apes.add(123);

This does not mean that inferring a narrow type always leads to more type errors. If we assume the narrower type, it means I do not get an error here:

var apes = ["chimp", "bonobo"];
apes.first.toUpperCase(); // <--

And I would get an error there if apes was inferred to be List<Object>. I think code like the latter is a lot more common than the former. (Most lists are homogeneous, and you certainly do call methods on elements in them.) So I think it makes sense to optimize for that user experience.

@leafpetersen
Copy link
Contributor

@nex3

I think your general point is quite reasonable, but just to be sure we're clear:

It definitely affects the reified type, but I doubt it would have wide-reaching effects. The type is statically verified everywhere it's passed to another method or returned from its originating method, so it seems very difficult for it to leak an unexpected reified type.

We currently implement covariant subtyping for generics. This means you can pass off a List to a context expecting a List, with no static warnings. And that context can in turn (try to) add an int to the list. This will throw at runtime, but we won't catch it statically. I agree that code that does this is probably much rarer.

If we do this kind of inference (and really, even if we don't) - I think we need to have a good story about how we communicate back to the user exactly where the reified type that is causing the error comes from. I think this can be done - it's just a matter of being careful and helpful with our error messages. We should aim to be able to report an error that looks like:

"Hey, you just tried to add an int to a List that was allocated as a List of String on line XXX of file YYY. Do you really want to do that? If so, you should change line XXX of file YYY"

@nex3
Copy link

nex3 commented Mar 28, 2015

@leafpetersen Good point. I do think that a function that accepts List<Object> and tries to add an int to it is just bad code, but you're right that it could happen.

@alan-knight
Copy link
Contributor

OK, that one I don't follow at all. If I'm given a List<Object> I'm not supposed to add an int to it? What, if anything, am I allowed to add?

@nex3
Copy link

nex3 commented Mar 28, 2015

I'd say in general that a function that takes a list shouldn't modify it at all, and certainly not in a way that breaks under covariance. There's room for functions that are explicitly documented to say "adds an int to the list" but they're the exception rather than the rule.

Put another way, if a function declares a parameter as List<Object>, it should assume that it might get any argument that's a subtype of List<Object>, including a list with a more narrow reified type. If it has additional contraints on its parameter, such as that it should be able to have an int added, it should explicitly document them.

@eernstg
Copy link

eernstg commented Mar 31, 2015

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen notifications@github.com
wrote:

That's a very big door to open (a conforming implementation of DDC is now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'.

We are still iterating on the type system (static + dynamic), but there is
a type system behind what we are doing, so no, such an implementation would
not be a conforming DDC implementation. Technically speaking it would
satisfy the same correctness criteria as DDC does with respect to semantic
coherence with the rest of the platforms though. In practice, the empirical
question is whether we can make the set of programs that fall into the
"throws under DDC, doesn't on other platforms" small enough, and more
importantly, make that set consist primarily of programs that programmers
want to have rejected.

Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same semantics
for the programs accepted by both? After all, the additional restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

Maybe? I'm not averse to pushing on this to see if there's a different
path to get where we need to get, but... we've looked pretty hard at the
problem (including running lots of experiments on existing code bases) and
this is the best path forward that we've found so far given the various
constraints.

Do you have any other source of additional exceptions than the following
one?:

- the heap contains an object O whose type contains `dynamic` (e.g., a

List)

- an expression with static type T yields a value V that does not

conform to T because it involves method/getter invocation on O

- an exception occurs which would not occur in standard Dart, because

you insert checks enforcing that V must conform to T (this is similar to
checked-mode checking, but occurs in additional positions in the code).

If so, maybe you could outlaw those nasty 'reified dynamic' objects? ;-)
Who would want them, and at the same time want a more rigorous typing
discipline?

Returning to the overall topic: I think that it is fair to claim that if
we have a language L and consider a subset M of the programs in L (say, the
ones that are accepted by a type checker which is stricter than the
standard type checker), then a program P in M as a program in L when
executed with the normal semantics (nobody can tell that P "comes from" M
rather than L, because it is indeed a member of L). This is a clear and
clean case. As soon as there is any difference in the runtime semantics
for the same source code (e.g., throwing more exceptions), it's much harder
to argue that M is the same language (and we cannot actually enforce that
nobody can say try/catch and change the semantics in arbitrary ways).

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@eernstg
Copy link

eernstg commented Mar 31, 2015

On Fri, Mar 27, 2015 at 10:27 PM, Natalie Weizenbaum <
notifications@github.com> wrote:

@jmesserly https://github.com/jmesserly

Locals are rarely mutated, and rarely to a different type, so the
assumption is usually right.

[Ah, couldn't resist: They should be final if possible. ;-]

If we're talking about empirical likelihood, I'm confident that
empirically a List literal initialized with values is almost never mutated
to contain values of a different type.

I don't think the relevant case is the situation where a wild stranger like
a String is added to a so-far-List-of-int. That's probably a bug. The
relevant case is much more likely to be when the list is intended to hold
elements from a hierarchy (say, numbers), and it starts out with something
more special (say, a few ints). Don't you ever have a mutable data
structure whose elements should be taken from the subclasses of a given
abstract class?

By contrast an inferred x:List for var x = [1] affects the List that is
on the heap, and can have effects well beyond the current method.

It definitely affects the reified type, but I doubt it would have
wide-reaching effects.

I think you're right that most of these objects will not escape. But if
they somehow escape, e.g., by being returned or stored in a field, then
they're potentially reaching every corner of the program.

The type is statically verified everywhere it's passed to another method
or returned from its originating method, so it seems very difficult for it
to leak an unexpected reified type.

Well, a List is acceptable (in checked-mode, too) wherever a List is
expected, as long as T <: S. So nobody will complain about the list
itself. But the list will complain when something is added that has a more
general type than the (inferred) type argument T. So it's a time bomb,
created by inference (i.e., invisibly) somewhere far away.

In general, I'm struggling to see a case where the proposed heuristic
causes problems in practice that aren't immediately obvious due to static
analysis

I don't think such a time bomb is immediately obvious.

and trivially fixable by adding .

.. if you can find the location where it was created ..

Given that none of the options is likely to produce
difficult-to-understand or non-local behavior

Granted, you could argue that the situation where the list does not escape
is special, because a local analysis of the syntactic scope of the name of
the list-valued variable would suffice. The escape analysis is not
entirely trivial, however, if that variable is given as an argument to some
method calls.

and all of them provide ample means of user customization, it seems like
the option that requires the fewest extra annotations is the best.

There's a general conflict here: I tend to think that mistakes should be
made by people, and algorithms should ask for input when there is no
known-correct choice. Even when the program grows by a couple of
characters.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 31, 2015

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen notifications@github.com
wrote:

That's a very big door to open (a conforming implementation of DDC is now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'.

We are still iterating on the type system (static + dynamic), but there
is
a type system behind what we are doing, so no, such an implementation
would
not be a conforming DDC implementation. Technically speaking it would
satisfy the same correctness criteria as DDC does with respect to
semantic
coherence with the rest of the platforms though. In practice, the
empirical
question is whether we can make the set of programs that fall into the
"throws under DDC, doesn't on other platforms" small enough, and more
importantly, make that set consist primarily of programs that programmers
want to have rejected.

Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same
semantics
for the programs accepted by both? After all, the additional restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

Maybe? I'm not averse to pushing on this to see if there's a different
path to get where we need to get, but... we've looked pretty hard at the
problem (including running lots of experiments on existing code bases)
and
this is the best path forward that we've found so far given the various
constraints.

Do you have any other source of additional exceptions than the following
one?:

Yes. In general, we prefer static errors to runtime ones. E.g., we
restrict function subtyping variance to contra on args and co on return.
This simplifies the generated code and gives the user earlier error
reporting.

  • the heap contains an object O whose type contains dynamic (e.g., a
    List)
  • an expression with static type T yields a value V that does not
    conform to T because it involves method/getter invocation on O
  • an exception occurs which would not occur in standard Dart, because
    you insert checks enforcing that V must conform to T (this is similar to
    checked-mode checking, but occurs in additional positions in the code).

If so, maybe you could outlaw those nasty 'reified dynamic' objects? ;-)
Who would want them, and at the same time want a more rigorous typing
discipline?

Returning to the overall topic: I think that it is fair to claim that if
we have a language L and consider a subset M of the programs in L (say, the
ones that are accepted by a type checker which is stricter than the
standard type checker), then a program P in M as a program in L when
executed with the normal semantics (nobody can tell that P "comes from" M
rather than L, because it is indeed a member of L). This is a clear and
clean case. As soon as there is any difference in the runtime semantics
for the same source code (e.g., throwing more exceptions), it's much harder
to argue that M is the same language (and we cannot actually enforce that
nobody can say try/catch and change the semantics in arbitrary ways).

Let M = checked mode.
Let L = production mode. Or one of our variants - e.g., trust-types mode.

P is equivalent under M and L only for non-exceptional cases. This is
what we are aiming for as well.

And, of course, we can make some exceptions much more difficult to catch.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
#108 (comment)
.

@eernstg
Copy link

eernstg commented Mar 31, 2015

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon notifications@github.com wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <notifications@github.com

wrote:

That's a very big door to open (a conforming implementation of DDC is
now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'.

We are still iterating on the type system (static + dynamic), but there
is
a type system behind what we are doing, so no, such an implementation
would
not be a conforming DDC implementation. Technically speaking it would
satisfy the same correctness criteria as DDC does with respect to
semantic
coherence with the rest of the platforms though. In practice, the
empirical
question is whether we can make the set of programs that fall into the
"throws under DDC, doesn't on other platforms" small enough, and more
importantly, make that set consist primarily of programs that
programmers
want to have rejected.

Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same
semantics
for the programs accepted by both? After all, the additional
restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

Maybe? I'm not averse to pushing on this to see if there's a different
path to get where we need to get, but... we've looked pretty hard at
the
problem (including running lots of experiments on existing code bases)
and
this is the best path forward that we've found so far given the various
constraints.

Do you have any other source of additional exceptions than the following
one?:

Yes. In general, we prefer static errors to runtime ones. E.g., we
restrict function subtyping variance to contra on args and co on return.
This simplifies the generated code and gives the user earlier error
reporting.

Aha, do you then restrict assignability to subtyping? This would bring you
closer to traditional (sound) type checking, but it would conflict with
covariance for generic types (and of course with the nasty
List-ish objects). Generic covariance may be the toughest case,
because it is a very constraining choice to mandate invariance (and
well-known solutions such as wildcards/existentials require additional
syntax).

[..]

Returning to the overall topic: I think that it is fair to claim that if
we have a language L and consider a subset M of the programs in L (say,
the
ones that are accepted by a type checker which is stricter than the
standard type checker), then a program P in M as a program in L when
executed with the normal semantics (nobody can tell that P "comes from" M
rather than L, because it is indeed a member of L). This is a clear and
clean case. As soon as there is any difference in the runtime semantics
for the same source code (e.g., throwing more exceptions), it's much
harder
to argue that M is the same language (and we cannot actually enforce that
nobody can say try/catch and change the semantics in arbitrary ways).

Let M = checked mode.
Let L = production mode. Or one of our variants - e.g., trust-types mode.

P is equivalent under M and L only for non-exceptional cases. This is
what we are aiming for as well.

And, of course, we can make some exceptions much more difficult to catch.

Here's a case where we use different terminology: To me, Dart-checked-mode
and Dart-production-mode are different languages, because they give rise to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

The point is that programmers can use one kind of reasoning about their
programs when targeting production mode (for instance, they can rely on
structural aka duck typing) and checked mode (where the duck typing should
be avoided because it causes TypeError exceptions). With a longer list of
different runtime semantics, we also get a longer list of cases that
programmers must keep in mind all the time, in case they'll be using more
than one of these languages.

What I was referring to is a less powerful kind of variation: Simple
subsetting of the set of included programs: M includes just some of the
programs in L (namely the ones that some checking procedure has flagged as
"nice" programs), but there is only one runtime semantics for each program
in M, no matter whether we think of it as a member of M or as a member of
L. With that kind of relation, programmers don't have to reason about
their programs in terms of multiple languages.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@vsmenon
Copy link
Contributor Author

vsmenon commented Mar 31, 2015

On Tue, Mar 31, 2015 at 8:52 AM, Erik Ernst notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon notifications@github.com wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <
notifications@github.com

wrote:

That's a very big door to open (a conforming implementation of DDC is
now
allowed to translate all programs into 'main([args]) { throw new
UnimplementedError(); }'.

We are still iterating on the type system (static + dynamic), but
there
is
a type system behind what we are doing, so no, such an implementation
would
not be a conforming DDC implementation. Technically speaking it would
satisfy the same correctness criteria as DDC does with respect to
semantic
coherence with the rest of the platforms though. In practice, the
empirical
question is whether we can make the set of programs that fall into
the
"throws under DDC, doesn't on other platforms" small enough, and more
importantly, make that set consist primarily of programs that
programmers
want to have rejected.

Wouldn't it be possible to reject some programs
under DDC that dart2js et al. accept, and then maintain the same
semantics
for the programs accepted by both? After all, the additional
restrictions
are supposed to lead to safer programs, not throw-happier ones. ;-)

Maybe? I'm not averse to pushing on this to see if there's a
different
path to get where we need to get, but... we've looked pretty hard at
the
problem (including running lots of experiments on existing code
bases)
and
this is the best path forward that we've found so far given the
various
constraints.

Do you have any other source of additional exceptions than the
following
one?:

Yes. In general, we prefer static errors to runtime ones. E.g., we
restrict function subtyping variance to contra on args and co on return.
This simplifies the generated code and gives the user earlier error
reporting.

Aha, do you then restrict assignability to subtyping?

We're still experimenting on this. When we don't restrict assignability,
we insert a runtime check when necessary (effectively the checked mode
check).

This would bring you
closer to traditional (sound) type checking, but it would conflict with
covariance for generic types (and of course with the nasty
List-ish objects). Generic covariance may be the toughest case,
because it is a very constraining choice to mandate invariance (and
well-known solutions such as wildcards/existentials require additional
syntax).

[..]

Returning to the overall topic: I think that it is fair to claim that
if
we have a language L and consider a subset M of the programs in L (say,
the
ones that are accepted by a type checker which is stricter than the
standard type checker), then a program P in M as a program in L when
executed with the normal semantics (nobody can tell that P "comes
from" M
rather than L, because it is indeed a member of L). This is a clear and
clean case. As soon as there is any difference in the runtime semantics
for the same source code (e.g., throwing more exceptions), it's much
harder
to argue that M is the same language (and we cannot actually enforce
that
nobody can say try/catch and change the semantics in arbitrary ways).

Let M = checked mode.
Let L = production mode. Or one of our variants - e.g., trust-types mode.

P is equivalent under M and L only for non-exceptional cases. This is
what we are aiming for as well.

And, of course, we can make some exceptions much more difficult to catch.

Here's a case where we use different terminology: To me, Dart-checked-mode
and Dart-production-mode are different languages, because they give rise to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

I don't think our users see it this way. They develop in one and deploy in
(something like) the latter.

The point is that programmers can use one kind of reasoning about their

programs when targeting production mode (for instance, they can rely on
structural aka duck typing) and checked mode (where the duck typing should
be avoided because it causes TypeError exceptions). With a longer list of
different runtime semantics, we also get a longer list of cases that
programmers must keep in mind all the time, in case they'll be using more
than one of these languages.

What I was referring to is a less powerful kind of variation: Simple
subsetting of the set of included programs: M includes just some of the
programs in L (namely the ones that some checking procedure has flagged as
"nice" programs), but there is only one runtime semantics for each program
in M, no matter whether we think of it as a member of M or as a member of
L. With that kind of relation, programmers don't have to reason about
their programs in terms of multiple languages.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
#108 (comment)
.

@eernstg
Copy link

eernstg commented Apr 1, 2015

On Tue, Mar 31, 2015 at 6:01 PM, vsmenon notifications@github.com wrote:

On Tue, Mar 31, 2015 at 8:52 AM, Erik Ernst notifications@github.com

wrote:

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <
notifications@github.com

wrote:

[..]

Yes. In general, we prefer static errors to runtime ones. E.g., we
restrict function subtyping variance to contra on args and co on
return.
This simplifies the generated code and gives the user earlier error
reporting.

Aha, do you then restrict assignability to subtyping?

We're still experimenting on this. When we don't restrict assignability,
we insert a runtime check when necessary (effectively the checked mode
check).

OK. I'm trying to wrap my head around the idea that we can have different
behavior at runtime and still somehow maintain that it is "the same
language".

I really think that semantic differences (between different ways to run the
same code) require programmers to be acutely aware of which "way" they
intend to run the code.

However, if it is possible to enforce that all the variants run in
lock-step for a while, and then only differ because some-but-not-all
variants terminate with a runtime error, then we can at least claim that
all the knowledge that we can establish about the program for one of the
variants (say, ECMA-408) will hold for them all as long as they are all
running. For that, we'd need errors that can't be caught. Maybe some kind
of "platform.exit(_)" would do?

[..]

Here's a case where we use different terminology: To me,
Dart-checked-mode
and Dart-production-mode are different languages, because they give rise
to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

I don't think our users see it this way. They develop in one and deploy in
(something like) the latter.

If they do exactly that then they are bound to experience different
behavior in deployed code than that which was tested and debugged; I cannot
imagine that this would be good for business.

[..]

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@vsmenon
Copy link
Contributor Author

vsmenon commented Apr 1, 2015

On Wed, Apr 1, 2015 at 12:55 AM, Erik Ernst notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 6:01 PM, vsmenon notifications@github.com wrote:

On Tue, Mar 31, 2015 at 8:52 AM, Erik Ernst notifications@github.com

wrote:

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst <
notifications@github.com>
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <
notifications@github.com

wrote:

[..]

Yes. In general, we prefer static errors to runtime ones. E.g., we
restrict function subtyping variance to contra on args and co on
return.
This simplifies the generated code and gives the user earlier error
reporting.

Aha, do you then restrict assignability to subtyping?

We're still experimenting on this. When we don't restrict assignability,
we insert a runtime check when necessary (effectively the checked mode
check).

OK. I'm trying to wrap my head around the idea that we can have different
behavior at runtime and still somehow maintain that it is "the same
language".

I really think that semantic differences (between different ways to run the
same code) require programmers to be acutely aware of which "way" they
intend to run the code.

However, if it is possible to enforce that all the variants run in
lock-step for a while, and then only differ because some-but-not-all
variants terminate with a runtime error, then we can at least claim that
all the knowledge that we can establish about the program for one of the
variants (say, ECMA-408) will hold for them all as long as they are all
running. For that, we'd need errors that can't be caught. Maybe some kind
of "platform.exit(_)" would do?

Exactly. This is what we mean by "catastrophic" or "non-recoverable"
errors. In the browser, we can force the JS debugger open (the "debugger"
statement) and / or alter the UI of the running program. We can also make
these non-catchable in generated code.

[..]

Here's a case where we use different terminology: To me,
Dart-checked-mode
and Dart-production-mode are different languages, because they give
rise
to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

I don't think our users see it this way. They develop in one and deploy
in
(something like) the latter.

If they do exactly that then they are bound to experience different
behavior in deployed code than that which was tested and debugged; I cannot
imagine that this would be good for business.

This is exactly how we've been telling our users to use Dart. The names
"checked" and "production" rather emphasize it. :-) "assert" statements
only run in checked mode.

I don't think this is as bad as you seem to think. There is certainly
plenty of precedent for debug mode and release mode.

That said, I think it's good motivation for favoring static
(mode-insensitive) enforcement whenever practical over runtime
(mode-sensitive) enforcement. E.g., to answer your earlier question, I'd
lean toward restricting assignability to subtyping if we can ... but it
does force more code changes (and bloat?) and we'll need to see how users
feel about that. Pragmatically, it might make more sense as a static
warning as opposed to an error.

[..]

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984


Reply to this email directly or view it on GitHub
#108 (comment)
.

@munificent
Copy link
Contributor

I'd lean toward restricting assignability to subtyping if we can ...

Yes yes yes.

but it does force more code changes (and bloat?) and we'll need to see how users
feel about that.

One option would to let APIs where we think downcasts are likely to be common annotate that fact and then allow them implicitly there. So assignment would in most cases use normal subtype tests. But, for example, querySelector() in "dart:html" could be defined like:

@implicitDowncast
Element querySelector(String selector) { ... }

and then calls to that have looser assignment compatibility rules.

@eernstg
Copy link

eernstg commented Apr 5, 2015

On Tue, Mar 31, 2015 at 6:01 PM, vsmenon notifications@github.com wrote:

On Tue, Mar 31, 2015 at 8:52 AM, Erik Ernst notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst <notifications@github.com
wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <
notifications@github.com> wrote:
[..]

[..do you..] restrict assignability to subtyping?

We're still experimenting on this. When we don't restrict assignability,
we insert a runtime check when necessary (effectively the checked mode
check).

I think it would be nice to be able to say "the behavior of your compiled
program is exactly checked mode" or possibly ".. is checked mode plus extra
checks enforcing ". But if you change static
assignability checks to subtype checks, couldn't you use that to insert
fewer checks than those inserted by a standard checked mode compilation
(because they are statically guaranteed to succeed)?

Here's a case where we use different terminology: To me, Dart-checked-mode

and Dart-production-mode are different languages, because they give rise
to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

I don't think our users see it this way. They develop in one and deploy in
(something like) the latter.

OK, my choice of words was a bit polemic. But still, I think it's
dangerous to have different semantics and think of them as "the same
language". If we have the situation where one execution terminates with a
checked-mode violation, and the other one runs in production mode and
survives for a while after that, we might very well have, say, stored some
inconsistent objects on disk or otherwise created problems for the future
because of the undetected violation. Code that ought to deal with that
type of situation would not exist, or it could be totally untested, because
the program was developed in checked mode and deployed in production mode.
Do we get enough extra performance in production mode to make that a good
trade-off?

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@eernstg
Copy link

eernstg commented Apr 5, 2015

On Wed, Apr 1, 2015 at 3:53 PM, vsmenon notifications@github.com wrote:

On Wed, Apr 1, 2015 at 12:55 AM, Erik Ernst notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 6:01 PM, vsmenon notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 8:52 AM, Erik Ernst <notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:27 PM, vsmenon <notifications@github.com
wrote:

On Tue, Mar 31, 2015 at 4:58 AM, Erik Ernst <
notifications@github.com> wrote:

On Fri, Mar 27, 2015 at 6:34 PM, Leaf Petersen <
notifications@github.com> wrote:

[..]
However, if it is possible to enforce that all the variants run in
lock-step for a while, and then only differ because some-but-not-all
variants terminate with a runtime error, then we can at least claim that
all the knowledge that we can establish about the program for one of the
variants (say, ECMA-408) will hold for them all as long as they are all
running. For that, we'd need errors that can't be caught. Maybe some kind
of "platform.exit(_)" would do?

Exactly. This is what we mean by "catastrophic" or "non-recoverable"
errors. In the browser, we can force the JS debugger open (the "debugger"
statement) and / or alter the UI of the running program. We can also make
these non-catchable in generated code.

Very nice! To me, the crucial point is that we shouldn't have to consider
different program behaviors, and allowing for "one of them stops earlier,
but they're in lockstep until then" is a good way to at least keep those
differences under control.

[..]

Here's a case where we use different terminology: To me,
Dart-checked-mode
and Dart-production-mode are different languages, because they give
rise
to
different runtime semantics, even though the syntax is suspiciously
similar. ;)

I don't think our users see it this way. They develop in one and deploy
in
(something like) the latter.

If they do exactly that then they are bound to experience different
behavior in deployed code than that which was tested and debugged; I
cannot
imagine that this would be good for business.

This is exactly how we've been telling our users to use Dart. The names
"checked" and "production" rather emphasize it. :-) "assert" statements
only run in checked mode.

As long as there are no checked mode violations, the production mode
execution would proceed identically. But we cannot assume that execution
"in the wild" will always just behave in ways that we have already tested
completely. Any new twist might have included a checked mode
violation---except that it remains undetected in production mode, and any
inconsistencies created at this point may proliferate arbitrarily in the
runtime state as long as we don't get an outright message-not-understood
error.

I don't think this is as bad as you seem to think. There is certainly

plenty of precedent for debug mode and release mode.

Right, many languages and tools allow you to add a bunch of extra (e.g.,
#ifdef controlled) checks with a suitably configured DEBUG compilation.
This setup does probably rely on the assumption that "if all tests run
successfully then no execution can ever violate any of these checks" (even
though that may be false). It is probably crucial that the checks are
omitted from the "production compilation", too, because the checks are
noisy (they might fill up the disk with log files etc.), and they add
time/space costs.

Getting used to it. ;-)

I guess my worries are focused on the notion of a "time bomb": If
production execution allows more invariants to be invalidated than
development/debugging execution then production execution may be sitting on
a time bomb in the sense that it produces a highly inconsistent (that is,
irrepairable) state. With a suitable "save the state at each step" policy,
the strictly controlled execution would have bailed out immediately when
there is a problem, and the user would just have to redo a tiny amount of
work, compared to the massive effort needed to restore the state from an
old backup when all recent versions are inconsistent.

That said, I think it's good motivation for favoring static

(mode-insensitive) enforcement whenever practical over runtime
(mode-sensitive) enforcement.

Yes. A proof is a very good reason for omitting a check. ;-)

E.g., to answer your earlier question, I'd
lean toward restricting assignability to subtyping if we can ... but it
does force more code changes (and bloat?) and we'll need to see how users
feel about that. Pragmatically, it might make more sense as a static
warning as opposed to an error.

Alternatively, satisfying subtype constraints in a program that is strictly
typed except for having a number of implicit downcasts (due to
assignability) could be a task for an algorithm.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@jmesserly
Copy link
Contributor

I lost track of this discussion, did we reach a conclusion? @vsmenon @leafpetersen

@eernstg
Copy link

eernstg commented Jun 19, 2015

On Thu, Jun 18, 2015 at 11:42 PM, John Messerly notifications@github.com
wrote:

I lost track of this discussion, did we reach a conclusion? @vsmenon
https://github.com/vsmenon @leafpetersen
https://github.com/leafpetersen

Me too! ;-)

But I think the brief version of the point that I was trying to make was
that there shouldn't be multiple similar languages named 'Dart', and one
way to protect ourselves against that would be if the following property
could be guaranteed:

Consider a program P that is compiled by a strictly spec conforming
compiler with no errors and no warnings (case 1). If P compiles with no
errors and no warnings under dev_compiler (case 2) then the compiled
programs will have identical runtime behavior initially. The only ways this
phase can terminate are the following: (a) both programs terminate at a
given point, or (b) the case 1 execution terminates with a runtime error
(maybe we can say 'runtime type error'), and the case 2 execution continues
to run.

Surely I forgot something, and this property would need to be adjusted
accordingly. But something similar to this property seems to give users
enough to rely on, such that they don't need to worry about incompatible
behaviors that matter, depending on whether they're running a given Dart
program in one or the other context.

best regards,

Erik Ernst - Google Danmark ApS
Skt Petri Passage 5, 2 sal, 1165 København K, Denmark
CVR no. 28866984

@jmesserly
Copy link
Contributor

merging this in to: dart-lang/sdk#25220

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

8 participants