Ordered quantification (foreach loops and sequence comprehensions - take 2) #10

robin-aws · 2022-05-12T20:35:05Z

(Second revision of the) design of the first step of dafny-lang/dafny#1753, and follow-up to #4. Replaces the first revision at #9

The key difference between this version and the old one is explicitly defining ordered quantification up front, and restricting the semantics to the first clauses of quantifier domains instead of inferring it from the range.

cc all the folks that provided such awesome feedback on the first version (turns out you can copy-paste this from the "participants" section of a PR's sidebar): @samuelgruetter @atomb @MikaelMayer @cpitclaudel @keyboardDrummer @RustanLeino. And while I'm at it why not @backtracking, @mschlaipfer, @parno, @amaurremi, @emina, and @seebees.

Edit: Revised to make <- required in ordered contexts (and drop the concept of natural ordering of types), and allow per quantified variable ranges. Also added motivation of idiomatic compilation.

cpitclaudel

Very nicely written proposal. I'm still concerned about the notion of ordering. In particular, the way the proposal is designed makes it so that in some cases you can't write seq x <- s | … and instead must write seq x | x in s && …. But in other cases, writing seq x <- s | … is possible and means something different from seq s | x in s && …:

seq x <- {1,2,3}     // rejected
seq x | x in {1,2,3} // ⇒ [1,2,3]

seq x <- [1,1,1]     // ⇒ [1,1,1]
seq x | x in [1,1,1] // ⇒ [1]

I would much prefer to ban sequence comprehensions without a <- clause.

cpitclaudel · 2022-05-12T23:07:14Z

0005-ordered-quantification.md

+    the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will
+    have such a natural ordering defined, but at a minimum `int` and `real`, and any subset type or newtype based on them,
+    will use the natural mathematical ordering.
+    `x: int | x in {2, 5, -4}`, for example, would bind `x` to `-4`, `2`, and `5` in that order.


In which case is the natural ordering needed? I'm surprised that we still need an ordering at all: for the example above we can write seq x: real <- sorted(mySet) | P(x), right? And for this one the libraries can provide a conversion function from sets?

Also, how will this work with types that don't have comparisons? I'm assuming that types that don't support == also don't support comparisons, and yet this feature will let me compare arbitrary values: var aLeB := (seq x | x in {a, b}) == [a, b]

Also, how will this work with types that don't have comparisons? I'm assuming that types that don't support == also don't support comparisons, and yet this feature will let me compare arbitrary values: var aLeB := (seq x | x in {a, b}) == [a, b]

This is a very good point - I've been assuming in all my examples that the types were equality-supporting to focus on defining the semantics. That means in some cases you can't even create a compiled {a, b} value in the first place. I'll call that out more explicitly and watch out for under-specified examples.

But yes, not all types will have natural orderings (any reference type for example) so some types will require <-.

Your proposal makes sense, but it has caveats.
x: nat says that x is in [0, 1, 2, 3, 4, ....] and then the range says how to filter this infinite range, so it's natural to think that the result be sorted if the underlying type "int" is sorted. In practice, we would compile to something different: We would infer what to enumerate x from (the range), and sort it. But this is a hidden performance cost based on heuristics.
We might as well not accept this "implicit sorting" for sequences, and immediately suggest that the enumeration be part of the x: nat syntax, and enable non-deterministic set orderings.

cpitclaudel · 2022-05-12T23:11:02Z

0005-ordered-quantification.md

+    The ordering of bindings is non-deterministic unless `C` is a sequence.
+    If `C` is a `multiset`, multiple bindings with the same value of `x` will be created, but their ordering is still non-deterministic.
+
+    The `<-` symbol would be read as "from", so a statement like `foreach x <- C { ... }` would be read as "for each x from C, ...". Note that `<-` is not an independent operator and is intentionally distinct from the `in` boolean operator.


recycling in might be fine?

My concern is only that the difference between x in C as a quantified variable declaration and x in C as a boolean expression will be pretty subtle, when C is a multiset or seq: if C is [1, 1, 1], for example, we'd have:

assert (seq x in C) == [1, 1, 1] assert (seq x | x in C) == [1]

That may be less of a concern if I drop the natural ordering idea that allows the second form, so we'll see how this plays out in the next revision. :)

cpitclaudel · 2022-05-12T23:11:42Z

0005-ordered-quantification.md

+3. `<CasePatternLocal> <- C`
+
+    This is a generalization of the previous case that supports pattern matching, as in variable declarations and match statements.
+    It allows destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`, and filtering, as in `Some(x) <- mySetOfOptionals`.


We don't have filtering assignments anywhere else, maybe this is best left to a later stage.

Completely agree it doesn't have to be in the first set of changes - I've got delayed until phase 4 in the implementation plan. I strongly want it eventually though, if nothing else to support quantifying over the key-value pairs in a map succinctly and efficiently, which is a very common pattern.

Destructuring lets are already supported; I'm not worried about them. It's filtering ones that we don't have.
So (x, y) <- [(1, 2)] is good; Some(x) <- [None] is what I'm worried about.

I see it as a natural extension of the semantics of match statements, which is like filtering as well. These two loops are equivalent, for example:

foreach Some(x) <- [None, Some(42), None] { print x; } foreach o <- [None, Some(42), None] { match o { case Some(x) => print x; } }

Yes, I love it and I want that feature; but I want it in more cases (including if var Some (x) := opt then …), and I think it distracts from the main aspect of this proposal without adding expressivity.

Well awesome then! :) I mainly wanted to enforce that there was a good option for the map iteration case, and it was a slippery slope to allow filtering as well.

I could make it yet another Future Possibility if that would make the proposal tighter?

Yes please, that sounds great :)

cpitclaudel · 2022-05-12T23:13:09Z

0005-ordered-quantification.md

+    It allows destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`, and filtering, as in `Some(x) <- mySetOfOptionals`.
+
+A single quantifier domain may include multiple such declarations separated by commas, 
+in which case the orderings described for each clause take lexicographic precedence. 


Does that mean that x <- [2, 1], y <- [4, 3] is the same as x <- [1, 2], y <- [3, 4]?

Nope, seq x <- [2, 1], y <- [4, 3] :: (x, y) would be [(2, 4), (2, 3), (1, 4), (1, 3)]. This was a confusing use of "lexicographic" and I'll rewrite this.

0005-ordered-quantification.md

cpitclaudel · 2022-05-12T23:17:01Z

0005-ordered-quantification.md

+The quantifier domain is allowed to be potentially infinite if the loop is used in a compiled context and an explicit `decreases` clause is provided.
+`decreases *` is permitted, in which the `foreach` loop may never terminate. Any other `decreases` clause can be provided
+to ensure the loop terminates even if the domain is potentially infinite. For example, the following (very slow) example collects
+five arbitrary primes (assuming that Dafny can figure out how to enumerate the `allPrimes` infinite set):


It would also need to know that it's not empty, right?

Worse, it would have to know there are at least five elements. Fortunately the code is sound, but my statement about how it behaves is wrong - it should say that it collects AT MOST five arbitrary primes. I'll fix that.

cpitclaudel · 2022-05-12T23:19:27Z

0005-ordered-quantification.md

+
+Since sequence comprehensions are expressions rather than statements, they carry an additional restriction
+when used in functions: their ordering must be fully deterministic. If `s` is a `set<int>`, for example,
+`seq x <- s` would be rejected in specification contexts, whereas `seq x | x in s` would be allowed.


Why have this distinction?

cpitclaudel · 2022-05-12T23:20:26Z

0005-ordered-quantification.md

+As mentioned in the guide-level explanation, `foreach` loops and sequence comprehensions are both able to
+borrow concepts and implementation substantially from other features. Parsing, resolving, verifying, and compiling
+quantifier domains is already a mature aspect of the Dafny implementation. The most significant implementation burden
+is ensuring that enumeration ordering is threaded through, and deterministic in contexts where it needs to be. 


In which contexts does it not need to be deterministic?

cpitclaudel · 2022-05-12T23:27:21Z

0005-ordered-quantification.md

+The proposed sequence comprehension expression allows more
+logic that ultimately produces a sequence of values to be expressed as a value rather than a statement.
+Just producing a sequence of the values in the set above, sorted by the natural ordering of `int` values,
+is simple using a sequence comprehension expression: `seq i: int | i in s`.


I'm not sure that it's worth having this, since users can simply write SetToSeq(s). And, once you remove this (keeping only <-), I think you don't need the notion of quantification ordering at all? I expanded on this below.

cpitclaudel · 2022-05-12T23:28:14Z

0005-ordered-quantification.md

+and I will label the `x: nat` and `y: nat` sections as *quantifier variable declarations*.
+We can augment this existing syntax with a notion of ordering, and allow quantified variables to bind to duplicate values. The key points are:
+
+* Any quantifier domain defines a potentially-infinite, partially-ordered set of *quantifier bindings*.


I'm not sure you can order all types, see below

Co-authored-by: Clément Pit-Claudel <cpitclaudel@users.noreply.github.com>

robin-aws · 2022-05-13T17:00:37Z

Very nicely written proposal. I'm still concerned about the notion of ordering. In particular, the way the proposal is designed makes it so that in some cases you can't write seq x <- s | … and instead must write seq x | x in s && …. But in other cases, writing seq x <- s | … is possible and means something different from seq s | x in s && …:
seq x <- {1,2,3}     // rejected
seq x | x in {1,2,3} // ⇒ [1,2,3]

seq x <- [1,1,1]     // ⇒ [1,1,1]
seq x | x in [1,1,1] // ⇒ [1]

Shoot, I've incorrectly stated in a few places that the ordering has to be deterministic in specification contexts, where I should have said when used in a compiled expression. It's the same reason that let-such-that expressions can only be compiled if they have unique solutions, so that they behave deterministically at runtime. I'll fix that and also call this out more explicitly where relevant.

Independently of that, though, I don't think this kind of restriction is unreasonable, since there's already plenty of precedent in Dafny.

I would much prefer to ban sequence comprehensions without a <- clause.

The main reason I want to allow omitting <- clauses is to at least support ordered quantification over integers, so that you can say seq i | 0 <= i < 10 :: ..., without having to explicitly actually create the [0, 1, ..., 10] value, which will be a performance hit at runtime unless we're far more clever than I'd like :). Besides the fact that it is useful in several motivating examples, that allows us to eventually replace the existing less consistent seq(10, ...) sequence construction feature. It also then provides a nice, succinct way to refer to a sequence of the values in an unordered collection in specifications. I do agree that part could be provided as a standard library function by method though.

Would you be comfortable with the incremental implementation plan I'm proposing, where the first few iterations would not support plain quantifier variables without <- clauses anyway?

cpitclaudel · 2022-05-13T17:45:44Z

The main reason I want to allow omitting <- clauses is to at least support ordered quantification over integers, so that you can say seq i | 0 <= i < 10 :: ..., without having to explicitly actually create the [0, 1, ..., 10] value, which will be a performance hit at runtime unless we're far more clever than I'd like :)

I don't think we would have to be very clever. If we had syntax for that sequence (say 1..10, then we could recognize it easily at compile time).

Would you be comfortable with the incremental implementation plan I'm proposing, where the first few iterations would not support plain quantifier variables without <- clauses anyway?

But step 1 claims that there is a need for ordered quantification, and I don't understand that part.

…fcs into ordered-quantification

…ibilities

MikaelMayer

Great job figuring out that the core of everything you are suggesting is the "ordered" enumeration part. A few comments there and there.

MikaelMayer · 2022-05-16T14:05:49Z

0005-ordered-quantification.md

+// After:
+method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) {
+  result := [];
+  foreach left <- s, right <- s {


I'm going to look stubborn with pointing back at your original idea 😁 , but anyway, what about the syntax:

foreach left <- s && right <- s { }

? Other booleans operators like || or ! would not be permitted except possibly in ghost contexts.

With this generalization, you could easily add conditions, such as:
left <- s && right <- s && left < right
or
left <- s && left % 2 == 0 && right <- s && right % 2 == 0
or
left <- s && left !in other && right <- s
I know sometimes we use the word "heuristic" to describe how to "infer" some knowledge, but here it would be plainly deterministic, straightforward and well documented.
it would even work great for integers, as soon as their range is next to their declaration:
left: int && 0 <= left < max && ...
or you could reuse the current for syntax:
left <- 0 to max

Alternatively, if you really prefer the comma, you could enable both enumeration and variable declaration at the same time.

left <- s, right <- s, left < right
or
left <- s, left % 2 == 0, right <- s, right % 2 == 0
or
left <- s, left !in other, right <- s

As a quick TL;DR as I revise the proposal, your last version is closest to what I'm thinking now. Those examples would be as follows instead:

left <- s, right <- s | left < right
or
left <- s | left % 2 == 0, right <- s | right % 2 == 0
or
left <- s | left !in other, right <- s

i.e. I'm keeping the hard distinction of x <- c not being a boolean operator, but a syntax attached directly to declaring the quantified variable. But each quantified variable can have an optional range attached with |. I realized that's still backwards compatible, because existing x1, x2, ..., xN | <Range> quantifier domains will just be reinterpreted as having a range attached to the final xN variable.

Very interesting.

A note: I like the left <- 0 to max syntax, if we would ever have a use for it. In fact, I like it better than left <- 0 .. max. If we were to introduce this at some point, then left <- max downto 0 is also a natural possibility, since for for loop supports similar syntax.

Agreed, it's nice that it wouldn't clash with the existing s[i..j] syntax, and that there is already precedent for how to handle the reverse order.

If we do that, is it really worth continuing to support both for x := 0 to 10 and foreach x <- 0 to 10?

MikaelMayer · 2022-05-16T14:20:34Z

0005-ordered-quantification.md

+    the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will
+    have such a natural ordering defined, but at a minimum `int` and `real`, and any subset type or newtype based on them,
+    will use the natural mathematical ordering.
+    `x: int | x in {2, 5, -4}`, for example, would bind `x` to `-4`, `2`, and `5` in that order.


Your proposal makes sense, but it has caveats.
x: nat says that x is in [0, 1, 2, 3, 4, ....] and then the range says how to filter this infinite range, so it's natural to think that the result be sorted if the underlying type "int" is sorted. In practice, we would compile to something different: We would infer what to enumerate x from (the range), and sort it. But this is a hidden performance cost based on heuristics.
We might as well not accept this "implicit sorting" for sequences, and immediately suggest that the enumeration be part of the x: nat syntax, and enable non-deterministic set orderings.

MikaelMayer · 2022-05-16T14:22:10Z

0005-ordered-quantification.md

+
+```dafny
+var myDoubleDeckerSeq: seq<seq<int>> := ...;
+foreach x <- myDoubleDeckerSeq, y <- x | y != 0 {


What if you want to filter on x before you extract y from x? What would be the syntax like? Can you write down a proposal?

See latest commit

MikaelMayer · 2022-05-16T14:25:37Z

0005-ordered-quantification.md

+```dafny
+// Filtering to optional values
+var s: seq<Option<T>> := ...;
+var filtered: seq<T> := seq o <- s | o.Some? :: o.value;


This reminds me that I would love to be able to create a comprehension that would create an Option<T> by "enumerating other options", like Scala enables. But for now, we have the elephant operator, which It think is strictly better.
Yet, I love your proposal of collect at the end of this PR.

Yup, considering failure-compatible types to be collection values compatible with <- is another future possibility. But I realized that supporting pattern matching makes that unnecessary because you can just write Some(x) <- optionals instead.

MikaelMayer · 2022-05-16T14:26:36Z

0005-ordered-quantification.md

+Note that the existing `seq(size, i => i + 1)` syntax is inconsistently-named in the Dafny reference manual, 
+but we refer to it here as a "sequence construction" expression, to disambiguate it from the proposed sequence comprehension feature.
+Sequence comprehensions are strictly more expressive, as any construction `seq(size, f)` can be rewritten as 
+`seq i | 0 <= i < size :: f(i)`. They also avoid the common trap of forgetting to explicitly attach the `requires 0 <= i < size`


I love this remark about the requires.

MikaelMayer · 2022-05-16T14:34:44Z

0005-ordered-quantification.md

+programmers how their code will behave at runtime, and hence these new features come with
+the same downside. Dafny is free to implement an expression like `seq x <- a | x in b` by either enumerating the values
+of `a` and skipping those not in `b`, or by sorting the elements of `b` according to the ordering in `a`. This flexibility
+is excellent for future optimizations, but makes it harder to reason about runtime performance.


+1 about reasoning about runtime performance.
I really read x <- a | x in b as "enumerate values in a, and skip those not in b". Sorting the elements of b according to the ordering in a and skipping the elements of a should be something explicit, such as x <- sorted(b, a) | x in a.

Yeah, I agree.

MikaelMayer · 2022-05-16T14:38:14Z

0005-ordered-quantification.md

+no additional filtering, and with `C` directly providing the enumeration and not just the ordering of values.
+This is certainly a well-formed and sound construct, but far less powerful than the proposed version, especially
+when sequence comprehensions are excluded. Offering `foreach` loops along with sequence comprehensions means we can 
+recommend the latter where possible, with the former as a fallback that is inevitably more complicated to verify.


Could you please discuss my alternative at least here?
The one that says "add '<-' as a synonym of 'in' in ranges, but such that they would be used for specifying the enumeration (i.e. no more conflicting heuristics needed)

Definitely! I might as well summarize the best version of the first RFC here. :) I'm also adding Clement's version where <- is required everywhere.

ACTUALLY @cpitclaudel has given me the kernel of an idea that allows us to chain filtering as you are asking for, while preserving the purity of boolean expressions, and still turns out to be backwards compatible with existing quantifier domains! I am tweaking the proposal now and will push a revision to see if you like it as much as I do. :)

You got me very curious now.

MikaelMayer · 2022-05-16T14:43:09Z

0005-ordered-quantification.md

+
+Along similar lines, will users frequently want to customize the ordering of their quantifications without
+creating explicit collection values? To produce the sequence `[4, 3, 2, 1, 0]`, for example, 
+we could consider supporting something like `seq i by > | 0 <= i < 5`.


I think sequences are both increasing and decreasing in everyday's usage, so having a way to support both would be better.
Alternatively to your suggestion:

seq i: int | 0 <= i < 5 :: 4 - i would not require any change

seq i: int | 5 > i >= 4 :: i might seem a bit WTF.

seq i: int | i <- 4 downto 0 :: i to constrain the notion of ordering not in the sequence comprehension, but in the original iterable sequence.

seq i: int | 0 <= i < 5 :: 4 - i would not require any change

Yup, that's a nice application of the generic comprehension syntax!

seq i: int | 5 > i >= 4 :: i might seem a bit WTF.

More than just a bit WTF methinks :D That's another case where I don't think we should mess with deeply established semantics of boolean operators. Plus I know @RustanLeino prefers the idiom of always using < and <= so values are ordered naturally left-to-right in expressions, and I tend to like that too.

seq i: int | i <- 4 downto 0 :: i

THAT'S not a bad idea, since there's precedent in for i := 4 downto 0 { .. } loops...

MikaelMayer · 2022-05-16T14:46:27Z

0005-ordered-quantification.md

+
+Since sequence comprehensions are expressions rather than statements, they carry an additional restriction
+when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example,
+`seq x <- s` would be rejected in compiled code, whereas `seq x | x in s` would be allowed.


I'm not sure I would buy into that. Deterministic means that the result will always be the same if the same set is given. The compilers should allow traversing a set in a deterministic order if they are equal anyway, even if it just means comparing references of non-comparable types.
So seq x <- s could be accepted in compiled code.

I suspect that, most of the time, however, users will not want to convert sets to sequences, but always use sequences, and use set in ghost contexts.

I'm not sure I would buy into that. Deterministic means that the result will always be the same if the same set is given. The compilers should allow traversing a set in a deterministic order if they are equal anyway, even if it just means comparing references of non-comparable types. So seq x <- s could be accepted in compiled code.

This can be done in some cases but definitely makes the runtime more complicated: most HashSet implementations might give you a different iteration order for {1, 2, 3} vs. {3, 2, 1}. I tried to explain how that could be addressed in many cases with a more specialized implementation in this section in the original RFC, but didn't go back to clean that up since I've dropped the requirement in this version. I can take another stab at it here if it would help though.

I suspect that, most of the time, however, users will not want to convert sets to sequences, but always use sequences, and use set in ghost contexts.

Well now that it's public (🎉!), allow me to point to the natively-implemented ComputeSetToOrderedSequence method that the ESDK uses in several places to deterministically serialize the keys of a map!

MikaelMayer · 2022-05-16T14:47:11Z

0005-ordered-quantification.md

+collect(*) x: T | P(x) :: Q(x)          == (product)
+collect(<) x: T | P(x) :: Q(x)          == (minimum)
+collect(Averager) x: T | P(x) :: Q(x)   == (average)
+collect(First(n)) x: T | P(x) :: Q(x)   == Take(seq x: T | P(x) :: Q(x), n)


Wow this blows my mind I love it !

atomb · 2022-05-23T23:35:24Z

0005-ordered-quantification.md

+1. `x [: T]`
+
+    In the existing syntax for quantifier variables (where the type `T` may be explicitly provided or omitted and inferred),
+    the bindings will be ordered according to the *natural ordering* of the type `T`. Not all Dafny types will


My immediate reaction to the difference between in and <- is to think of in as more "mathematical" and <- as more "implementation-focused" (given how those symbols have been used in other contexts before). Because of that, it seems more natural to me to have <- impose more ordering, and in impose less. What if in produced elements in a non-deterministic order, with no duplicates, and <- always used a collection-defined ordering with the potential for duplicates? That would eliminate the need for "natural ordering" altogether.

Have a look at the latest commit, I went with almost exactly that option. :)

atomb · 2022-05-23T23:56:57Z

0005-ordered-quantification.md

+
+This mirrors the `Collector` concept from the Java streaming APIs, and ideally the shape of the
+parameter to `collect` expressions would define similar operations for merging intermediate results
+to leave the door open for parallel or even distributed computation.


Would collect require a "zero" or "identity" element? Most such formalisms do, I think (such as Monoid in Haskell). In Dafny, we could avoid it by requiring T to be non-empty, though.

I had that thought too. Formalizing the parameter to collect definitely needs a lot more thought and I didn't attempt to make this too concrete yet - it might work better to parameterize with a type parameter that implements a particular "type characteristic".

robin-aws · 2022-05-25T05:49:25Z

The main reason I want to allow omitting <- clauses is to at least support ordered quantification over integers, so that you can say seq i | 0 <= i < 10 :: ..., without having to explicitly actually create the [0, 1, ..., 10] value, which will be a performance hit at runtime unless we're far more clever than I'd like :)

I don't think we would have to be very clever. If we had syntax for that sequence (say 1..10, then we could recognize it easily at compile time).

This is a fair point (and I'd considered the option of some kind of interval sequence syntax in earlier versions too). I decided I'm fine with replacing sequence constructions being out of scope for now, and requiring <- in ordered quantification.

Would you be comfortable with the incremental implementation plan I'm proposing, where the first few iterations would not support plain quantifier variables without <- clauses anyway?

But step 1 claims that there is a need for ordered quantification, and I don't understand that part.

In the latest version I've dropped the idea of a natural ordering for types, but still kept the idea of ordered quantification. Let me know if it resolves your concerns!

MikaelMayer · 2022-05-26T20:16:11Z

0005-ordered-quantification.md


-2. `x [: T] <- C` 
+Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as:


Aren't subset types just iset? :-)

True! In general the quantification x: T is equivalent to x <- (iset x: T). Since Dafny supports potentially infinite collections, types and collections are very similar. But collections are strictly more expressive since we don't have a fully dependent type system; you can't express "sequences of length n" as a subset type if n is a dynamic value, but you can declare a dynamic, local iset collection for that. Part of the reason I was fine with dropping the notion of default orderings of types is that we could still support infinite ordered collections in the future, and hence achieve the same thing with x <- (collection of all T's ordered by >).

This may not be the strongest possible example to motivate quantifier variable domains, since it could be done with a subset type instead, and in fact I'd prefer that style in this case. A more complicated example where the domain depended on dynamic values and couldn't be a subset type would be better.

MikaelMayer · 2022-05-26T20:27:44Z

0005-ordered-quantification.md

@@ -360,16 +383,10 @@ assert (seq x <- b | x in a) == [3, 1];

 Since sequence comprehensions are expressions rather than statements, they carry an additional restriction
 when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example,
-`seq x <- s` would be rejected in compiled code, whereas `seq x | x in s` would be allowed.
+`seq x <- s` would be rejected in compiled code.


Suggested change

`seq x <- s` would be rejected in compiled code.

`seq x <- s` would be rejected in compiled code if the compiler does not have a way to enumerate `s` in a deterministic way.

?

Very good point, I had a lot more detail on this in the first RFC that is still valid. Let me pull some of it in here.

MikaelMayer · 2022-05-26T20:31:07Z

0005-ordered-quantification.md


-1. Membership: `forall x :: P(x) <==> x in S`
+1. Membership: `forall x :: x in C && P(x) ==> Q(x) in S`
 2. Cardinality: `|S| <= |C|`
 3. Ordering: `forall i, j | 0 <= i < j < |S| :: 
      exists i', j' | 0 <= i' < j' < |C| :: 


Can you strengthen this by adding

Suggested change

exists i', j' | 0 <= i' < j' < |C| ::

exists i', j' | 0 <= i' < j' < |C| && i' <= i && j' <= j ::

MikaelMayer · 2022-05-26T20:39:01Z

0005-ordered-quantification.md

 2. Cardinality: `|S| <= |C|`
 3. Ordering: `forall i, j | 0 <= i < j < |S| :: 
      exists i', j' | 0 <= i' < j' < |C| :: 
-        C[i'] == S[i] && C[j'] == S[j]`
+        Q(C[i']) == S[i] && Q(C[j']) == S[j]`


Could you have a new construct that says:

countIf(x <- C, P(x))

that is always between |0| and |C|, that increases with C,
that counts the number of P(x) that hold for all the x in C ?
That way, you wouldn't need an existential inside the forall:

Cardinality: `|S| == countIf(x <- C, Q(x))` Ordering: forall i < Q(C[i']) == S[i] forall i | 0 <= i < |S| :: var i' := countIf(x <- C[0..i], P(x)); Q(C[i']) == S[i]

Your axioms could be deduced from that.

Possibly! I only wanted to propose an encoding that seemed feasible enough, I don't claim to have found the optimal encoding yet nor am I trying to nail that before starting on implementation.

MikaelMayer · 2022-05-26T20:43:10Z

0005-ordered-quantification.md

+## Range expressions
+
+This would interpret the existing `i..j` syntax, 
+currently only usable in sequence or array indexing expressions such as `s[i..j]`,


Do you envision it would be possible to say something like this:

var x := i..j; s[x]

and obtain a slice of s?
I personally was thinking that i..j would be available only in [i..j] or after an àrrow <- i..j.
I do love this syntax, but what about reverse iteration?

Yes I was thinking it would be possible, just so we're not introducing a new type of value or special-casing a lot. I also like the syntax but have no good suggestion for the reverse order, and that's one reason this idea is safely quarantined in Future Possibilities for now. :)

RustanLeino

I like it!

The one thing I'm not clear about is what happens with sequence comprehensions when the s in x <- s is a set. Is such a sequence comprehension allowed only in ghost contexts? Or is there something that, at run time, will figure out a deterministic ordering?

More generally, what types of e are allowed in x <- e?

I think the current proposal does not give a way to deterministically step through the elements of a set. Is that right? (I'm not saying this is necessary. Just checking my understanding.)

RustanLeino · 2022-05-26T23:19:47Z

0005-ordered-quantification.md

+```dafny
+// Before:
+method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) {
+  for i := 0 to |s| {


Another whoops: missing initialization of result.

To which I say "whoops" :)

I legitimately toyed with figuring out a way for this repo to test code snippets as the main dafny repo is.

RustanLeino · 2022-05-26T23:22:03Z

0005-ordered-quantification.md

+var ss := s;
+while ss != {}
+{
+  var i: int :| i in ss;


The : int can be omitted, since it will be inferred.

RustanLeino · 2022-05-26T23:27:04Z

0005-ordered-quantification.md

+// After:
+method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) {
+  result := [];
+  foreach left <- s, right <- s {


A note: I like the left <- 0 to max syntax, if we would ever have a use for it. In fact, I like it better than left <- 0 .. max. If we were to introduce this at some point, then left <- max downto 0 is also a natural possibility, since for for loop supports similar syntax.

RustanLeino · 2022-05-26T23:36:49Z

0005-ordered-quantification.md

+If `C` is a `map`, the bound values will be the keys of the `map`, in order to be consistent with the meaning of `x in C`;
+`map.Items` can be used instead to bind key-value pairs.


Good to keep the two consistent. But I have also thought that x in m ought to mean x in m.Items, rather than x in m.Keys. Maybe we'll change both in m and <- m in the future.

RustanLeino · 2022-05-26T23:39:34Z

0005-ordered-quantification.md

+If `C` is a `map`, the bound values will be the keys of the `map`, in order to be consistent with the meaning of `x in C`;
+`map.Items` can be used instead to bind key-value pairs.
+
+Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as:


For the mathematics involved in this example, I think you intended 0 < n.

RustanLeino · 2022-05-27T01:20:51Z

0005-ordered-quantification.md

+This would allow destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`.
+It could also support filtering, as in `Some(x) <- mySetOfOptionals`.


I like this and I think we should do it (after the more basic functionality has been implemented and loved). Just a note: it will complicate the lookahead disambiguation that will need to be written for Coco (see my comments regarding the "," above).

I very much like this, too.

RustanLeino · 2022-05-27T01:21:58Z

0005-ordered-quantification.md

+the necessary interface. This is also likely the best way to provide a better user experience
+for looping or quantifying over the values produced by an `iterator` type.


RustanLeino · 2022-05-27T01:24:13Z

0005-ordered-quantification.md

+## Range expressions
+
+This would interpret the existing `i..j` syntax, 
+currently only usable in sequence or array indexing expressions such as `s[i..j]`,


Unless range expressions would be allowed everywhere and have their own type (which I resist at the moment), then ranges expressions would be allowed only to the right of <-. If so, I like the syntax @MikaelMayer suggested above, namely x <- i to j, which looks like in a for loop.

RustanLeino · 2022-05-27T01:26:51Z

0005-ordered-quantification.md

+There would have similar semantics to sequence comprehensions, 
+where the multiplicity of results is significant but not the ordering.
+This feature would be very cheap to implement once sequence comprehensions are implemented.


RustanLeino · 2022-05-27T01:40:25Z

0005-ordered-quantification.md

+The translation of `foreach` loops should be reducible to sequence comprehensions
+and other existing features for loops in general,


Sequence comprehensions are more difficult to formalize for the verifier than foreach loops, so I don't suggest formalizing foreach in terms of sequence comprehensions. I'm thinking that the verifier would handle foreach loops like this:

Check the loop invariant initially (as usual for a loop).

Inside the loop, play havoc with the loop assignment targets (as usual for a loop) and the bound variables (as is also done in the for loop). Then, assume the loop invariant and assume the condition that you get by replacing every <- with an in in the foreach loop header. Then, check the body and check that the body maintains the loop invariant.

After the loop, assume the loop invariant (as usual for a loop). Here's what I don't know what to do with: one also needs to assume some condition that's analogous to "the negation of the loop guard". I don't know what that condition is here.

Yes, I thought translating to sequence comprehensions was appealing because it at least solves the last point: the invariants reduce to those of for loops, so the loop guard is just an expression of completing the enumeration. I may still try this encoding first, especially since I plan to have sequence comprehensions committed and stable before moving to foreach loops.

keyboardDrummer

Excellent writing and exciting proposal!

keyboardDrummer · 2022-05-27T12:14:31Z

0005-ordered-quantification.md

+}
+```
+
+The feature supports much more sophisticated uses as well, however,


I'm wary of adding syntax for features whose use-cases are also covered by multiple foreach loops and higher-order method calls.

I think it's important that if we add support for the 'sophisticated uses' syntax, that it's exactly the same syntax as used in comprehensions so it doesn't require any additional documentation or implementation.

This is one of the big motivating styles: https://github.com/dafny-lang/rfcs/pull/10/files#diff-e3e01b715e6d0ac6dce19064a36b244c495633e9760a43233eaf22cdb18849f8R376 Supporting multiple quantified variables in one expression is necessary to support that flattening pattern, as you get a different value if you use nested comprehensions instead.

Totally agree that keeping the syntax and semantics consistent is very important. Hence the strategy of extending the existing quantification syntax slightly in a way that makes sense everywhere, but then also enables the new features.

I was commenting only on providing sophisticated syntax for the foreach statement. For comprehensions I think it opens up new use-cases so that syntax is warranted.

For foreach I think the stronger argument is consistency (given all other uses of quantification allow multiple quantified variables, and it won't be very hard to implement).

If it would be so inconsistent not to do it, that doing it reduces the documentation and implementation we need, then of course let's do it!!!

keyboardDrummer · 2022-05-27T12:29:40Z

0005-ordered-quantification.md

+```
+
+Since sequence comprehensions are expressions rather than statements, they carry an additional restriction
+when compiled: their ordering must be fully deterministic. If `s` is a `set<int>`, for example,


Are you sure the ordering must only be deterministic when compiled? Wouldn't it be difficult to ensure (seq x <- someSet) == (seq x <- someSet) at the verification level as well?

This is easy to ensure at the verification level. In fact, it difficult NOT to ensure it, because expressions in math are deterministic. This is the reason that the compiler also must make sure to do this consistently.

I do think this is SUCH a common point of confusion, it needs to be taught and documented more prominently. We already have this problem with :| and it surprises even advanced Dafny users.

Still, I wonder whether it wouldn't be better to disallow seq x <- someSet in all contexts. What would be the use-case of doing this in a ghost context?

You definitely need some efficient way of accessing the values in a set in sequence at runtime, but it might be enough to only support foreach. Supporting collections other than sequences as quantified variable domains is not until step 4 of the implementation plan, so I'm definitely happy to hold off until we feel there's a need for it.

keyboardDrummer · 2022-05-27T13:07:13Z

0005-ordered-quantification.md

+
+This RFC proposes adding two closely-related new features to Dafny for working with collections and iteration:
+
+1. `foreach` loops over any enumerable domain, initially only supporting the builtin collection types but leaving the door open for user-defined collections.


Have we considered introducing the concept of an iterator/enumerator type prior to introducing a foreach? The iterator/enumerator would be similar to that in Java/C#, and you could use it to do something similar to a foreach:

method ForEach<T>(hasEnumerator: HasEnumerator<T>) { var enum: BoundedEnumerator<T> := hasEnumerator.GetEnumerator(); while(enum.HasNext()) decreases enum.MaxLeft() { var iterationVariable := enum.MoveNext(); // do something useful } } trait HasEnumerator<T> { method GetEnumerator() returns (result: BoundedEnumerator<T>) } trait BoundedEnumerator<T> { function MaxLeft(): int function method HasNext(): bool method MoveNext() returns (value: T) requires HasNext() ensures MaxLeft() < old(MaxLeft()) }

Building the foreach concept on top of such an iterator would mean we immediately enable user-defined collections to work with foreach. More importantly, starting out with simple fundamentals and building higher-level language constructs on top of those might mean we avoid having to backtrack in our implementation in the future.

Can you provide a sketch for how you plan to support foreach for user-defined collections?

Similarly, for supporting user-defined collections to be used in the quantified variable domain of a sequence comprehension, I expect we would need to define a trait Ordered that can be used like this:

function SequenceComprehension<T>(domain: Ordered<T>): seq<T> { domain.Fold([], (value: T, result: seq<T>) => [value] + result) } trait Ordered<T> { ghost var values: seq<T> function method Fold<Result>(initial: Result, combine: (T, Result) -> Result): Result ensures Fold(initial, combine) == FoldSpec(values, initial, combine) function FoldSpec<Result>(values: seq<T>, initial: Result, combine: (T, Result) -> Result): Result { if |values| == 0 then initial else FoldSpec(values[1..], combine(values[0], initial), combine) } }

And for set comprehensions we'd need a trait Unordered:

function SetComprehension<T>(domain: Unordered<T>): set<T> reads * { var sets := domain.Select<set<T>>(x => {x}); var combine := (a,b) => a + b; assert forall x, y | x in sets.values && y in sets.values :: combine(x,y) == combine(y,x); sets.Aggregate({}, combine) } trait Unordered<T(==)> { ghost var values: multiset<T>; function method Select<U>(f: T -> U): Unordered<U> function method Aggregate(zero: T, aggregate: (T, T) -> T): T reads this requires forall x, y | x in values && y in values :: aggregate(x,y) == aggregate(y,x) ensures Aggregate(zero, aggregate) == AggregateSpec(values, zero, aggregate) function AggregateSpec(values: multiset<T>, zero: T, aggregate: (T, T) -> T): T requires forall x, y | x in values && y in values :: aggregate(x,y) == aggregate(y,x) { if |values| == 0 then zero else (var x: T :| x in values; AggregateSpec(values - multiset{x}, aggregate(x, zero), aggregate)) } } function FailedSeqComprehension<T>(domain: Unordered<T>): seq<T> reads * { var seqs := domain.Select<seq<T>>(x => [x]); var combine := (a,b) => a + b; //. Error: assertion might not hold assert forall x, y | x in seqs.values && y in seqs.values :: combine(x,y) == combine(y,x); seqs.Aggregate([], combine) }

Have we considered introducing the concept of an iterator/enumerator type prior to introducing a foreach?

Oh absolutely, and dafny-lang/libraries#37 has a snapshot of that work to date. It looks similar to what you sketched out with an Enumerator<T> trait, and I'd expect an Enumerable<T> trait with a GetEnumerator() method. That would be the bare minimum to allow more values to be used as quantified variable domains.

There are a number of factors that make that approach much harder to implement and less beneficial to users though, at least in the current state of Dafny:

Your // do something useful comment in ForEach skips over a big gap. :) Lambda expressions in Dafny can't have any side-effects, so you can't make a call like ForEach(myCollection, x => { print x; } work. That means you have to declare a general-purpose "Action" trait and an explicit class for every use case, e.g. https://github.com/aws/aws-encryption-sdk-dafny/blob/c8be143088490301d486958ad126d53c46ec1311/src/AwsCryptographicMaterialProviders/Keyrings/AwsKms/AwsKmsKeyring.dfy#L599-L600. It's a massive pain and an order of magnitude harder to use than a built in foreach construct.

I found that attaching accurate enough termination metrics and invariants to the various Enumerator<T> implementations required creating a combinatorial explosion of concrete classes. I'm not against doing that in the standard library if necessary (and possibly leaning on code generation to make it easier/more maintainable), but I'd like to explore solving the more fundamental limitation first.

Consuming such enumerator types using a particular idiom of while loops (e.g. https://github.com/dafny-lang/libraries/pull/37/files#diff-1927bc3853704ac5f502a37509d998e3ac29f7fbe8c4e8c479810b6a02ca4c3bR15-R28) still takes a lot of boilerplate, and requires you to then prove your enumerator's Repr is disjoint from any other objects in your code.

Dafny also doesn't support having "default" implementations of methods in a trait that can be overridden in classes. That means we can't decide the Enumerator<T> interface should really have a Contains(x) method (for example) in the future without breaking existing code. Supporting default implementations was necessary in Java 8 so they could add the streams API to existing interfaces like Collection without breaking existing implementations.

Modelling the enumeration constraints as a trait is also limiting, since only reference types can currently implement them. The alternative is introducing yet more type characteristics similar to "failure-compatible" (and more complicated ones at that), and I'd rather avoid that and wait until Non-reference traits dafny#1226 is addressed.

The appealing thing about the features in this RFC is that they fully abstract away from the details of enumerating at runtime. I absolutely plan for us to support user-defined collections in the future, but supporting ordered quantification over just the built-in collection types provides a ton of value while keeping the door open for the future. I've tried very hard to keep the semantics as general and flexible as possible, to reduce the risk you point out of finding that they interfere with how we'd like to define lower-level features that can implement them, but please let me know if you're concerned about anything specific.

Note that the aggregation/Fold functionality you also sketch out is out of scope for now. I suspect it will also work out better as a more built in concept as the "collect" future possibility sketches out, for a lot of the same reasons - aggregation is essentially the dual of enumeration!

Okay, but how do you plan to support foreach for user-defined collections?

If that means introducing Enumerable<T> and rebuilding foreach on top of it as syntax sugar for while based approach, wouldn't that mean we lose some of the effort that we spend now on a first-class foreach implementation?

The initial foreach implementation will translate pretty directly to the existing BoundedPool / SinglePassCompiler.Create[Guarded]ForeachLoop/CompileGuardedLoops implementations already used to compile features such as set comprehensions and assign-such-that statements, so the implementation cost beyond the language changes is actually quite cheap.

If we end up replacing that code with a more pure-Dafny implementation of builtin collections in the future, very little effort will be wasted. And it's likely sequence comprehensions will be useful in in the specification of such code!

Okay, sounds good :-)

robin-aws

I like it!

The one thing I'm not clear about is what happens with sequence comprehensions when the s in x <- s is a set. Is such a sequence comprehension allowed only in ghost contexts? Or is there something that, at run time, will figure out a deterministic ordering?

Only allowed in ghost contexts, since the current runtimes can't provide a deterministic ordering. I added a bit of content here about that point: https://github.com/dafny-lang/rfcs/pull/10/files#diff-e3e01b715e6d0ac6dce19064a36b244c495633e9760a43233eaf22cdb18849f8R391

More generally, what types of e are allowed in x <- e?

Any "collection" type, or alternatively any value for which the expression x in e is valid.

I think the current proposal does not give a way to deterministically step through the elements of a set. Is that right? (I'm not saying this is necessary. Just checking my understanding.)

That's correct. There are a number of options for providing that feature outside the scope of this RFC:

A standard library function by method can use seq x <- s in the method implementation and sort the result.
The runtime representation of sets could be changed to guarantee a deterministic ordering (not necessarily sorted, but purely determined by the contents).
As described in one of the future possibilities, we could add support for an infinite ordered collection type, so that an expression like seq x <- intsInIncreasingOrder | x in s would then express "the contents of s sorted", and would be implemented at runtime by building a sorted data structure with the contents of s and then enumerating that.

robin-aws · 2022-05-27T16:03:21Z

0005-ordered-quantification.md

+```dafny
+// Before:
+method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) {
+  for i := 0 to |s| {


To which I say "whoops" :)

I legitimately toyed with figuring out a way for this repo to test code snippets as the main dafny repo is.

robin-aws · 2022-05-27T16:11:53Z

0005-ordered-quantification.md

+// After:
+method AllPairs(s: seq<nat>) returns (result: seq<(nat, nat)>) {
+  result := [];
+  foreach left <- s, right <- s {


Agreed, it's nice that it wouldn't clash with the existing s[i..j] syntax, and that there is already precedent for how to handle the reverse order.

If we do that, is it really worth continuing to support both for x := 0 to 10 and foreach x <- 0 to 10?

robin-aws · 2022-05-27T16:28:45Z

0005-ordered-quantification.md


-2. `x [: T] <- C` 
+Assuming that we have `posNat == iset n: nat | 0 <= n`, the set comprehension above could also then be expressed as:


True! In general the quantification x: T is equivalent to x <- (iset x: T). Since Dafny supports potentially infinite collections, types and collections are very similar. But collections are strictly more expressive since we don't have a fully dependent type system; you can't express "sequences of length n" as a subset type if n is a dynamic value, but you can declare a dynamic, local iset collection for that. Part of the reason I was fine with dropping the notion of default orderings of types is that we could still support infinite ordered collections in the future, and hence achieve the same thing with x <- (collection of all T's ordered by >).

This may not be the strongest possible example to motivate quantifier variable domains, since it could be done with a subset type instead, and in fact I'd prefer that style in this case. A more complicated example where the domain depended on dynamic values and couldn't be a subset type would be better.

robin-aws · 2022-05-27T16:41:46Z

0005-ordered-quantification.md

+quantified variable domains also specify the ordering of quantification bindings, and allow variables to be bound to duplicate values.
+This ordering of bindings is non-deterministic unless `C` is a sequence.
+If `C` is a `multiset`, multiple bindings with the same value of `x` will be created, but their ordering is still non-deterministic.
+Note that ordering and multiplicity is irrelevant for all current uses:
+the semantics of `[i]set` and `[i]map` comprehensions, `forall` and `exists` expressions, 
+and `forall` statements all do not depend on the order of quantification and semantically ignore duplicates.


Thanks, I like that version better - I've massaged it slightly and used it.

robin-aws · 2022-05-27T16:49:02Z

0005-ordered-quantification.md

+This means that the parser will accept attributes in-between variables, 
+but these can be easily rejected by the resolver instead.


Three technical points:

If the resolver is what rejects the in-between attributes, then the AST still needs a place to hang them. Better seems to be for the parser to reject them. More precisely, the parser would first parse in-between attributes, but before when it has gathered all the variables and "from" expressions and is about to create the AST node, then the parser would generate errors for any in-between attributes it may have parsed along the way.

I came to the same conclusion. I may still add a new QuantifiedVar type for the AST in the first PR so that we can have better resolution errors without losing source tokens, but I'm going to try to avoid it.

Don't worry about the specific attribute {:heapQuantifier}, since I don't think it's supported any more. We should remove it.

Sure, I was only looking for an example of a quantification attribute. As long as we still support at least one I just want to make sure we can still parse them.

Note that the "," causes an ambiguity in the grammar. To resolve it, the lookahead mechanism needs to determine if what follows the "," is another QuantifierVarDecl or not.

Ah you spotted what I only figured out after making the parsing changes. :) I did figure out that it was helpful to look ahead for ", <Identifier>" at least. But note that this is still not fully backwards compatible! The statement:

print set x: int | 0 <= x < 10, y;

Is currently parsed as:

print (set x: int | 0 <= x < 10), (y);

But will then be parsed as:

print (set x: int | 0 <= x < 10, y);

And produce an error, since it is then a comprehension with more than one variable but no term. Our plan is to accept this minor breaking change with a helpful error message like "if y is not part of the comprehension, try adding parentheses or a term expression before it". We could add an option similar to /functionSyntax to avoid breaking people before the next major version.

robin-aws · 2022-05-27T16:54:10Z

0005-ordered-quantification.md

 2. Cardinality: `|S| <= |C|`
 3. Ordering: `forall i, j | 0 <= i < j < |S| :: 
      exists i', j' | 0 <= i' < j' < |C| :: 
-        C[i'] == S[i] && C[j'] == S[j]`
+        Q(C[i']) == S[i] && Q(C[j']) == S[j]`


Possibly! I only wanted to propose an encoding that seemed feasible enough, I don't claim to have found the optimal encoding yet nor am I trying to nail that before starting on implementation.

robin-aws · 2022-05-27T16:56:30Z

0005-ordered-quantification.md

+including Scala, Cryptol, and Haskell. Using something other than `in` is recommended to avoid confusion with the `in` boolean operator,
+but I don't have a strong opinion about which symbol or keyword we use.


from reads quite nicely too, but <- seems to fit better with the mathematical notation leanings of Dafny IMO. This is something we can bikeshed more effectively on in the first PR when looking at lots of code examples, I think.

robin-aws · 2022-05-27T16:57:38Z

0005-ordered-quantification.md

+  match o {
+    case Some(x) => print x;
+  }


Another whoops because my markdown renderer is not verifying my Dafny code. :P Thanks.

robin-aws · 2022-05-27T16:59:56Z

0005-ordered-quantification.md

+The translation of `foreach` loops should be reducible to sequence comprehensions
+and other existing features for loops in general,


Yes, I thought translating to sequence comprehensions was appealing because it at least solves the last point: the invariants reduce to those of for loops, so the loop guard is just an expression of completing the enumeration. I may still try this encoding first, especially since I plan to have sequence comprehensions committed and stable before moving to foreach loops.

robin-aws · 2022-05-27T17:07:26Z

0005-ordered-quantification.md

+
+```dafny
+ghost var xs := [];
+foreach x <- s {


See top level response

Resolves #2187. First phase of implementing dafny-lang/rfcs#10. This change adds two new features on several uses of quantified variables in the language: 1. Quantified variable *domains*, specified with the syntax `x <- C`, where C is an expression with a collection type (i.e. set, multiset, sequence, or map). Besides offering a more compact syntax for a common pattern (i.e. replacing `myLovelyButLongVariableName | myLovelyButLongVariableName in someSet` with `myLovelyButLongVariableName <- someSet`), this is forwards-compatible with the notion of *ordered quantification*, which the two features designed in the linked RFC depend on, and potentially others in the future. 2. Quantified variable *ranges*, specified with the syntax `x | E`, where E is a boolean expression. These replace the existing concept of a single `| E` range after all quantified variables (which is now bound to the last declared variable, with equivalent semantics). These are helpful for restricting the values of one quantified variable so that it can be used in the domain expression of another variable. These features apply to the quantifier domains used in `forall` and `exists` expressions, `set` and `map` comprehensions, and `forall` statements, and the parsing for quantifier domains is now shared between these features. As a small consequence, set comprehensions now no longer require range expressions, and will default to `| true`. This new syntax is not fully backwards-compatible: the statement `print set x | 0 <= x < 10, y`, for example, was previously parsed as `print (set x | 0 <= x < 10), (y)` but will now be parsed as `print (set x | 0 <= x < 10, y)` and rejected. The `/quantifierSyntax` option is used to help migrate this otherwise breaking change: * `/quantifierSyntax:3` keeps the old behaviour where a `| <Range>` always terminates the list of quantified variables. * `/quantifierSyntax:4` instead attempts to parse additional quantified variables. Like `/functionSyntax`, this option will default to `4` instead in Dafny 4.0. This is implemented with pure desugaring for now: the `QuantifiedDomain` production parses a list of `QuantifiedVar` objects, but then immediately rewrites a snippet such as `x <- C | P(x), y <- D(x) | Q(x, y)` into the equivalent (for now) `x | x in C && P(x) && y in D(x) && Q(x, y)`. Token wrappers are used to ensure error messages can still refer to the original syntax. This will not be equivalent once features that depend on the ordering of quantification are added, though, so a subsequent change will add the new fields to `BoundVar` instead and plumb these extra components through the whole Dafny pipeline. On testing: I duplicated several existing test files and modified them to use the new syntax features, to get good white-box test coverage even though the implementation is very simple for now. Also resolves #2038 (and the same issue for Java) since I hit it when converting test cases, and it was dead easy to fix.

davidcok · 2022-07-28T15:02:18Z

Coming to this late, and I understand the useful generalizations, but now the very intuitive simple case seq(n, F) is what? seq j: j < n :: F(j)? If so, to me, that is not a syntactical improvement. seqs are naturally indexed by nat up to a limit. The generalizations are great in that one need not make that indexing explicit (e.g., in making a seq from a set), but in the common case where there is a natural indexing, as when there is a function from nat -> T, that just needs a limit, the indexing ought to be implicit in the syntax.

robin-aws · 2022-07-28T16:28:57Z

Coming to this late, and I understand the useful generalizations, but now the very intuitive simple case seq(n, F) is what? seq j: j < n :: F(j)? If so, to me, that is not a syntactical improvement. seqs are naturally indexed by nat up to a limit. The generalizations are great in that one need not make that indexing explicit (e.g., in making a seq from a set), but in the common case where there is a natural indexing, as when there is a function from nat -> T, that just needs a limit, the indexing ought to be implicit in the syntax.

Since we decided to require a quantification domain for sequence comprehensions (i.e. seq x <- C :: f(x) is allowed but not seq x | P(x) :: f(x)), we still need seq(n, F) sequence constructions too for now.

We have a strong inclination to support seq i <- 0 to n :: F(i) after further discussion, but I'm not planning to include that in the initial version as we haven't fully designed that yet. I would argue that the natural indexing is very frequently indexing into a sequence anyway, so the initial version will cover a lot of cases.

MikaelMayer · 2022-07-28T17:04:08Z

Coming to this late, and I understand the useful generalizations, but now the very intuitive simple case seq(n, F) is what? seq j: j < n :: F(j)? If so, to me, that is not a syntactical improvement. seqs are naturally indexed by nat up to a limit. The generalizations are great in that one need not make that indexing explicit (e.g., in making a seq from a set), but in the common case where there is a natural indexing, as when there is a function from nat -> T, that just needs a limit, the indexing ought to be implicit in the syntax.

After @robin-aws 's improvement, we could always define the previously built-in constructor as a simple library helper:

function seq<T>(n: nat, f: nat ~> T): seq<T>
  requires forall i | 0 <= i < n :: f.requires(i)
{
  seq i | 0 <= i < n :: f(i)
}

robin-aws · 2022-07-28T17:46:28Z

After @robin-aws 's improvement, we could always define the previously built-in constructor as a simple library helper:
function seq<T>(n: nat, f: nat ~> T): seq<T>
  requires forall i | 0 <= i < n :: f.requires(i)
{
  seq i | 0 <= i < n :: f(i)
}

Unfortunately not, since we decided to require a domain to define the ordering in sequence comprehensions. It would have to be this (adding a necessary reads clause, and changing seq to Seq so that it actually parses :)

function Seq<T>(n: nat, f: nat ~> T): seq<T>
  requires forall i | 0 <= i < n :: f.requires(i)
  reads set o, i | 0 <= i < n && o in f.reads(i) :: o
{
  seq(n, f)
}

But if we allowed lo to hi to be a domain, either only in that syntax or as another form of sequence display (which I personally like, and could easily implement quickly and efficiently after dafny-lang/dafny#2390), then we could define it like this:

function Seq<T>(n: nat, f: nat ~> T): seq<T>
  requires forall i | 0 <= i < n :: f.requires(i)
  reads set o, i | 0 <= i < n && o in f.reads(i) :: o
{
  seq i <- 0 to n :: f(i)
}

davidcok · 2022-07-28T20:03:42Z

I would vote for keeping the seq(n,F) syntax, and defining it as a sugar of what you write in the previous comment. I expect it will be the commonest use.

And in this context seq i <- 0 to n :: f(i) is (almost) always going to start with 0 . Isn't seq i <- k to n :: f(i) just seq i <- 0 to n-k :: f(i+k) ? And I expect it will be confusing as some users will think it is the k .. n elements of the sequence that are being set. [So maybe even seq i <- to n :: f(i) :-)]

robin-aws added 8 commits May 11, 2022 11:28

Initial partial rewrite

e8125a3

Added section on ordered quantification

f8726d7

Mostly finished modulo a few TODOs

b80e931

Clean-up

b243b38

Filled in the rest

273173b

A few more edits

461e92d

Link typo

cc9c9f1

Added implementation plan

f6747ef

cpitclaudel reviewed May 12, 2022

View reviewed changes

Update 0005-ordered-quantification.md

3f00994

Co-authored-by: Clément Pit-Claudel <cpitclaudel@users.noreply.github.com>

robin-aws added 3 commits May 13, 2022 13:08

Address most feedback

963abea

Merge branch 'ordered-quantification' of github.com:robin-aws/dafny-r…

bc2edc0

…fcs into ordered-quantification

Move case patterns on quantifier variable declarations to future poss…

698bf03

…ibilities

MikaelMayer mentioned this pull request May 16, 2022

foreach loops and sequence comprehensions #9

Closed

MikaelMayer reviewed May 16, 2022

View reviewed changes

atomb reviewed May 23, 2022

View reviewed changes

Per quantified variable domains and ranges

5c5e290

MikaelMayer reviewed May 26, 2022

View reviewed changes

RustanLeino approved these changes May 27, 2022

View reviewed changes

keyboardDrummer reviewed May 27, 2022

View reviewed changes

PR feedback

af30270

robin-aws commented May 27, 2022

View reviewed changes

This was referenced May 31, 2022

Implement quantified variable domains and ranges dafny-lang/dafny#2187

Closed

feat: Domains and ranges for quantified variables dafny-lang/dafny#2195

Merged

robin-aws mentioned this pull request Jun 30, 2022

Implement sequence comprehensions dafny-lang/dafny#2335

Open

robin-aws mentioned this pull request Jun 30, 2022

Implement foreach loops dafny-lang/dafny#2336

Open

robin-aws mentioned this pull request Jul 28, 2022

sequence comprehensions are ugly dafny-lang/dafny#1332

Open

robin-aws mentioned this pull request Aug 12, 2022

Support <lo> to <hi> as a quantifier variable domain, or as a sequence display dafny-lang/dafny#2583

Open


		2. `x [: T] <- C`
		Assuming that we have `posNat == iset n: nat \| 0 <= n`, the set comprehension above could also then be expressed as:

	`seq x <- s` would be rejected in compiled code.
	`seq x <- s` would be rejected in compiled code if the compiler does not have a way to enumerate `s` in a deterministic way.

	exists i', j' \| 0 <= i' < j' < \|C\| ::
	exists i', j' \| 0 <= i' < j' < \|C\| && i' <= i && j' <= j ::

		If `C` is a `map`, the bound values will be the keys of the `map`, in order to be consistent with the meaning of `x in C`;
		`map.Items` can be used instead to bind key-value pairs.

		This would allow destructuring datatypes and tuples, as in `(k, v) <- myMap.Items`.
		It could also support filtering, as in `Some(x) <- mySetOfOptionals`.

		the necessary interface. This is also likely the best way to provide a better user experience
		for looping or quantifying over the values produced by an `iterator` type.

		The translation of `foreach` loops should be reducible to sequence comprehensions
		and other existing features for loops in general,


		This RFC proposes adding two closely-related new features to Dafny for working with collections and iteration:

		1. `foreach` loops over any enumerable domain, initially only supporting the builtin collection types but leaving the door open for user-defined collections.

		This means that the parser will accept attributes in-between variables,
		but these can be easily rejected by the resolver instead.

		including Scala, Cryptol, and Haskell. Using something other than `in` is recommended to avoid confusion with the `in` boolean operator,
		but I don't have a strong opinion about which symbol or keyword we use.

Ordered quantification (foreach loops and sequence comprehensions - take 2) #10

Are you sure you want to change the base?

Ordered quantification (foreach loops and sequence comprehensions - take 2) #10

Conversation

robin-aws commented May 12, 2022 • edited Loading

cpitclaudel left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robin-aws commented May 13, 2022

cpitclaudel commented May 13, 2022

MikaelMayer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robin-aws May 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

atomb May 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robin-aws commented May 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RustanLeino left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robin-aws commented May 12, 2022 •

edited

Loading

cpitclaudel left a comment •

edited

Loading

robin-aws May 20, 2022 •

edited

Loading

atomb May 23, 2022 •

edited

Loading