-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding Merge Sort and Binary Search #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start! Here are some top-level suggestions (which will make some of the inline comments I made moot, but I added those anyway just for the sake of learning).
-
This code should be under
src
rather thanexamples
- the latter is for sample uses of the actual functions/methods. I would suggestsrc/Collections/Arrays/Arrays.dfy
to align with the established layout. The additional total ordering predicates oncompare
would fit nicely into the existingsrc/Functions.dfy
file. -
For sorting, I think proving HeapSort correct will take quite a bit of work. Why not steal the MergeSort implementation I did for the report generator instead? https://github.com/dafny-lang/dafny-reportgenerator/blob/main/src/StandardLibrary.dfy#L38 It definitely won't be as efficient given that it creates lots of intermediate subsequences (especially since most runtimes implement that as a copy instead of a reference), but it's already proven correct, and having that sorting algorithm here will still be an improvement over not having any sorting algorithm. :) You'll just have to replace calls to
f(x) < f(y)
withcompare(x, y)
just as you already have inBinarySearch
.
examples/BinarySearch.dfy
Outdated
|
||
|
||
|
||
method BinarySearch<T>(a: array<T>, key: T,compare: (T, T) -> bool) returns (r: int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to return an Option<nat>
instead, so we don't have to use -1
as a special flag value.
examples/sorting.dfy
Outdated
var temp:= a[0]; | ||
a[0]:= a[i]; | ||
a[i]:= temp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var temp:= a[0]; | |
a[0]:= a[i]; | |
a[i]:= temp; | |
a[0], a[i] := a[i], a[0]; |
examples/sorting.dfy
Outdated
// Implemetation of Heap sort | ||
include "../src/Collections/Sequences/Seq.dfy" | ||
|
||
method SortArray<T>(a: array<T>, compare: (T, T) -> bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd name this HeapSortArray
to be more precise, since there are other ways to sort (and may even be more in this repo in the future)
examples/BinarySearch.dfy
Outdated
|
||
|
||
method BinarySearch<T>(a: array<T>, key: T,compare: (T, T) -> bool) returns (r: int) | ||
requires forall i,j :: 0 <= i < j < a.Length ==> compare(a[i],a[j]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing the pre-conditions that ensure compare
is a total ordering, which is why some of the assertions about correctness aren't verifying. You should be able to copy them here with proper attribution (I know all of the authors on that file would be happy to have their code in this repo :) - https://github.com/aws/aws-encryption-sdk-dafny/blob/mainline/src/Util/Sorting.dfy#L126-L127
You'll need exactly the same requires
clauses on the sorting methods too.
examples/sorting.dfy
Outdated
include "../src/Collections/Sequences/Seq.dfy" | ||
|
||
method SortArray<T>(a: array<T>, compare: (T, T) -> bool) | ||
requires multiset(a[..]) == multiset(old(a[..])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be an ensures
instead - it's describing the state of the array AFTER executing the method.
{} | ||
|
||
function method MergeSortBy<T>(a: seq<T>, compare: (T, T) -> bool ): (result :seq<T>) | ||
ensures multiset(a) == multiset(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also probably need a postcondition that the result is sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to add SortedBy
in a postcondition here, you'll need to export it from the module, as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think you'll also need these preconditions:
requires StandardLibrary.Transitive(compare)
requires Connected(compare)
requires StandardLibrary.Transitive(compare) | ||
requires Connected(compare) | ||
{ | ||
MergeSortBy(s,compare) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Sort
is a method, you'll need something like result := MergeSortBy(s, compare);
.
} | ||
|
||
method BinarySearch<T>(a: array<T>, key: T,compare: (T, T) -> bool) returns (r : nat) | ||
requires forall i,j :: 0 <= i < j < a.Length ==> compare(a[i], a[j]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to do this to allow the array to contain duplicates.
requires forall i,j :: 0 <= i < j < a.Length ==> compare(a[i], a[j]) | |
requires forall i,j :: 0 <= i < j < a.Length ==> compare(a[i], a[j]) || a[i] == a[j] |
ensures 0 <= r ==> r < a.Length && a[r] == key | ||
ensures r < 0 ==> key !in a[..] | ||
requires StandardLibrary.Transitive(compare) | ||
requires Connected(compare) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method requires an even more obscure property of the compare function. :)
requires StandardLibrary.Trichotomous(compare)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it turns out that if you have that precondition you don't need Transitive
or Connected
.
{ | ||
} | ||
|
||
method BinarySearch<T>(a: array<T>, key: T,compare: (T, T) -> bool) returns (r : nat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the spec is written right now, this method should return int
instead of nat
, though making it return Option<nat>
might be even nicer, as I mention below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice! I just suggested a few small changes for consistency and removal of duplication.
Eventually, I suspect we should split up StandardLibrary.dfy
to put some of its contents in the more specialized modules that already exist (and maybe some new ones). For now, though, this at least makes a bunch more definitions available.
|
||
|
||
|
||
module BinarySearch{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module BinarySearch{ | |
module BinarySearch { |
import opened Wrappers = StandardLibrary.Wrappers | ||
|
||
|
||
predicate Reflexive<T(!new)>(R: (T, T) -> bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predicate Reflexive<T(!new)>(R: (T, T) -> bool) { | |
predicate Reflexive<T(!new)>(R: (T, T) -> bool) { |
And I'd suggest similar changes throughout this file to ensure the indentation is consistent.
include "../../UInt.dfy" | ||
include "../Sequences/Seq.dfy" | ||
|
||
module sorting{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module sorting{ | |
module Sorting { |
import opened UInt = StandardLibrary.UInt | ||
import opened Seq | ||
|
||
predicate Reflexive<T(!new)>(R: (T, T) -> bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate and a bunch of those after it are duplicated between here and the BinarySearch
module. Maybe they should go in StandardLibrary.dfy
until we reorganize it to split it up a little more?
src/UInt.dfy
Outdated
|
||
module StandardLibrary.UInt { | ||
|
||
// TODO: Depend on types defined in dafny-lang/libraries instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment probably isn't needed anymore. :)
|
||
module BinarySearch { | ||
export | ||
provides StandardLibrary, UInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll probably want provides BinarySearch
, too, so that people can use it.
import opened UInt = StandardLibrary.UInt | ||
import opened Wrappers = StandardLibrary.Wrappers | ||
|
||
method BinarySearch<T>(a: array<T>, key: T, compare: (T, T) -> bool) returns (r:Option<nat>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have this indented two spaces to be consistent with the imports above it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done!
I believe your comments have been addressed and that you'd prefer to have this merged sooner than later. :)
# Conflicts: # src/Collections/Sequences/Seq.dfy
The latter doesn’t work well in CI here because the /verificationLogger options cause extra, unpredictable output lines
@prvshah51 I've pushed several commits to remove more code we don't need (yet!), and made a few other changes to get it ready to merge IMO:
Let me know if you have any concerns, and @atomb / @MikaelMayer I'll leave it to you two to approve since I feel a little biased now that I've thrown my own commits on this. :) |
Addressed own suggestions, recusing myself as biased :)
@MikaelMayer I just noticed you had some style requests that were lost in the shuffle. I'll respond to the ones I see that still seem relevant, but please just provide a quick comment at the top level if there are others we missed. Re: the naming convention for predicates, e.g. "IsTotalOrder" instead of "TotalOrder", that's potentially a good idea but the latter convention is already established here in predicates like Re: "lessThan" instead of "less", I don't have a strong opinion so I'd be willing to change it, personally. @prvshah51? |
Both options are fine with me. |
import opened Relations | ||
|
||
//Splits a sequence in two, sorts the two subsequences using itself, and merge the two sorted sequences using `MergeSortedWith` | ||
function method MergeSortBy<T>(a: seq<T>, compare: (T, T) -> bool): (result :seq<T>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function method MergeSortBy<T>(a: seq<T>, compare: (T, T) -> bool): (result :seq<T>) | |
function method MergeSortBy<T>(a: seq<T>, lessEqThan: (T, T) -> bool): (result :seq<T>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's perhaps worth noting that we could change this in the future without breaking client code. I'd maybe prefer lessEq
or lessThanOrEq
, to avoid the awkward reading of "equal than".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggestions , I am changing compare
to lessThanOrEq
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikaelMayer I just noticed you had some style requests that were lost in the shuffle. I'll respond to the ones I see that still seem relevant, but please just provide a quick comment at the top level if there are others we missed.
Re: the naming convention for predicates, e.g. "IsTotalOrder" instead of "TotalOrder", that's potentially a good idea but the latter convention is already established here in predicates like
Functions.Injective
. I'd suggest if you still feel strongly about it that you cut a repo issue for discussion, similar to #47.Re: "lessThan" instead of "less", I don't have a strong opinion so I'd be willing to change it, personally. @prvshah51?
Could you use the attribute {:options "/functionSyntax:4"} and change all your predicate method to predicate ?
function method {:opaque} Sort(s: seq, compare: (T, T) -> bool)
instead of a method
Add the following comment to MergeSortBy
// Splits a sequence in two, sorts the two subsequences using itself, and merge the two sorted sequences using MergeSortedBy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use the attribute {:options "/functionSyntax:4"} and change all your predicate method to predicate ?
function method {:opaque} Sort(s: seq, compare: (T, T) -> bool)
instead of a method
I did that for MergeSort.dfy
, but for Relations.dfy
SortedBy
was the only predicate method, and I actually made that a non-compiled predicate
instead like the others. As per my comment on the original version in dafny-reportgenerator
:
// TODO: Make this a predicate-by-method so it has linear runtime instead of quadratic.
// For now it's only used at runtime in tests.
This change doesn't need it to be compilable yet so I'd rather address that later.
Add the following comment to MergeSortBy
// Splits a sequence in two, sorts the two subsequences using itself, and merge the two sorted sequences using
MergeSortedBy
It's already there? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to use {:options "/functionSyntax:4"}
ran into what looks like a bug in Dafny 3.5.0 (which this repo is using) that I can't reproduce with the latest. I've reverted it for now and we can come back to that too: #55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice simplification of the original. :) Ordering continues to be a tricky topic, and I left a couple of comments related to it, though I'm not sure that any of them should hold up merging.
src/Relations.dfy
Outdated
* Note that being trichotomous implies being irreflexive. | ||
*/ | ||
predicate Trichotomous<T(!new)>(less: (T, T) -> bool) { | ||
(forall x, y :: less(x, y) || x == y || less(y, x)) && // at least one of the three |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I commented earlier about defining this for a comparison result data type, @cpitclaudel pointed out very rightly that my first attempt wouldn't work for comparison based on projection (e.g., looking only at the first element of a tuple), and I think this definition suffers from the same problem, using ==
here. That's not necessarily fatal, but it's something to be aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but I notice this isn't actually used anywhere at this point. Should we remove it, given that issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'd prefer to remove it too now that it's unused.
import opened Relations | ||
|
||
method BinarySearch<T>(a: array<T>, key: T, less: (T, T) -> bool) returns (r: Option<nat>) | ||
requires SortedBy(a[..], (x, y) => less(x, y) || x == y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have the same issue I commented on for Trichotomous
, making it impossible to sort by projections. Would it be possible to make the proof go through if we required TotalOrdering
instead of StrictTotalOrdering
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted that at first, but I was discouraged not only by verification not immediately working, but also the fact that it makes the specification messier as well: instead of
ensures r.Some? ==> r.value < a.Length && a[r.value] == key
You would need
ensures r.Some? ==> r.value < a.Length && lessThanOrEqual(a[r.value], key) && lessThanOrEqual(key, a[r.value])
Which is just plain much harder to read and understand.
I think for that reason, it makes more sense to improve this at the same time that we add comparison functions that return CmpResult
instead as per #53.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve but I still don't feel comfortable having a comparison function called twice in a binary search.
I definitely don't love it either. Both of these implementations are far from ideal when it comes to efficiency at runtime. I'm personally much more dissatisfied with my own MergeSort here rather than the binary search: it's using This whole repository needs more work to ensure it compiles to efficient-enough code for each supported language (see #24). In fact we're not even guaranteeing it is compilable at all for each target - it doesn't work for Java and hence the existence of https://github.com/dafny-lang/libraries-without-variance. In general, I consider it good forwards progress to have a verified but inefficient definition of a concept here, since many users can live with the inefficiency in their context. But I'm looking forward to adding a lot of |
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.