Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upEquality on arrays is wonky #403
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eeue56
Sep 10, 2015
Contributor
So, I have a fix for this now by adding length and unsafeGet to Utils.js and adding
if (l.ctor === "_Array") {
if (l.ctor === r.ctor){
if (length(l) != length(r)) return false;
for (var i = 0; i < length(l); i++){
if (!eq(unsafeGet(i, l), unsafeGet(i, r))){
return false;
}
}
return true;
} else {
return false;
}
}
to the start of eq. It also decreased the speed for Array comparisons by a factor of ~5 times for unnested arrays.
For nested arrays it cut eq down from ~11600 ms to ~20ms for the test below
main =
show
<| List.filter (\x -> not x)
<| List.map (\n -> (Array.fromList (List.repeat n <| List.repeat 50 1)) == (Array.repeat n <| List.repeat 50 1)) [50000..50002]
So, overall, I think I have a fix and it will generally increase performance slightly, and in other cases it will massive increase performance.
The one blocker for me to create a PR to fix this though is where should I put Array.unsafeGet and Array.length? I'm not really fond of having specialized functions within Utils, and they can't be accessed in Array.js from Utils.js due to the fact that Array depends on List which depends on Utils.
Any thoughts?
|
So, I have a fix for this now by adding
to the start of eq. It also decreased the speed for Array comparisons by a factor of ~5 times for unnested arrays. For nested arrays it cut eq down from ~11600 ms to ~20ms for the test below
So, overall, I think I have a fix and it will generally increase performance slightly, and in other cases it will massive increase performance. The one blocker for me to create a PR to fix this though is where should I put Any thoughts? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
evancz
Oct 30, 2015
Member
It is also wonky on Set and Dict. The root in all the cases is that the "meaning" of the data structure does not determine a 100% unique structure. It makes sense to me that all of these situations, where structural equality is not really what is desired, should be handled in a coherent way. There are solutions for that (like type classes) but I don't want to introduce a relatively intense language feature to resolve this particular issue.
So I guess I'm saying, if we go in this direction, it means we will special case Array and possibly Dict. But we will not special case any user defined thing. Hash maps, graphs, whatever else. So implicitly, we are saying "we have consistent behavior now, but we want to add some special cases, making things a bit better, but also making the behavior inconsistent." That's what I think.
I don't know if that's good or bad. I guess I'd be curious to know if these are things you considered in looking at this issue.
|
It is also wonky on So I guess I'm saying, if we go in this direction, it means we will special case I don't know if that's good or bad. I guess I'd be curious to know if these are things you considered in looking at this issue. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
evancz
Oct 30, 2015
Member
Also, of all the things related to making Array nicer, this is probably the one thing that branches out into broader language design questions, so it's not just a matter of writing code and putting it in the right place. There are at least 3 viable solutions that are mutually exclusive, so it's a question of which one and when.
|
Also, of all the things related to making |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eeue56
Oct 30, 2015
Contributor
When I was looking at this, it seemed to make sense to me to consider adding a _eq method or something to each of the Elm objects, much like the current ctor. The algorithm for comparisons would then look like this -
if a.ctor === b.ctor:
if not a._eq:
False
a._eq(b) There are a few benefits from this:
It would now be possible to add custom eq operations to Native modules. This would be great for anything implementing a new data structure that would enable things like fast eqs and still using Elm's built in == operator. The current eq is slow in some places, such as for Array (as my posts note). This would get around that issue.
From a language perspective, it would also avoid people going down the route of defining a equals method which did what == should do, but faster. This helps keep the language clean and readable.
This is the route that Python takes, for example. It would also translate from the Eq typeclass in Haskell -
instance Eq Array where
...In an ideal world, I would love it if it were possible to define an eq that gets converted to the Native _eq at compile time, but I'm not sure how feasible that would be within the language. I would therefore only allow this _eq override to be done through Native modules.
I would prefer this to bringing in custom Array/Dict stuff into Utils.js.
|
When I was looking at this, it seemed to make sense to me to consider adding a if a.ctor === b.ctor:
if not a._eq:
False
a._eq(b) There are a few benefits from this: It would now be possible to add custom eq operations to Native modules. This would be great for anything implementing a new data structure that would enable things like fast From a language perspective, it would also avoid people going down the route of defining a This is the route that Python takes, for example. It would also translate from the instance Eq Array where
...In an ideal world, I would love it if it were possible to define an I would prefer this to bringing in custom Array/Dict stuff into Utils.js. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eeue56
Oct 30, 2015
Contributor
Alternatively, Elm could just take the stance that comparables will be provided for in core through Utils.js, and if anyone else wants to define a eq, they have to implement it with a different name. So it would become common to see eq a b instead of a == b for non-core datastructures.
|
Alternatively, Elm could just take the stance that comparables will be provided for in core through Utils.js, and if anyone else wants to define a |
jvoigtlaender
referenced this issue
Nov 15, 2015
Closed
Set.fromList [1,2,3,4] /= Set.fromList [4,3,2,1] #1166
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtfeldman
Nov 15, 2015
Member
Conceptually I think of Elm's philosophy as "it's not a big deal that you don't have a ton of power to define custom things, because the core data structures are really nice and you can get a long way just mixing and matching them."
I think special-casing equality for the core data structures of Array, Set, and Dict is consistent with this philosophy in the same way that number gets its own typeclass so that things like (+) and (*) can be nice, even though users can't define their own typeclasses.
|
Conceptually I think of Elm's philosophy as "it's not a big deal that you don't have a ton of power to define custom things, because the core data structures are really nice and you can get a long way just mixing and matching them." I think special-casing equality for the core data structures of |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtfeldman
Nov 15, 2015
Member
Also, it seems a bit weird that == is a runtime exception for functionA == functionB but just doesn't work properly for setA == setB. Seems like either the latter should be considered a bug or else something explicitly unsupported, like function equality, which results in a runtime exception.
That said, appreciate the point that this is a broader language discussion.
|
Also, it seems a bit weird that That said, appreciate the point that this is a broader language discussion. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eeue56
Nov 16, 2015
Contributor
I have this issue with Set and Dict having special casing in Utils.js. Technically, I don't think they should need it - the fact they do is the result of eq being broken from a language perspective. The problem with all these datastructures is that they can be represented in many different ways - eq from Utils.js compares the representation, rather than the data. This is probably right from a "true" eq standpoint, but from a usability standpoint users don't care if 9 was used as the base node instead of 5, they just care that a set has both [9, 5].
At the moment, both Dict and Set are pure Elm. Special casing Array is pretty easy to do, since the implementation is in JS already. Javascript is where runtime exceptions exist and happen. I would much prefer to have things in Elm rather than JS wherever possible. It's easy enough to write a (more or less) performant eq in Elm for Dicts and Sets - my AllDict implementation does just that, albeit in a hacky way.
As for function equality, I also have a solution for that which works in a hacky way, for two identical implementations of the same function (though, names aren't checked as they currently aren't kept from compile)
|
I have this issue with At the moment, both As for function equality, I also have a solution for that which works in a hacky way, for two identical implementations of the same function (though, names aren't checked as they currently aren't kept from compile) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
eeue56
Nov 16, 2015
Contributor
FWIW - I know how to fix eq for Set and Dict currently (using special casing), but I think it might be worth having this discussion before adding it
|
FWIW - I know how to fix eq for |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mgold
Dec 27, 2015
Contributor
So I guess I'm saying, if we [fix the issue without adding a language feature], it means we will special case Array and possibly Dict. But we will not special case any user defined thing. Hash maps, graphs, whatever else. So implicitly, we are saying "we have consistent behavior now, but we want to add some special cases, making things a bit better, but also making the behavior inconsistent." That's what I think.
Inconsistent behavior is a matter of perspective. If you see Dict as a purely Elm data structure that nevertheless gets special treatment in the compiler that is unavailable to other authors, that's inconsistent. But if you see Int and List Int as supporting equality but suddenly Array Int and Set Int do not, then it's the current situation that's inconsistent.
The fact that Dict happens to be a pure-Elm data structure behind an opaque type, and Array is a bunch of JavaScript code, is an implementation detail. The fact that equality will silently fail, when silent failures are almost unheard of in Elm, speaks much louder than an implementation detail.
If the long-term goal is to add a language feature, than this is a short-term hack. Whenever that feature is added, data structure equality will be supported, so to the user, nothing in core changes.
Inconsistent behavior is a matter of perspective. If you see Dict as a purely Elm data structure that nevertheless gets special treatment in the compiler that is unavailable to other authors, that's inconsistent. But if you see The fact that If the long-term goal is to add a language feature, than this is a short-term hack. Whenever that feature is added, data structure equality will be supported, so to the user, nothing in core changes. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rtfeldman
Apr 15, 2016
Member
Sort 'em afterwards, yeah?
On Fri, Apr 15, 2016, 4:16 AM Kris Jenkins notifications@github.com wrote:
This also raises the question, how can we reliably compare sets for
equality? The toList workaround doesn't work:» elm repl
---- elm repl 0.16.0 -----------------------------------------------------------:help for help, :exit to exit, more at https://github.com/elm-lang/elm-repl
import Set
Set.fromList [0,1] == Set.fromList [1,0]
False : Bool—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/elm-lang/core/issues/403#issuecomment-210420961
|
Sort 'em afterwards, yeah? On Fri, Apr 15, 2016, 4:16 AM Kris Jenkins notifications@github.com wrote:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
krisajenkins
Apr 15, 2016
Contributor
@rtfeldman Yup. Sorry, I was being slow & clearly didn't delete that comment fast enough.
|
@rtfeldman Yup. Sorry, I was being slow & clearly didn't delete that comment fast enough. |
tgecho
referenced this issue
Apr 17, 2016
Closed
[word-count] Tests fail because expectation and result are… equal! #67
referenced
this issue
in NobbZ/xelm
Apr 17, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
NobbZ
Apr 18, 2016
Perhaps add an equals function to the affected types modules. That has to be used then instead of ==.
Another approach were to introduce overloading of functions. This way you can avoid typeclasses but introduce other ambiguities.
NobbZ
commented
Apr 18, 2016
|
Perhaps add an Another approach were to introduce overloading of functions. This way you can avoid typeclasses but introduce other ambiguities. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mgold
Apr 18, 2016
Contributor
equals works well for third-party libraries but would be a blight on the core libraries.
Type classes are a form of overloading. If you want to continue this discussion please do so on the mailing list.
|
Type classes are a form of overloading. If you want to continue this discussion please do so on the mailing list. |
eeue56 commentedSep 9, 2015
The problem
Issue #176 highlighted the fact that array equality takes into account the internal structure when comparing two arrays. The problem was raised by two arrays, of identical elements and length, being generated differently -
will generate
whereas
will generate
Due to the way eq is implemented, the two arrays which are element-by-element equal are considered different as they have different heights and therefore nested differently.
Expected behvaiour
As datastructures in Elm are immutable, we don't really care about checking if the internal data structures are identical -
[1, 1]should always be equal to[1, 1], no matter if one uses two tables and the other uses a single table internally.The fix
eqshould check if an object'sctoris an Array. If they both are, then we should compareElm.Native.Array.length(right) == Elm.Native.Array.length(left). If they match, then we should iterate over both simultaneously and calledeqon each element-pair as we go.