Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should order support heterogenous value types? #1599

Open
krader1961 opened this issue Aug 11, 2022 · 12 comments
Open

Should order support heterogenous value types? #1599

krader1961 opened this issue Aug 11, 2022 · 12 comments

Comments

@krader1961
Copy link
Contributor

krader1961 commented Aug 11, 2022

The discussion about the keys function implicitly sorting its output resulted in a decision not to do so. However, since the keys of an Elvish map do not have to be homogenous that implies the order function should support heterogenous types if we want examples like this to be well defined without raising an exception:

> var m = [&a=b &(num 1)=2]
> order [(keys $m)]
Exception: bad value: inputs to "compare" or "order" must be comparable values, but is uncomparable values
@krader1961
Copy link
Contributor Author

krader1961 commented Aug 15, 2022

Obviously, supporting heterogenous value types requires defining an ordering among the types. Which is going to be arbitrary but possible to define. For example, we could define type bool < num < string < list < map. The main issue is the cognitive cost of such an arbitrary ordering. Still, if we're going to allow heterogenous keys for maps it seems to me the order (and compare) command has to allow heterogenous values in order to get well defined behavior from doing keys $map | order.

P.S., At the moment compare (and thus order) doesn't allow comparing maps. Should it? What does it mean to compare two maps vis-a-vis ordering? The reason I ask is that the Elvish map implementation allows using a maps as keys which begs the question what does keys $map | order mean if the keys for $map are themselves maps?

@krader1961
Copy link
Contributor Author

krader1961 commented Aug 15, 2022

Note that Go doesn't support comparing maps. See https://go.dev/ref/spec#Comparison_operators. You can't declare a map that uses a map as a key. This is invalid in Go: var m map[map]bool. While Elvish isn't Go, and Elvish maps are ostensibly immutable, I still think Elvish should disallow maps as keys in a map. I can't think of a single valid use case and it is easy to inadvertently use a map as a key given the Elvish grammar.

@dunsany
Copy link
Contributor

dunsany commented Jan 8, 2023

if we want examples like this to be well defined without raising an exception

Why would we want that? What is so special in the keys $map | order bit, that we should, in your opinion, make sure that it never fails?

@krader1961
Copy link
Contributor Author

krader1961 commented Jan 9, 2023

What is so special in the keys $map | order bit, that we should, in your opinion, make sure that it never fails?

Precisely because Elvish map keys are not currently required to be homogenous. I've argued elsewhere that they should be homogenous. In which case this proposal can be rejected. But if we're going to continue allowing heterogenous keys then keys $map | order should be well defined and not raise an exception.

@dunsany
Copy link
Contributor

dunsany commented Jan 9, 2023

I don't follow you here. Your argument, as I currently understand it, goes like this:

K: order should support heterogenous value types.
D: Why?
K: Because keys $map | order shouldn't ever fail.
D: Why?
K: Because maps can contain heterogenous value types. So order should support heterogenous value types.
D: Why?
...

You can see why I think it is an infinite loop.
Lists also support heterogenous value types, and I don't think you consider it making the case for supporting comparison of different types.
So let me ask again: what is so special about maps and their keys that keys $map | order shouldn't ever fail?

@krader1961
Copy link
Contributor Author

I get your point, @dunsany; however, I would argue that it may be desirable this example provides an ordered output rather than raise an exception:

> order [1 (num 1)]
Exception: bad value: inputs to "compare" or "order" must be comparable values, but is uncomparable values

I shouldn't have based my argument solely on the output of the keys command. The fact is that a dynamic language like Elvish is fundamentally different from a static language like C or Go. I used sorting the output of the keys command only because maps are not guaranteed to include homogenous types for its keys (as is the case in static languages like C and Go). I understand that equivalent scenarios (see above) are also affected by this issue. I simply don't think they are as likely to occur in practice. Having said that, if you believe that the order command should retain its current behavior I consider that a strong argument that Elvish maps should not allow heterogenous keys. That's because I believe that if ordering heterogenous types should be an error then var m = [&1=a &(num 1)=b] should be an error.

@dunsany
Copy link
Contributor

dunsany commented Jan 10, 2023

Again: how is ordering a list different than doing < x (num 1), for example? What is so special about lists that it makes you feel Elvish should support comparing apples to oranges? If you are particularly concerned about numbers, you should probably say it explicitly.

if ordering heterogenous types should be an error then var m = [&1=a &(num 1)=b] should be an error

So you changed your argument somewhat to this:

Map keys should always be comparable.

I'm sure you already know my next question. What is so special about map keys that you consider it to be important?

@krader1961
Copy link
Contributor Author

@dunsany: This came to my attention again while working on issue #1495 to sort the keys of maps printed by pprint and repr. There is a world of difference between ordering values and an explicit comparison such as < x (num 1). The latter is obviously nonsensical and should result in an exception being raised. The former can be, at least in theory, well defined. You seem to have missed the point of this issue. Either the order command (and sorting Elvish values in general) should have a well defined ordering of heterogenous values or it should not be possible to use heterogenous types as map keys since it should be possible to order the keys of a map. Note that this argument does not apply to arbitrary, heterogeneous, values in a list. On the other hand, if we're going to provide a well defined ordering of heterogeneous map keys, then ordering heterogenous values in a list is also well defined. Which gets to my point that either such ordering should be well defined for both cases or map keys should be required to be homogenous.

Note that some languages, such as Go, do allow heterogeneous map keys (e.g., var m map[any]any) but you will be hard pressed to find uses of that pattern and which also support ordering the key values since it isn't supported natively by the Go stdlib.

My solution for issue #1495 supports heterogenous keys but does not resolve this issue. It works by default since it tells the Go sort.Sort function that incomparable values are always "true" when comparing whether one value is less than another (incomparable) value. That solution happens to work today, but it does so accidentally rather than by definition. So I have misgivings regarding my implementation to have the repr and pprint commands implicitly sort map keys since it depends on an implementation detail of the Go sort package. It would be better if Elvish defined an explicit ordering of heterogenous types.

@krader1961
Copy link
Contributor Author

@xiaq, Having worked on a solution to issue #1495 I now have an even stronger belief that an explicit ordering of incomparable types should be defined (as in my earlier comment) or Elvish maps should require homogeneous (i.e., comparable) keys. My preference is the former since the behavior is generally useful. The potential risk is that people will attempt to order heterogenous values (e.g., a list like [1, (num 2)]) without realizing they have made a mistake. For example, by including strings that look like numbers and actual numbers in a list. But if that is a concern then it also argues against Elvish maps allowing heterogenous keys since keys $map | order can fail unexpectedly, and the use of order in that example shouldn't be necessary to expose the incomparable types. We also shouldn't be protecting users from foot/gun mistakes unless such mistakes are always an error.

@krader1961
Copy link
Contributor Author

I modified the cmp function to replace the "uncomparable" return value with an explicit ordering of uncomparable types as I proposed in an earlier comment. I was pleasantly surprised at how many unit tests failed because they explicitly verified an attempt to compare previously uncomparable types. I appreciate good test coverage. On the other hand, those test failures suggest this change in behavior may be controversial. I believe my change is not controversial. The existing tests, which my change breaks, simply reflect the behavior of the code prior to my change regarding a total ordering of Elvish types.

@krader1961
Copy link
Contributor Author

In light of my previous comment and thinking about the unit test failures I have concluded that the new behavior should not be the default. It should require an explicit option to the compare and order commands, but should be the default for implicit ordering by the pprint and repr commands. This improves backward compatibility and, more importantly it seems to me, is less likely to surprise uses of the compare and order commands if they do not explicitly ask that otherwise uncomparable types be handled.

@xiaq
Copy link
Member

xiaq commented May 9, 2023

Re a total ordering of Elvish values, see my comment in #1698 (comment). It's not necessary to define a total ordering of all Elvish values in order to print maps with their keys sorted, or for order to support heterogenous value types.

To expand that a bit, when order is given heterogenous values (probably gated by an option), all it should guarantee is that values of the same type are grouped together, and if sortable, sorted. For example, if the input is b (num 1) a (num 2) { foo } { bar }, the output should guarantee that

  • a b appear together, in that order
  • (num 1) (num 2) appear together, in that order
  • { foo } { bar } appear together

In practice order could use a deterministic ordering between types, but that shouldn't be guaranteed or relied upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants