Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map:empty, map:exists ← array:empty, array:exists #827

Closed
ChristianGruen opened this issue Nov 9, 2023 · 12 comments
Closed

map:empty, map:exists ← array:empty, array:exists #827

ChristianGruen opened this issue Nov 9, 2023 · 12 comments
Labels
Feature A change that introduces a new feature XQFO An issue related to Functions and Operators

Comments

@ChristianGruen
Copy link
Contributor

ChristianGruen commented Nov 9, 2023

We have array:empty and array:exists, but no equivalent functions for maps.

I think we have decided to live with the ambiguity (discussed in #229) that map:exists(map {}) will return false although the “map exists”. Same for arrays.

@ChristianGruen ChristianGruen added XQFO An issue related to Functions and Operators Feature A change that introduces a new feature labels Nov 9, 2023
@michaelhkay
Copy link
Contributor

I think my preference would be to have array:empty() and map:empty(), but not array:exists() or map:exists(). The name is too misleading. The problem isn't as acute with fn:exists(), because in a typical call such as exists(child::thing) the function does what the reader would expect. It's too easy to read "array:exists(x)" as "array x exists". Much better to let people write not(array:empty(x)).

@dnovatchev
Copy link
Contributor

What about:

array:any ?

@michaelhkay
Copy link
Contributor

Which reminds me that we don't currently have array:some() and array:every().

Note that fn:exists($seq) is equivalent to fn:some($seq, true#0).

I think we're in danger of providing too many simple functions that do trivial things, and neglecting the more challenging problems.

@dnovatchev
Copy link
Contributor

Which reminds me that we don't currently have array:some() and array:every().

Note that fn:exists($seq) is equivalent to fn:some($seq, true#0).

I think we're in danger of providing too many simple functions that do trivial things, and neglecting the more challenging problems.

I think here we are going into another, very important topic. Here are a few thoughts on this:

This danger is due to the fact that we have sequences, arrays, maps, ..., and maybe we will have other types of collections in the future. And for each of these types of collections we have to define essentially the same functions, such as head, tail, sub-sequence / sub-whatever-collection, slice, items/members-at, etc ....

So, if we have N different collection types and M meaningful functions on each of them, then we end up with N * M functions.

As I have pointed in other issues, we need to have a common base-type for any collection and define these M functions only once - just for this common base-type.

I am not satisfied with the explanation that this is not possible due to the fact that the sequence is the most-general type. As in other languages, we could introduce ordered type resolution (like MRO in Python).

For example, in:

let $collection as union(generator(*), item()*)  :=  Expr-giving-collection

If every type of collection is a generator, then the Ordered type resolution matches the types in the union in order and assigns the first-found matching type to $collection. Thus, $collection will be treated as a generator and not just as a sequence. In case all types of collections can be represented as generators, then we don't need different fn:tail, array:tail, ..., etc. functions - we will just use one single function tail, that operates on a generator and returns another generator.

It is hidden (as it must be!) from the user that generators that are sequences and generators that are arrays would have different internal implementations.

@michaelhkay
Copy link
Contributor

Yes, indeed, it would be nice if the data model were different. However, a year or more of the development time for the 3.1 specs was taken up with trying to find a better way of modelling arrays that retained backwards compatibility. The fact that we failed to come up with a better solution then doesn't mean that no better solution is possible, but it does mean that it's a swamp I'm very reluctant to enter again.

@dnovatchev
Copy link
Contributor

Yes, indeed, it would be nice if the data model were different. However, a year or more of the development time for the 3.1 specs was taken up with trying to find a better way of modelling arrays that retained backwards compatibility. The fact that we failed to come up with a better solution then doesn't mean that no better solution is possible, but it does mean that it's a swamp I'm very reluctant to enter again.

This is the beauty of the Ordered Type Resolution that it does not affect the XDM in any way, it is used to dynamically specify the best type during the evaluation of a specific expression.

An array [1, 2, 3] is a sequence, but because its type had been specified (intentionally by the user) as union(array(*), item()*) and the first matching type - array(*) - has been selected, then evaluating this expression using Ordered Type Resolution:

let $ar as union(array(*), item()*) := [1, 2, 3]
  return
     head($ar)

will produce:

1

If we want the array to be treated as a sequence of one item, then this expression:

let $ar as item()* := [1, 2, 3]
  return
     head($ar)

will produce:

[1, 2, 3]

And if all collection types implement a given set of functions defined on generators, and thus can be considered generators, then it becomes unnecessary to have different definitions for functions such as fn:head, array:head, ..., etc., as both arrays and sequences implement the head, tail, ..., etc. functions that are defined for a generator.

To repeat again: No changes to the XDM are required in order to use OTR (Ordered Type Resolution) during expression evaluation.

@michaelhkay
Copy link
Contributor

You're proposing a change to the type system. It's not clear whether you are proposing that selection of array:head() in preference to fn:head() should be based on the static type of the expression $ar, or the dynamic type of the value of $ar, but either way, it's a very substantial change to our specs:

  1. If you're using the static type of the expression, then we have to build static typing rules into the language
  2. If you're using the dynamic type of the value, then we have to associate the dynamic type union(array(*), item()*) with the value somehow.
  3. And either way, there's a substantial change to the rules for static function calls to make the selection of a function depend on the type of the first argument.

@dnovatchev
Copy link
Contributor

You're proposing a change to the type system. It's not clear whether you are proposing that selection of array:head() in preference to fn:head() should be based on the static type of the expression $ar, or the dynamic type of the value of $ar, but either way, it's a very substantial change to our specs

Yes, the change is significant and would be very useful. It solves the huge burden of having to define essentially the same functions once for sequences, once for arrays, once for maps, ..., etc, over and over again.

Maybe even something like using the arrow operator and the treat as operator here would be more instructive:

($ar treat as array(*)) => head()

Both the proposed Ordered Type Resolution and/or the treat as operator can be used to achieved this.

Everyone would be grateful if we have this in the language and thus reduce the number of functions 2-3 times.

@michaelhkay
Copy link
Contributor

thus reduce the number of functions 2-3 times

I think that technically, it's reducing the number of function names, not the number of functions. You're essentially arguing for a mechanism for overloading names so they can refer to different functions depending on the context (or the arguments).

And most of the suggestions for disambiguating the function name seem to involve more complexity than just adding a namespace prefix to discriminate them. The objective surely is to come up with something that involves less complexity than the current mechanism.

@ChristianGruen
Copy link
Contributor Author

ChristianGruen commented Nov 11, 2023

Everyone would be grateful if we have this in the language and thus reduce the number of functions 2-3 times.

I don't think it's challenging for users to decide between head() or array:head() (at least, I haven't observed this in practice). It’s mich more challenging to understand the subtle inherent differences between the data types, resulting in different semantics. For example, array:for-each returns an array, whereas map:for-each and fn:for-each yield a sequence (one might expect to either get a map for map:for-each, or a sequence for array: for-each).

Talking about array:members-at, I understood the expectation would be (@dnovatchev am I right?) that it returns an array, whereas array:members is currently defined to return value records (i.e., maps) – and array:get returns a sequence…

@dnovatchev
Copy link
Contributor

thus reduce the number of functions 2-3 times

I think that technically, it's reducing the number of function names, not the number of functions. You're essentially arguing for a mechanism for overloading names so they can refer to different functions depending on the context (or the arguments).

If we no-longer care about these "different functions" then they would be internal to the implementation and the user will work only with the functions that are explicitly provided. So indeed, this reduces the number of user-visible functions.

And most of the suggestions for disambiguating the function name seem to involve more complexity than just adding a namespace prefix to discriminate them. The objective surely is to come up with something that involves less complexity than the current mechanism.

Complexity is inherent in everything and one can argue that "uniform complexity" is useful as it doesn't bring confusion, such as trying to remember which function in which namespace does exactly what.

@ChristianGruen
Copy link
Contributor Author

Closed (see #969).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature A change that introduces a new feature XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

3 participants