Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard, array & map functions: Equivalencies #843

Closed
ChristianGruen opened this issue Nov 18, 2023 · 19 comments
Closed

Standard, array & map functions: Equivalencies #843

ChristianGruen opened this issue Nov 18, 2023 · 19 comments
Labels
Propose Closing with No Action The WG should consider closing this issue with no action Tests Needed Tests need to be written or merged XQFO An issue related to Functions and Operators

Comments

@ChristianGruen
Copy link
Contributor

ChristianGruen commented Nov 18, 2023

In many threads (#135, others), we have discussed how to align the functions for sequences, arrays, and maps. This is an attempt to summarize the status quo, and I hope to keep it up-to-date in the coming weeks.

The 4.0 functions are the ones with the keyword new attached. If the function is followed by a question mark, there may be an existing issue for its addition, or it may be consistent to add it.

Please note that the data types have fundamental differences, so it’s not always possible to present or provide exact symmetries.

To be discussed

Functions Array Functions Map Functions
fn:contains-subsequence
new#94#844
array:contains-subarray ? map:contains
fn:ends-with-subsequence
new#96#844
array:ends-with-subarray ?
fn:starts-with-subsequence
new#96#844
array:starts-with-subarray ?
fn:distinct-values array:distinct-members ?
fn:duplicate-values
new#123
array:duplicate-members ?
fn:empty array:empty new#229 map:empty ? #827
fn:exists array:exists new#229 map:exists ? #827
fn:every new array:every ? map:every ?
fn:some new array:some ? map:some ?
fn:highest new array:highest ?
fn:lowest new array:lowest ?
fn:index-of array:index-of ? #260
fn:index-where new array:index-where
new#114
map:keys($m, $pred)
new#467
fn:items-at new#213
→ fn:get ? #872
array:members-at ? #825
array:get
map:get
fn:intersperse new#2
→ fn:join ? #868
array:join
fn:subsequence-where
new#878
array:subarray-where ?
fn:substitute ? #553, #583 array:replace new;
array:substitute ? #583
map:replace new;
map:substitute ? #583
fn:slice new array:slice new
array:split new map:entries new
array:values new map:keys; map:values new
array:entries ? #826 map:entries new
array:merge ? #826 map:merge
map:entry
array:members
new → keep? #826
map:pairs
new → keep? #826
array:of-members
new → keep? #826
map:of-pairs
new → keep? #826
map:pair
new#508 → keep? #826

Settled

Functions Array Functions Map Functions
fn:count array:size map:size
fn:filter array:filter map:filter
fn:fold-left array:fold-left
fn:fold-right array:fold-right
fn:for-each-pair array:for-each-pair
fn:for-each array:for-each map:for-each
fn:head array:head
fn:insert-before array:insert-before
fn:remove array:remove map:remove
fn:reverse array:reverse
fn:sort array:sort
fn:subsequence array:subarray
fn:tail array:tail
array:put; array:append map:put
fn:footnew#250 array:foot new#250
fn:trunk new#250 array:trunk new#250
@ChristianGruen ChristianGruen added XQFO An issue related to Functions and Operators Discussion A discussion on a general topic. labels Nov 18, 2023
@Arithmeticus
Copy link
Contributor

@ChristianGruen in these two excellent tables, could you somehow add a visual device to distinguish between 4.0 and non-4.0 functions? Then it will be easier to identify those functions that cannot be tampered with.

@Arithmeticus
Copy link
Contributor

One challenge in a search for consistency, is that of domains. The map: and array: prefixes cover specialized areas that are mutually exclusive, but not so with fn:, which is general, and may handle maps and arrays as items in any sequence.

This touches on the issue of naming conventions. When map and array are the prefix, the user gets an important cue lacking in fn. Parity may be impossible for future functions. And may be undesirable. (TBD)

Note, when the fn namespace intersects with the other two, array is always present. There aren't cases of fn intersecting with map absent a corresponding array counterpart. fn and array are proximate to each other because they both deal with ordered, non-keyed objects.

@ChristianGruen
Copy link
Contributor Author

@ChristianGruen in these two excellent tables, could you somehow add a visual device to distinguish between 4.0 and non-4.0 functions? Then it will be easier to identify those functions that cannot be tampered with.

Thanks. The 4.0 functions are the ones with the keyword new attached; if the function is followed by a question mark, there may be an existing issue for its addition, or it may be consistent to add it.

I’ve just updated the table to reflect the latest changes.

One challenge in a search for consistency, is that of domains. The map: and array: prefixes cover specialized areas that are mutually exclusive, but not so with fn:, which is general, and may handle maps and arrays as items in any sequence.

Yes, that’s a valid point. Personally, I would be happy if we deliberately chose different names for sequence functions. We should just be aware of the consequences. The discussion on fn:items-at was partly triggered by the proposal to add array:members-at (#825), which would introduce some redundancy, especially if we restrict a possible fn:get or fn:item-at function to single positions.

@ChristianGruen
Copy link
Contributor Author

@dnovatchev I like your push forward to find a more general solution to the naming problem. I just don’t know what it could look like (maybe others don’t either). Would you be willing to open a new issue and make some suggestions on how we could proceed?

@Arithmeticus
Copy link
Contributor

From my perspective, we should align names when we can, but we should expect and embrace the asymmetry. We should not conclude something is wrong if we cannot, nor should we presume that any new function in one area will necessarily spawn O(M * N) functions. In sum, this comment argues for preserving the status quo, making small adjustments where possible.

The fn prefix is a general category. Users have few expectations on what a given function in this namespace will do.

On the other hand, the math, map, and array prefixes are specialized categories. Every user who has some basic understanding of any one of these three domains brings a set of expectations to a given function that they do not bring either to the general category or to another specialized category.

We do this in OOP all the time. If we are working with a project that has a date object, a car object, and a thingy object, we do not expect the date object to have a speed method, nor the car object to have a time zone property, and we do not know what kinds of properties or methods a thingy object would have. The nature of the object drives its properties and methods.

I think the underlying principles we have used to date are sound. fn governs the large swathe of functions that accept sequences of items of various types. Whenever a specialized category is introduced, the prefix + names reflect the peculiarities of that domain. The number of functions added to a specialized category should be limited, and not attempt to replicate the inventory of general category functions. Vice versa, no specialized function should necessarily spawn an fn counterpart.

In sum, I don't think there's a significant problem here. We can make slight alignments/adjustments where possible.

Anyone who doesn't like the standard function names and arity can write their own function library, refactoring, regrouping, and re-aliasing the functions as they wish, and share the results with developers of like mind. I've done a bit of this myself in the past. But it need not occupy WG time and energy.

@ChristianGruen
Copy link
Contributor Author

ChristianGruen commented Dec 18, 2023

Thanks, Joel. I completely agree. My hope is that we can go through the list together in the meeting and decide which functions we want to keep as is, and which of the functions tagged with question marks we want to add or not. Hopefully, by simple majority vote, this won't take longer than 15 minutes. As a result, we may be able to close several other open issues.

@dnovatchev
Copy link
Contributor

Anyone who doesn't like the standard function names and arity can write their own function library, refactoring, regrouping, and re-aliasing the functions as they wish, and share the results with developers of like mind. I've done a bit of this myself in the past. But it need not occupy WG time and energy.

💯

@ChristianGruen
Copy link
Contributor Author

One general question here is:

Do we need all the complementing functions for arrays that don’t exist yet (array:contains-subarray, array:index-of, etc.), or should we rather focus on sequences, and either expect users to write their own functions?

@michaelhkay
Copy link
Contributor

I would be quite happy to drop those that appear to have little value, or that make little sense in the context of arrays. For example I don't believe array:index-of() makes much sense because there's a wide choice of possible ways of comparing array members, none of which is an obvious default; it makes more sense to rely on an array:index-where() function where specifying the comparison function is mandatory.

@ChristianGruen
Copy link
Contributor Author

My minimized proposal would be to…

  • Add map:empty
  • Drop array:exists
  • Add array:index-of (see array:index-of #260 (comment))
  • Change map:keys($m, $pred) to map:keys-where($m, $pred) (analogous to index-where, subsequence-where, etc).

…and skip the addition of equivalent array functions, at least in the scope of this issue.

Suggestions are welcome.

@michaelhkay
Copy link
Contributor

  • map:empty - fine.
  • drop array:exists - fine
  • add array:index-of - what equality operator are you proposing to use? Will it atomize? How compatible with fn:index-of will it actually be? Because there are many possible ways of testing equality of sequences, I feel that array:index-where is a lot more explicit and flexible.
  • rename map:keys#2 - fine.

@ChristianGruen
Copy link
Contributor Author

Thanks.

  • add array:index-of - what equality operator are you proposing to use? Will it atomize?

I would define it as suggested in the referenced comment (#260 (comment)).

@dnovatchev
Copy link
Contributor

We need some powerful way to examine arrays - something that combines the proposed for removal array:exists and array:empty.

I would love to have:

array:where($input as array(*), $predicate as fn(array(*)) as xs:boolean) as array[*]*

And also:

array:any($input as array(*), $predicate as fn(array(*)) as xs:boolean) as xs:boolean

and:

array:count($input as array(*), $predicate as fn(array(*)) as xs:boolean) as xs:integer

It is not a coincidence that LINQ provides very similar function to these.

@michaelhkay
Copy link
Contributor

What do these functions actually do? We can't guess from their names.

How does array:where(A, P) differ from array:filter(A, P)? Why does the predicate take an array as its argument?

Is array:count(A, P) simply array:size((array:filter(A, P)))?

Is array:any(A, P) simply array:count(A, P) ne 0?

@michaelhkay
Copy link
Contributor

I would define array:index-of as suggested in the referenced comment (#260 (comment)).

That comment proposes to use fn:deep-equal for comparisons. Since the primary motivation for introducing array:index-of is for symmetry with fn:index-of, I think it would be rather confusing if array:index-of(array{1 to 10}, @data) returns false, while fn:index-of(1 to 10, @data) returns true. If we can't produce a specification that has sufficiently similar semantics, I would prefer to stick with array:index-where().

@dnovatchev
Copy link
Contributor

What do these functions actually do? We can't guess from their names.

How does array:where(A, P) differ from array:filter(A, P)? Why does the predicate take an array as its argument?

Yes, it seems to be equivalent to array:filter, and the predicate in all the 3 signatures needs to be:

function(item()*) as xs:boolean

As we have started using the word "where" in other function names, we could continue this systematic naming approach, and use the same word "where" here.

Is array:count(A, P) simply array:size((array:filter(A, P)))?

Yes, but is twice shorter ... and twice less error-prone and time-consuming. Also, considerably easier to understand - thus significantly more readable.

Is array:any(A, P) simply array:count(A, P) ne 0?

Yes, but is twice shorter ... and twice less error-prone and time-consuming. Also, considerably easier to understand - thus significantly more readable.

Also, array:count() seems to be O(n) while array:any is O(1).

@ChristianGruen
Copy link
Contributor Author

Since the primary motivation for introducing array:index-of is for symmetry with fn:index-of, I think it would be rather confusing if array:index-of(array{1 to 10}, @data) returns false, while fn:index-of(1 to 10, @data) returns true.

I can see what are your concerns. We could possibly atomize the search items before applying deep-equal. I believe we can live with the remaining edge cases (like comparing NaN), as the existing incompatibilities (between eq, atomic-equal, deep-equal) are much harder to grasp.

It's true that array:index-where is more flexible, but it's also more complicated and verbose. Calling array:index-of($array, 5) is mich easier to understand and better readable than things like array:index-where($array, fn { . = 5 }), ...deep-equal(?, 5), or even ...op('=')(?, 5).

@michaelhkay
Copy link
Contributor

Perhaps then array:index-of() should have signature array:index-of(array(xs:anyAtomicType), xs:anyAtomicType) and array:index-of($A, $V) should return fn:index-of($A?*, $V).

ChristianGruen added a commit to ChristianGruen/qtspecs that referenced this issue Jan 24, 2024
@ChristianGruen ChristianGruen added PR Pending A PR has been raised to resolve this issue Tests Needed Tests need to be written or merged labels Jan 24, 2024
ChristianGruen added a commit to ChristianGruen/qtspecs that referenced this issue Feb 5, 2024
@ChristianGruen ChristianGruen added Propose Closing with No Action The WG should consider closing this issue with no action and removed Discussion A discussion on a general topic. PR Pending A PR has been raised to resolve this issue labels Feb 23, 2024
@ndw
Copy link
Contributor

ndw commented Feb 27, 2024

The CG decided to close this issue without further action at meeting 067.

@ndw ndw closed this as completed Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Propose Closing with No Action The WG should consider closing this issue with no action Tests Needed Tests need to be written or merged XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

5 participants