Proposal: Add the missing functions for arrays: array:exists() and array:empty() #229

dnovatchev · 2022-10-31T19:02:26Z

The functions over sequences: fn:exists and fn:empty are amongst the most useful and well-understood sequence functions.

In case we need to know whether or not an array has at least one member, or none, these function cannot help us:

To fill this gap, these two functions are proposed:

`array:exists`:

Signature

array:exists($arg as array(*)) as xs:boolean

Properties

This function is ·deterministic·, ·context-independent·, and ·focus-independent·.

Rules

If the value of $arg is a non-empty array, the function returns true(); otherwise, the function returns false().

Examples

The expression array:exists(array:remove(["hello"], 1) returns false().

The expression aray:exists(array:remove(["hello", "world"], 1) returns true().

The expression aray:exists([]) returns false().

The expression aray:exists([ () ]) returns true().

`array:empty`:

Signature

array:empty($arg as array(*)) as xs:boolean

Properties

This function is ·deterministic·, ·context-independent·, and ·focus-independent·.

Rules

If the value of $arg is the empty array, the function returns true(); otherwise, the function returns false().

Examples

The expression array:empty(array:remove(["hello"], 1) returns true().

The expression aray:empty(array:remove(["hello", "world"], 1) returns false().

The expression aray:empty([]) returns true().

The expression aray:empty([ () ]) returns false().

The text was updated successfully, but these errors were encountered:

michaelhkay · 2022-11-01T08:48:48Z

I worry a little about the name array:exists(), it feels too much like a test as to whether the array exists, not whether it has any members.

ChristianGruen · 2022-11-01T10:22:23Z

I always felt fn:exists to be similarly confusing. On the other hand, if people know what it means, they should also understand how array:exists works (and, possibly, map:exists).

ndw · 2022-11-01T11:15:17Z

It's unfortunate that we have an inconsistency between sequences, arrays, and maps: we use fn:count to find out how many elements are in a sequence but array:size and map:size for arrays and maps.

For sequences, there's no confusion about what "not existing" means, but I see Mike's point about "exists" functions on maps and arrays. I suppose as long as we don't allow the argument to array:size and map:size to be optional, they work analagously to fn:exists and people will either find it does exactly what they expect or they'll get used to it.

It does strike me as slightly disconcerting that if you wanted to know if an optional array (or map) was non-empty, you'd have to write:

exists($arr) and array:exists($arr)

But the alternative of allowing the argument to array:exists to be optional would be differently confusing. Presumably, if we allowed the argument to array:exists to be empty, we'd have to say that array:exists(()) returned () where array:exists(array{}) returns false().

On balance, I'm in favor. They work like fn:exists and trying to come up with new names would simply be another inconsistency between sequences and maps and arrays.

OTOH, I don't find if (array:size($arr) gt 0) ... that troubling to write.

michaelhkay · 2022-11-01T11:32:29Z

Compatibility means deliberately repeating other people's mistakes.

Yes, I think fn:exists() was a mistake. It reads OK in a context like fn:exists(child::h1). But "array:exists" reads as "does the array exist?" not as "does the array have any members?"

ndw · 2022-11-01T11:37:39Z

I suppose fn:has-items could be proposed which takes an optional argument:

If the argument is empty, return false()
If the argument is a sequence of length greater than one, return true()
If the argument is a single map, return map:size($arg) gt 0
If the argument is a single array, return array:size($arg gt 0
Return true()

Whether that's a good idea or not...

Arithmeticus · 2022-11-01T13:49:32Z

+1 to latest suggestion by @ndw with some suggested alternative names for discussion (because I think array:exists is misleading):

fn:is-populated
fn:is-empty (obviously the inverse of the function)
fn:has-members

dnovatchev · 2022-11-01T14:56:03Z

In LINQ (C#) any IEnumerable has the method

Any()

There is no equivalent for fn:empty. The code just checks for : if(!collection.Any()) , and this is perfectly understandable and readable.

Thus I would prefer something similar (in decreasing order of preference):

array:any, or array:has-any, or array:has-members

For implementation I would strongly recommend not to mention array:size at all, as knowing the size is not required for this functionality, and in fact computing the size could take O(N), while this function can and should be implemented in O(1).

benibela · 2022-11-01T17:07:24Z

array:exists
The expression array:exists([ () ]) returns true().

Perhaps array:exists([ () ]) false would be better, if people want to know if there is actual data. Even array:exists([ [] ]) could be reasonable

For implementation I would strongly recommend not to mention array:size at all, as knowing the size is not required for this functionality, and in fact computing the size could take O(N), while this function can and should be implemented in O(1).

The implementation does not need to calculate the size to test array:size($arg gt 0, some optimizer could change it

dnovatchev · 2022-11-01T18:02:41Z

array:exists
The expression array:exists([ () ]) returns true().

Perhaps array:exists([ () ]) false would be better, if people want to know if there is actual data. Even array:exists([ [] ]) could be reasonable

Definitely not. The array has one member. And this empty sequence is the actual data contained in that member. An empty sequence is in fact actual data. For example, the empty sequence can be the value passed as argument in a given function call.

For implementation I would strongly recommend not to mention array:size at all, as knowing the size is not required for this functionality, and in fact computing the size could take O(N), while this function can and should be implemented in O(1).

The implementation does not need to calculate the size to test array:size($arg gt 0, some optimizer could change it

Yes, and given that some of the comments above referred to array:size, we need to not have any mentions of array:size in the specification of these two functions.

joewiz · 2022-11-01T21:40:20Z

These would be useful functions!

I agree that array:exists() and map:exists() are potentially confusing. A single catch-all fn:has-items() also risks confusion, since the data types are so different - across sequences, arrays, and maps.

I prefer data-type specific terminology like array:has-members() since according to https://www.w3.org/TR/xquery-31/#dt-member:

Definition: The values of an array are called its members.

I'd also welcome a corresponding map:has-entries() (a convenience function for map:size($map) gt 0), since according to https://www.w3.org/TR/xquery-31/#dt-entry:

Definition: Each key / value pair in a map is called an entry.

And for the empty version, I think map:empty() and array:empty() wouldn't risk confusion (shorthands for map:size($map) eq 0 and array:size($array) eq 0).

dnovatchev · 2022-11-01T23:58:16Z

@joewiz ,

These would be useful functions!

I agree that array:exists() and map:exists() are potentially confusing. A single catch-all fn:has-items() also risks confusion, since the data types are so different - across sequences, arrays, and maps.

I prefer data-type specific terminology like array:has-members() since according to https://www.w3.org/TR/xquery-31/#dt-member:

Definition: The values of an array are called its members.

👍

I'd also welcome a corresponding map:has-entries() (a convenience function for map:size($map) gt 0), since according to https://www.w3.org/TR/xquery-31/#dt-entry:

Definition: Each key / value pair in a map is called an entry.

👍

And for the empty version, I think map:empty() and array:empty() wouldn't risk confusion (shorthands for map:size($map) eq 0 and array:size($array) eq 0).

Let us not mention array:size and map:size because a for naive Xpath processor implementing this it will take O(N) time, while the proposed functions are O(1).

I think that this problem is eliminated using the following equivalent expressions, instead:

not(map:has-entries($m))

not(array:has-members($a))

ChristianGruen · 2022-11-02T09:46:04Z

I like array:has-members() and map:has-entries().

Most existing built-in functions have names without auxiliary verbs, indicating possession (has), or a copula (is), but we already have fn:has-children and we’ll soon have fn:is-NaN, and the general tendency for version 4.0 seems to be to choose a more self-explanatory wording.

I think we don’t additionally need array:empty and map:empty. If we believe we do, we could think about array:is-empty and map:is-empty, as map:empty might indicate that it returns an empty map (it reminds me of map:entry, which returns a map with a single entry).

michaelhkay · 2022-11-02T15:31:40Z

Ignoring other questions, there's nothing wrong with the specification being defined in terms of a very inefficient implementation. We do that all the time. It's a specification of what result the function delivers, it's not guidance for the implementor on how to implement it efficiently.

dnovatchev · 2022-11-02T16:10:27Z

Ignoring other questions, there's nothing wrong with the specification being defined in terms of a very inefficient implementation. We do that all the time. It's a specification of what result the function delivers, it's not guidance for the implementor on how to implement it efficiently.

@michaelhkay , Even so, maybe it is always a good idea putting a warning that literally having such implementation could be ridiculously inefficient, right?

@dnovatchev In general, I don't think implementors expect the specification writers to give them advice on how to implement the specification efficiently. We do so occasionally, e.g. in the Note at the end of XSLT 3.0 section 5.5.3 (see https://www.w3.org/TR/xslt-30/#pattern-semantics) but if we did this too often then (a) it would get very tedious, and (b) we would often get the advice wrong, since implementors are sometimes much smarter than we imagine.

ChristianGruen · 2022-11-02T16:35:05Z

I agree with Michael: I see no need to add hints for implementors to the spec. There are numerous other language features that would be very inefficient if implemented without further thoughts (think of e.g. (1 to 100000000)[position() = 1]. Personally, I would never look up the specification to get aware of the obvious pitfalls. It's usually the complex stuff that's much harder to optimize.

dnovatchev · 2022-11-02T17:17:36Z

I agree with Michael: I see no need to add hints for implementors to the spec. There are numerous other language features that would be very inefficient if implemented without further thoughts (think of e.g. (1 to 100000000)[position() = 1]. Personally, I would never look up the specification to get aware of the obvious pitfalls. It's usually the complex stuff that's much harder to optimize.

Yes, this is true, but we can at least put a short reminder in parenthesis, something like:

(note, this can be implemented with O(1) time complexity)

@dnovatchev I can envisage implementations of arrays in which determining whether the array is empty involves looking to see how many members are marked as deleted/removed. That will not be O(1). We might think that's not a good implementation, but it's not our call. The implementor might want to reduce the cost of remove() operations at the expense of increasing the cost of is-empty() - that's entirely up to them, it's not for us to judge. It's perfectly adequate to say that is-empty() (or whatever we call it) returns the same result as size()=0, and if the implementor wants to make one faster than the other, that's entirely their choice.

dnovatchev · 2022-11-02T17:50:17Z

Yes, this is true, but we can at least put a short reminder in parenthesis, something like:

(note, this can be implemented with O(1) time complexity)

@dnovatchev I can envisage implementations of arrays in which determining whether the array is empty involves looking to see how many members are marked as deleted/removed. That will not be O(1). We might think that's not a good implementation, but it's not our call. The implementor might want to reduce the cost of remove() operations at the expense of increasing the cost of is-empty() - that's entirely up to them, it's not for us to judge. It's perfectly adequate to say that is-empty() (or whatever we call it) returns the same result as size()=0, and if the implementor wants to make one faster than the other, that's entirely their choice.

@michaelhkay,

If the specification says (as suggested in your comment:

is-empty() (or whatever we call it) returns the same result as size()=0,

It would be good also to add: "but this is not a suggestion to use size() in the actual implementation"

ChristianGruen · 2022-11-02T18:57:54Z

It would be good also to add: "but this is not a suggestion to use size() in the actual implementation"

@dnovatchev All these comments could certainly be added, but to be honest, I don't understand who would benefit from such notes. We should find at least one implementor out there who believes this would be of help.

dnovatchev · 2022-11-02T19:10:53Z

It would be good also to add: "but this is not a suggestion to use size() in the actual implementation"

@dnovatchev All these comments could certainly be added, but to be honest, I don't understand who would benefit from such notes. We should find at least one implementor out there who believes this would be of help.

@ChristianGruen We could have such statement just in a single, central section of the specification, just to be known that the provided "possible implementations code" is only for instructive, definitional purposes and should not be perceived as any recommendation to implement exactly this code/algorithm.

When distinguishing between "clear" and "useful", "clear" comes before "useful" (because if something is unclear, we cannot judge whether it is useful or not)

ChristianGruen · 2022-11-02T19:28:25Z

When distinguishing between "clear" and "useful", "clear" comes before "useful"

So we should possibly find at least one implementor who believes that the specification gets more clear if hints of that kind are added.

…cg#229)

ChristianGruen · 2022-12-07T12:36:53Z

The initially proposed functions have been added to the spec (#250).

…cg#229)

dnovatchev added the XQFO An issue related to Functions and Operators label Nov 1, 2022

ChristianGruen added the Feature A change that introduces a new feature label Nov 1, 2022

dnovatchev mentioned this issue Dec 3, 2022

array bound checking #278

Closed

michaelhkay added a commit to michaelhkay/qtspecs that referenced this issue Dec 6, 2022

Rename truncate as trunk; add array:exists and array:empty (issue qt4…

a97464c

…cg#229)

ChristianGruen closed this as completed Dec 7, 2022

ndw pushed a commit to ndw/qtspecs-xslt4 that referenced this issue Dec 14, 2022

Rename truncate as trunk; add array:exists and array:empty (issue qt4…

7191085

…cg#229)

ChristianGruen mentioned this issue Nov 9, 2023

map:empty, map:exists ← array:empty, array:exists #827

Closed

ChristianGruen mentioned this issue Nov 18, 2023

Standard, array & map functions: Equivalencies #843

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add the missing functions for arrays: array:exists() and array:empty() #229

Proposal: Add the missing functions for arrays: array:exists() and array:empty() #229

dnovatchev commented Oct 31, 2022

michaelhkay commented Nov 1, 2022

ChristianGruen commented Nov 1, 2022

ndw commented Nov 1, 2022

michaelhkay commented Nov 1, 2022

ndw commented Nov 1, 2022

Arithmeticus commented Nov 1, 2022

dnovatchev commented Nov 1, 2022

benibela commented Nov 1, 2022

dnovatchev commented Nov 1, 2022

joewiz commented Nov 1, 2022 •

edited

Loading

dnovatchev commented Nov 1, 2022 •

edited

Loading

ChristianGruen commented Nov 2, 2022 •

edited

Loading

michaelhkay commented Nov 2, 2022

dnovatchev commented Nov 2, 2022 •

edited by michaelhkay

Loading

ChristianGruen commented Nov 2, 2022

dnovatchev commented Nov 2, 2022 •

edited by michaelhkay

Loading

dnovatchev commented Nov 2, 2022

ChristianGruen commented Nov 2, 2022

dnovatchev commented Nov 2, 2022

ChristianGruen commented Nov 2, 2022

ChristianGruen commented Dec 7, 2022

Proposal: Add the missing functions for arrays: array:exists() and array:empty() #229

Proposal: Add the missing functions for arrays: array:exists() and array:empty() #229

Comments

dnovatchev commented Oct 31, 2022

array:exists:

array:empty:

michaelhkay commented Nov 1, 2022

ChristianGruen commented Nov 1, 2022

ndw commented Nov 1, 2022

michaelhkay commented Nov 1, 2022

ndw commented Nov 1, 2022

Arithmeticus commented Nov 1, 2022

dnovatchev commented Nov 1, 2022

benibela commented Nov 1, 2022

dnovatchev commented Nov 1, 2022

joewiz commented Nov 1, 2022 • edited Loading

dnovatchev commented Nov 1, 2022 • edited Loading

ChristianGruen commented Nov 2, 2022 • edited Loading

michaelhkay commented Nov 2, 2022

dnovatchev commented Nov 2, 2022 • edited by michaelhkay Loading

ChristianGruen commented Nov 2, 2022

dnovatchev commented Nov 2, 2022 • edited by michaelhkay Loading

dnovatchev commented Nov 2, 2022

ChristianGruen commented Nov 2, 2022

dnovatchev commented Nov 2, 2022

ChristianGruen commented Nov 2, 2022

ChristianGruen commented Dec 7, 2022

`array:exists`:

`array:empty`:

joewiz commented Nov 1, 2022 •

edited

Loading

dnovatchev commented Nov 1, 2022 •

edited

Loading

ChristianGruen commented Nov 2, 2022 •

edited

Loading

dnovatchev commented Nov 2, 2022 •

edited by michaelhkay

Loading

dnovatchev commented Nov 2, 2022 •

edited by michaelhkay

Loading