-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fn:sort, and XSLT and XQuery sorting, should use transitive comparisons #866
Comments
It's so tempting to say we should be using a transitive order relation for everything including the "eq" and "lt" operators. But this causes problems: currently the decimal 1.2 and the double 1.2e0 compare equal, and they would then compare non-equal. Because the simple notation "1.2" represents a decimal, while atomization and arithmetic operators default to producing a double, unintentional comparisons between decimals and doubles are very common. |
Specific proposal: I propose that in XSLT The behaviour of the |
Note in passing: the 3.1 specifications (and the current 4.0 drafts) for fn:min and fn:max have a "properties" section that refers to the arity-0 and arity-1 forms of these functions, which is nonsense; the available arities are 1 and 2. |
I definitely see the need to make the functions transitive. We’d then have I may be overlooking something, but couldn’t we simply drop
…and replaced with:
…with declare function normalize(
$item as xs:anyAtomicType,
$options as map(*)
) as xs:string {
if($item instance of xs:string) then (
$string
! (if($options?whitespace = "normalize")) then normalize-unicode(., $options?normalization-form) else .)
! (if($options?normalize-space) then normalize-space(.) else .)
) else $item
} The signature of fn:compare(
$value1 as xs:anyAtomicType?,
$value2 as xs:anyAtomicType?,
$collation as xs:string? := fn:default-collation()
) as xs:integer? For types other than strings, we could adopt the rules from Another positive effect would be that |
Firstly, atomic-equal was specifically designed to be context-free, which makes it messy to combine it into a function that uses collations. We could generalise atomic-equal to do an ordered comparison over all data types, in which case it could embrace numeric-compare, but that's providing functionality that I'm not sure we need: we're happy, I think, for sorting to be context-dependent. Folding fn:numeric-compare into fn:compare is more feasible, but you've then got one function that does two different jobs; there's no type safety to ensure that the arguments have compatible types, and you need ad-hoc rules to say which combinations of arguments are valid and which aren't. The merit of two separate functions is that each is a total function over the domain implied by its signature. Another option is to make numeric-compare private (i.e. put it in "op" namespace). I put it in the fn namespace because I think there are valid use-cases for applications to use it directly; that will be especially true if we introduce fn:sort-with. I did wonder while writing this whether we should combine the op:xxx-less-than and op:xxx-equal functions for other data types into op:xxx-compare functions, but I think that's an unrelated task, and is purely editorial. As for deep-equal, I wish we could separate it into two functions, one which compares items and one which compares sequences taking an item-comparison function as an argument; but again that's a separate project, and one that is hampered by compatibility concerns. |
I can’t agree here; those seem technical concerns to me that can be solved (similar to arithmetic expressions, for example, ad-hoc rules are only necessary for those cases in which static types are not available). From a user perspective, it’s not intuitive at all to have multiple compare functions, i.e., to have Apart from the already mentioned advantages for a generalized
I see. Perhaps |
This reverts commit 6a97d00.
We have addressed the question of non-transitivity of equality matching in
distinct-values()
, and in XSLT and XQuery grouping, but the same issue exists for sorting. Currentlyfn:sort
, as well as XSLT and XQuery sorting, rely on the "lt" operator for comparing values including mixed numerics such as doubles and decimals. Because this promotes to double, it is capable of losing precision, and is therefore non-transitive. Most sort algorithms rely on the supplied comparison function being transitive, and if it isn't, then undefined failures may occur including non-termination.One particular quirk (which led me here) is that
fn:highest
andfn:lowest
start by usingfn:sort
semantics to put the values in order, and then rely onfn:deep-equal
semantics to find the values that are "equal highest" or "equal lowest". Butfn:sort
andfn:deep-equal
have different ways of deciding whether two values are equal: decimal 1.2 and double 1.2 are equal forfn:sort
, but not forfn:deep-equal
.The text was updated successfully, but these errors were encountered: