-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function identity: documentation, nondeterminism #908
Comments
It means that the property is not exposed to users it's merely there to underpin the semantics of comparing two functions with each other.
Indeed, that's exactly what we are trying to do.
Not sure I understand the question.
§4.6.2.4 (Named function references) states "Otherwise [if the function is not context-dependent], a function identity that is the same as that produced by the evaluation of any other named function reference with the same function name and arity. "So yes, true#0 and true#0 will always be identical.
It says they are not guaranteed to be identical. But there is a let-out clause (in §4.6.2.4): "Optimizers, however, are allowed to detect cases where the captured context happens to be the same, or where any variations are immaterial, and where it is therefore safe to return the same function item each time." XQFO 17.1.1 refers to this: "The function identity is determined in the same way as for a named function reference."
You're referring I think to the note in XDM "Currently, the concept of function identity is used for two purposes: firstly, when functions appear in the arguments supplied to the fn:deep-equal function; and secondly, in establishing whether the arguments and results of a function are "the same" when deciding whether the function is deterministic." We could add cross-references to the two places mentioned if you think it would help.
It means "must return functions having the same identity". (Talking about multiple things having the same identity is always tricky, because if two things have the same identity then they are really one thing not two, and use of a plural noun is inappropriate. We have this problem everywhere with node identity, e.g. when we talk about removing duplicate nodes.
A more realistic example is
|
Thanks. I managed to ignore what the XPath spec say about function identities. The reason was that XQFO, 14.2.8 fn:deep-equal points to the XDM spec…
…which doesn’t provide the promised explanation. Maybe the XPath spec could be referenced here instead?
I still haven’t understood what “deterministic” is supposed to say here. How can it be “decided” whether a function is deterministic, if determinism is a fixed function property? Maybe we should choose a different term?
I see. Maybe with a link to the definition: “must return functions with the same function identity” ? Two more observations: XPath, 4.6.2.4 Named Function References states that:
…whereas XPath, 4.6.2.7 Function Identity says:
XQFO, 1.8.4 Properties of Functions defines identical values (nodes) as follows:
…which doesn’t match for nondeterministic function calls like My next question is about 4.6.2.4 Named Function References. It says:
Doesn’t |
My first suggestion is probably nonsense; otherwise, a nondeterministic function would not be identical to itself anymore. In either case, I think we cannot maintain the definition for identical functions that “any function call with identical arguments will produce an identical result.”. This can only be true for deterministic functions. The definition “results that are deep-equal to each other” would still not be applicable to external nondeterministic functions like |
@michaelhkay I’m sorry, I’m still struggling to properly and comprehensively implement the comparison of function items. I wanted to contribute more test cases, but chances are high that they might be wrong. Some questions: A) Is the following code required to return <x/>
! name#0
! deep-equal(., .) B) Is the comparison of two identical, but context-dependent, functions allowed to return deep-equal(
<xml/> ! id#1,
<xml/> ! id#1
) C) For partial function applications, the rule is “A new function identity distinct from the identity of any other function item.”, but deep-equal requires the functions to have the same identity. Is the following code still allowed to return let $xml := <xml/>
return deep-equal(
id(?, $xml),
id(?, $xml)
) D) If a nondeterministic function is repeatedly invoked, it may return different results w.r.t. the order, or node identity (if we consider external libraries, the result itself may be completely different). Maybe we should treat context-dependent and nondeterministic functions equally, and require the following code to return deep-equal(
analyze-string('a', ?),
analyze-string('a', ?)
) |
There is only one function item involved here. Therefore both operands are functions having the same identity, therefore the result must be true.
The spec for named function references says that: "Optimizers, however, are allowed to detect cases where the captured context happens to be the same, or where any variations are immaterial, and where it is therefore safe to return the same function item each time. " This applies here: the two named function references are evaluated with different dynamic contexts, but a smart optimizer is allowed to determine that the in both cases the result of the relevant function will always be an empty sequence (because neither of the two documents contains any ID values); and therefore both If the expression were
then no such optimization would be possible, because the the two function items are different: consider
which returns false.
The spec for partial function application refers to §4.6.2.7 which states:
This applies here. An optimizer that identifies common subexpressions is allowed to rewrite
as
which evaluates to true.
Again, the rules in §4.6.2.7 say that the partial function application can be treated as a common subexpression and evaluated once to produce the same function item each time. |
I see. So I assume that “any variations are immaterial” implies that the result of an invoked function item will be “the same” as the evaluation of another function item with the same name and arity.
So “the same” means that both I would then assume that
I believe this contrasts with the example further above:
|
Yes. I agree it's not a mathematically rigorous formulation, but that's the intended reading.
Yes, for the moment we're stuck with the fact that node constructors are guaranteed to return a different node on each evaluation. We could consider changing that, but it's a separate topic.
Yes.
Not so. The spec for |
Thanks, things are getting clearer. This means that…
In a nutshell:
|
If it helps, the Saxon implementation is to treat two function items as deep-equal if and only if they are represented internally by the same Java object. For that to work we have to make sure that a function reference like count#1 always delivers the same Java object. But beyond that, we simply rely on the fact that (a) two expressions (or expression evaluations) will only return the same Java object if the compiler has already decided the two expressions are equivalent, in which case we are allowed to treat the functions as identical, and (b) the only case where the spec requires two expressions to return identical function items is the |
Yes, that’s helpful. We proceed differently: deep-equal(<e/> ! name#0, <e/> ! name#0) …will be rewritten to… deep-equal(fn() { name(<e/>) }, fn() { name(<e/>) })
→ deep-equal(fn() { 'e' }, fn() { 'e' }) …which means we have no information at runtime whether the original expression was a named function reference. It seems we’ll either need to register all inline functions as constants or compare the parameters and function bodies at runtime (which is a bit tricky as the parameter names may change during compilation, so I’ll have more thoughts on the implications. |
We’ve found a way to treat different kinds of functions equal (in short, by applying more rewrites at compile time, and comparing the Object identity or arity/function body at runtime). My assessment is that we can’t expect all implementations to behave identically to Saxon (or it can be very difficult to simulate its behavior). In particular, …
…we should allow
I think that the result should be more intuitive. |
While I’ve now understood the implications of the current definition of function identity, it could still be challenging for future implementors. As I cannot offer a particular suggestion on how to improve the status quo, I propose to close the issue. |
The CG agreed to close this issue without further action at meeting 079. |
In #520, the concept of function identities was introduced. This is what the current draft says:
XDM, 2.9.4 Function Items
XQFO, 1.8.4 Properties of functions
XQFO, 14.2.8 fn:deep-equal
XQFO, 17.1.1 fn:function-lookup
While I definitely believe in the concept, I believe the documentation is still cryptic, or even impossible, to understand, at least without reading #520 or consuming the existing QT4 test cases. Here are some questions that I’m trying to answer:
true#0
andtrue#0
always be identical? The test cases imply this, whereas Function identity #520 doesn’t.name#0
andname#0
cannot be equal? Or can, or will, they be equal if the context is identical?map:entries
andfn:parse-xml
can be “the same” if the parameters are the same.fn:parse-xml
)?deep-equal(name#0, name#0)
anddeep-equal(name#1, name#1)
returnedtrue
.I’m sorry for not offering good answers in return. I could try to describe what we’ve implemented so far – mostly inspired by the test cases – but I’m not sure if it meets the requirements.
Related: #333
The text was updated successfully, but these errors were encountered: