Function Identity: Treating function items with identical bodies #1235

ChristianGruen · 2024-05-22T08:56:41Z

(One) requirement resulting from #908 (in particular, #908 (comment)):

The following examples reflect the status quo:

#	Function	Result
1	`deep-equal(<a/>, <a/>)`	`true`
2	`let $f := fn { <a/> } return deep-equal($f, $f)`	`true`
3	`deep-equal(fn { 1 }, fn { 1 })`	`true` or `false`
4	`deep-equal(fn { <a/> }, fn { <a/> })`	`false`

Notes

The result is true (regardless of the identity of the compared nodes).
The result is true (regardless of the identity of the nodes that result from evaluating the functions)
The result is allowed to be true or false, depending on the optimization strategies of a processor.
Only in this case, the result must currently be false.

We should allow deep-equal to return true for function items that have the same arguments and bodies. An implementation should be allowed to use the same internal representation for multiple occurrences of such functions.

The text was updated successfully, but these errors were encountered:

MarkNicholls · 2024-05-22T14:37:31Z

I think that the result should be more intuitive.

yes

Is this worth it?
whats a sensible motivating use case?

(I'm cogniscant that if you ask an FPer about this, they shriek and say you're mad, and if you talk to an OPer, they say that methods are readonly, so you don't need to compare them, and 'functions' as objects, are comparable, but its not something they'd really expect anyone to do)

ChristianGruen · 2024-05-22T14:43:34Z

Is this worth it?

I’m not exactly sure what you are proposing. Would you like to preserve the status quo (as described in #908)?

MarkNicholls · 2024-05-22T14:48:51Z

I'm doing my usual irritating trick of trying to reverse the conversation 9 months, which is probably highly irritating, whilst you are trying incrementally improve the status quo.

The way to stop me doing that (apart from telling me you're not willing to do it), is to give me motivating use case that I can't escape.

I'm hoping this motivating use case though has an alternative approach that dissolves the problem of trying to (in an FPers view) solve the halting problem.

OO'ers dissolve it, by defining equality inside the domain of the type, where all 'functions' are fixed (as methods) so the issue never arises (well not much).

BUT...genuinely feel free to say..."its too late for that"

ChristianGruen · 2024-05-22T15:04:36Z

BUT...genuinely feel free to say..."its too late for that"

It may not be too late for that, but you'll probably need to spend a lot of time to digest what has happened so far.

With regard to your specific reply, I cannot really read out what you think may be worth or not worth doing. Could you be more specific?

MarkNicholls · 2024-05-22T15:13:03Z

I was part of the conversation at one point, and tried to express my concern, I gave up (or was overpowered) but your summary isn't good reading, it unintuitive, and you appear to be saying results depend on what optimisations are turned on.

the simple alternative (to allowing equals on functions) is "error"

Whats the cost of that? i.e. what motivating use case does it stifle? and is enabling that use case worth the consequence of your summary table? is it just going to be bigger headache.

the next simple alternative is "object identity" and all implementions have to respect it....its horrid...I hate functions having an identity, and if saw code comparing almost anything by object identity I'd feel concern.

the next simple alternative is elevating some functions to "methods" only allowing equality inside a type (so you dont need to compare methods as they are statically defined), and not allowing equality for functions, so yes, you can compare objects in a type if all the fields are non functions, else....error.

ChristianGruen · 2024-05-22T15:19:47Z

Thanks, I now understand you better: You would like to question the current concept of function identity in the specification general. I appreciate your concerns, but I think it is definitely out of scope here as I was focusing on the specific case that deep-equal(fn { <a/> }, fn { <a/> }) yields false, whereas deep-equal(fn { 1 }, fn { 1 }) may yield true.

I think it’s better if you create a separate issue for your general concerns. Would that be ok for you?

MarkNicholls · 2024-05-22T15:22:23Z

in the scope of this issue, my answer would be

error and error

:-)

I wont raise it as an issue. Thanks for the offer, I don't think it would get traction.

you don't need to reply, my response is just rhetorical.

ChristianGruen · 2024-05-22T15:26:33Z

you don't need to reply, my response is just rhetorical.

Thanks for your comments. For rhetorical feedback, though, I recommend you to join the discussions on Slack; my assumption is that it will be more productive for everyone involved.

michaelhkay · 2024-05-22T19:05:31Z

I'm sympathetic, but what exactly is your proposed fix?

ChristianGruen · 2024-05-22T19:52:53Z

I'm sympathetic, but what exactly is your proposed fix?

I would be happy if I could make a concrete suggestion. I have now understood how to interpret the current rules, but it would be presumptuous to say that I have grasped the interplay of all rules in full detail.

Maybe it’s the rule in 4.5.2.7 Function Identity that should be changed. It says that:

“An optimizer is permitted to rewrite expressions in such a way that repeated evaluation is avoided if it can be established that the result will be the same each time, […].”

“Same result” seems to imply that two evaluated instances of true#0 or fn { true() } are the same, but I assume that this also includes functions like fn { parse-xml('<x/>') } or fn { file:read-text('x') }, although the result may have a different node identity or may even be completely different for nondeterministic functions. At the same time, the results of two evaluated instances of fn { <a/> } seem to be regarded not to be the same because of their varying node identity. This is at least how I understood #908 (comment) and the following comments. Is this correct?

Maybe we need to clarify what “same” means:

I think it cannot mean that the results of the functions will be deep-equal to each other. Otherwise, a nondeterministic function may not even be equal to itself.
Node identity should not be relevant either. Otherwise, functions like fn:parse-xml#1 or fn { <a/> } cannot be rewritten to the same instance.

Without being able to provide a fix: Does my summary sound correct and reasonable?

michaelhkay · 2024-05-26T17:10:41Z

Perhaps rewrite the relevant paragraphs:

An optimizer is permitted to rewrite deterministic expressions in such a way that repeated evaluation is avoided, and this may be done without consideration of function identity. For example, if the expression contains(?, "e") appears within the body of a for expression, or if the same expression is written repeatedly in a query, then an optimizer may decide to evaluate it once only, and thus return the same function item each time. Similarly, if the expression fn($x){$x+1} appears more than once, or is evaluated repeatedly, then it may return the same function each time.

Similarly, optimizers are allowed to replace any expression with an equivalent expression. For example, count(?) may be rewritten as count#1, and fn($x){$x+1} may be rewritten as fn($y){$y+1}. This may lead to different expressions returning identical function items.

Link to the definition of deterministic in F&O (though it's not as rigorous as one might like).

ChristianGruen · 2024-05-26T17:51:11Z

Thanks. Sounds good to me.

Do you think there is a need to tweak the following rule further above in the text?

Functions with the same identity are indistinguishable in every way; in particular, any function call with identical arguments will produce an identical result.

It could be argued that two deep-equal nodes with a different node identity are not identical (let alone results of nondeterministic expressions). Maybe the sentence after the semicolon could simply be dropped, or we might need to concretize what “identical” means in this context.

ChristianGruen added Bug Something that doesn't work in the current specification XPath An issue related to XPath Editorial Minor typos, wording clarifications, example fixes, etc. labels May 22, 2024

ndw added PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 labels Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function Identity: Treating function items with identical bodies #1235

Function Identity: Treating function items with identical bodies #1235

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 •

edited

Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 •

edited

Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 •

edited

Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 •

edited

Loading

ChristianGruen commented May 22, 2024

michaelhkay commented May 22, 2024

ChristianGruen commented May 22, 2024

michaelhkay commented May 26, 2024

ChristianGruen commented May 26, 2024

Function Identity: Treating function items with identical bodies #1235

Function Identity: Treating function items with identical bodies #1235

Comments

ChristianGruen commented May 22, 2024

Notes

MarkNicholls commented May 22, 2024 • edited Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 • edited Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 • edited Loading

ChristianGruen commented May 22, 2024

MarkNicholls commented May 22, 2024 • edited Loading

ChristianGruen commented May 22, 2024

michaelhkay commented May 22, 2024

ChristianGruen commented May 22, 2024

michaelhkay commented May 26, 2024

ChristianGruen commented May 26, 2024

MarkNicholls commented May 22, 2024 •

edited

Loading

MarkNicholls commented May 22, 2024 •

edited

Loading

MarkNicholls commented May 22, 2024 •

edited

Loading

MarkNicholls commented May 22, 2024 •

edited

Loading