Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function Identity: Treating function items with identical bodies #1235

Open
ChristianGruen opened this issue May 22, 2024 · 12 comments
Open
Labels
Bug Something that doesn't work in the current specification Editorial Minor typos, wording clarifications, example fixes, etc. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 XPath An issue related to XPath

Comments

@ChristianGruen
Copy link
Contributor

(One) requirement resulting from #908 (in particular, #908 (comment)):

The following examples reflect the status quo:

# Function Result
1 deep-equal(<a/>, <a/>) true
2 let $f := fn { <a/> } return deep-equal($f, $f) true
3 deep-equal(fn { 1 }, fn { 1 }) true or false
4 deep-equal(fn { <a/> }, fn { <a/> }) false

Notes

  1. The result is true (regardless of the identity of the compared nodes).
  2. The result is true (regardless of the identity of the nodes that result from evaluating the functions)
  3. The result is allowed to be true or false, depending on the optimization strategies of a processor.
  4. Only in this case, the result must currently be false.

We should allow deep-equal to return true for function items that have the same arguments and bodies. An implementation should be allowed to use the same internal representation for multiple occurrences of such functions.

@ChristianGruen ChristianGruen added Bug Something that doesn't work in the current specification XPath An issue related to XPath Editorial Minor typos, wording clarifications, example fixes, etc. labels May 22, 2024
@MarkNicholls
Copy link

MarkNicholls commented May 22, 2024

I think that the result should be more intuitive.

yes

Is this worth it?
whats a sensible motivating use case?

(I'm cogniscant that if you ask an FPer about this, they shriek and say you're mad, and if you talk to an OPer, they say that methods are readonly, so you don't need to compare them, and 'functions' as objects, are comparable, but its not something they'd really expect anyone to do)

@ChristianGruen
Copy link
Contributor Author

Is this worth it?

I’m not exactly sure what you are proposing. Would you like to preserve the status quo (as described in #908)?

@MarkNicholls
Copy link

MarkNicholls commented May 22, 2024

I'm doing my usual irritating trick of trying to reverse the conversation 9 months, which is probably highly irritating, whilst you are trying incrementally improve the status quo.

The way to stop me doing that (apart from telling me you're not willing to do it), is to give me motivating use case that I can't escape.

I'm hoping this motivating use case though has an alternative approach that dissolves the problem of trying to (in an FPers view) solve the halting problem.

OO'ers dissolve it, by defining equality inside the domain of the type, where all 'functions' are fixed (as methods) so the issue never arises (well not much).

BUT...genuinely feel free to say..."its too late for that"

@ChristianGruen
Copy link
Contributor Author

BUT...genuinely feel free to say..."its too late for that"

It may not be too late for that, but you'll probably need to spend a lot of time to digest what has happened so far.

With regard to your specific reply, I cannot really read out what you think may be worth or not worth doing. Could you be more specific?

@MarkNicholls
Copy link

MarkNicholls commented May 22, 2024

I was part of the conversation at one point, and tried to express my concern, I gave up (or was overpowered) but your summary isn't good reading, it unintuitive, and you appear to be saying results depend on what optimisations are turned on.

the simple alternative (to allowing equals on functions) is "error"

Whats the cost of that? i.e. what motivating use case does it stifle? and is enabling that use case worth the consequence of your summary table? is it just going to be bigger headache.

the next simple alternative is "object identity" and all implementions have to respect it....its horrid...I hate functions having an identity, and if saw code comparing almost anything by object identity I'd feel concern.

the next simple alternative is elevating some functions to "methods" only allowing equality inside a type (so you dont need to compare methods as they are statically defined), and not allowing equality for functions, so yes, you can compare objects in a type if all the fields are non functions, else....error.

@ChristianGruen
Copy link
Contributor Author

Thanks, I now understand you better: You would like to question the current concept of function identity in the specification general. I appreciate your concerns, but I think it is definitely out of scope here as I was focusing on the specific case that deep-equal(fn { <a/> }, fn { <a/> }) yields false, whereas deep-equal(fn { 1 }, fn { 1 }) may yield true.

I think it’s better if you create a separate issue for your general concerns. Would that be ok for you?

@MarkNicholls
Copy link

MarkNicholls commented May 22, 2024

in the scope of this issue, my answer would be

error and error

:-)

I wont raise it as an issue. Thanks for the offer, I don't think it would get traction.

you don't need to reply, my response is just rhetorical.

@ChristianGruen
Copy link
Contributor Author

you don't need to reply, my response is just rhetorical.

Thanks for your comments. For rhetorical feedback, though, I recommend you to join the discussions on Slack; my assumption is that it will be more productive for everyone involved.

@michaelhkay
Copy link
Contributor

I'm sympathetic, but what exactly is your proposed fix?

@ChristianGruen
Copy link
Contributor Author

I'm sympathetic, but what exactly is your proposed fix?

I would be happy if I could make a concrete suggestion. I have now understood how to interpret the current rules, but it would be presumptuous to say that I have grasped the interplay of all rules in full detail.

Maybe it’s the rule in 4.5.2.7 Function Identity that should be changed. It says that:

“An optimizer is permitted to rewrite expressions in such a way that repeated evaluation is avoided if it can be established that the result will be the same each time, […].”

“Same result” seems to imply that two evaluated instances of true#0 or fn { true() } are the same, but I assume that this also includes functions like fn { parse-xml('<x/>') } or fn { file:read-text('x') }, although the result may have a different node identity or may even be completely different for nondeterministic functions. At the same time, the results of two evaluated instances of fn { <a/> } seem to be regarded not to be the same because of their varying node identity. This is at least how I understood #908 (comment) and the following comments. Is this correct?

Maybe we need to clarify what “same” means:

  • I think it cannot mean that the results of the functions will be deep-equal to each other. Otherwise, a nondeterministic function may not even be equal to itself.
  • Node identity should not be relevant either. Otherwise, functions like fn:parse-xml#1 or fn { <a/> } cannot be rewritten to the same instance.

Without being able to provide a fix: Does my summary sound correct and reasonable?

@michaelhkay
Copy link
Contributor

Perhaps rewrite the relevant paragraphs:

An optimizer is permitted to rewrite deterministic expressions in such a way that repeated evaluation is avoided, and this may be done without consideration of function identity. For example, if the expression contains(?, "e") appears within the body of a for expression, or if the same expression is written repeatedly in a query, then an optimizer may decide to evaluate it once only, and thus return the same function item each time. Similarly, if the expression fn($x){$x+1} appears more than once, or is evaluated repeatedly, then it may return the same function each time.

Similarly, optimizers are allowed to replace any expression with an equivalent expression. For example, count(?) may be rewritten as count#1, and fn($x){$x+1} may be rewritten as fn($y){$y+1}. This may lead to different expressions returning identical function items.

Link to the definition of deterministic in F&O (though it's not as rigorous as one might like).

@ChristianGruen
Copy link
Contributor Author

Thanks. Sounds good to me.

Do you think there is a need to tweak the following rule further above in the text?

Functions with the same identity are indistinguishable in every way; in particular, any function call with identical arguments will produce an identical result.

It could be argued that two deep-equal nodes with a different node identity are not identical (let alone results of nondeterministic expressions). Maybe the sentence after the semicolon could simply be dropped, or we might need to concretize what “identical” means in this context.

@ndw ndw added PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 labels Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something that doesn't work in the current specification Editorial Minor typos, wording clarifications, example fixes, etc. PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 XPath An issue related to XPath
Projects
None yet
Development

No branches or pull requests

4 participants