New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the context item for value templates #547

Merged
merged 1 commit into from Oct 15, 2018

Conversation

Projects
None yet
4 participants
@ndw
Contributor

ndw commented Sep 29, 2018

Value templates get their context from the DRP. It is a dynamic error if a sequence of more than one items appears. Output ports may be read by value templates.
Close #482

@ndw ndw requested review from gimsieke, eriksiegel and xml-project Sep 29, 2018

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Sep 30, 2018

Contributor

I asked twice in comments (1, 2) without receiving a satisfactory answer: Didn’t we intend at some stage to only use the first of multiple documents that are in context for evaluating an expression? This would remove the need for D0065.

Contributor

gimsieke commented Sep 30, 2018

I asked twice in comments (1, 2) without receiving a satisfactory answer: Didn’t we intend at some stage to only use the first of multiple documents that are in context for evaluating an expression? This would remove the need for D0065.

See @gimsiekes comment

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Sep 30, 2018

Contributor

@gimsieke I was not able to find the consent you are citing. Can you give me a hint, when and/or why we decided this. Does it apply to p:with-option and option-shuts too?

Contributor

xml-project commented Sep 30, 2018

@gimsieke I was not able to find the consent you are citing. Can you give me a hint, when and/or why we decided this. Does it apply to p:with-option and option-shuts too?

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Sep 30, 2018

Contributor

We didn’t formally decide it but we discussed it at some point. I don’t remember where and when. Therefore the phrasing “I vaguely remember”.
It would apply to p:with-option, p:variable, p:when, value templates, etc.
The question is, in the absence of collection="true" and when multiple documents are in context, should we always only consider the first.
Motivation: Quite regularly when I use cx:message, I get an error that multiple docs are in the context of the message attribute. The message often only refers to variables, so in these cases I can evaluate p:with-option in a p:empty context. If there are multiple documents in context, I cannot use an attribute value template in @message, I need to revert to the lengthy p:with-option/p:empty.
We don’t have to discuss it in the comments of a pull request, I can also raise a new issue.

Contributor

gimsieke commented Sep 30, 2018

We didn’t formally decide it but we discussed it at some point. I don’t remember where and when. Therefore the phrasing “I vaguely remember”.
It would apply to p:with-option, p:variable, p:when, value templates, etc.
The question is, in the absence of collection="true" and when multiple documents are in context, should we always only consider the first.
Motivation: Quite regularly when I use cx:message, I get an error that multiple docs are in the context of the message attribute. The message often only refers to variables, so in these cases I can evaluate p:with-option in a p:empty context. If there are multiple documents in context, I cannot use an attribute value template in @message, I need to revert to the lengthy p:with-option/p:empty.
We don’t have to discuss it in the comments of a pull request, I can also raise a new issue.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Sep 30, 2018

Contributor

I'm pretty strongly opposed to "first value" semantics. That's what XSLT 1.0 used for sequences and we moved away from that in 2.0 because of the subtle bugs it introduced. (Mostly failure to detect errors, I think.)

However, I see your point wrt cx:message and other places where there are AVTs that don't refer to the context.

I wonder if we could say that the context item is defined if-and-only-if a single item appears. Pro: it would remove the error when you don't refer to the context and preserve the error if you do. Con: it's a weird rule.

Contributor

ndw commented Sep 30, 2018

I'm pretty strongly opposed to "first value" semantics. That's what XSLT 1.0 used for sequences and we moved away from that in 2.0 because of the subtle bugs it introduced. (Mostly failure to detect errors, I think.)

However, I see your point wrt cx:message and other places where there are AVTs that don't refer to the context.

I wonder if we could say that the context item is defined if-and-only-if a single item appears. Pro: it would remove the error when you don't refer to the context and preserve the error if you do. Con: it's a weird rule.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Sep 30, 2018

Contributor

I like the proposed rule. It won't seem weird to pipeline authors most of the time. And if it does, we can still toss an RTFS (read the fscking spec) in their direction.

Contributor

gimsieke commented Sep 30, 2018

I like the proposed rule. It won't seem weird to pipeline authors most of the time. And if it does, we can still toss an RTFS (read the fscking spec) in their direction.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Sep 30, 2018

Contributor

Con: it's a weird rule.

Yes, indeed! I am not sure whether the proposed rules makes adoption of XProc easier (is this still a goal?) for people coming from other XPath-host languages like XSLT. We should think about a better idea to solve the cx:message problem.

Contributor

xml-project commented Sep 30, 2018

Con: it's a weird rule.

Yes, indeed! I am not sure whether the proposed rules makes adoption of XProc easier (is this still a goal?) for people coming from other XPath-host languages like XSLT. We should think about a better idea to solve the cx:message problem.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Oct 1, 2018

Contributor

I wonder if we could say that the context item is defined if-and-only-if a single item appears.

What about having the rule that the DRP (if present) supplies the context item for a value template if-and-only-if the XPath expression refers to a context item. If value template relies on the context item then it is an error if there is a sequence with more than one item.

Pro: Would solve the problem and sounds better to me. Con: Possibly difficult to implement.

Contributor

xml-project commented Oct 1, 2018

I wonder if we could say that the context item is defined if-and-only-if a single item appears.

What about having the rule that the DRP (if present) supplies the context item for a value template if-and-only-if the XPath expression refers to a context item. If value template relies on the context item then it is an error if there is a sequence with more than one item.

Pro: Would solve the problem and sounds better to me. Con: Possibly difficult to implement.

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Oct 1, 2018

Contributor

I'm sorry to come back to a point by @ndw made earlier, but I think that from a user's perspective it would make perfect sense to have a "first value" rule: When more than on document appears on a DRP, the evaluation of AVTs only takes the first one into account.
IMHO errors when more than item appears will be rather puzzling i'm afraid.

As an alternative, can't we state that the DRP "output" is always a collection of documents (zero, one or more) and you simply have to write your Xpath expressions correctly?

Contributor

eriksiegel commented Oct 1, 2018

I'm sorry to come back to a point by @ndw made earlier, but I think that from a user's perspective it would make perfect sense to have a "first value" rule: When more than on document appears on a DRP, the evaluation of AVTs only takes the first one into account.
IMHO errors when more than item appears will be rather puzzling i'm afraid.

As an alternative, can't we state that the DRP "output" is always a collection of documents (zero, one or more) and you simply have to write your Xpath expressions correctly?

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Oct 1, 2018

Contributor

Reply to Achim’s comment #547 (comment) above:

In an email exchange about #482 (between @ndw, @xml-project, and @gimsieke, while @eriksiegel was vacationing), Norm already expressed reservations against this level of analysis wrt XPath expressions:


Achim Berndzen achim.berndzen@xml-project.com writes:

Yes, I was thinking about that line too. But then, why do we say that
this p:inline consumes the DRP?

<p:inline expand-true="true">{$a+$b}</p:inline>

So the next step would be: A p:inline, an implicit inline or the href
on p:document only consume the DRP if the used XPath expression refers
to the context item, the position or the length. But then you could
image cases, where one branch of an "if" refers to the context and the
other one does not. Does "if (false) . else 4" depend on the context
item?

I think we want a simple rule. A p:inline or a p:document could
require the context item, therefore they always consume it. I think
it’s too complicated to say that it depends on the value of
@expand-text or the presence of an AVT or anything like that.


(Unquoted text above by Norm)

And to another message in the same thread, Norm replied:


"Imsieke, Gerrit, le-tex" gerrit.imsieke@le-tex.de writes:

What is unclear about 3.0 is whether
– an attribute option that may hold an AVT or
– one that actually contains an AVT
will also consume the DRP.

I think the rule should be “may hold”. I do not think we should make
static analysis depend on parsing attribute values in search of AVTs.


Therefore I’d assume that Norm again would consider a proposal that involves parsing the XPath expression and deciding whether it refers to a context item as too difficult to implement or too complicated.

Contributor

gimsieke commented Oct 1, 2018

Reply to Achim’s comment #547 (comment) above:

In an email exchange about #482 (between @ndw, @xml-project, and @gimsieke, while @eriksiegel was vacationing), Norm already expressed reservations against this level of analysis wrt XPath expressions:


Achim Berndzen achim.berndzen@xml-project.com writes:

Yes, I was thinking about that line too. But then, why do we say that
this p:inline consumes the DRP?

<p:inline expand-true="true">{$a+$b}</p:inline>

So the next step would be: A p:inline, an implicit inline or the href
on p:document only consume the DRP if the used XPath expression refers
to the context item, the position or the length. But then you could
image cases, where one branch of an "if" refers to the context and the
other one does not. Does "if (false) . else 4" depend on the context
item?

I think we want a simple rule. A p:inline or a p:document could
require the context item, therefore they always consume it. I think
it’s too complicated to say that it depends on the value of
@expand-text or the presence of an AVT or anything like that.


(Unquoted text above by Norm)

And to another message in the same thread, Norm replied:


"Imsieke, Gerrit, le-tex" gerrit.imsieke@le-tex.de writes:

What is unclear about 3.0 is whether
– an attribute option that may hold an AVT or
– one that actually contains an AVT
will also consume the DRP.

I think the rule should be “may hold”. I do not think we should make
static analysis depend on parsing attribute values in search of AVTs.


Therefore I’d assume that Norm again would consider a proposal that involves parsing the XPath expression and deciding whether it refers to a context item as too difficult to implement or too complicated.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Oct 1, 2018

Contributor

@gimsieke Thanks for reminding me about this. I know it wouldn't be a popular approach, but it would be a solution.

Contributor

xml-project commented Oct 1, 2018

@gimsieke Thanks for reminding me about this. I know it wouldn't be a popular approach, but it would be a solution.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Oct 1, 2018

Contributor

With respect to @eriksiegel 's comment:

If we say AVTs have first-value semantics, then this is ok:

  <p:identity>
    <p:input port="source">
      <p:inline><doc1/></p:inline>
      <p:inline><doc2/></p:inline>
    </p:input>
  </p:identity>

  <ex:step optname="{count(.)}"/>

and this is an error:

  <p:identity>
    <p:input port="source">
      <p:inline><doc1/></p:inline>
      <p:inline><doc2/></p:inline>
    </p:input>
  </p:identity>

  <ex:step>
    <p:with-option name="optname" select="count(.)"/>
  </ex:step>

I think that's bad. We could say that first value semantics is the rule for p:variable and p:option as well, but I also think that would be bad. Sometimes, it will be what the user expects. When there's only one document, it'll be fine. And then one day there will be two documents and instead of getting an error about the fact that the pipeline wasn't designed for this case, you'll silently get a result that may be wrong.

We can't say that it's always a sequence because that's not how XPath works. The context item is either defined or it isn't, it can't be defined as a sequence.

Contributor

ndw commented Oct 1, 2018

With respect to @eriksiegel 's comment:

If we say AVTs have first-value semantics, then this is ok:

  <p:identity>
    <p:input port="source">
      <p:inline><doc1/></p:inline>
      <p:inline><doc2/></p:inline>
    </p:input>
  </p:identity>

  <ex:step optname="{count(.)}"/>

and this is an error:

  <p:identity>
    <p:input port="source">
      <p:inline><doc1/></p:inline>
      <p:inline><doc2/></p:inline>
    </p:input>
  </p:identity>

  <ex:step>
    <p:with-option name="optname" select="count(.)"/>
  </ex:step>

I think that's bad. We could say that first value semantics is the rule for p:variable and p:option as well, but I also think that would be bad. Sometimes, it will be what the user expects. When there's only one document, it'll be fine. And then one day there will be two documents and instead of getting an error about the fact that the pipeline wasn't designed for this case, you'll silently get a result that may be wrong.

We can't say that it's always a sequence because that's not how XPath works. The context item is either defined or it isn't, it can't be defined as a sequence.

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Oct 2, 2018

Contributor

@ndw Sorry, I'm getting a bit lost in this discussion. Why would your second example be an error?

And: Does all this means that the context item for a <p:with-option> is different than for an option-setting expressed as an AVT?

BTW, this seems something better postponed to the next call (which I unfortunately cant attend).

Contributor

eriksiegel commented Oct 2, 2018

@ndw Sorry, I'm getting a bit lost in this discussion. Why would your second example be an error?

And: Does all this means that the context item for a <p:with-option> is different than for an option-setting expressed as an AVT?

BTW, this seems something better postponed to the next call (which I unfortunately cant attend).

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Oct 2, 2018

Contributor

The second example is an error because of err:XD0008 which has existed since XProc 1.0. In XProc 3.0, we're providing a collection attribute to allow a sequence, but it isn't the context in that case.
No, it isn't different (unless collection is specified) because a sequence is an error in either case.
We can postpone it until a call if we can't reach consensus here.

Contributor

ndw commented Oct 2, 2018

The second example is an error because of err:XD0008 which has existed since XProc 1.0. In XProc 3.0, we're providing a collection attribute to allow a sequence, but it isn't the context in that case.
No, it isn't different (unless collection is specified) because a sequence is an error in either case.
We can postpone it until a call if we can't reach consensus here.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Oct 4, 2018

Contributor

In today’s call, @ndw, @xml-project, and @gimsieke agreed that we feel comfortable adopting Norm’s proposal (context defined iff there’s exactly one document present for evaluating the expression).
Given his authority and decade-long experience in XML processing language specification, we just believed Norm’s argument that “first-value” semantics is error-prone and should be avoided.

Regarding Achim’s proposal to analyze whether an XPath expression refers to a context item: There was consensus that this will put too high a burden on implementers.
A dynamic error will be thrown when the context is nullified and the expression refers to the context.
However, care <rfc2119>should</rfc2119> be taken by implementations to issue a meaningful error message in this situation, that is, if the context is undefined not just because there is no document, but because more than one document is present.

@eriksiegel, can you live with this solution?

Contributor

gimsieke commented Oct 4, 2018

In today’s call, @ndw, @xml-project, and @gimsieke agreed that we feel comfortable adopting Norm’s proposal (context defined iff there’s exactly one document present for evaluating the expression).
Given his authority and decade-long experience in XML processing language specification, we just believed Norm’s argument that “first-value” semantics is error-prone and should be avoided.

Regarding Achim’s proposal to analyze whether an XPath expression refers to a context item: There was consensus that this will put too high a burden on implementers.
A dynamic error will be thrown when the context is nullified and the expression refers to the context.
However, care <rfc2119>should</rfc2119> be taken by implementations to issue a meaningful error message in this situation, that is, if the context is undefined not just because there is no document, but because more than one document is present.

@eriksiegel, can you live with this solution?

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Oct 5, 2018

Contributor

I'm trying to understand the implications of this...

So we have a step for which the DRP is one of:

  1. Non-existent
  2. Empty (no doc appearing)
  3. Multiple docs appearing

In all these cases, if I understand things correctly, any AVT is invalid, even simple ones like {1 + 2} and { $somevar }?

And what about p:with-option/@select? It has no context item either.

If I'm right (and probably not) it would mean you can hardly do anything meaningful if one of the three conditions is true.

Contributor

eriksiegel commented Oct 5, 2018

I'm trying to understand the implications of this...

So we have a step for which the DRP is one of:

  1. Non-existent
  2. Empty (no doc appearing)
  3. Multiple docs appearing

In all these cases, if I understand things correctly, any AVT is invalid, even simple ones like {1 + 2} and { $somevar }?

And what about p:with-option/@select? It has no context item either.

If I'm right (and probably not) it would mean you can hardly do anything meaningful if one of the three conditions is true.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Oct 5, 2018

Contributor

No, AVTs are valid in these cases as long as they don’t refer to a context item.

Contributor

gimsieke commented Oct 5, 2018

No, AVTs are valid in these cases as long as they don’t refer to a context item.

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Oct 5, 2018

Contributor

Ah, I think I mis-interpreted:

Regarding Achim’s proposal to analyze whether an XPath expression refers to a context item: There was consensus that this will put too high a burden on implementers.

That would refer to static analysis only, right?

Contributor

eriksiegel commented Oct 5, 2018

Ah, I think I mis-interpreted:

Regarding Achim’s proposal to analyze whether an XPath expression refers to a context item: There was consensus that this will put too high a burden on implementers.

That would refer to static analysis only, right?

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Oct 5, 2018

Contributor

Yes. If an XPath expression refers to the context item when the context item is undefined, a dynamic error will be thrown. I think it will currently be XD0050 for value templates, but maybe there should be other dynamic errors that are thrown when an XPath expression refers to the context when the context item is undefined (maybe differentiated by the cause of the undefined context item).

Contributor

gimsieke commented Oct 5, 2018

Yes. If an XPath expression refers to the context item when the context item is undefined, a dynamic error will be thrown. I think it will currently be XD0050 for value templates, but maybe there should be other dynamic errors that are thrown when an XPath expression refers to the context when the context item is undefined (maybe differentiated by the cause of the undefined context item).

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Oct 5, 2018

Contributor

It's not static analysis; it's a runtime constraint. (You could sometimes guarantee that it won't happen, but you can't detect if it will or not until you're running the pipeline.)

  1. If you have a value template or p:with-option or p:variable expression that doesn't refer to the context item (1+2, $foo, etc.) then it doesn't matter what the context is. You'll never have a problem.
  2. If you refer to the context item then you will get an error if the provided context does not consist of exactly one item. It's a quality of implementation issue to make that message meaningful.
  3. My proposal says that if you read from a port (implicitly in the case of value templates or explicitly), then the context item will be undefined if zero items are available on that port or if more than one item is available.

Having given it some more thought, I think point 3 is really just an implementation strategy. We only have to say that it's an error. How I detect that error isn't really relevant.

But the critical point is that you do not get first value semantics.

By the way, if you know you're getting a sequence, you can use collection=true on the long form of the with-option or variable, so there's no loss of functionality.

Contributor

ndw commented Oct 5, 2018

It's not static analysis; it's a runtime constraint. (You could sometimes guarantee that it won't happen, but you can't detect if it will or not until you're running the pipeline.)

  1. If you have a value template or p:with-option or p:variable expression that doesn't refer to the context item (1+2, $foo, etc.) then it doesn't matter what the context is. You'll never have a problem.
  2. If you refer to the context item then you will get an error if the provided context does not consist of exactly one item. It's a quality of implementation issue to make that message meaningful.
  3. My proposal says that if you read from a port (implicitly in the case of value templates or explicitly), then the context item will be undefined if zero items are available on that port or if more than one item is available.

Having given it some more thought, I think point 3 is really just an implementation strategy. We only have to say that it's an error. How I detect that error isn't really relevant.

But the critical point is that you do not get first value semantics.

By the way, if you know you're getting a sequence, you can use collection=true on the long form of the with-option or variable, so there's no loss of functionality.

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Oct 6, 2018

Contributor

Ok, I think I can explain this to the audience. Let's do it this way.

I think I do need some additional explanation about the meaning of @collection (that I completely mis-interpreted probably). But that's better done outside this discussion thread.

Contributor

eriksiegel commented Oct 6, 2018

Ok, I think I can explain this to the audience. Let's do it this way.

I think I do need some additional explanation about the meaning of @collection (that I completely mis-interpreted probably). But that's better done outside this discussion thread.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Oct 15, 2018

Contributor

So can I merge this now? @xml-project did I address your concerns (I can't work out from the GitHub UI what the text of your comment is/was!)

Contributor

ndw commented Oct 15, 2018

So can I merge this now? @xml-project did I address your concerns (I can't work out from the GitHub UI what the text of your comment is/was!)

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Oct 15, 2018

Contributor

@ndw Everything is fine with me. I originally approved the PR but then reverted it because of @gimsieke intervention. This is why there is no comment!

Contributor

xml-project commented Oct 15, 2018

@ndw Everything is fine with me. I originally approved the PR but then reverted it because of @gimsieke intervention. This is why there is no comment!

@xml-project xml-project self-requested a review Oct 15, 2018

@ndw ndw merged commit 147a51b into xproc:master Oct 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ndw ndw deleted the ndw:iss-482 branch Oct 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment