New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format error document #579

Closed
xml-project opened this Issue Oct 24, 2018 · 22 comments

Comments

Projects
None yet
3 participants
@xml-project
Contributor

xml-project commented Oct 24, 2018

Now we have to "Q{namespace}local" syntax, I propose to use them in error documents (appearing on c:catch).
Instead of

<c:errors xmlns:c="http://www.w3.org/ns/xproc-step"
          xmlns:p="http://www.w3.org/ns/xproc"
          xmlns:my="http://www.example.org/error">
 <c:error name="bad-document" type="p:error"
          code="my:unk12"><message>The document element is unknown.</message>
</c:error>
</c:errors>

We could have

<c:errors xmlns:c="http://www.w3.org/ns/xproc-step">
  <c:error name="bad-document" type="Q{http://www.w3.org/ns/xproc}error"
    code="Q{http://www.example.org/error}unk12"><message>The document element is unknown.</message>
</c:error>
</c:errors>
@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 24, 2018

The spec currently permits both (EQName). Allowing URIQualifiedNames is a bit cumbersome anyway since someone who processes (for example, groups by @code) the errors needs to treat an attribute value of Q{http://www.example.org/error}unk12 as equivalent to an attribute value of my:unk12 (given that the prefix my is bound to http://www.example.org/error).

Do you want to restrict @code to URIQualifiedNames only?

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 24, 2018

The spec currently permits both (EQName).
Missed that. Where is it? The only thing I found on the error-vocabulary is the example I quoted above.

Do you want to restrict @code to URIQualifiedNames only?
Yes, this was the idea. Having a self contained QName seems to make processing easier that having to resolve "my:unk12" against the namespaces defined on the document.

Did I miss your point?

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 24, 2018

http://spec.xproc.org/master/head/xproc/#d2297e0

<c:error
  name? = NCName
  type? = EQName
  code? = EQName
  href? = anyURI
  line? = integer
  column? = integer
  offset? = integer>
    anyNode*
</c:error>
@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 24, 2018

Thanks, missed this!

What about the second part = Restricting code to URIQualified name.

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 24, 2018

Pro:

  • simplifies processing

Con:

  • There is no precedent for URIQualifiedName in the spec
  • Doesn’t play nice with existing XProc 1.0 pipelines that rely on QNames (makes migration harder because the string comparison might be buried in XSLT code that create error reports)

QNames in attribute values are not ideal anyway. They can too easily become detached from their namespace URIs. EQNames are even worse because you need to invoke clumsy XPath functions to determine equivalence, and it can only be determined wrt the namespace bindings of a context node that needs to be supplied to resolve-QName().

Restricting the attribute value space to URIQualifiedNames can mitigate these problems. Plain strings as a potential alternative are not as good as URIQualifiedNames because they may accidentally clash, and the exact purpose of namespace URIs is to avoid name clashes.

I’m still a bit reluctant to go full URIQualifiedName here. Quite often, prefixed error codes are just treated as strings, and pipeline authors will know whether a step that they use will produce QNames or URIQualifiedNames as error codes.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 24, 2018

Ok, I see the point. But I think we have to make a decision anyway. Given the current one processor might use colonised name, the other might use the Q{} names, and a third might might use both on different occasions. All processor would conform to the specs, but handling error documents, say in XSLT, would be very complicated in an interoperable way.

I am fine with any decision, having a small preference for Q{} names, but as I said, we have to make a decision.

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 24, 2018

I’d say the decision – EQNames – has already been made, and we might just watch how this will play out in practice, and if it proves to be a pain point, then make corrections in 4.0.

In 3.0, we could add a note stating that namespace prefixes in attribute values are problematic – a bit more so in the case of c:error/@code than in the case of c:error/@type because with type, it’s easier to resolve the namespace prefix of a declared step than to resolve the namespace prefix of an error code, which can be an (almost) arbitrary string value. (“Almost arbitrary” because, admittedly, it must be an EQName syntactically, but unlike attributes and elements, XML processors cannot prevent you from using undeclared prefixes in your error codes. Or if they can, in the context where the error message is thrown, a recipient of these error messages still won’t have an authoritative source to turn to for namespace prefix → namespace URI resolution.) Therefore implementations should treat the error code EQNames that step authors chose to use as literal strings and they should refrain from converting URIQualifiedNames into QNames and vice versa.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

Sorry, I do not get, what you mean by "I’d say the decision – EQNames – has already been made". XPath 3.1 says:

  EQName ::= QName |  URIQualifiedName

I would argue to use "URIQualifiedName" for the attributes of c:error and not "prefixed name". According to the production rules, both are EQNames.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

@ndw Any comment?

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 25, 2018

Both are EQNames, but if you rule out QNames, then the value space isn’t EQName any more. What was legal in 1.0, an error code tr:unsp01, would be illegal, although it is a QName as long as the prefix is bound “somewhere”.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

I know that both are EQNames and this is precisely why I think we should do something about it. My problem is, that if a pipeline author has a

<p:error code="Q{my-error-namespace}err1" />

the specs do not precicely define, what error document to expect. One processor may produce

<c:errors xmlns:user-err="my-error-namespace">
  <c:error code="user-err:err1 />
</c:errors>

An other processor might use an URIQualifiedName. A processor might even choose to report dynamic XProc errors in one form and user errors in the other.

The problem is: Both forms are legal according to the current specs. Pipeline authors do not know, what error documents they get and so their XSLT-stylesheets to process the error document get to complex.

As I said before: I am fine with any decision, but I am strongly opposed just saying they are EQNames, because this will lead into troubles for pipeline authors.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

As we had no "EQNames" in XProc 1.0 all old pipelines/stylesheets would expect a QName production. I think this is a reason, to use just QNames and no URIQualifiedNames (not defined in XPath 2.0).

Or: We extend the error vocabulary to use both:
<c:error code="QName" qualified-code="URIQualifiedName" />
and the same for step-type.

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Oct 25, 2018

Your presupposition is: A processor may change from QName to URIQualifiedName at will. I question this.

Is there nothing that a processor can do about it? I think a processor needs to perform some transformation in order to arrive from one representation to the other. It will, however, always be possible for a processor to leave the literal error code untouched.

Therefore my point wrt a possible solution while keeping EQName was: We can stipulate that a processor should not change (from QName to URIQualifiedName or vice versa) the literal @code that was returned by a step.

@ndw

This comment has been minimized.

Contributor

ndw commented Oct 25, 2018

I'm confused actually. I think of Q{}x as a serialization syntax: it's just a QName. And I've spent so much time lately writing Scala/Java code to process data models that I've lost track of what's easy to do in XPath.

Suppose I have an error document:

<error xmlns:t="http://example.com/" code="t:test"/>

then, given an appropriate in-scope namespace xmlns:t2="http://example.com/", I can ask

xs:QName('t2:test') = resolve-QName($err/@code, $err)

and get a reliable answer. I just tried some experiments in Saxon and I’m quite surprised that I can’t get

Q{http://example.com/}test = resolve-QName($err/@code, $err)

to give me the same answer.

Also, without a schema, it doesn’t seem possible to put a QName value into an attribute value, so I don’t see how the URIQualifiedName is useful in an attribute value. I worry that users will be confused if we say that the attribute values have the form of URIQualifiedNames but are really just strings so that the question you have to ask is:

"Q{http://example.com/}test" = $err/@code

Achim, can you show us some examples of what you have in mind?

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

@ndw:
The problems come up, when you try to implement the p:error step: Here we have options "@code", "@code-prefix" and "@code-namespace". The task of the step is to raise an error with this code (which is purely internal to the processor) AND to construct a c:error document which has to appear on port "error" of a related p:catch.

The error document is something like

<c:errors>
   <c:error name="stepName" code="??? />
</c:errors>

So the c:error element has a serialized form of the error code (a QName) in attribute "@code". And the error vocabulary says that code is an EQName which mean that it may be the colonized name notation or the Q{...} notation. So given the current specs a processor is free to use one of the notation.

My point is, that we have to say, which of the notations a pipeline author can expect e.g. in an stylesheet transforming the error document.

My initial proposal was to decide to use either the "n:x" or the "Q{...}x" notation. If I understand @gimsieke proposal correctly (not sure), he want the @code-attribute in the c:error element to be serialized in the same notation that was used in option @code on the p:error step.
I could live such a rule which requires the processor to remember the way in which the QName was constructed.

However it does not all case:

  1. What notation to use when the QName is constructed using @code, @Prefix and @namespace.
  2. What format do we use for dynamic XProc errors raised by the processor. Here the keep it like it is rule does not work.
@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

@gimsieke:
Sorry, obviously we have a communication problem here and I am not sure where it comes from. Any hint from your side would be helpful because the point I would like to make seems pretty obviously and uncontroversial to me. But as we are arguing about it, I fail to communicate my point correctly.

One difference between your and my understanding is this quote:

Your presupposition is: A processor may change from QName to URIQualifiedName at will. I question this.

My presupposition is that XProc 3.0 and XPath 3.1 define TWO equivalent notation to write (serialize QNames): The (traditional) colonized name (with a namespace declaration somewhere) and the URIQualifiedName notation. Neither of the two format IS in my reading a QName, but they are notations to constructed and serialize a QName.

So the problem is: I have an instance of xs:QName and I need to serialized: The current specs say, for @code I can do it in any way the EQName production rule allows. And since the production rule defines to alternative I can use either of them.

You say you question this, but I can not see an which ground because to my reading, the production rule defines two alternatives. Since I do not expect you to question the last, there must be some deeper point in your argument I do not get.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

Here is another way to make my point: More code, less prose:

<p:declare-step xmlns:my-err="http://my-error-namespace">
   <p:output port="result" />
  <p:try>
    <p:error>
     <p:with-option name="code" select="?" code-prefix="?" code-namespace="?" />
    </p:error>
  <p:catch name="catcher">
    <p:output port="result">
    <p:identity>
      <p:with-input pipe="error@catcher" />
    </p:identity>
  </p:catch>
</p:declare-step>

I think the purpose of this pipeline is pretty obvious: I raise an error, catch the error and the result of the pipeline is the c:errors-document that appeared on port @error of step "catcher".

Now I want to write a schematron test for /c:errors/c:error/@code.
Which values are right, which are wrong for different values of "@select", "@code-prefix" and "code-namespace" (and where do we say this in the specs)?
Different cases:

(1) select="my-err:error" code-prefix="()" code-namespace="()".
(2) select="Q{http://my-error-namespace}" code-prefix="()" code-namespace="()",
(3) select="QName("http://my-error-namespace", "error"),
(4) select="QName("http://my-error-namespace"; "my-err:code"), others ()
(5) select="error" code-prefix="() code-namespace="http://my-error-namespace"
(6) select="error" code-prefix="my-err" code-namespace="http://my-error-namespace"

According to the current specs, /c:errors/c:error/@code is an EQName, so correct values are either the colonised name notation (with the namespace declared properly) or the URIQualifiedName syntax.

Did I make my point? What did I miss?

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

@ndw Sorry, that my answer made you comment the same text as before. Hope that was a mistake.

Never mind. May be we should close this issue, go on and wait until someone discovers the problem who is able to explain it better than I can.

@ndw

This comment has been minimized.

Contributor

ndw commented Oct 25, 2018

Sorry. That previous comment was lying in textarea for a while.

With respect to the example above, @xml-project , there aren't any code-prefix or code-namespace attributes on p:with-option so I'm not sure how to interpret them.

I'd expect each of the following to result in the same thing:

<p:error xmlns:t="http://example.com/" code="t:foo"/>
<p:error code="foo" code-namespace="http://example.com/">
<p:error code="foo" code-prefix="t" code-namespace="http://example.com/"/>

Each raises an error with a code that has the QName Q{http://example.com/}foo

I'd expect that to be represented in the error vocabulary as:

<c:error xmlns:t="http://example.com/" code="t:foo"/>
<c:error xmlns:random="http://example.com/" code="random:foo"/>
<c:error xmlns:t="http://example.com/" code="t:foo"/>

Maybe that helps.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 25, 2018

With respect to the example above, @xml-project , there aren't any code-prefix or code-namespace attributes on p:with-option so I'm not sure how to interpret them.

Yes the syntax is wrong, but you get the point in your examples.

Concerning your three answers: I think your answer's are right. But I think:

<c:error code="Q{http://example.com}foo" />

would also be right in all three cases, because the type annotation for c:error/@code is EQName, we both have EQNames.
Why am I wrong?

@xml-project

This comment has been minimized.

Contributor

xml-project commented Oct 26, 2018

OK, I fucked it up totally. Sorry for stealing your time gentleman.

@ndw

This comment has been minimized.

Contributor

ndw commented Oct 26, 2018

No need to apologize. I figure there was at least a 50-50 chance I was the one fucking up! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment