New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider a mechanism for defining enumerated values for options #613

Open
ndw opened this Issue Nov 1, 2018 · 20 comments

Comments

Projects
None yet
4 participants
@ndw
Contributor

ndw commented Nov 1, 2018

See if we can leverage this for the QNames as option values magic.

@ndw ndw self-assigned this Nov 1, 2018

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 2, 2018

My first thought is that we add a values attribute anywhere that as can appear. The values attribute is a token or a list of values that may be allowed:

  <p:option name="mode" values="('lax','strict')"/>

Making it an XPath list makes it easy to parse and avoids questions about what happens if you want the separator character in the list of values. Note that it's not an AVT; let's try to keep this simple.

We can say that the token "QName" means the value should be a QName (and the "QNames as option values" rules apply) and the token "QNames" means it should be a list of QNames.

Thoughts?

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 2, 2018

We can add tokens for XPathExpression, XSLTSelectionPattern, XPathSequenceType, ContentType and ContentTypes too and remove those silly comments from our syntax summaries!

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 2, 2018

First thought: Cool!
Second thought: Defer to 4.0 ?

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Nov 2, 2018

So in p:insert, we’d have something like

<p:option name="position" required="true" as="xs:NMTOKEN" 
  values="('first-child', 'last-child', 'before', 'after')"/>

instead of

<p:option name="position" required="true" as="xs:token"/>   
  <!-- "first-child" | "last-child" | "before" | "after" -->

ContentType is used infrequently, for example on p:cast-content-type. I think that we can neither provide a finite list of tokens nor call it QNames, therefore I’d leave it as xs:string or maybe xs:token.

An XSLTSelectionPattern occurs for ex. in p:add-attribute/@match. I don’t see that a step author would like to restrict these values by giving a finite list of XSLT matching patterns.

Similar arguments for XPathExpression and XPathSequenceType, I don’t see value lists there.

Maybe this @values approach is only relevant for strings/tokens/QNames, and then users should be able to specify them, too, in their step declarations.

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 2, 2018

What I was trying to say was that values is either a list of values or a single token that describes the values in more detail. The position attribute would be just as you say, and the match attribute would be

  <p:option name="match" select="/*" values="XSLTMatchPattern"/>
@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Nov 2, 2018

Ah, I see

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 2, 2018

With respect to "defer to 4.0", I'm very sympathetic. I'd like to. But I'm also very uncomfortable with the fact that we have no way of allowing the magic of "QNames as option values" to apply to user-defined steps. It's going to seem wholly unfair and confusing that I can say:

  <p:add-attribute attribute-name="ex:foo"/>

But I have to say:

<ex:my-step attribute-name="{xs:QName('ex:foo')}"/>

I note that we (attempted to) finesse this in the steps introduction document by saying

Types in the XML Schema namespace, identified as QNames with the xs: prefix, as per the XML Schema specification with one exception. Anywhere an xs:QName is specified, an EQName is allowed.

If we don't have a separate mechanism for allowing users to identify options that can have QName values, I'd like to go back to that approach. It's a hack, but one that will benefit our users, I think.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 6, 2018

@ndw I need some help to understand your point because I fail to see how the two use cases you mentioned are connected.
The "attribute" case is in my understanding an instance of what XDM (or XPath 3.1) call "union type": an attribute may be an instance of XDM type A or XDM type B. If its A the processor does action A (to produce a QName), if it is B, some other (or none) action is invoked.
The "string case" looks to me an xs:restriction, where I use a base type (xs:string), but restrict the allowed value range to some token or pattern.
How are these two cases connected? I fail to see it.

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 6, 2018

@xml-project yes, it would be possible to achieve some of the validation goals with union types and restrictions. But that would require making XSD types for the attributes and performing XSD validation on the attributes. I'm not proposing that we do either of those. From a schema-validation perspective, the type of the values attribute is simply xsd:string (or text, if you prefer).

What I'm saying is, the values that that attribute can have in an XProc pipeline are either a list of strings or a single token. A list of strings is an open paren followed by a comma separated list of quoted strings followed by a close paren. A token is a single keyword.

If the values attribute is a list of strings, then the values that the option on which it occurs can have are limited to the list of values specified.

If the values attribute is a token, then it implies the semantics of that token type, which we define in the XProc spec.

Any other value is an error.

<p:option name="metavariable" as="xs:string" values="('foo','bar','baz')"/>

The metavariable option must be one of the three specified string values.

<p:option name="match" as="xs:string" values="XSLTSelectionPattern"/>

The match option must satisfy the constraints of what XProc dictates for an XSLTSelectionPattern.

<p:option name="attribute-name" as="xs:string" values="QName"/>

The attribute-name option must satisfy the constraints of what XProc dictates for a QName. In this case, that's how all of the special rules for parsing QName option values are applied. And it's available equally to our steps and to user-defined steps.

Caveat

While I'm in favor of the values attribute, I think we could solve the problem of QName value types by simply stating that values of type xs:QName have these special semantics. We aren't obligated to say that our use of the as attribute always implies that the lexical value of the option must satisfy the constraints of the declared sequence type.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 7, 2018

OK, know I got the idea. And I like it, but...
I am pretty sure that this will open a can of worms sooner or later. I am prepared to bet a round of beer for the four of us in Prague, that soon someone will come up and demand value restriction by RegEx. I am also pretty sure I could bet, who this will be. ;-))

And then: Why just restricting instances of xs:string. What about

<p:option name="att" as="xs:integer" values="(1, 3, 5, 7, 9)" />

or even better: Let use use functions for value restriction:

<p:option name="att" as="xs:integer" values="f(x){x < 10 and x mod 2 =1}" />

And: What about user defined types, so I do not have to repeat myself over and over again?

<p:values name="type" values="('one', 'two', 'three')" />
<p:option name="att" as="xs:string" values="type" />
<p:option name="att1" as="xs:string" value="type" />

I think all these demands would be useful and make pipelines much safer. (Did I mention that I am a big fan of strong typed languages yet?) Moving the value check from the authors to the processor will save a lot of typing. But given our February deadline I do not see me delivering this.

Sorry for these (paranoid) remarks.

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 7, 2018

I agree we're not delivering all of those things in February.

I think we could deliver the small subset I enumerated.

Or we could go with my "caveat" solution and simply claim that we treat the sequence type "xs:QName" in a special way.

I am strongly opposed to the current situation where we can define steps that have a special semantics for some options but users cannot specify that behavior for their pipelines.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 7, 2018

I am strongly opposed to the current situation where we can define steps that have a special semantics for some options but users cannot specify that behaviour for their pipelines.

I see your point and from from an aesthetic perspective I completely agree with you. But I am not as dissatisfied with the current state as you are because there are workarounds for pipeline authors.

@gimsieke

This comment has been minimized.

Contributor

gimsieke commented Nov 7, 2018

We can make it implementation-defined, like:

  • Each implementation must not flag @values as an error
  • For the standard step library, @values is purely declarative since each implementation must adhere to the spec’s prose. This prose can be phrased succinctly by using @values in the step signatures.
  • For user-defined steps, an implementation can, for example, state that it supports sequences of atomic-type values that can be cast to the declared sequence type (the @as attribute), or that it only supports lists of QNames, or that it ignores @values entirely, or that it also supports XSLTSelectionPattern, etc.
@eriksiegel

This comment has been minimized.

Contributor

eriksiegel commented Nov 8, 2018

WHatever we decide (and I think its mostly up to the implementors what they can do) I'm all in favor. Everything that advances the abilities to engineer your software better should get a round of applause.

If we decide it cannot be done for the feb release, ok as well. I can live with the current situation.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 8, 2018

What is currently unclear to me is the interplay between '@as' and '@values'.
Most examples use both '@as' and '@values'. This suggests, that the processor is expected to do a type-check (sequence type matching) and if this does not fail, check if the value provided is equal to one of the listed values or satisfies the implicit rules of the token. But then there is

<p:option name="attribute-name" as="xs:string" values="QName"/>

which only makes sense IMHO if we say, that '@as' is ignored if '@values' is present, because

<p:with-option name="attribute-name" select="QName('uri','name')" />

would comply to the XProc production rules for QName, but is not an instance of xs:string. I do not think that it is a good idea to say, that '@as' is ignored if '@values' is present because this would allow me to write such things as

<p:option name="opt" as="xs:integer" values="('lax', 'strict')" />

I think we should get the semantics more precise in our call tomorrow, to see, what work is actually connected to the idea.
Having done this we could then decide, whether @gimsieke 's proposal of "make it implementation defined" is a viable way. Having thought about it today I currently would say that we should not take this path because the idea changes the way we write pipelines dramatically and take a great bunch of checking away from the authors to the processor. Making it implementation defined means, that I can not really rely on it (and have to write the checks myself) because the implementor might choose not to support this feature in a later processor version or the customer demands interoperable pipelines. But may be I still got the whole idea wrong.
See you tomorrow.

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 9, 2018

The devil is always in the details. I think I'd suggest that if values is a list, then it must be a list of values that are compatible with any specified as type. (I assume that we say that if as isn't specified then xs:string is used.) If values is a token, then as is forbidden. The whole point of the tokens is to do things that we can't express in sequence types.

@xml-project what is the workaround you have in mind in your preceding comment to that effect?

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 9, 2018

Thanks @ndw Clearing up those details helps to see, how expensive the idea is. I think your clarification is reasonable, although I would argue, that in the list-case we should stick with XProc standard default if @as is not specified. So it should be "item()*".
Forbidding @as for the token-case solves a lot of problems, but we loos the quantifier from sequence types. I therefore would suggest to add the quantifier to the token case: values="QName*".

Concerning the workaround I was talking about the QName-magic. It is IMHO a rather rare case, that I have something which is either a QName or a (Qname) string and want to supply this to a user declared step. My approach would be:

<p:declare-step type="step">
  <p:option name="qName" as="xs:anyAtomicType" />
  <p:add-attribute attribute-name="{$qName}" ... />
</p:declare-step>

The p:add-attribute will fail, if $qName is neither a string nor a qname or if the string can be made into a QName.[1]

If I want a more user friendly pipeline, I would wrap p:add-attribute in a p:choose and write out an error, if the value of $qName does not fit.

What did I miss?

@ndw

This comment has been minimized.

Contributor

ndw commented Nov 9, 2018

The heart of my objection which is that the qname-as-option-value solution you propose only works for the steps defined in the XProc namespace. It works for attribute-name on p:add-attribute, but there's no way for me to assert that the token option on my ndw:fancy-pipeline step is an equally magical QName.

At this point, if we don't want to spend time and energy devising and implementing a solution to this problem, I think the simplest answer is to remove all of the magic. We make the sequence type of the attribute-name option xs:QName and users have to make sure the value they pass is a QName. That means this won't work:

<p:add-attribute xmlns:ex="test" attribute-name="ex:foo"/>

but this will and it's not hugely more difficult to type:

<p:add-attribute xmlns:ex="test" attribute-name="{xs:QName('ex:foo')}"/>

If that’s viewed as too backwards incompatible, then I think the XProc 1.0 solution is a reasonable fallback: options declared with the sequence type xs:QName are magic.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 9, 2018

here's no way for me to assert that the token option on my ndw:fancy-pipeline step is an equally magical QName.

No, you can't and I think this is a real problem. My work-around just relies on the fact, what @token of ndw:my-fancy-pipeline will finally supplied to an atomic step which does the magic.

But I think we can do better: Consider (as a sketch)

<p:option 
  name = EQName
  as? = XPathSequenceType
  values? = xs:anyAtomicType+

If @values is specified, then:

  • if '@values' is a single item and is instance of xs:token (followed by optional quantifier)-> Use predefined XProc types ('XPathExpression', 'QName', 'XPathSequenceType' etc.)

  • if '@values' is a single item, xs:string, but not an xs:token -> Regex

  • if '@values' has more than 1 item, it is a list of allowed values.

This means I can do:

<p:option name="opt1" as="xs:string" values="('strict', 'lax')" />
<p:option name="opt2" values="QName?" />
<p:option name="opt3" as="xs:string*" values="fo?" />
<p:option name="opt4" values="(true(), false(), 'yes', 'no', 0, 1)" />

I suppose that will work and the cost of implementation are not to high.

@xml-project

This comment has been minimized.

Contributor

xml-project commented Nov 9, 2018

Sorry, forgot: As an alternative it might be useful to move the XProc defined types to @as, so one can write

<p:option name="opt2" as="QName?" />

This would avoid any ruling forbidding @as when @values is used, and is IMHO a more harmonic approach. What we need to do it to modify the type annotation of @as:

  as? = (xs:token ('*','?','+')?) | XPathSequenceType

ndw added a commit to ndw/3.0-specification that referenced this issue Nov 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment