Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Desiderata for an ixml schema #28

Closed
cmsmcq opened this issue Jan 17, 2022 · 2 comments
Closed

Desiderata for an ixml schema #28

cmsmcq opened this issue Jan 17, 2022 · 2 comments

Comments

@cmsmcq
Copy link
Contributor

cmsmcq commented Jan 17, 2022

On 4 December, I raised an issue on the public community group discussion list:

In a recent meeting I think I mentioned my goal of writing a transform that will read an ixml grammar and produce a schema describing the XML documents that can be produced by parsing input against that grammar. One use of such a schema, for me, is to allow syntax-directed editing of data described by an ixml grammar — notably including ixml itself.

If we want to encourage or require conforming processors to accept grammars in XML, having an authoritative schema describing the set of grammars they should or must accept might be helpful.

In practice, however, I find that when I work with ixml grammars in XML, I frequently want to annotate them; often I build a pipeline of XSLT transforms which begin by doing some straightforward annotations (e.g. making lists of all possible ancestors or descendants of a nonterminal, or recording whether a nonterminal generates the empty string) and then create a related grammar based in part on those annotations. A schema that matches only documents that could be produced by parsing an ixml grammar is no good to me, because it doesn't handle my annotations.

It's easy enough to extend a standard schema for ixml with rules saying that namespaced non-ixml attributes are valid on any element, and that namespaced non-ixml elements can appear anywhere. So it's not a requirement that a standard schema for ixml allow extension attributes and extension elements. But I suspect that the desire for a schema allowing extension attributes and extension elements may not be limited to me.

For purposes of discussion, then, I make the following proposal regarding a schema for ixml grammars in XML form:

  1. We should have a standard schema pointed to from the spec.

  2. In fact, we should have two:

    • one that describes as closely as possible the set of XML documents that can be generated by parsing an ixml grammar against the standard ixml.ixml, and

    • one that also allows extension attributes and extension elements.

I'll call these the narrow schema and the broad schema. (Some may prefer 'strict' and 'lax'.)

  1. Exactly what counts as an extension attribute or extension element is tbd.

We might require declaration of extension namespaces in the style of XSLT. For the moment, my proposal would be: any namespace-qualified attribute or element whose namespace is not the ixml namespace) can occur at any position as a child or attribute of a standard ixml element. (Possible exception for comments?) That's easy to achieve with wildcards.

What can occur inside an extension element is a matter for those who define it; in particular, ixml does not forbid extension elements from having children or attributes in the ixml namespace defined by the ixml spec.

Under the pragmas proposal Tom Hillman and I are working on, some non-ixml elements, attributes, and processing instructions in an XML grammar will count as pragmas, but not necessarily all. (Pragmas will we hope have the property that they can be written out in ixml form without loss of information; that is not guaranteed true of other extension elements.)

  1. The spec should say that:

    • Conforming processors MUST (or SHOULD -- open question, I guess) accept grammars in XML that conform to the narrow schema.

    • Conforming processors SHOULD (or MUST?) accept grammars that conform to the broad schema.

The standard interpretation of a broad-schema grammar is the same as the interpretation of the narrow-schema grammar that would result if we removed all extension elements (with all their contents) and all extension attributes.

I wonder what other people think.

@spemberton
Copy link
Member

spemberton commented Jan 18, 2022 via email

@cmsmcq
Copy link
Contributor Author

cmsmcq commented Jan 18, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants