New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Desiderata for an ixml schema #28
Labels
Milestone
Comments
I believe that an ixml grammar is actually a schema in disguise (or an alternative representation of a schema), as I hinted in one of my papers [1].
Maybe it would be of value to try and define a mapping from ixml grammar to some schema format.
[1] https://www.cwi.nl/~steven/Talks/2016/02-12-prague/data.html#L313
Steven
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you are subscribed to this thread.
|
Steven Pemberton writes:
I believe that an ixml grammar is actually a schema in disguise (or an
alternative representation of a schema), as I hinted in one of my
papers [1].
I think you are right (for some meaning of "in disguise").
Maybe it would be of value to try and define a mapping from ixml
grammar to some schema format.
Indeed it would.
When I did so, I ran into (among others) the problem described in the
rest of my mail, namely that if we derive a schema from the grammar for
ixml grammars in the most obvious way, accepting only XML documents
which can be generated by ixml processors for some input, then the
schema is more restrictive than I think in practice it ought to be.
Hence my suggestion that the XML form of ixml grammars should allow
extension attributes and extension elements, following the examples of
XSLT, Relax NG, XSD, and doubtless many many other XML vocabularies.
Michael
…
[1] https://www.cwi.nl/~steven/Talks/2016/02-12-prague/data.html#L313
Steven
On Monday 17 January 2022 19:30:12 (+01:00), C. M. Sperberg-McQueen wrote:
On 4 December, I raised an issue on the public community group discussion list:
In a recent meeting I think I mentioned my goal of writing a transform that will read an ixml grammar and produce a schema describing the XML documents that can be produced by parsing input against that grammar. One use of such a schema, for me, is to allow syntax-directed editing of data described by an ixml grammar — notably including ixml itself.
If we want to encourage or require conforming processors to accept grammars in XML, having an authoritative schema describing the set of grammars they should or must accept might be helpful.
In practice, however, I find that when I work with ixml grammars in XML, I frequently want to annotate them; often I build a pipeline of XSLT transforms which begin by doing some straightforward annotations (e.g. making lists of all possible ancestors or descendants of a nonterminal, or recording whether a nonterminal generates the empty string) and then create a related grammar based in part on those annotations. A schema that matches only documents that could be produced by parsing an ixml grammar is no good to me, because it doesn't handle my annotations.
It's easy enough to extend a standard schema for ixml with rules saying that namespaced non-ixml attributes are valid on any element, and that namespaced non-ixml elements can appear anywhere. So it's not a requirement that a standard schema for ixml allow extension attributes and extension elements. But I suspect that the desire for a schema allowing extension attributes and extension elements may not be limited to me.
For purposes of discussion, then, I make the following proposal regarding a schema for ixml grammars in XML form:
We should have a standard schema pointed to from the spec.
In fact, we should have two:
one that describes as closely as possible the set of XML documents that can be generated by parsing an ixml grammar against the standard ixml.ixml, and
one that also allows extension attributes and extension elements.
I'll call these the narrow schema and the broad schema. (Some may prefer 'strict' and 'lax'.)
Exactly what counts as an extension attribute or extension element is tbd.
We might require declaration of extension namespaces in the style of XSLT. For the moment, my proposal would be: any namespace-qualified attribute or element whose namespace is not the ixml namespace) can occur at any position as a child or attribute of a standard ixml element. (Possible exception for comments?) That's easy to achieve with wildcards.
What can occur inside an extension element is a matter for those who define it; in particular, ixml does not forbid extension elements from having children or attributes in the ixml namespace defined by the ixml spec.
Under the pragmas proposal Tom Hillman and I are working on, some non-ixml elements, attributes, and processing instructions in an XML grammar will count as pragmas, but not necessarily all. (Pragmas will we hope have the property that they can be written out in ixml form without loss of information; that is not guaranteed true of other extension elements.)
The spec should say that:
Conforming processors MUST (or SHOULD -- open question, I guess) accept grammars in XML that conform to the narrow schema.
Conforming processors SHOULD (or MUST?) accept grammars that conform to the broad schema.
The standard interpretation of a broad-schema grammar is the same as the interpretation of the narrow-schema grammar that would result if we removed all extension elements (with all their contents) and all extension attributes.
I wonder what other people think.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you are subscribed to this thread.
--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
On 4 December, I raised an issue on the public community group discussion list:
In a recent meeting I think I mentioned my goal of writing a transform that will read an ixml grammar and produce a schema describing the XML documents that can be produced by parsing input against that grammar. One use of such a schema, for me, is to allow syntax-directed editing of data described by an ixml grammar — notably including ixml itself.
If we want to encourage or require conforming processors to accept grammars in XML, having an authoritative schema describing the set of grammars they should or must accept might be helpful.
In practice, however, I find that when I work with ixml grammars in XML, I frequently want to annotate them; often I build a pipeline of XSLT transforms which begin by doing some straightforward annotations (e.g. making lists of all possible ancestors or descendants of a nonterminal, or recording whether a nonterminal generates the empty string) and then create a related grammar based in part on those annotations. A schema that matches only documents that could be produced by parsing an ixml grammar is no good to me, because it doesn't handle my annotations.
It's easy enough to extend a standard schema for ixml with rules saying that namespaced non-ixml attributes are valid on any element, and that namespaced non-ixml elements can appear anywhere. So it's not a requirement that a standard schema for ixml allow extension attributes and extension elements. But I suspect that the desire for a schema allowing extension attributes and extension elements may not be limited to me.
For purposes of discussion, then, I make the following proposal regarding a schema for ixml grammars in XML form:
We should have a standard schema pointed to from the spec.
In fact, we should have two:
one that describes as closely as possible the set of XML documents that can be generated by parsing an ixml grammar against the standard ixml.ixml, and
one that also allows extension attributes and extension elements.
I'll call these the narrow schema and the broad schema. (Some may prefer 'strict' and 'lax'.)
We might require declaration of extension namespaces in the style of XSLT. For the moment, my proposal would be: any namespace-qualified attribute or element whose namespace is not the ixml namespace) can occur at any position as a child or attribute of a standard ixml element. (Possible exception for comments?) That's easy to achieve with wildcards.
What can occur inside an extension element is a matter for those who define it; in particular, ixml does not forbid extension elements from having children or attributes in the ixml namespace defined by the ixml spec.
Under the pragmas proposal Tom Hillman and I are working on, some non-ixml elements, attributes, and processing instructions in an XML grammar will count as pragmas, but not necessarily all. (Pragmas will we hope have the property that they can be written out in ixml form without loss of information; that is not guaranteed true of other extension elements.)
The spec should say that:
Conforming processors MUST (or SHOULD -- open question, I guess) accept grammars in XML that conform to the narrow schema.
Conforming processors SHOULD (or MUST?) accept grammars that conform to the broad schema.
The standard interpretation of a broad-schema grammar is the same as the interpretation of the narrow-schema grammar that would result if we removed all extension elements (with all their contents) and all extension attributes.
I wonder what other people think.
The text was updated successfully, but these errors were encountered: