Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xsl:* (or other non-sch:*) not allowed within constraintSpec #1607

Closed
ebeshero opened this issue Mar 12, 2017 · 29 comments
Closed

xsl:* (or other non-sch:*) not allowed within constraintSpec #1607

ebeshero opened this issue Mar 12, 2017 · 29 comments

Comments

@ebeshero
Copy link
Member

ebeshero commented Mar 12, 2017

Some Schematron rules require the use of <xsl:variable> when particular functionality is missing from a Schematron variable. One example involves testing whether Roman numerals in one element match a numerical value in a nearby attribute (for which one might wish to define an <xsl:variable> to convert the value of the number into a Roman numeral using @format="I", which is unavailable in the Schematron variable).

Schematron permits the embedding of xsl: variables in its syntax in a standalone file. However, if one attempts to write this in an ODD constraintSpec, it's flagged as an error because the descendant elements of <constraint> must all be defined in the sch: (Schematron) namespace. If one ignores the tei_odds schema error, one can output a Relax NG Schema in XML syntax that holds the Schematron rule with its variable, but it's apparently not fully functional in the Relax NG context: Schematron contexts and variables don't pass into the xsl:variable.

I've made this work in my project by compiling the ODD schema without the rule with the xsl:variable, and just storing it in a separate Schematron file that I associate with the files that require Roman numeral checking. But it would be nice to see full functionality of Schematron in ODD, including embedded xsl that might reasonably be evoked in a Schematron file.

@ebeshero ebeshero changed the title ODDs won't permit or process xsl:* (or other non-sch:*) within constraintSpec xsl:* (or other non-sch:*) not allowed within constraintSpec Mar 12, 2017
@martindholmes
Copy link
Contributor

Can you not use <let> for this?

https://www.xml.com/pub/a/2003/11/12/schematron.html#Variables_using_let

@ebeshero
Copy link
Member Author

ebeshero commented Mar 12, 2017

@martindholmes Alas, not that I can see. The Schematron <let> variables don't permit the same simple and concise reformatting of numerals, (or dates or times, for that matter). For that I needed to embed an XSL variable with an @format attribute. The solution took me a while to research, but the code itself is pretty straightforward--just embedding an xsl:variable in the pattern to handle the conversion, and then calling it in the sch:assert or report test. That transference of variables from sch to xsl and back to sch is what's not happening in the ODD context.

@ebeshero
Copy link
Member Author

ebeshero commented Mar 12, 2017

@martindholmes OK--for dates, there is a handy XPath function for conversions that would work in sch:let: http://www.sixtree.com.au/articles/2013/formatting-dates-and-times-using-xslt-2.0-and-xpath/

I see we also have a format-integer() function in XPath 3.0, which works similarly on picture strings and patterns. But the xsl:variable solution is a simpler (less verbose) implementation on reading numbers to Roman numerals. And it's tidy--a convenient functionality that works in standalone Schematron. So that's really my question here--why can't we use xsl: variable functionality in the ODD environment just as we can in native Schematron? Just because we could write a more verbose XPath function in a Schematron let statement doesn't make that solution more desirable.

We ought to have the benefit of the full range of things we can write in a Schematron file within the constraintSpec elements of an ODD.

@martindholmes
Copy link
Contributor

I don't see any objection to allowing elements in the XSLT namespace into constraintSpec, and your use-case seems compelling.

@rvdb
Copy link
Contributor

rvdb commented Mar 13, 2017

By coincidence, last Friday I had started drafting an issue on exactly this same problem (tei_odds being too strict about xsl descendants of <constraint>), while (in some respect) the stylesheets are more permissive and are copying some XSLT elements to the generated schema.

I was holding back to post the issue here, since it involves multiple TEI repos (TEI and Stylesheets, to tackle some XSLT elements that are not being copied properly, while they seem perfectly valid in the generated schema). While experimenting a bit in order to get a good test case, I had noticed how all XSLT elements I drop in Schematron tend to be accepted and executed by the validation engine in Oxygen. Yet, I'm not sure at all which XSLT elements are really allowed, and if the (validation framework in) Oxygen might be too permissive, so I'm trying to clarify this first. I've posted this on the Oxygen forum (https://www.oxygenxml.com/forum/topic14236.html), hoping to get feedback there, so I know what XSLT content is actually allowed before making any suggestions for ODD processing.

@rvdb
Copy link
Contributor

rvdb commented Mar 13, 2017

Ok, it seems that a) while XSLT support inside Schematron is implementation-dependent, b) the Schematron “Skeleton” Implementation that is used by Oxygen allows for XSLT inside Schematron. In his answer to my question Octavian mentions this implementation as the reference implementation, which seems to me a further argument for allowing xsl:* elements inside <constraint>, and relaxing the Schematron check in the files that are being generated at http://jenkins.tei-c.org/job/oxygen-tei-bleeding/ws/oxygen-tei/frameworks/tei/xml/tei/custom/schema/relaxng/tei_odds.rnc and http://jenkins.tei-c.org/job/oxygen-tei-bleeding/ws/oxygen-tei/frameworks/tei/xml/tei/custom/schema/relaxng/tei_odds.rng (I haven't found a source file containing this check, though, so I don't know where it is coming from).

@martindholmes
Copy link
Contributor

Annex H of the Schematron 2016 standard:

http://standards.iso.org/ittf/PubliclyAvailableStandards/c055982_ISO_IEC_19757-3_2016.zip

should tell us what aspects of XPath and XSLT2 are formally supported by Schematron. It says:

The xslt2 query language binding allows schemas implemented using XSLT2.

I'm not sure whether this means that anything and everything from XSLT2 is allowed in a Schematron rule if the xslt2 query language binding is invoked. It also says:

— A Schematron let expression is treated as an XSLT2 variable. The XSLT2 $ delimiter signifies the use of a variables [sic] in an [sic] context expression, assertion test, name query, value-of query or let expression. The character not followed by the name of an in-scope variable shall be treated as a literal character.

It doesn't look very thoroughly proofed, but this suggests that sch:let and xsl:variable are equivalent, and therefore that you should be able to use xsl:variable.

If the standard permits it, and the most common implementations support it, then TEI should support it too.

@rvdb
Copy link
Contributor

rvdb commented Mar 13, 2017

Indeed, the implementation seems to allow far more XSLT constructs than are specified by the standard, and Octavian's answer on the Oxygen support forum seems to confirm this. I've created a gist with some XSLT elements that are not mentioned in the standard, and this just seems to work.

So, the situation is indeed a bit hazy: the standard only lists a couple of XSLT elements (while not prohibiting others, though), while the implementation is much more permissive.

UPDATE: In his answer to my question on the Skeleton implementation tracker, Rick Jelliffe has confirmed the fact that the implementation allows more XSLT elements than defined by the Schematron standard, and ensured that this can be considered a stable feature.

@rvdb
Copy link
Contributor

rvdb commented Mar 13, 2017

Also, I think it would make sense to allow nodes from the Schematron Quick Fix namespace (http://www.schematron-quickfix.com/validator/process) inside <constraint> as well.

@sydb
Copy link
Member

sydb commented Mar 31, 2017

I have not read most of this ticket, but rather came across it while trying to do due diligence before creating a new one for a bug that Martin & I just discovered. In constraint.xml, the content model for <constraint> is declared as

    <alternate minOccurs="0" maxOccurs="unbounded">
      <textNode/>
      <anyElement require="http://purl.oclc.org/dsdl/schematron"/>
    </alternate>

which is, frankly, wrong. The content of <constraint> should just be <anyElement>. A user is free to express his or her constraint in any language she chooses, not just Schematron. (If the parent <constraintSpec> has a @scheme attribute of "schematron" (or, deprecated, "isoschematron"), then the content of <constraint> should be limited to Schematron, but that means just <anyElement> without the alternate <textNode>, as Schematron allows elements and attributes from any namespace.)

P.S. My finding this bug springs from the need to have elements in the Schematron Quick Fix namespace in a <constraint>. Even if TEI-C, for some reason, wants to say “no, you can’t have any ol’ namespace in there”, we have to allow the SQF and XSLT namespaces, no?

@rvdb
Copy link
Contributor

rvdb commented Feb 14, 2018

This is probably related, though I don't know if it deserves its own issue (and what to report, then): during the TEIP5-dev build process on the TEI Jenkins server, an error was thrown during the validateodd step, because my ODD file contained a <sch:value-of/> element inside a <sqf:title> (see 650c56e#diff-0e83e27d862db009cc8246696c6b6b40R3248).

When building a RelaxNG schema locally, everything works without problems, and -besides the tons of "constraint descendants must be in the namespaces 'http://purl.oclc.org/dsdl/schematron', http://www.tei-c.org/ns/1.0'" warnings reported in this issue-, no errors are flagged for the occurrence of <sch:value-of/> inside <sqf:title>.

Yet, during the build process, some validation process is choking on this, and causing the build to fail, with following message:

validateodd:
    [echo] Validate tei_jtei.odd as ODD 
    [runjing] /var/lib/jenkins/jobs/TEIP5-dev/workspace/P5/Exemplars/tei_jtei.odd:3248:138: error: element "sch:value-of" not allowed here; expected the element end-tag or text

According to the SQF documentation at http://schematron-quickfix.github.io/sqf/publishing-snapshots/April2015Draft/spec/SQFSpec.html#d0e227, <sch:value-of/> is explicitly listed as a permitted child of <sqf:title>, so this error is definitely erroneous. Yet, I don't have a clue what validation file is being used, and what should be corrected exactly. But I have a feeling it's related to this issue.

rvdb pushed a commit to rvdb/TEI that referenced this issue Feb 14, 2018
…he TEI build seems to choke on this (see TEIC#1607 (comment))

-changed @scheme to "schematron" ("isoschematron" seems to have been deprecated)
@martindholmes
Copy link
Contributor

I claim that if we add this line:

<ref name="anySchematron"/>

at line 133 of p5odds.odd, we should fix the problem. Any reason not to do this?

@sydb
Copy link
Member

sydb commented Feb 15, 2018

Yes, I think that would solve this problem, and might be the appropriate thing to do for p5odds.odd. But overall I wonder …

a) Should be defining the content model of Schematron elements, or importing the Schematron schema? Interesting question: is there a Schematron schema that includes SQF? A quick search says yes, there is a schemas/ directory on the SQF GitHub site. However, the schemas therein are XSD. However, the comment in the XSD says it was converting with trang. Where’s the source, and is it REALX NG?

b) Surely not any Schematron element is allowed inside any SQF element, right? But I am not sure how to read the error message I got from oXygen; it may imply exactly that:

One of '{"http://purl.oclc.org/dsdl/schematron":value-of, "http://purl.oclc.org/dsdl/schematron":name, WC[""], WC[##other:"http://purl.oclc.org/dsdl/schematron",""]}' is expected.

I understand the first two expected thingies, but can someone explain that WC construct?

Also, the content of element <content> should not include <ref name="anySchematron"/>, I don’t think, as we do not use Schematron 1.x (only ISO Schematron) in the Guidelines nowadays. I will consider that bit on me.

@sydb
Copy link
Member

sydb commented Feb 15, 2018

OK. I’ve removed the extraneous Schematron 1.x bits from my local copy of p5odds.odd, but should I hold off on committing it until this problem is dealt with?

(BTW, I have not just systematically nuked all Schematron NSs and looked for errors, because I am guessing that some occurrences are legit, and there are lots to look through. Over 330 occurrences in over 125 files.

@martindholmes
Copy link
Contributor

@sydb If you agree with my fix, that's two voices, so how about rolling it in with your changes and seeing what happens?

sydb added a commit that referenced this issue Feb 15, 2018
1) Explicitly permit <sqf:*> elements inside <sch:*> elements
2) Remove vestigal references to Schematron 1.x
@sydb
Copy link
Member

sydb commented Feb 15, 2018

Done at 1925b03.

@rvdb
Copy link
Contributor

rvdb commented Feb 19, 2018

Great, so far so good for allowing SQF descendants of <constraint>; could http://www.w3.org/1999/XSL/Transform be added as an allowed namespace for <constraint> descendants as well, or does this need more discussion?

@martindholmes
Copy link
Contributor

I see no objection myself; it would be good to have example use-cases though.

@rvdb
Copy link
Contributor

rvdb commented Feb 19, 2018

One use-case could be the use of <xsl:key> in order to make more performant checks for e.g. @xml:id values. For example: https://github.com/TEIC/TEI/blob/dev/P5/Exemplars/tei_jtei.odd#L2149-L2152 could be rewritten as:

<!-- of course, <xsl:key> should be declared with global Schematron variables for efficiency -->
<xsl:key name="idrefs" match="@target[starts-with(., '#')]" use="for $i in tokenize(., '\s+') return substring-after($i, '#')"/>
<sch:assert test="key('idrefs', @xml:id)/parent::*[self::tei:ref][@type='bibl']">
  This bibliographic entry is an orphan: no ref[@type="bibl"] references to it occur in the text.
</sch:assert>

The <xsl:key> element is copied without problems by the ODD2RelaxNG transformation, and the resulting RelaxNG schema works. In order to make this ODD pass validation, however, I guess:

I could just rework the jTEI ODD with some XSL elements (where it makes sense, of course), and see how it affects the build/validation process?

@martindholmes
Copy link
Contributor

That's a perfect example.

@rvdb
Copy link
Contributor

rvdb commented Feb 19, 2018

Ok, done in #1742. Shall I leave additional changes in the ODD validation machinery to the experts?

@martindholmes
Copy link
Contributor

Someone from Council should OK the addition of the XSLT namespace elements, in case there's some issue I haven't thought of. We should hold off on committing the ODD changes till that's done, otherwise the ODD will be invalid.

@ebeshero
Copy link
Member Author

ebeshero commented Feb 26, 2018

Council suggests we change the content of <constraint> thus (to remove @require from <anyElement>

    <alternate minOccurs="0" maxOccurs="unbounded">
      <textNode/>
      <anyElement/>
    </alternate>

Let's try this as an experiment and see if XSL is now permitted. If it works, let's go ahead and close this ticket.

sydb added a commit that referenced this issue Apr 20, 2018
Remove restriction that elements in <constraint> have to be in Schematron or SQF namespace.
@sydb
Copy link
Member

sydb commented Apr 20, 2018

Committed at 6c508bd.

I did get one weird message on my local build:

isoschematron:
     [echo] XSLT generate ISO schematron tei_jtei.isosch from compiled ODD 
     [xslt] Processing /Users/syd/Documents/TEI-GitHub/P5/Exemplars/tei_jtei.odd.processed to /Users/syd/Documents/TEI-GitHub/P5/Exemplars/tei_jtei.isosch
     [xslt] Loading stylesheet /Users/syd/Documents/Stylesheets/odds/extract-isosch.xsl
     [xslt] Processing /Users/syd/Documents/TEI-GitHub/P5/Exemplars/tei_jtei.isosch to /Users/syd/Documents/TEI-GitHub/P5/Exemplars/tei_jtei.xsl
     [xslt] Loading stylesheet /Users/syd/Documents/TEI-GitHub/P5/Utilities/iso_schematron_message_xslt2.xsl
     [xslt] Error: unrecognized element in ISO Schematron namespace: check spelling and capitalizationvalue-of
     [xslt] Error: unrecognized element in ISO Schematron namespace: check spelling and capitalizationvalue-of

but was not able to figure out what it’s talking about.

@rvdb
Copy link
Contributor

rvdb commented Apr 25, 2018

@sydb Great, I can confirm how this fixes validation of the jTEI ODD with XSLT elements inside <constraintSpec>. Ok to merge #1742, then?

@rvdb
Copy link
Contributor

rvdb commented Aug 21, 2018

I've just realized my PR (#1742) is still pending, awaiting resolution of this issue. Did @sydb 's commit (6c508bd) solve this issue sufficiently (I think it does, by successfully allowing elements from other namespaces inside <constraint> in ODD files), or are there any problems left?

@sydb
Copy link
Member

sydb commented Aug 21, 2018

Sorry! Just merged. I did not work through the logic completely, but on first glance seems perfectly fine. @rvdb may want to tweak some more per my comment on the pull request.

Does this mean we can close this ticket?

@martindholmes
Copy link
Contributor

The presence of xsl:key in tei_jtei.odd is still causing the P5-dev build to fail. More work needed here I think.

@sydb
Copy link
Member

sydb commented Aug 31, 2018

Guess my thoughts, above, about doing the same in p5odds never made it in; done at 84e541c .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants