Skip to content

Commit

Permalink
Address #2371 with a Schematron constraint
Browse files Browse the repository at this point in the history
that flags as an error any 2+ attDef elements who both share an ancestor attList and
have the same ident= attribute (regardless of the mode= attr value) unless they are
both (all) children of an attList that has an org= attribute value of "choice".
  • Loading branch information
sydb committed Mar 14, 2024
1 parent 42da9bf commit 03c774a
Show file tree
Hide file tree
Showing 4 changed files with 323 additions and 57 deletions.
102 changes: 74 additions & 28 deletions P5/Source/Specs/attList.xml
Original file line number Diff line number Diff line change
@@ -1,32 +1,79 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- © TEI Consortium. Dual-licensed under CC-by and BSD2 licenses; see the file COPYING.txt for details. -->
<?xml-model href="https://jenkins.tei-c.org/job/TEIP5-dev/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>
<elementSpec xmlns="http://www.tei-c.org/ns/1.0" module="tagdocs" ident="attList">
<elementSpec xmlns="http://www.tei-c.org/ns/1.0" module="tagdocs" ident="attList" xmlns:sch="http://purl.oclc.org/dsdl/schematron">
<gloss xml:lang="en" versionDate="2020-12-20">attribute list</gloss>
<gloss versionDate="2007-06-12" xml:lang="fr">liste d'attributs</gloss>
<desc versionDate="2005-01-14" xml:lang="en">contains documentation for all the attributes associated with this element, as a series of
<gi>attDef</gi> elements.</desc>
<desc versionDate="2005-01-14" xml:lang="en">contains documentation for all the attributes associated with this element, as a series of <gi>attDef</gi> elements.</desc>
<desc versionDate="2007-12-20" xml:lang="ko">일련의 <gi>attDef</gi> 요소로서, 이 요소와 연관된 모든 속성에 대한 기록을 포함한다.</desc>
<desc versionDate="2007-05-02" xml:lang="zh-TW">包含所有和此元素相關的屬性記錄,使用一連串的元素<gi>attDef</gi>。</desc>
<desc versionDate="2008-04-05" xml:lang="ja">当該要素に関する全ての属性に関する文書を、一連の要素<gi>attDef</gi> で示す。</desc>
<desc versionDate="2007-06-12" xml:lang="fr">contient la documentation pour tous les attributs
associés à cet élément comme une série d'éléments <gi>attDef</gi>.</desc>
<desc versionDate="2007-05-04" xml:lang="es">contiene documentación relativa a todos los atributos
asociados con este elemento bajo forma de series de elementos attDef.</desc>
<desc versionDate="2007-01-21" xml:lang="it">contiene la documentazione relativa agli attributi
associati all'elemento in questione sotto forma di una serie di elementi attDef</desc>
<desc versionDate="2007-06-12" xml:lang="fr">contient la documentation pour tous les attributs associés à cet élément comme une série d'éléments <gi>attDef</gi>.</desc>
<desc versionDate="2007-05-04" xml:lang="es">contiene documentación relativa a todos los atributos asociados con este elemento bajo forma de series de elementos attDef.</desc>
<desc versionDate="2007-01-21" xml:lang="it">contiene la documentazione relativa agli attributi associati all'elemento in questione sotto forma di una serie di elementi attDef</desc>
<classes>
<memberOf key="att.global"/>
</classes>
<content>

<alternate minOccurs="1" maxOccurs="unbounded">
<elementRef key="attRef"/>
<elementRef key="attDef"/>
<elementRef key="attList"/>
</alternate>

<alternate minOccurs="1" maxOccurs="unbounded">
<elementRef key="attRef"/>
<elementRef key="attDef"/>
<elementRef key="attList"/>
</alternate>
</content>
<constraintSpec scheme="schematron" ident="no_duplicate_attrs">
<desc>Because it is illegal in XML to have two attributes with the
same name on the same element instance, it is illegal in TEI to
have two <gi>attDef</gi> elements with the same value of
<att>ident</att> as within a single <gi>attList</gi>, unless the
parent <gi>attList</gi> has an <att>org</att> of
<val>choice</val>. This applies regardless of <att>mode</att> of
each <gi>attDef</gi> with a matching <att>ident</att>.</desc>
<constraint>
<sch:rule context="tei:attList[ not( ancestor::tei:attList ) ]">
<!-- generate a sequence of my <attDef> descendants -->
<sch:let name="defs" value="descendant::tei:attDef"/>
<!--
get a sequence of @idents of those <attDef>s, except
ignore those that are in parent a <attList> that is an
alternation, and for which we have already recorded this
@ident. Thus if we see
<attList org="choice">
<attDef ident="klaatu"/>
<attDef ident="bodsworth"/>
<attDef ident="rugglesby"/>
<attDef ident="klaatu"/>
</attList>
The sequence should be ('klaatu','bodsworth','rugglesby','').
-->
<sch:let name="idents"
value="for $ad in $defs return
if (
$ad
[
parent::tei:attList[ @org eq 'choice']
and
preceding-sibling::tei:attDef[ @ident eq $ad/@ident ]
]
)
then ''
else normalize-space( $ad/@ident )
"/>
<!-- get a sequence of any that occur 2+ times: -->
<sch:let name="dups" value="for $n in $idents return ( $idents[ . eq $n ][2] )"/>
<!-- remove any duplicates from the list of duplicates (-: -->
<sch:let name="distinct_dups" value="distinct-values( $dups )"/>
<!--
if there are any values in list of distinct duplicates (other than null),
warn user about them:
-->
<sch:assert test="count( $distinct_dups[ . ne ''] ) eq 0">
Within the attribute list defined in <sch:value-of select="ancestor::*[@ident][1]/@ident"/>,
the following attributes have been defined multiple times: <sch:value-of select="$distinct_dups"/>.
</sch:assert>
</sch:rule>
</constraint>
</constraintSpec>
<attList>
<attDef ident="org">
<gloss versionDate="2007-07-04" xml:lang="en">organization</gloss>
Expand All @@ -36,15 +83,11 @@
<gloss versionDate="2007-11-06" xml:lang="it">organizzazione</gloss>
<desc versionDate="2023-02-07" xml:lang="en">specifies whether only one (<val>choice</val>) or all (<val>group</val>) of the attributes in the list are available</desc>
<desc versionDate="2007-12-20" xml:lang="ko">목록의 모든 속성이 이용가능하거나(org="group") 그 중 하나만 이용가능한지를 명시한다.</desc>
<desc versionDate="2007-05-02" xml:lang="zh-TW">標明是否列表中的全部屬性皆可使用 (org="group") 、或是僅可使用其中一個
(org="choice")。</desc>
<desc versionDate="2008-04-05" xml:lang="ja">リスト中の属性が全て使用できるか(org="group")、またはその1つだけ
が使用できるか(org="choice")を示す。</desc>
<desc versionDate="2007-06-12" xml:lang="fr">précise si les attributs dans la liste sont tous
disponibles (org="group") ou seulement l'un d'entre eux (org="choice").</desc>
<desc versionDate="2007-05-02" xml:lang="zh-TW">標明是否列表中的全部屬性皆可使用 (org="group") 、或是僅可使用其中一個 (org="choice")。</desc>
<desc versionDate="2008-04-05" xml:lang="ja">リスト中の属性が全て使用できるか(org="group")、またはその1つだけ が使用できるか(org="choice")を示す。</desc>
<desc versionDate="2007-06-12" xml:lang="fr">précise si les attributs dans la liste sont tous disponibles (org="group") ou seulement l'un d'entre eux (org="choice").</desc>
<desc versionDate="2023-03-20" xml:lang="es">indica si solo uno (<val>choice</val>) o todos (<val>group</val>) los atributos de la lista están disponibles</desc>
<desc versionDate="2023-03-21" xml:lang="it">indica se gli attributi contenuti nella lista sono
tutti disponibili (<val>group</val>) o se ne è disponibile solo uno (<val>choice</val>)</desc>
<desc versionDate="2023-03-21" xml:lang="it">indica se gli attributi contenuti nella lista sono tutti disponibili (<val>group</val>) o se ne è disponibile solo uno (<val>choice</val>)</desc>
<datatype><dataRef key="teidata.enumerated"/></datatype>
<defaultVal>group</defaultVal>
<valList type="closed">
Expand Down Expand Up @@ -109,12 +152,15 @@
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<attList org="choice">
<attDef ident="active">
<desc versionDate="2005-07-24" xml:lang="en">identifies the <soCalled>active</soCalled> participants in a non-mutual relationship, or all the participants in a mutual
one.</desc>
<desc versionDate="2005-07-24" xml:lang="en">identifies the
<soCalled>active</soCalled> participants in a non-mutual
relationship, or all the participants in a mutual one.</desc>
<datatype maxOccurs="unbounded"><dataRef key="teidata.pointer"/></datatype>
</attDef>
<attDef ident="mutual" usage="opt">
<desc versionDate="2005-07-24" xml:lang="en">supplies a list of participants amongst all of whom the relationship holds equally.</desc>
<desc versionDate="2005-07-24" xml:lang="en">supplies a list
of participants amongst all of whom the relationship holds
equally.</desc>
<datatype maxOccurs="unbounded"><dataRef key="teidata.pointer"/></datatype>
</attDef>
</attList>
Expand All @@ -124,4 +170,4 @@
<ptr target="#TDTAG"/>
<ptr target="#TDCLA"/>
</listRef>
</elementSpec>
</elementSpec>
200 changes: 183 additions & 17 deletions P5/Test/detest.odd
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
<titleStmt>
<title>Testing errors</title>
<author>Lou Burnard</author>
<author>Sebastian Rhatz</author>
<author xml:id="sbauman.emt">Syd Bauman</author>
</titleStmt>
<publicationStmt>
<p>Published along with TEI P5 as part of its build process test suite</p>
Expand All @@ -17,6 +19,9 @@
</sourceDesc>
</fileDesc>
<revisionDesc>
<change when="2024-03-13" who="#sbauman.emt">

</change>
<change when="2023-08-06" who="#sbauman.emt">
Add the <name>add_missing_scheme</name> and
<name>replace_missing_scheme</name> tests for
Expand Down Expand Up @@ -77,15 +82,13 @@
<elementSpec ident="p" mode="change">
<constraintSpec mode="add" ident="c1" scheme="schematron">
<constraint>
<sch:report test="tei:list">
lists inside paragraphs not supported </sch:report>
<sch:report test="tei:list"> lists inside paragraphs not supported </sch:report>
</constraint>
</constraintSpec>
<constraintSpec mode="add" ident="c2" scheme="schematron">
<constraint>
<sch:rule context="tei:p|tei:q">
<sch:report test="contains(@rend,' ')">
multi-valued rend is not supported </sch:report>
<sch:report test="contains(@rend,' ')"> multi-valued rend is not supported </sch:report>
</sch:rule>
</constraint>
</constraintSpec>
Expand Down Expand Up @@ -161,18 +164,18 @@
<elementSpec ident="div" mode="change">
<constraintSpec mode="add" ident="canondiv" scheme="schematron">
<constraint>
<sch:report test="@type='canon' and parent::tei:div/@type='canon'">
divs of type 'canon' may not be nested
</sch:report>
<sch:report test="@type='canon' and parent::tei:div/@type='register'">
divs of type 'canon' may not be nested within 'register'
</sch:report>
<sch:report test="@type='canon' and count (tei:div[@type='canonText']) &gt;1">
divs of type 'canon' may contain only one 'canonText'
</sch:report>
<sch:report test="@type='canonText' and not(parent::tei:div[@type='canon'])">
divs of type 'canonText' can only occur inside 'canon'
</sch:report>
<sch:report test="@type='canon' and parent::tei:div/@type='canon'">
divs of type 'canon' may not be nested
</sch:report>
<sch:report test="@type='canon' and parent::tei:div/@type='register'">
divs of type 'canon' may not be nested within 'register'
</sch:report>
<sch:report test="@type='canon' and count( tei:div[ @type eq 'canonText'] ) gt 1">
divs of type 'canon' may contain only one 'canonText'
</sch:report>
<sch:report test="@type='canonText' and not( parent::tei:div[ @type eq 'canon'])">
divs of type 'canonText' can only occur inside 'canon'
</sch:report>
</constraint>
</constraintSpec>
</elementSpec>
Expand Down Expand Up @@ -357,7 +360,10 @@
</attList>
</elementSpec>
<elementSpec ident="blort2" mode="add">
<desc xml:lang="en" versionDate="2015-03-27">another completely spurious element made up for testing purposes only</desc>
<desc xml:lang="en" versionDate="2024-03-13">
Another completely spurious element made up for testing purposes only.
For information on what is being tested, see <gi>remarks</gi>.
</desc>
<classes>
<memberOf key="model.pPart.data"/>
</classes>
Expand Down Expand Up @@ -517,6 +523,166 @@
</listRef>
</elementSpec>

<elementSpec ident="no_duplicate_attrs_1_invalid" mode="add">
<desc xml:lang="en" versionDate="2024-03-13">
Another completely spurious element made up for testing purposes only.
For information on what is being tested, see <gi>remarks</gi>.
</desc>
<classes>
<memberOf key="att.global"/>
</classes>
<content><empty/></content>
<attList>
<attDef ident="bad"/>
<attDef ident="good"/>
<attDef ident="better"/>
<attDef ident="best"/>
<attDef ident="bad"/>
</attList>
<remarks xml:lang="en" versionDate="2024-03-13">
<p>Here we are testing the Schematron that tests for duplicate
atttribute defintions, i.e. <name>no_duplicate_attrs</name>.</p>
<p>This <gi>attList</gi> should be invalid because there
are two <att>bad</att> attributes defined as siblings.</p>
<p>Note that because this element is not a member of any
model class nor in any other element’s content model, it
will probably not show up in an output schema, which is
a good thing, becase defining an attribute twice like
that is an error in at least RELAX NG and DTD.</p>
</remarks>
</elementSpec>

<elementSpec ident="no_duplicate_attrs_2_valid" mode="add">
<desc xml:lang="en" versionDate="2024-03-13">
Another completely spurious element made up for testing purposes only.
For information on what is being tested, see <gi>remarks</gi>.
</desc>
<classes>
<memberOf key="att.global"/>
</classes>
<content><empty/></content>
<attList>
<attDef ident="have"/>
<attList org="choice">
<attDef ident="fun"/>
<attDef ident="fun"/>
<attDef ident="fun"/>
</attList>
<attDef ident="til"/>
<attDef ident="her"/>
<attDef ident="daddy"/>
</attList>
<remarks xml:lang="en" versionDate="2024-03-13">
<p>Here we are testing the Schematron that tests for duplicate
atttribute defintions, i.e. <name>no_duplicate_attrs</name>.</p>
<p>This <gi>attList</gi> should be valid because,
although there are three <att>fun</att> attributes
defined as siblings, they are in alternation with one
another.</p>
<p>Note that because this element is not a member of any
model class nor in any other element’s content model, it
will probably not show up in an output schema.</p>
</remarks>
</elementSpec>

<elementSpec ident="no_duplicate_attrs_3_invalid" mode="add">
<desc xml:lang="en" versionDate="2024-03-13">
Another completely spurious element made up for testing purposes only.
For information on what is being tested, see <gi>remarks</gi>.
</desc>
<classes>
<memberOf key="att.global"/>
</classes>
<content><empty/></content>
<attList>
<attList org="choice">
<attDef ident="Your"/>
<attDef ident="My"/>
</attList>
<attDef ident="spirit"/>
<attDef ident="and"/>
<attList org="choice">
<attDef ident="my"/>
<attDef ident="your"/>
</attList>
<attDef ident="voice"/>
<attDef ident="in"/>
<attDef ident="one"/>
<attDef ident="combined"/>
<attDef ident="The"/>
<attDef ident="Phantom"/>
<attDef ident="of"/>
<attDef ident="the"/>
<attDef ident="Opera"/>
<attDef ident="is"/>
<attDef ident="there"/>
<attDef ident="Inside"/>
<attList org="choice">
<attDef ident="my"/>
<attDef ident="your"/>
</attList>
<attDef ident="mind"/>
</attList>
<remarks xml:lang="en" versionDate="2024-03-13">
<p>Here we are testing the Schematron that tests for duplicate
atttribute defintions, i.e. <name>no_duplicate_attrs</name>.</p>
<p>This <gi>attList</gi> should be invalid because there
are both two <att>my</att> and two <att>your</att>
attributes defined, and although each is in an alternate
group, the two <att>my</att> attributes are not in
alternation with one another (nor are the two
<att>your</att>s).</p>
<p>Note that the <att>Your</att> vs <att>your</att> and
<att>The</att> vs <att>the</att> are not problems at all
— XML is completely case sensitive, those are different
attributes.</p>
<p>Note that because this element is not a member of any
model class nor in any other element’s content model, it
will probably not show up in an output schema, which is
a good thing, becase defining an attribute twice like
that is an error in at least RELAX NG and DTD.</p>
</remarks>
</elementSpec>

<elementSpec ident="no_duplicate_attrs_4_invalid" mode="add">
<desc xml:lang="en" versionDate="2024-03-13">
Another completely spurious element made up for testing purposes only.
For information on what is being tested, see <gi>remarks</gi>.
</desc>
<classes>
<memberOf key="att.global"/>
</classes>
<content><empty/></content>
<attList>
<attDef ident="all"/>
<attDef ident="the"/>
<attDef ident="leaves"/>
<attDef ident="are"/>
<attDef ident="brown"/>
<attList>
<attDef ident="and"/>
<attList>
<attDef ident="the"/>
<attDef ident="sky"/>
</attList>
<attDef ident="is"/>
</attList>
<attDef ident="gray"/>
</attList>
<remarks xml:lang="en" versionDate="2024-03-13">
<p>Here we are testing the Schematron that tests for duplicate
atttribute defintions, i.e. <name>no_duplicate_attrs</name>.</p>
<p>This <gi>attList</gi> should be invalid due to the
two definitions of the <att>the</att> attribute, which
are not in alternation.</p>
<p>Note that because this element is not a member of any
model class nor in any other element’s content model, it
will probably not show up in an output schema, which is
a good thing, becase defining an attribute twice like
that is an error in at least RELAX NG and DTD.</p>
</remarks>
</elementSpec>

</schemaSpec>
</div>
</body>
Expand Down

0 comments on commit 03c774a

Please sign in to comment.