Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequence of top level elements within msDesc #2214

Closed
schassan opened this issue Dec 1, 2021 · 30 comments
Closed

sequence of top level elements within msDesc #2214

schassan opened this issue Dec 1, 2021 · 30 comments

Comments

@schassan
Copy link

schassan commented Dec 1, 2021

The general structure of a manuscript description following the German cataloguing rules puts the description of the contents at the end, right after physDesc, history, and literature and before the msParts. The schema declaration (snippet) would look like this:

(
msIdentifier,
model.headLike*,
(
model.pLike+
| (
physDesc?,
history?,
additional?,
msContents?,
( msPart* | msFrag* )
)
)
)

A brief discussion on TEI-MS-SIG has shown that in other regions than only in German speaking countries different order of elements is preferable too.

In the past we dealt with this problem by just changing the sequence of contents in the transformations while preparing the output of the descriptions, on the web and/or for print. But in the future we will have to change the file structure itself as we are about to implement an editor which will save the file "as shown" (WYSIWYG, the hard way). :(

I guess, in order to allow for the current structure and the German one, we need to loose the rules for the sequence or define the latter as another option?

It would be necessary to keep the limit of occurences of physDesc, history, additional, and msContents at One though.

@peterstadler
Copy link
Member

I'm all in favor for loosening this sequential constraint 👍

@sydb
Copy link
Member

sydb commented Dec 21, 2021

An Analysis Expressed in (mostly) Formal Syntax

Here is my take on the problem @schassan is expressing and the various solutions. If anyone knows how to get GitHub to syntax-color the code block of RELAX NG Compact Syntax below, please let me know (or just edit this post). [JC: there isn't one for rnc highlighting, but you could do something else like js. ;-) ]

start = element msDescs_for_2214 { msDesc_current, msDesc_schassan, msDesc_2214_alt, msDesc_2214_fac, msDesc_2214_dtd, msDesc_2214_mso }

# ---------------------------------------------

# Current content model in P5
msDesc_current =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       ( msContents?, physDesc?, history?, additional?, (msPart* | msFrag*) )
    )
  }

# What @schassan wants to use
msDesc_schassan =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       ( physDesc?, history?, additional?, msContents?, (msPart* | msFrag*) )
    )
  }

# Simple alternation of the two
msDesc_2214_alt =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       ( msContents?, physDesc?, history?, additional?, (msPart* | msFrag*) )
       |
       ( physDesc?, history?, additional?, msContents?, (msPart* | msFrag*) )
    )
  }

# Alternation of the two using intermediate patterns to make it easier
# to read & understand
msDesc_2214_fac =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       ( msContents?, msDesc_info, msDesc_rest )
       |
       ( msDesc_info, msContents?, msDesc_rest )
    )
  }
msDesc_info = ( physDesc?, history?, additional? )
msDesc_rest = ( msPart* | msFrag* )

# Allowing the relevant children of <msDesc> to occur in any order.
# The problem with this, of course, is that it is hard to express it
# in the DTD language, and `trang` will not convert it to a DTD
# automatically.
msDesc_2214_amp =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       (
          ( physDesc? & history? & additional? & msContents? ),
          ( msPart* | msFrag* )
       )
    )
  }

# Same as above re-expressed so it could be auto-converted to DTD
# easily. ADDENDUM  The original version posted here was incorrect;
# this post has been edited. Thanks to MSMcQ for helping me think through
# the RelaxNG content model that is non-deterministic, and thus will work
# in DTD-land.
msDesc_2214_dtd =
   element msDesc {
      msIdentifier,
      model.headLike*,
      (
         model.pLike+
         |
         (
           ( 
             ( additional,
               (
                   ( history,    ( ( msContents, physDesc? )   | ( physDesc, msContents?   ) )? )
                 | ( msContents, ( ( history, physDesc? )      | ( physDesc, history?      ) )? )
                 | ( physDesc,   ( ( msContents, history? )    | ( history,  msContents?   ) )? )
               )?
             )
             |
             ( history,
               (
                   ( additional, ( ( msContents, physDesc? )   | ( physDesc, msContents?   ) )? )
                 | ( msContents, ( ( additional, physDesc? )   | ( physDesc, additional?   ) )? )
                 | ( physDesc,   ( ( msContents, additional? ) | ( additional, msContents? ) )? )
               )?
             )
             |
             ( msContents,
               (
                   ( history,    ( ( additional, physDesc? )   | ( physDesc, additional?   ) )? )
                 | ( additional, ( ( history, physDesc? )      | ( physDesc, history?      ) )? )
                 | ( physDesc,   ( ( additional, history? )    | ( history, additional?    ) )? )
               )?
             )
             |
             ( physDesc,
               (
                   ( history,    ( ( msContents, additional? ) | ( additional, msContents? ) )? )
                 | ( msContents, ( ( history, additional? )    | ( additional, history?    ) )? )
                 | ( additional, ( ( msContents, history? )    | ( history, msContents?    ) )? )
               )?
             )
           ),
           ( msPart* | msFrag* )
         )
      )
   }

# A rough demonstration of how this would be done using references to
# the _sequenceOptional versions of model classes. The basic idea is
# to create 2 model classes, one for each desired sequence, and refer
# to that model class with the _sequenceOptional derived pattern,
# rather than directly. To my konwledge, we *never* refer to any of
# the derived variations of a model class in the Guidelines.
msDesc_2214_mso =
  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       ( model.msStuff1_sequenceOptional, (msPart* | msFrag*) )
       |
       ( model.msStuff2_sequenceOptional, (msPart* | msFrag*) )
    )
  }

model.msStuff1 = msContents, physDesc, history, additional
model.msStuff2 = physDesc, history, additional, msContents
# Given the “model classes” defined above, the TEI ODD processor
# automatically generates the patterns defined below and referred to
# in the content model of the ms_Desc_2214_mos pattern.
model.msStuff1_sequenceOptional = msContents?, physDesc?, history?, additional?
model.msStuff2_sequenceOptional = physDesc?, history?, additional?, msContents?

# ---------------------------------------------
# Dummy pattern declaratioins so this schema can actually be used (and
# bonus: you won’t get red lines in oXygen :-)
msIdentifier = element msIdentifier { text }
model.headLike = element head { text }
model.pLike = element p { text }
physDesc = element physDesc { text }
history = element history { text }
additional = element additional { text }
msContents = element msContents { text }
msPart = element msPart { text }
msFrag = element msFrag { text }

@schassan
Copy link
Author

I realised that I should have contributed to the discussion before in order to have this issue dealt with in the current release. :(

I would like to stress the importance that this change has for us thus I would aks for priorisation in the next release.

Maybe it is a good idea to allow every order of top level elements (after msIdentifier and head) because there might be communities which are in need of another one, not foreseen in one of the solutions Syd proposed, Even in the scope of the Handschriftenportal project we think of the binding as a msPart which can and should be described separately. This would result in the following order:

msIdentifier
head
physDesc
msPart[@type='binding']
history
additional
msContents
msPart[@type='fragment']
msPart[@type='booklet']

@sydb
Copy link
Member

sydb commented Jun 10, 2022

Some discussion of this is likely soon, @schassan, and perhaps resolution in your favor, maybe as early as the next release. But surely the Handschriftenportal project should be using a customization ODD that allows precisely the desired order of elements, whether TEI eventually comes ’round to thinking your way or not.

Note: It is even possible to get your customization to require, in the RELAX NG, that the 1st <msPart> have a @type of "binding", and the last a @type of "booklet", etc. But that is awfully hard and probably a bit fragile. Much easier to just use something like

            <sequence>
              <elementRef key="msIdentifier"/>
              <classRef key="model.headLike" minOccurs="0" maxOccurs="unbounded"/>
              <elementRef key="physDesc" minOccurs="0" maxOccurs="1"/>
              <elementRef key="msPart" minOccurs="0" maxOccurs="1"/>
              <elementRef key="history" minOccurs="0" maxOccurs="1"/>
              <elementRef key="additional" minOccurs="0" maxOccurs="1"/>
              <elementRef key="msContents" minOccurs="0" maxOccurs="1"/>
              <elementRef key="msPart" minOccurs="2" maxOccurs="2"/>
            </sequence>

as the <content> in the element specification of <msDesc> and then use Schematron inside <constraintSpec> to insist on the various @type values of <msPart>.

Note on the note: The content model above is ambiguous and thus cannot be converted to a DTD (at least not by trang or not to one that works), and although trang will convert it to XSD, the resulting schema “violate[s the] "Unique Particle Attribution". During validation against this schema, ambiguity would be created for those two particles.”. So either do not use the above without requiring one of <history>, <additional>, or <msContents>, or give up on ever using DTDs or XSDs. Which is what I would do. 😁

@sydb
Copy link
Member

sydb commented Jun 10, 2022

Made green for

  • MS to ask the MSS-SIG whether loosening the content model to allow these four things in any order is important or not. (I.e., is @schassan an outlier, and allowing just the current order and his desired order would do, or should we allow any order, which might be a bit hard to do).
  • SB to work on the “hard to do” part — making sure the msDesc_2214_dtd pattern, above, can in fact be converted to DTD.

@sydb
Copy link
Member

sydb commented Jun 11, 2022

Short version
No, the msDesc_2214_dtd pattern, above, is non-deterministic, and cannot be converted to a DTD. Don’t know what I was thinking. NOTE — above was edited 2022-10 and is now deterministic, so rest of this comment is here for historical reference only.

Medium version
Although trang happily converts the msDesc_2214_dtd pattern to a DTD, that DTD is non-detministic, and thus invalid and cannot be used. It turns out oXygen will happily validate a document against that DTD, which I suspect that may be a bug in oXygen, or because I have some option set incorrectly in my oXygen. Truth is, the resulting DTD is non-deterministic, and cannot be used. Sigh.

Error message from xmllint

validity error : Content model of msDesc is not determinist: [ … content model here … ]

Thoughts on Solutions
At the moment I am pessimistically feeling this cannot be represented in a DTD. I expect to work on it some more in a few hours.

@larkvi
Copy link

larkvi commented Sep 19, 2022

Per the msDesc SIG meeting at TEI2022, there was general agreement that we have no strong reasons to enforce the order of elements in msDesc. Given that different cataloguing traditions use different orders in their practices and cataloguers in these traditions would like to enter msDesc information in their preferred order rather than the TEI-enforced order, we thought that the easiest solution is to remove the ordering constraint.

@sydb
Copy link
Member

sydb commented Oct 5, 2022

With the help of the amazing @cmsmcq I have updated the msDesc_2214_dtd pattern in my comment of 21 Dec 21 so that it now works. (I.e., can be converted to a deterministic DTD.)

But I think it worth pointing out that removing the ordering constraint (i.e., moving from msDesc_current to msDesc_2214_dtd), while quite possibly the easiest solution from a user perspective, makes for a very complex content model.

@lb42
Copy link
Member

lb42 commented Oct 5, 2022

Can we see the proposed nondeterministic version in pure odd plz?

@sydb
Copy link
Member

sydb commented Oct 5, 2022

Not sure why anyone would want that, @lb42 (after all, there is a reason it is called the compact syntax — in PureODD it is 191 lines long, in Relax NG XML syntax it is over 260 lines long), but here it is as a plain text file because GitHub is scared of XML.

@sydb
Copy link
Member

sydb commented Oct 5, 2022

Oooh … but I forgot an important note or caveat (thank you, @lb42) —

The unambiguous _dtd version above (which I just converted to PureODD for @lb42) is slightly different than the content model we currently have in that it requires one of <additional>, <history>, <msContents>, or <physDesc>. We might not want that.

(I believe it would be quite easy to change: just insert a ? immediately in front of the last comma.)

sydb added a commit that referenced this issue Apr 29, 2023
 * New macro, macro.msDescPart, which boils down to the DTD-friendly equivalent of '( msContents? & physDesc? & history? & additional? )'
 * Use that macro in place of '( msContents?, physDesc?, history?, additional? )' in the content models of msDesc, msFrag, msPart, and object
 * Note: changes to MS are just whitespace and namespace prefix changes; no changes to actual Guidelines
@sydb
Copy link
Member

sydb commented Apr 29, 2023

I have just created a version of TEI that uses a macro to express the msDesc_2214_dtd constraint, above; said macro then gets used in <msDesc>, <msFrag>, <msPart>, and <object>. I think this is exactly what @schassan and the MS-SIG want.

I named the macro macro.msDescPart and put it in ST (not MS) because it is used by <object>, too, which is not part of MS (but of ND).

The only downside I see is that because it is a cumbersome content model it makes for a long, ugly tagdoc page. But (I claim) the schema will do exactly what the MSSers want.

This was all done in the sydb_2214 branch. Could the Jenkins maintainers set up a job so the MSSers can test this new schema, or should I put up a copy on my basement server (or both)? Pinging @raffazizzi, @peterstadler, and @martindholmes on that last question.

@martindholmes
Copy link
Contributor

@sydb I've added a build job TEI-P5-sydb_2214 on my server and it's building now.

@sydb
Copy link
Member

sydb commented Apr 30, 2023

Excellent, thank you @martindholmes. It built successfully (which makes sense, there were no problems when I built it in a Docker).

So @schassan and the MS-SIGgers (and anyone else interested, even if you are only interested in making sure this change does not cause problems for you) —
you can find the documentation for this new version of <msDesc> at https://jenkins2.tei-c.org/job/TEI-P5-sydb_2214/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/ref-msDesc.html, and you can play with the schema by using the one at, say, https://jenkins2.tei-c.org/job/TEI-P5-sydb_2214/lastSuccessfulBuild/artifact/P5/release/xml/tei/custom/schema/relaxng/tei_ms.rnc.

If you can report back here whether these changes seem acceptable (or not) in the next week, TEI Technical Council can perhaps act on this at its upcoming meeting 1 week from today.

One issue remaining, BTW, is to read through the prose and make sure it does not explicitly say the order is required anywhere (and fix it if it does). I do not think we have to provide examples using a different order. (But say so if you disagree with me.)

@sydb sydb self-assigned this May 2, 2023
@schassan
Copy link
Author

schassan commented May 2, 2023

Dear @sydb, thanks for your efforts. Unfortunately you didn't apply the wider change I proposed in the pull request: Differing from my initial request and proposal I wanted to allow for any order of child elements within msDesc, after msIdentifier and head, thus I defined an alternate order:

    <sequence>
      <elementRef key="msIdentifier"/>
      <classRef key="model.headLike" minOccurs="0" maxOccurs="unbounded"/>
      <alternate>
        <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
        <alternate>
          <elementRef key="msContents" minOccurs="0"/>
          <elementRef key="physDesc" minOccurs="0"/>
          <elementRef key="history" minOccurs="0"/>
          <elementRef key="additional" minOccurs="0"/>
          <alternate>
            <elementRef key="msPart" minOccurs="0" maxOccurs="unbounded"/>
            <elementRef key="msFrag" minOccurs="0" maxOccurs="unbounded"/>
          </alternate>
        </alternate>
      </alternate>
    </sequence>
  </content>

(likewise for msDesc, msPart, and msFrag. Not having taken into account the 'object' element, I suppose its content model would be in analogy:

    <sequence>
      <elementRef key="objectIdentifier" minOccurs="1" maxOccurs="unbounded"/>
      <classRef key="model.headLike" minOccurs="0" maxOccurs="unbounded"/>
      <alternate>
        <classRef key="model.pLike" minOccurs="0" maxOccurs="unbounded"/>
        <alternate>
          <elementRef key="msContents" minOccurs="0"/>
          <elementRef key="physDesc" minOccurs="0"/>
          <elementRef key="history" minOccurs="0"/>
          <elementRef key="additional" minOccurs="0"/>
          <elementRef key="object" minOccurs="0" maxOccurs="unbounded"/>
        </alternate>
      </alternate>
      <alternate minOccurs="0" maxOccurs="unbounded">
        <classRef key="model.noteLike"/>
        <classRef key="model.biblLike"/>
        <elementRef key="linkGrp"/>
        <elementRef key="link"/>
        <elementRef key="object" minOccurs="0" maxOccurs="unbounded"/>
      </alternate>
    </sequence>
  </content>

The "free" order of child elements is of crucial importance for the Handschriftenportal, but was suggested by the MS-SIG as well, in order to be suitable for as many cataloguing traditions as possible (and allow for an easier implementation).

I will look at the descriptions soon.

@schassan
Copy link
Author

schassan commented May 2, 2023

The descriptions text might change quite a bit, because the idea of the change is to broaden the meaning of <msPart> in that sense, that not only codicological units in the "traditional" sense should be encoded using this element, but that any part of the manuscript with its own physical appearance, history, etc shall be described using this element. Especially the binding, but in-situ fragments would be covered by that wider usage. Descriptions of bindings, fragments, accompanying materials, and booklets shall be distinguished by the 'type' attribute.

@sydb
Copy link
Member

sydb commented May 3, 2023

@schassan
While I am perfectly willing to accept that I did not get this right, I am pretty convinced that the content model you propose does not do what you want. It certainly does not do what I think (or even what I thought) you want. 😄

Current content model:

  element msDesc {
    msIdentifier,
    model.headLike*,
    (
       model.pLike+
       |
       (
	  msContents?,
	  physDesc?,
	  history?,
	  additional?,
	  ( msPart* | msFrag* )
       )
    )
  }

Content model from schassan:issue-2214:

  element msDesc {
    ( 
       msIdentifier,
       model.headLike*,
       (
          model.pLike+
        |
          (
             msContents?
           | physDesc?
	   | history?
	   | additional?
	   | ( msPart* | msFrag* )
          )
       )
    )
  }

The problem with that 2nd content model is that at most 1 of the 4 elements <msContents>, <physDesc>, <history>, or <additional> may appear — not 1 of each, at most 1, period. And although multiple occurrences of <msPart> or <msFrag> may be present, if there are any occurrences of either of them, there cannot also be an <msContents>, <physDesc>, <history>, or <additional>.

The content model in sydb_2214 allows “free order” of the 4 elements <msContents>, <physDesc>, <history>, and <additional> (0 or 1 occurrence of each) followed by all of the <msPart>s or <msFrag>s.

I am guessing that the complaint you have with that is that the order of <msPart>s or <msFrag>s is constrained (they need appear at the end of the <msDesc>). If I have that bit right, I do not know what we should do. The methodology I employed (simply listing out all possible orders) is already very cumbersome at 4 elements. (The content model in macro.msDescPart contains 121 descendant elements; no other TEI content model has even half that many. Adding <msPart> and <msFrag> into the mix would roughly double or quadruple that number.)

The other possibilities that jump to mind are to

  • keep the requirement that the <msPart>s or <msFrag>s remain at the end; or
  • convince the TEI to drop support for XSD and DTD, and then use <rng:interleave> in the content model.

@sydb
Copy link
Member

sydb commented May 8, 2023

When I presented this to Council yesterday I added another possibility: loosen the closed schema content model to be a standard zero-or-more of an alternation, and then use an added <constraintSpec> to enforce the “no more than 1 each of <msContents>, <physDesc>, <history>, and <additional>” constraint.

That one was the solution that, I think, everyone else in the room liked the most, by far. The basic idea is that the TEI abstract model (which calls for no more than 1 each of <msContents>, <physDesc>, <history>, and <additional>) would be expressed in prose, and enforced by a combination of both the closed schema (RELAX NG, DTD, or XSD) and the open schema (Schematron).

This solution, as expressed quite eloquently by @joeytakeda, was so popular that the general-purpose version thereof now has a name: the Takeda Strategy.

I believe (but I am not sure) that an important step to making use of this strategy is to update section #CFDL so that it is clear that by “TEI Schema” we mean both the closed schema (RELAX NG, although a user is welcome to use DTD or XSD if more convenient, with the understanding that not all constraints may be present in those versions) and the open schema (ISO Schematron).

We should have done that years ago, anyway.

BUT even with this strategy, we still need to know if you (@schassan and the MS-SIG) are asking for

( msContents? & physDesc? & history? & additional? ), ( msPart* | msFrag* )

or what you really want is

( msContents? & physDesc? & history? & additional? & msPart* & msFrag* )

which some of us consider at least messy, if not bad practice. (Reminder: using the ‘&’ means that each particle is required (although if it has a ‘?’ or a ‘*’ it may occur only 0 times), but that order is not important.)

@schassan
Copy link
Author

Sorry for my delayed answer, I seem to have missed any notification of github to me.

What we want and need is indeed the constraint, that any of the four elements (msContents, physDesc, history, and additional) may appear only once, but (exclusive) either msPart or msFrag (given the interpretation, that msFrag is for virtual re-binding only and not be used for fragments in the current ms) may appear as often as needed, and all those in any given order.

Background to that request is (as an example only, as others raised this request as well in the SIG meeting!) that in German manuscript cataloguing the order of top level elements is (msIdentifer, head, physDesc, bindingDesc, fragments, history, additional, msContents, msPart). Right now physDesc contains bindingDesc which we want to "drop" to use and instead use a msPart[@type='binding'] instead, as the binding might be an object of its own dignity, with its own history etc to be described in more detail as bindingDesc allows for. Additionally there is no place to store detailed information on fragments and accompanying materials in.

Thus the needed order of elements according to German cataloguing rules would be:
msIdentifier, head, physDesc, msPart[@type='binding'], msPart[@type='fragment'], history, additional, msPart[@type='booklet'], msPart[@type='accMat']*

@sydb
Copy link
Member

sydb commented May 16, 2023

OK. Will try to poke at this a bit more in the next few days …

@schassan
Copy link
Author

schassan commented Jun 5, 2023

And while I seem to have used a quantifier on msPart[@type='accMat'], the "correct" expression as a whole would then be:

msIdentifier, head, physDesc, msPart[@type='binding'], msPart[@type='fragment'], history, additional, msPart[@type='booklet'], msPart[@type='accMat']

Thus, all msParts should be repeatable whereas the "old" singular top level elements stay the way they were defined, to appear only once at max.

Although the order of elements might be fixed, it should be the easiest way to allow for any order except msIdentifier and head coming first. Especially as this order is first of all according to the German cataloguing rules, there might be others.

@sydb
Copy link
Member

sydb commented Jun 7, 2023

OK. I am currently of a mind that the best (current)¹ way to get the requested content model for <msDesc> is to follow the Takeda Strategy — to use

(
  msIdentifier, model.headLike*,
    (
       model.pLike+
       |
       ( msContents | physDesc | history | additional | msPart | msFrag )*
    )
)

as the RELAX NG content model, and then use a Schematron constraint that complains if there are > 1 occurrences of any one of <msContents>, <phyDesc>, <history>, or <additional>.

BUT we still have to decide what, if anything, should change with the content models of <msFrag>, <msPart>, and <msObject>, which all have a similar ( msContents?, physDesc?, history?, additional? ) portion.

Note
¹ I say best “current” way, because I would suggest something different if we had interleave and dropped support for DTDs.

@laurentromary
Copy link
Contributor

Sorry to come out of the blue on this, but could this me a mechanisms attached to a class ? (here the class comprising msContents, physDesc, history, additional, msPart msFrag) on which you would add an indication of maximal use (not more than once for each). <= maybe completely stupid. Do not hesitate to trash.

@sydb
Copy link
Member

sydb commented Jun 11, 2023

No, @laurentromary, I do not think it is a stupid idea at all. Whatever we end up doing may well involve either 1 class for all 6 elements, or 1 class for msContents, physDesc, history, & additional, and a separate class for msPart & msFrag. On the other hand, may not be able to use a class for the 1st four if we want to get optionality and lack of sequence from preserveOrder=false. I am hoping to test that out now.

@schassan and MS-SIG — two questions:

  1. Is there a need to be able to intermingle both <msPart> and <msFrag> in (various places within) the same <msDesc>, or should we continue to enforce the “within any particular <msDesc> you can use either as many <msPart>s or as many <msFrag>s as you want (and put them anywhere amongst the other elements), but not both”?
  2. Still hoping for answer to my question of 07 Jun: What do you think, if anything, should change with the content models of <msFrag>, <msPart>, and <msObject>, which all have a similar ( msContents?, physDesc?, history?, additional? ) portion.”

@schassan
Copy link
Author

schassan commented Sep 6, 2023

@sydb
ad 1.:

First of all we should be aware that msFrag has changed its definition over time: In the beginning it was meant to contain (or rather: be) a copy of an existing msDesc or msPart, and should be used only for virtual reconstruction. Although (or maybe because?) I always complaint about that distinguished meaning and usage, right now a msFrag could contain the description of a fragment, really. Sticking to the "old" definition, a fragment could also be described using msPart @type=fragment.

Having said that, two different answers might be possible:

  • If we still consider the main purpose of msFrag to be meant for virtual reconstruction, I think that the usage of msPart and msFrag must have to be mutually exclusive.
  • If the semantics of msFrag is considered to be more loose, both elements should be allowed at any place and any number of occurrences.

Considering that we might break documents and that I (personnally) prefer a quick solution, I would favour the second option, allowing both.

ad 2.: As msPart and msFrag always shared the content model of msDesc, I would think they should be changed accordingly.

I would be very glad, if this issue could be decided upon, implemented and closed in the upcoming release. ;-)

@sydb
Copy link
Member

sydb commented Oct 12, 2023

OK. I have implemented a possible solution in branch 'sydb_2214_take_4'.

The change is that any time the foursome of <msContents>, <phyDesc>, <history>, and <additional> occurs in a content model¹ the clause they are in is now an alternation. In the RELAX NG it is an alternation of 0–∞, but an additional constraint (in Schematron) complains if there is more than 1 of any of those four elements.

Additionally, when both <msFrag> and <msPart> can appear, they can now be intermingled.

I have put a copy of both the Guidelines and the exemplar schemas up on my basement server. The only real change to the Guidelines prose is the 3rd paragraph of 10.2 (starts with “The first of these components …”); the only tagdoc pages that are changed are for msDesc, msFrag, msPart, and object.

If anyone who understands manuscript description better than I could take a quick look and make sure I am not off the rails before I submit a pull request for this, it would be appreciated.

Notes
¹ Those four elements only occur in the content of <msDesc>, <msFrag>, <msPart>, and <object>, and in each case they all 4 occur together.

@holfordm
Copy link

The revised prose at 10.2 should perhaps specify that the specialised elements can newly occur in any order. The reference to composite manuscripts doesn't cover what @schassan is proposing in using msPart to describe bindings

@sydb
Copy link
Member

sydb commented Oct 13, 2023

Thank you @holfordm !

I am not sure I see a good reason to explicitly say that the fact that they can appear in any order is new, but probably does make sense to mention they can appear in any order, which I have now done.

But I am afraid I do not grok your 2nd sentence (I do not do real mss description, I just make stuff up to test TEI :-). But I think you are talking about the sentence “Finally, in the case of a composite manuscript (a manuscript composed of several codicological units) or a fragmented manuscript (a manuscript whose parts are now dispersed and kept at different places), a full description may also contain one or more <msPart> (…) elements and <msFrag> (…) elements, respectively.”, yes? Can someone propose what it should say?

@holfordm
Copy link

Probably the existing text is fine since using msPart to describe bindings might be a practice that is restricted to the Handschriftenportal - @schassan is best placed to comment I think

@sydb
Copy link
Member

sydb commented Oct 23, 2023

@schassan (and @holfordm and the rest of the MS-SIGers) — I am hoping this is ready to go into the Guidelines, and have created pull request #2495 to that effect. If you could please check this version and let us know if it is OK or not in < 10 days, it can probably make it into the upcoming release.

You can see the comparison of the actual source files on the PR. As previously, the most recent version of the Guidelines and Exemplars generated from this branch are now up on my basement server. Please test.

@sydb sydb closed this as completed Nov 10, 2023
@ebeshero ebeshero added this to the Guidelines 4.7.0 milestone Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants