Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify representation of milestones #4

Closed
kshawkin opened this issue Aug 31, 2015 · 11 comments
Closed

clarify representation of milestones #4

kshawkin opened this issue Aug 31, 2015 · 11 comments
Assignees

Comments

@kshawkin
Copy link
Owner

For a line if asterisks between paragraphs, we currently say to use <ab type="typography">******</ab>. I think we chose this for ease of rendering a sequence of Unicode characters that's included as content of this element. We should explain this rationale for choosing <ab> (over <milestone> or <space>, which are more conventional choices).

Furthermore, we should add @Style to the example:

<ab type="typography" style="text-align: center">******</ab>
@kshawkin
Copy link
Owner Author

Check that we say what to do if you encounter this at a division (<div>) boundary.

@sydb
Copy link
Collaborator

sydb commented Oct 29, 2015

But what we say how to encode this at <div> boundary does not effect this ticket. I'm inclined to say “yes”, we should add prose explaining.

@kshawkin
Copy link
Owner Author

kshawkin commented Mar 7, 2016

I think Syd is right that what to do about a <div> boundary is a separate issue. In fact, I now remember that the Best Practices already say that when a <pb> occurs at a div boundary, we should always include it within a <div> for ease of processing. We probably should be consistent for what we say these milestones.

But on the question originally raised in this issue, Martin Mueller suggested that <pc> would be more appropriate than <ab>. He agreed to post to TEI-L to explain that we are looking for a non-empty element for this purpose (ruling out <milestone> and <space>) and ask for people's thoughts on using <pc>. We'll return to this once we have some thoughts.

@PFSchaffner
Copy link
Collaborator

<pc> is semantically appropriate but I suspect would be practically infeasible, since it is a word/chunk level element that can only appear where plain text can appear, whereas pause-lines or almost-divs, or whatever you want to call these indications of a vague break, occur often in places that are more likely to require an element sibling to the elements on either side (<p> or <lg> for example). Which I think leaves only <ab> and <metamark> as practical options, assuming that you want to capture the line of whatevers as literals, and capture them within the matrix within which the sibling elements float (typically a <div> but possibly also <q> <floatingText> etc.) I.e., whatever you choose has to be able to appear directly in <div> and preferably also in <q(uote)> and <sp> and <floatingText> and probably other places I haven't thought of. As for capturing them as literals, we should bear in mind that 'line of asterisks' is only one of many styles, others being dashes, horizontal rules, dots, alternating dots and equals, etc.

@kshawkin
Copy link
Owner Author

kshawkin commented May 2, 2016

Following discussion during monthly call on 2016-04-04, Martin Mueller posted to TEI-L to ask for input: https://listserv.brown.edu/archives/cgi-bin/wa?A2=TEI-L;96293b6.1604 .

@kshawkin
Copy link
Owner Author

kshawkin commented May 2, 2016

As discussed during today's monthly meeting, BPTL will recommend recommend putting all milestone-type elements in the lowest-level <div>. However, if a user diverges from this practice, they should document in <editorialDecl>. So if we decide on an empty element as the solution for this, we should address placement with respect to <div> elements.

@kshawkin
Copy link
Owner Author

kshawkin commented May 2, 2016

Decided during today's monthly meeting that @emylonas will investigate what P5 says and summarize in a comment here. Then we'll reconsider.

@kshawkin
Copy link
Owner Author

For the record, Martin Mueller wrote on teilib-l on 2016-05-17:

As for typographical milestones, it is the case that they are often ornamental. But they are first of all separators, though of a weak kind. If a div is like a colon, a typographical milestone is like a comma, but there is no clear hierarchy. On the other hand, you never find such phenomena as merely decorative. They always carry the message that what comes after them is in some fashion different from what came before. They may be more like mdashes, indeterminate and anti-hierarchical articulators of text, but articulators nonetheless. The question about what to do with them in an XML environment points to their nature: XML is very binary, but these dividers try to have it both ways. I'd like to keep alive the <pc> option, perhaps as <pc. unit='discourse'>, becaue it may be least abusive option. It puts them "in" the text, where they belong, but it doesn't turn them into the kind of content you'd expect to find inside <ab>. The <pc> element is a relatively recent addition to the tag set, and it may be worth having a discussion about whether acknowledging a weak discursive "punctuation" is a useful extension of a punctuation element or a form of tag abuse.

@emylonas
Copy link
Collaborator

emylonas commented Nov 14, 2016

Two discussions in this ticket:

First Discussion
The P5 discussion on milestones at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CORS5 doesn't provide any specifics on where to place <milestone>-like elements when they appear at division boundaries. Some of the examples imply that they may be outside the lowest level mixed content element, but there is no explicit information.

The reasons suggested for putting the <milestone>-like element inside the mixed content element seem to be that it makes it easier to handle links to facsimile images, BPTL is currently in favor of a) putting <milestone>-like elements inside the lowest level <div> or documenting divergent practice in an <editorialDecl>. For the record, Syd and Elli disagree, and think it should be between the sibling sections and not within them.

Second discussion:
P5 doesn't seem to have any explicit advice on how to mark characters like a row of asterisks that indicate some kind of division, but not necessarily a well understood one like a paragraph or a chapter. Epidoc uses <milestone rend="xx">. The WWP usually declares it to be marking an identifiable section and uses a @rend on a <div> or other container element.
Neither of these is particularly helpful for us in this case.

It seems to make more sense to treat these characters as content and put them in an <ab>

<ab style="xx">*********</ab> 

The Epidoc approach is also viable if the separator has a name

<milestone rend="paragraphos" unit="undefined"/>

Finally, if the separator characters are not in the unicode set and don't have a name, for ex. some kind of curlicue, it may be possible to use a <figure> inside the <ab>.

@emylonas
Copy link
Collaborator

After discussion on 11/13
Explain why we recommend <ab> in the prose and add @style to example. This is Kevin's original suggestion.

emylonas added a commit to emylonas/Best-Practices-for-TEI-in-Libraries that referenced this issue Dec 12, 2016
@emylonas
Copy link
Collaborator

Made changes as above. Note: revisited why not use <metamark>. For the record, it's because that element is part of msDesc and is often used for editorial marks. So <ab> is definitely better. Closed with commit fbc58b8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants