Initialize branch TagsNTitles; replaces attributes @refnum,@frefnum,@…#966
Conversation
…rrefnum with more flexible ltx:tags containing multiple ltx:tag, each with @ROLE to indicate what each tag is used for; this allows formatting within the reference numbers, extensibility for new types of tags (eg. hyperref's autoref), possibilities for consolodating duplicated code and making it easier to implement endnotes, glossaries, etc. These changes will be disruptive of some bindings; More to come!
| title => orNull($doc->findnode('ltx:bibtag[@role="title"]', $node)), | ||
| keytag => orNull($doc->findnode('ltx:bibtag[@role="key"]', $node)), | ||
| typetag => orNull($doc->findnode('ltx:bibtag[@role="bibtype"]', $node))); } | ||
| authors => orNull($doc->findnode('ltx:tags/ltx:tag[@role="authors"]', $node)), |
There was a problem hiding this comment.
Since ltx:tags is used for more than bibliographies, isn't the new selector potentially more general than simply over bibliography tag elements? And is that intended?
There was a problem hiding this comment.
Indeed; I replaced sequences of ltx:bibtag by an ltx:tags containing any number of ltx:tag. But ltx:tags is also used to collect the single ltx:tag that were previously allowed at the beginnings of various sectional, floats, theorems, etc (there can now also be multiple ltx:tag). And they're also allowed in equations and such. So, yeah, that was the idea that a collection of tags would become fully general.
| <XMApp> | ||
| <XMTok fontsize="80%" meaning="not-equals" name="not-=" role="RELOP">≠</XMTok> | ||
| <XMTok font="italic" fontsize="80%" role="UNKNOWN">r</XMTok> | ||
| <XMTok fontsize="80%" meaning="0" role="NUMBER">0</XMTok> |
There was a problem hiding this comment.
unsure if this is minor, but the fontsize attribute disappeared from the individual math tokens? Seems unrelated as a change.
There was a problem hiding this comment.
I think this was due to a bugfix that came about as a side-effect of treating tags more uniformly: sometimes the formatter for a tag was something like \bf\thesomething; Hopefully the current code is more consistent about wrapping a group around so that the sideeffects don't leak out, like the small font did here.
| <tags> | ||
| <tag>1</tag> | ||
| <tag role="refnum">1</tag> | ||
| <tag role="typerefnum">§1</tag> |
There was a problem hiding this comment.
why not trefnum for a role, given there is frefnum? I am unsure if it's better to argue for a single letter convention, or longer attribute names such as refnum_formatted and refnum_type or some such.
Also curious to observe that they really have the same refnum role, but hold distinct variants for that role. Though I suspect adding another attribute is overhead that's not worth its time right now.
There was a problem hiding this comment.
I spent a lot of time, and back-and-forth, with the names, trying to improve over "frefnum" :> There are 3 base cases roughly equivalent to the former frefnum, refnum and rrefnum, but hopefully with more explanatory names, but perhaps those names can be further clarified.
- No role is a "formatted refnum"; it is used for the display in or alongside the object being tagged (eg. items, equations) Exception is that objects with a title or caption, the tag is essentially duplicated within the title/caption so the one in the
ltx:tagsis likely never used. @role=refnumis the plain refnum, essentially\thecounterwhich LaTeX would use to fill in\ref. Note that the author typically gives some context likesee section \ref{foo}, and that not all objects get a refnum, in which case latex uses the refnum of the parent.@role=typerefnumis a "typed refnum" for use when generating references to an object, typically automated, such as back-references and such. There's no context there, so the type name or symbol is helpful.
Hope that helps; if you can think of better names...
There was a problem hiding this comment.
Thanks! Got it.
Well, refnum_formatted and refnum_typed do sound better to me, but they are also long, so probably not ideal. I still slightly lean towards recommending them...
There was a problem hiding this comment.
The other, more informal-sounding, bunch of names that come to mind are things like fmtnum and typnum, but I don't really find the appeal in such dashing shorthands nowadays.
| <table frefnum="Table 1" refnum="1" xml:id="S0.T1"> | ||
| <table xml:id="S0.T1"> | ||
| <tags> | ||
| <tag>Table 1</tag> |
There was a problem hiding this comment.
just as a readability remark, I'm unsure what the tag with no attribute signifies. Looking at the diff I assume it is the frefnum that was originally on the table element? If so, maybe it warrants a role attribute specifying that?
There was a problem hiding this comment.
see reply to previous comment about what the lack of @role means. Either can be default or explicit, if there's a meaningful name for it.
| <div> | ||
| <!-- The element tags | ||
| is currently not supported for the main body. | ||
| --> |
There was a problem hiding this comment.
Extremely minor, but seeing this same comment on every div feels a bit verbose.
There was a problem hiding this comment.
Yeah, the original author of the TEI & JATS stylesheets had different approach than I'm used to. Not sure if those messages were intended only for development or not.
|
For what it's worth, I skimmed the code and left minor comments. The change set is very impressive! So I can not claim any reasonable quality control, but at least I understand the diffs better. Depositing so many refnum-like tags does feel redundant while skimming, but it's probably worth the trouble given how much machinery requires the different variants. I can definitely believe this makes writing the XSLTs much easier. So all 👍 from me, let me know if you'd like me to take a closer look at some specific bit. |
… make ltx:TOC use @lists, rather than @ROLE for corresponding purpose; implement \addcontentsline; otherwise support the various toc and list commands more in tune with the way LaTeX does it, and reduce the dependence on explicit lists of element names
|
OK, let's cross our fingers and do it! |
|
oh wow, it shipped! 🚀 Time to do some testing... |
|
Also, it just occurred to me that some of the production users of latexml are not on the mailing list - so it may be a great idea to update the The best way to do that may be adding a header line on the top: and bookkeep the changes as PRs get merged, especially major ones. Bonus outcome - releasing the next version becomes as simple as bumping the date in the file, and it's less likely to forget some change set. Just a thought, since this specific PR changes a lot, and it is quite possible production deployments may hit underwater discrepancies. |
…rrefnum with more flexible ltx:tags containing multiple ltx:tag, each with @ROLE to indicate what each tag is used for; this allows formatting within the reference numbers, extensibility for new types of tags (eg. hyperref's autoref), possibilities for consolodating duplicated code and making it easier to implement endnotes, glossaries, etc. These changes will be disruptive of some bindings; More to come!