New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add <taxonomy>
and <category>
to att.datcat
#2419
Comments
Just to add a small detail to what @bansp was saying. We are working on an edition of a Portuguese monolingual dictionary from the 18th century, and we are using At the moment, we are using |
Sounds quite reasonable to this linguistic ignoramus.
|
The PR needs a bit more love: where I mention "Morais dictionary", we probably want a project reference or at least a few more words and a link. The resulting att.datcat is here: https://jenkins-paderborn.tei-c.org/view/LingSIG/job/TEIP5-LingSIG-tests/lastSuccessfulBuild/artifact/P5/release/doc/tei-p5-doc/en/html/ref-att.datcat.html |
@anacastrosalgado is going to help and provide the missing reference, she said. |
@bansp Morais Silva, A. M. (1789). Diccionario da lingua portugueza composto pelo padre D. Rafael Bluteau, reformado, e accrescentado por Antonio de Moraes Silva, natural do Rio de Janeiro (vols. 1–2). Officina 730 de Simão Thaddeo Ferreira. MORDigital project (https://mordigital.fcsh.unl.pt/en/homepage/). The digital edition will be available via TEI Lex-0 Publisher at the end of the project. |
Thanks, Ana. I've added the reference. Are you OK with the example used there? I think it came from you directly, but that was something like a year ago, so maybe you'd rather see some changes there while the Council is still in the process. |
@bansp Please, see if it can be like this (@ttasovac , @laurentromary also take a look, please). It is the example that we used yesterday on our presentation during the TEI conference. The att.datcat attributes can be used for any sort of taxonomies. The example below illustrates their usefulness for describing usage domain labels in dictionaries showing a lexicographic article from a Portuguese legacy dictionary, the Morais dictionary [Morais Silva, A. M., (1789). Diccionario da lingua portugueza composto pelo padre D. Rafael Bluteau, reformado, e accrescentado por Antonio de Moraes Silva, natural do Rio de Janeiro, vols. 1–2. Officina 730 de Simão Thaddeo Ferreira. The digital edition will be available via TEI Lex-0 Publisher at the end of the MORDigital project (https://mordigital.fcsh.unl.pt/en/homepage/). <!-- in the dictionary header -->
<encodingDesc>
<classDecl>
<taxonomy xml:id="domains">
<!--...-->
<category xml:id="domain.mathematical_sciences"
valueDatcat="http://www.semanticweb.org/OntoDomLab-Math#MathematicalSciences http://vocabs.rossio.fcsh.unl.pt/morais_domains/0036">
<catDesc xml:lang="en">
<term>Mathematical Sciences</term>
<gloss>Group of areas of study that includes, in addition to mathematics, those
academic disciplines that are primarily mathematical in nature but may not
be universally considered subfields of mathematics proper.</gloss>
</catDesc>
<catDesc xml:lang="pt">
<term>Ciências Matemáticas</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
<category xml:id="domain.mathematics"
valueDatcat="http://www.semanticweb.org/OntoDomLab-Math#Mathematics http://vocabs.rossio.fcsh.unl.pt/morais_domains/0024">
<catDesc xml:lang="en">
<term>Mathematics</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
<catDesc xml:lang="pt">
<term>Matemática</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
<category xml:id="domain.arithmetic"
valueDatcat="http://www.semanticweb.org/OntoDomLab-Math#Arithmetic http://vocabs.rossio.fcsh.unl.pt/morais_domains/0003">
<catDesc xml:lang="en">
<term>Arithmetic</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
<catDesc xml:lang="pt">
<term>Aritmética</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
</category>
<category xml:id="domain.geometry"
valueDatcat="http://www.semanticweb.org/OntoDomLab-Math#Geometry http://vocabs.rossio.fcsh.unl.pt/morais_domains/0018">
<catDesc xml:lang="en">
<term>Geometry</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
<catDesc xml:lang="pt">
<term>Geometria</term>
<gloss>
<!--...-->
</gloss>
</catDesc>
</category>
</category>
</category>
<!--...-->
</taxonomy>
</classDecl>
</encodingDesc> <!-- inside an <entry> element: -->
<usg type="domain" valueDatcat="#domain.mathematics">Mathem.</usg> <entry xmlns="http://www.tei-c.org/ns/1.0" xml:id="MORAIS.DLP.1.ORDENADA" type="mainEntry" xml:lang="pt">
<form type="lemma">
<orth>ORDENADA</orth>
</form>
<metamark function="lemmaDelimiter">,</metamark>
<gramGrp>
<gram type="pos" norm="NOUN">ſ.</gram>
<gram type="gen">f.</gram>
</gramGrp>
<sense xml:id="MORAIS.DLP.1.ORDENADA.s.1">
<usg type="domain" valueDatcat="#domain.mathematics">Mathem.</usg>
<def>linha recta tirada perpendicularmente do ponto da curva a ſeu eixo</def>
</sense>
<metamark function="senseDelimiter">.</metamark>
</entry> <entry xmlns="http://www.tei-c.org/ns/1.0" xml:id="MORAIS.DLP.1.TRIGONOMETRIA" type="mainEntry" xml:lang="pt">
<form type="lemma">
<orth>TRIGONOMETRIA</orth>
</form>
<metamark function="lemmaDelimiter">,</metamark>
<gramGrp>
<gram type="pos" norm="NOUN">ſ.</gram>
<gram type="gen">f.</gram>
</gramGrp>
<sense xml:id="MORAIS.DLP.1.TRIGONOMETRIA.s.1">
<!-- invisible domain -->
<usg type="domain" valueDatcat="#domain.mathematics" resp="#Salgado"/>
<def>parte da Mathematica , que enſina a reſolver os triangulos planos , e esfericos</def>
</sense>
<metamark function="senseDelimiter">.</metamark>
</entry> In the Morais dictionary, the relevant domain labels are organised in the header, getting referenced inside the dictionary, from usg elements. The vocabulary used for dictionary-internal labelling is in turn anchored in the MORDigital controlled vocabulary service of the NOVA University of Lisbon – School of Social Sciences and Humanities (NOVA FCSH). |
I should have phrased my last comment differently :-) Like "is there something in the example right now that makes it utterly wrong (rather than not beautiful enough)" ;-) The last Jenkins build shows some errors, but I'm not at all sure that the errors come from the newly added reference. I have now pushed a new commit and hope that it lights green, and the ticket/PR gets accepted for merging. |
Update: the Jenkins build keeps failing, but it looks a bit like an incompatibility between some configuration item (path?) and some backwards-incompatible modification in a new release of the Guidelines. I wonder if lines such as
can be taken as indicative of what's wrong (see the console output). Will pester @peterstadler about this at some point, but only after he's had a bit of a breather after the conference... |
Thanks, Peter Stadler! Fixing a compatibility failure to allow a build to pass for issue TEIC#2419
... and, as usual, Peter has not failed. Thanks! See issue #2472 for progress on the Council side. |
That hasn't worked as planned, compare old console output, 6 errors: https://jenkins-paderborn.tei-c.org/job/TEIP5-LingSIG-tests/13/parsed_console/ new output, 8 errors: https://jenkins-paderborn.tei-c.org/job/TEIP5-LingSIG-tests/14/parsed_console/ -- so let me just wait for a fix by the Council. |
@bansp I took a very quick look at the bug report and saw this issue right away: "ERROR: Guidelines.epub: OPS/XHTML file OPS/ref-att.calendarSystem.html is missing" It looks like the build is missing a crucial file for ref-att.calendarSystem ? |
@bansp That's not your doing, of course--just a recognition that the build problem is likely to do with activity last week and this on a different PR (uh oh...): #2435 @raffazizzi and @sydb should be able to help here! I'll look in later--I'm headed back to the university trenches for the next several hours. |
@ebeshero Thanks for giving it a check :-) |
@bansp I think the proverbial dust has settled from yesterday's activities on the other PR! It's probably safe to update your branch now. But I also think it may be safe to ask our Council reviewers to check things out too. |
Thanks, Elisa. No hurry on this end. I'll do my best to react to potential comments by the reviewers. |
OTOH, there's no movement yet, in the dev branch of either TEI or Stylesheets. I'll just check back in a day or two :-) |
Updated the pull request with the content coming from issue #2480 but still no go, at least not in the Paderborn Jenkins. It's the first time since I can remember that the build tree has been broken for so long. Feels weird. I understand it's because we're waiting for some upstream sanity but I'm not sure that that is sensible. They must know they've broken stuff and since they haven't bothered to fix it, shouldn't we go around them? As Martin suggests in #2472 . |
Thanks, Elisa. By upstream sanity, I was referring to what seems a happy go lucky move by the Debian team, if I understood Martin correctly. And leaving the matters unchanged despite a hiccup. I realise that, for the Council, waiting is a reasonable option, up to a limit, and the costs are arguably low in this case. Maybe a different make flow is what can be done in our case. Will ask Peter about that. |
* add category and taxonomy to att.datcat; add to the spec (references #2419 ) * minor corrections * correction of attribute names * add the Morais dictionary reference With thanks to Ana Salgado * cosmetic changes to restart Jenkins * Update att.datcat.xml use title rather than hi * "Junicode" to "Junicode Two Beta" Thanks, Peter Stadler! Fixing a compatibility failure to allow a build to pass for issue #2419 * revert
This is a request coming from the Lexical Resources Summit 2023 (DARIAH, Berlin), convened by @ttasovac and @laurentromary . The immediate context is the TEI Lex0 customisation but @JessedeDoes suggests that the request could also be helpful in the ParlaMint project.
Essence: it would be most useful to be able to use the datcat attributes for taxonomies, of any sort (datcat atts are not only a grammatical device, any longer). For that purpose, the
<taxonomy>
and<category>
elements should be members of att.datcat.When discussing the initial bundle of elements for the re-written datcat, a few months ago, we decided to start small and expand when there is a need. Now, the need comes from two well-established projects, and there is a good chance that the addition is going to be useful elsewhere as well.
Also, we said that, since taxonomies may use
<equiv>
, we'd see if a genuine need for the attribute class exists. Neither the ODD for Lex-0 nor the ODD for ParlaMint uses the tagdocs module. I think that this may qualify as genuine need, because requiring these carefully crafted ODDs to use the tagdocs module only for the sake ofequiv/@url
as the mechanism for aligning with external taxonomies is definitely far-fetched.We're going to add a PR to this ticket, with some examples added to (at least) the att.datcat spec.
The text was updated successfully, but these errors were encountered: