New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dictionary: an entry element in a sense element #1791
Comments
In theory, this would be a perfect use case for |
This is really a very lucky coincidence. As Laurent mentioned, the TEILex0 team (partly DARIAH WG Lexical Resources, partly ELEXIS: European Lexicographic Infrastructure) has spent a great deal of time discussing this issue. We've identified multiword-expressions, collocations (but sometimes also idioms and other type of phraseology) that would greatly benefit from being grouped as entries. @laurentromary is right that we also have
So, I would also like to urge the Council to consider making |
Just to mention it: In a dictionary one may write separate entries for multiword headwords and then create a link from the actual sense to this separate entry. This is the solution we have used in the editing system for the Norwegian Dictionary (http://no2014.uib.no). This is the typical database/linked data solution. However, in an TEI-realization of the dictionary the solution with nested/recursive entries is very clean and useful. It will help me in making a TEI verson of the this very complex dictionary. |
@martinascholger just resolved a different, perhaps related ticket on the dictionary module: #1702 , so this might make sense to continue work... |
It is indeed important to have the flexibility of having both multiword headwords as part of a semantic description of the simplex they are related to, but at the same time having the opportunity to treat them as entries. Even more so since dictionaries treat mwu both or as separate entries, or as part of the description of an entry they are related to. |
I can see problems (as well as advantages) of having entry recursive. So, I have to ask: what is wrong with dictScrap for containing such multiword expressions? It seems to be able to contain most of the stuff entry does. |
|
The ticket #1702 mentioned by @ebeshero is related but different. It addresses a problem one encounters frequently when encoding retrodigitized dictionaries - how to deal with all the characters (and spaces) used in the original as separators and decor and at the same time encode the logical structure of an entry. This would require either mixed content model (elements and cdata intermixed) or some neutral element that can appear almost anywhere to encapsulated the separators, punctuation and decor elements. |
To further answer @TomazErjavec 's remark: I think one of the issues behind @chr-emil 's request is to be able to have the same object encoded in the same way wherever it appears: i.e. as an autonomous entry or a sub-entry somewhere (in his case within a |
I agree with @ttasovac. It is nothing unstructured in such a |
But an entry does have somewhat different semantics from a subentry or a nested entry surely? If asked "how many entries are there?" It's plausible to exclude those which are nested within a main entry, surely? Why did the dictionary writer organise the material in this way? So I am not convinced it really is "the same object". |
Having a different semantics depends on the actual editorial stance associated to the dictionary. In many cases, the fact that en entry appears as a sub- (or super-) entry is accidental, i.e. results from practicalities. The point is that there are use cases where we need a more homogenous representation framework and we do not ask for the deprecation of |
To back up the claim of @laurentromary with two examples: There are many dictionaries that use nests of entries purely for reasons of text compression in print (pretty common e. g. for German dictionaries). Those nests consist of some kind of common header and then a list of (typographically) subordinated entries. They are typically typeset just like one big entry, i. e. as one big paragraph on the surface – but all those entries still exhibit exactly the same types of lexicographic sub-elements like the »normal« entries do. Put differently (and exactly along the lines of what @ttasovac said): they perfectly fit the content model of A slightly different case can be seen in etymological dictionaries that may organize entries around word families. One form (typically a simplex form) comes first but often derivatives of this first headword may be discussed further on in what really is a separate (but embedded) entry. Derivatives may also have their own extensively described etymologies possibly deviating from the first headword and thus can be considered entries in their own right. It would be really elegant in my view to mark-up such a cluster of clearly typographically grouped entries as an In any case, when using If you need to pin down conceptual differences between types of entries it would be much more in line with common TEI practice to use the |
Council at F2F agrees to add |
In many multiword expressions (collocations) are written and explained under a sense node of the definition tree in the entry of one of the central words of the multi word expression. The location under a sense node is ok. However, the multi word expression itself has to be encapsulated in a cit element. The definition of the expression will be encapsulated in a sense element at the same level.
A cleaner and clearer model is to consider the multiword expression as a headword and describe its meaning and use in the standard way of an entry. That is, in a TEI encoded dictionary an entry element must be allowed inside a sense element
In the current TEI a sense element may contain:
dictionaries: def dictScrap etym form gramGrp lang oRef pRef re sense usg xr
My suggestion is to extend this to
dictionaries: def dictScrap entry etym form gramGrp lang oRef pRef re sense usg xr
The text was updated successfully, but these errors were encountered: