-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New element annotatedU #539
Comments
We should probably see how we could also deal with such cases by leans of the stand-off element. I see the two options as complementary flavors (for many pieces of speech annotation software an interleaved representation à la annotationU is easier; whereas for some other use cases, it is better to leave the primary transcription "untouched") Original comment by: @laurentromary |
After going back and forth between the ISO proposal and the stdf proposal. I see the possibility to create an element that would be slightly more generic than annotated you, which we could call annotationGrp. This element could be used to group together series of annotations associated to the same primary object (e.g. the same u element) either by having this object as a child (i.e. what we wanted with annotatedU: a u with a series of spanGrp for instance) or in a stand-off mode within the annotations sub-element of stdf. The specification of this element could be as follows:
with the idea that model.annotationPart would be the hook where one could add any kind of internal or external annotation object. For instance in my tests, I make model.global.meta member of this class to get spanGrp and the like in it. Original comment by: @laurentromary |
Generalizing is always nice. But what is "stdf" please? Original comment by: @lb42 |
There is also a github project (https://github.com/laurentromary/stdfSpec), where I maintain updates on the stdf proposal and some samples, which shows how annotatedU can be used nine or stand-off in relation to speech transcription. Original comment by: @laurentromary |
Original comment by: @lb42 |
Referring to the document at https://docs.google.com/document/d/1BTjYHSiPjD6GhKMNFmZrrvCkLQAa1RK7aGbG5K50uN4 Section 6.5.2 ("Representation as unclear or gap") says that when an string of words is unclear, and alternatives are proposed, the strings should each be wrapped in a separate span element (within choice, within unclear). I think this meant to say "a separate seg element" ; and indeed the examples given two sections later (6.5.4) use seg, not span. Probably just the usual code-switching problem between HTML span and TEI seg. Section 5.7 (6.7 as listed in the TOC) on "Global divisions" proposes that divisions of the transcription at levels superordinate to the utterance should be accomplished by the use of non-tessellating divs. Unless utterance and annotated utterance themselves are regarded as syntactic sugar for div type="utterance", this is surely a very un-TEI way of doing things. Do we really mean to slip floating divs into the scheme by this means? Original comment by: @PFSchaffner |
I have suggested a revision to the document precluding non-tesselating divs. In the meantime, do we have agreement on introducing a new <annotatedU> element, a spec for which would look something like this
Original comment by: @lb42 |
Original comment by: @laurentromary |
So you want to replace "annotatedU" with "annotationGrp" ? Original comment by: @lb42 |
Yes. See Thomas' last document. Original comment by: @laurentromary |
For the benefit of others trying to follow this ticket, "Thomas' last document" is an entirely new docx version of the googledoc, the existence of which I learned of about 20 minutes ago when he sent me a copy ! Original comment by: @lb42 |
The current version of this latest draft is now available from Original comment by: @lb42 |
Could we put this behind a pwd protected place. We may have a pb with ISO copyrighted documents. (I am +not+ opening a debate, just mentioning) Original comment by: @laurentromary |
Well, we have the wiki, but that is hardly secure. If you want to restrict access to this document, then clearly it is not yet ready for discussion by the TEI, so I will remove it. Original comment by: @lb42 |
This issue was originally assigned to SF user: louburnard |
The latest version of the ISO proposal has apparently renamed this element as "annotationGrp". Unfortunately, TEI naming conventions require that an element named xxxGrp contains only xxx elements, which is not the case here. Perhaps a better name might be "annotationUnit" or "annotationBlock" ? |
I must say I like both (annotationUnit or annotationBlock). If a decision could be taken quickly by the council. We would make sure that the final ISO publication would refer to it. We actually presented the case in ISO as pending the naming decision by the TEI council. |
So are we agreed on the following: If so, I'd appreciate some help confecting the latter. Laurent? Tomas? |
I am sending to Hugh the ISO document which is under balloting and from which the council can take up examples. Come back to me and Thomas (now subscribed to both tickets) for any additional information. |
I've now seen the PDF of the draft: it still says "annotationGrp" rather than "annotationBlock", but on |
Of course, since it is under ballot. We have already filed in a comment requesting the change to annotationBlock. So please go ahead with the implementation. Please notify me and Tomas if anything is wrong. |
In which TEI module should <annotationBlock> be defined? In spoken or in analysis ? |
Clearly analysis. It is potentially a tool for grouping annotations related to quite a range of object and of course an essential piece for standOff. |
I concur -- it would be ideal for it to sit in a standoff module, but since there is no such module (yet?), analysis is definitely the way to go. |
some simple usage examples would be very helpful, if anyone has them. |
Following a more in-depth discussion with @lb42 we suggest to make the content model of
The content model of These classes could be bootstrapped with typical TEI elements that would have the appropriate semantic for the corresponding function in annotationBlock:
In the case of a stand-off use of annotationBlock, we may consider either to make the annotableSegment optional or use |
@sydb wonders aloud (for @laurentromary to answer) if requiring the model.annotableSegment bit would get rid of the ambiguity that occurs when you want to annotate (with |
The issue of ambiguity is one for which I do not have an answer. In theory (if XML schemas were no headache), I would like to have the two model classes above. But in practice, we may just resolve to have one and provide written guidelines as to proper usage: for instance mapping this to the Open Annotation model as already alluded to in https://hal.inria.fr/hal-01254365 |
@hcayless : getting tired? The comment is not related to the ticket, is it? |
@laurentromary Wrong ticket. Deleted. |
Council to prod LB to prod LR. |
@laurentromary The element <annotationBlock> is now in the Guidelines. Can we close this issue? |
Yes. There will be a specific ticket for updating the content model of annotationBlock |
OK, thanks. Closing this one. |
[This is the second of a few tickets related to the TEI/ISO standard for transcriptions of spoken language: see http://bit.ly/1jyZC37 ]
It is usual to segment transcribed speech into smaller chunks for which the existing <u> element is appropriate. This proposal suggests a way of grouping each such chunk with one or more tiers of annotation, as is common practice.
Original comment by: @lb42
The text was updated successfully, but these errors were encountered: