Phonetic Annotation/Content

This is the phonetic analogy to text content (<t>) and allows associating a phonetic transcription with any structural element, it is often used in a speech context. Note that for actual segmentation into phonemes, FoLiA has another related type: Phonological Annotation

Specification

Annotation Category

content_annotation_category

Declaration

<phon-annotation set="..."> (note: set is optional for this annotation type; if you declare this annotation type to be setless you can not assign classes)

Version History

Since v0.12

Element

<ph>

API Class

PhonContent (FoLiApy API Reference)

Required Attributes

Optional Attributes

set -- The set of the element, ideally a URI linking to a set definition (see set_definitions) or otherwise a uniquely identifying string. The set must be referred to also in the annotation_declarations for this annotation type.
class -- The class of the annotation, i.e. the annotation tag in the vocabulary defined by set.
processor -- This refers to the ID of a processor in the provenance_data. The processor in turn defines exactly who or what was the annotator of the annotation.
annotator -- This is an older alternative to the processor attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation.
annotatortype -- This is an older alternative to the processor attribute, without support for full provenance. It is used together with annotator and specific the type of the annotator, either manual for human annotators or auto for automated systems.
confidence -- A floating point value between zero and one; expresses the confidence the annotator places in his annotation.
datetime -- The date and time when this annotation was recorded, the format is YYYY-MM-DDThh:mm:ss (note the literal T in the middle to separate date from time), as per the XSD Datetime data type.
tag -- Contains a space separated list of processing tags associated with the element. A processing tag carries arbitrary user-defined information that may aid in processing a document. It may carry cues on how a specific tool should treat a specific element. The tag vocabulary is specific to the tool that processes the document. Tags carry no instrinsic meaning for the data representation and should not be used except to inform/aid processors in their task. Processors are encouraged to clean up the tags they use. Ideally, published FoLiA documents at the end of a processing pipeline carry no further tags. For encoding actual data, use class and optionally features instead.

Accepted Data

<comment> (comment_annotation), <desc> (description_annotation)

Valid Context

<current> (correction_annotation), <def> (definition_annotation), <div> (division_annotation), <event> (event_annotation), <ex> (example_annotation), <head> (head_annotation), <hiddenw> (hiddentoken_annotation), <list> (list_annotation), <morpheme> (morphological_annotation), <new> (correction_annotation), <note> (note_annotation), <original> (correction_annotation), <p> (paragraph_annotation), <part> (part_annotation), <phoneme> (phonological_annotation), <ref> (reference_annotation), <s> (sentence_annotation), <str> (string_annotation), <suggestion> (correction_annotation), <term> (term_annotation), <utt> (utterance_annotation), <w> (token_annotation)

Explanation

Written text is always contained in the text content element (<t>, see text_annotation), for phonology there is a similar counterpart that behaves almost identically: <ph>. This element holds a phonetic or phonological transcription. It is used in a very similar fashion:

<utt src="helloworld.mp3"  begintime="..." endtime="...">
    <ph>helˈoʊ wɝːld</ph>
    <w xml:id="example.utt.1.w.1" begintime="..." endtime="...">
        <ph>helˈoʊ</ph>
    </w>
    <w xml:id="example.utt.1.w.2" begintime="..." endtime="...">
        <ph>wɝːld</ph>
    </w>
</utt>

Like the text_annotation, the <ph> element supports the offset attribute, referring to the offset in the phonetic transcription for the parent structure. The first index being zero. It also support multiple classes (analogous to text classes), the implicit default and predefined class being current. You could imagine using this for different notation systems (IPA , SAMPA, pinyin, etc...).

Phonetic transcription and text content can also go together without problem:

<utt>
    <ph>helˈoʊ wɝːld</ph>
    <t>hello world</t>
    <w xml:id="example.utt.1.w.1">
        <ph offset="0">helˈoʊ</ph>
        <t offset="0">hello</t>
    </w>
    <w xml:id="example.utt.1.w.2">
        <ph offset="8">wɝːld</ph>
        <t offset="6">world</t>
    </w>
</utt>

Note

You should still use the normal text_annotation for a normal textual transcription of the speech. This annotation type is reserved for phonetic/phonological transcriptions.

If you want to actually do segmentation into phonemes, see phonological_annotation.

Example

A simple example document:

../../examples/speech.2.0.0.folia.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phon_annotation.rst

phon_annotation.rst

Phonetic Annotation/Content

Specification

Explanation

Example

Files

phon_annotation.rst

Latest commit

History

phon_annotation.rst

File metadata and controls

Phonetic Annotation/Content

Specification

Explanation

Example