Structure annotation introducing vertical whitespace
- Annotation Category
structure_annotation_category
- Declaration
<whitespace-annotation set="...">
(note: ``set`` is optional for this annotation type)- Version History
Since the beginning
- Element
<whitespace>
- API Class
Whitespace
- Required Attributes
- Optional Attributes
xml:id
-- The ID of the element; this has to be a unique in the entire document or collection of documents (corpus). All identifiers in FoLiA are of the XML NCName datatype, which roughly means it is a unique string that has to start with a letter (not a number or symbol), may contain numers, but may never contain colons or spaces. FoLiA does not define any naming convention for IDs.set
-- The set of the element, ideally a URI linking to a set definition (seeset_definitions
) or otherwise a uniquely identifying string. Theset
must be referred to also in theannotation_declarations
for this annotation type.class
-- The class of the annotation, i.e. the annotation tag in the vocabulary defined byset
.processor
-- This refers to the ID of a processor in theprovenance_data
. The processor in turn defines exactly who or what was the annotator of the annotation.annotator
-- This is an older alternative to theprocessor
attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation.annotatortype
-- This is an older alternative to theprocessor
attribute, without support for full provenance. It is used together withannotator
and specific the type of the annotator, eithermanual
for human annotators orauto
for automated systems.confidence
-- A floating point value between zero and one; expresses the confidence the annotator places in his annotation.datetime
-- The date and time when this annotation was recorded, the format isYYYY-MM-DDThh:mm:ss
(note the literal T in the middle to separate date from time), as per the XSD Datetime data type.n
-- A number in a sequence, corresponding to a number in the original document, for example chapter numbers, section numbers, list item numbers. This this not have to be an actual number but other sequence identifiers are also possible (think alphanumeric characters or roman numerals).space
-- This attribute indicates whether spacing should be inserted after this element (it's default value is alwaysyes
, so it does not need to be specified in that case), but if tokens or other structural elements are glued together then the value should be set tono
. This allows for reconstruction of the detokenised original text.src
-- Points to a file or full URL of a sound or video file. This attribute is inheritable.begintime
-- A timestamp inHH:MM:SS.MMM
format, indicating the begin time of the speech. If a sound clip is specified (src
); the timestamp refers to a location in the soundclip.endtime
-- A timestamp inHH:MM:SS.MMM
format, indicating the end time of the speech. If a sound clip is specified (src
); the timestamp refers to a location in the soundclip.speaker
-- A string identifying the speaker. This attribute is inheritable. Multiple speakers are not allowed, simply do not specify a speaker on a certain level if you are unable to link the speech to a specific (single) speaker.
- Accepted Data
<alt>
(alternative_annotation
),<altlayers>
(alternative_annotation
),<comment>
(comment_annotation
),<correction>
(correction_annotation
),<desc>
(description_annotation
),<metric>
(metric_annotation
),<part>
(part_annotation
),<relation>
(relation_annotation
)- Valid Context
<def>
(definition_annotation
),<div>
(division_annotation
),<event>
(event_annotation
),<ex>
(example_annotation
),<head>
(head_annotation
),<note>
(note_annotation
),<p>
(paragraph_annotation
),<ref>
(reference_annotation
),<s>
(sentence_annotation
),<term>
(term_annotation
)
Sometimes you may want to explicitly specify vertical whitespace, rather than repeat multiple linebreaks (linebreak_annotation
), the <whitespace> element accomplishes this. Note that using <p> to denote paragraphs is always strongly preferred over using <whitespace> to mark their boundaries, this element should be used sparingly!
The difference between br
and whitespace
is that the former specifies that only a linebreak was present, not forcing any vertical whitespace between the lines, whilst the latter actually generates an empty space, which would comparable to two successive br
statements. Both elements can be used inside various structural elements, such as divisions, paragraphs, headers, and sentences.