Skip to content

Latest commit

 

History

History
78 lines (58 loc) · 6.37 KB

sentiment_annotation.rst

File metadata and controls

78 lines (58 loc) · 6.37 KB

Sentiment Annotation

Sentiment analysis marks subjective information such as sentiments or attitudes expressed in text. The sentiments/attitudes are defined by a user-defined set definition.

Note

This annotation type is deprecated because it overlaps with modality annotation (_modality_annotation). Modality annotation is now preferred over sentiment annotation, as it is more generic.

Specification

Annotation Category

span_annotation_category

Declaration

<sentiment-annotation set="..."> (note: set is optional for this annotation type; if you declare this annotation type to be setless you can not assign classes)

Version History

since v1.3

Element

<sentiment>

API Class

Sentiment (FoLiApy API Reference)

Layer Element

<sentiments>

Span Role Elements

<hd> (Headspan), <source> (Source), <target> (Target)

Required Attributes
Optional Attributes
  • xml:id -- The ID of the element; this has to be a unique in the entire document or collection of documents (corpus). All identifiers in FoLiA are of the XML NCName datatype, which roughly means it is a unique string that has to start with a letter (not a number or symbol), may contain numbers, but may never contain colons or spaces. FoLiA does not define any naming convention for IDs.
  • set -- The set of the element, ideally a URI linking to a set definition (see set_definitions) or otherwise a uniquely identifying string. The set must be referred to also in the annotation_declarations for this annotation type.
  • class -- The class of the annotation, i.e. the annotation tag in the vocabulary defined by set.
  • processor -- This refers to the ID of a processor in the provenance_data. The processor in turn defines exactly who or what was the annotator of the annotation.
  • annotator -- This is an older alternative to the processor attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation.
  • annotatortype -- This is an older alternative to the processor attribute, without support for full provenance. It is used together with annotator and specific the type of the annotator, either manual for human annotators or auto for automated systems.
  • confidence -- A floating point value between zero and one; expresses the confidence the annotator places in his annotation.
  • datetime -- The date and time when this annotation was recorded, the format is YYYY-MM-DDThh:mm:ss (note the literal T in the middle to separate date from time), as per the XSD Datetime data type.
  • n -- A number in a sequence, corresponding to a number in the original document, for example chapter numbers, section numbers, list item numbers. This this not have to be an actual number but other sequence identifiers are also possible (think alphanumeric characters or roman numerals).
  • textclass -- Refers to the text class this annotation is based on. This is an advanced attribute, if not specified, it defaults to current. See textclass_attribute.
  • src -- Points to a file or full URL of a sound or video file. This attribute is inheritable.
  • begintime -- A timestamp in HH:MM:SS.MMM format, indicating the begin time of the speech. If a sound clip is specified (src); the timestamp refers to a location in the soundclip.
  • endtime -- A timestamp in HH:MM:SS.MMM format, indicating the end time of the speech. If a sound clip is specified (src); the timestamp refers to a location in the soundclip.
  • speaker -- A string identifying the speaker. This attribute is inheritable. Multiple speakers are not allowed, simply do not specify a speaker on a certain level if you are unable to link the speech to a specific (single) speaker.
Accepted Data

<comment> (comment_annotation), <desc> (description_annotation), <metric> (metric_annotation), <relation> (relation_annotation)

Valid Context

<sentiments> (sentiment_annotation)

Feature subsets (extra attributes)
  • polarity
  • strength

Explanation

Note

Please first ensure you are familiar with the general principles of span_annotation_category to make sense of this annotation type.

Sentiment analysis marks subjective information such as sentiments or attitudes expressed in text. The <sentiment> span annotation element is used to this end. It is embedded in a <sentiments> layer.

The <sentiment> element takes the following span roles:

  • <hd> -- (required) -- The head of the sentiment; expresses the actual sentiment, it covers word spans such as happy'',very satisfied'', ``highly dissappointed''.
  • <source> -- (optional) -- The source/holder of the sentiment, assuming it is explicitly expressed in the text.
  • <target> -- (optional) -- The target/recipient of the sentiment, assuming it is explicitly expressed in the text.

The following feature subsets are predefined (see features), whether they are actually used depends on the set, their values (classes) are set-dependent as well:

  • polarity -- Expresses the whether the sentiment is positive, neutral or negative.
  • strength -- Expresses the strength or intensity of the sentiment.

Besides these predefined features, FoLiA's feature mechanism can be used to associate other custom properties with any sentiment.

Example

../../examples/sentiments.2.0.0.folia.xml