Skip to content

Latest commit

 

History

History
72 lines (54 loc) · 2.46 KB

File metadata and controls

72 lines (54 loc) · 2.46 KB

Encoding language information in the metadata of content resources

Concreteness: concrete
Strength: mandatory
Category: WG1,WG2
Status: draft

See also

Language is one of the most important features for discovering content resources and for ensuring the proper operation of processing components. Therefore, it is important not only to include it in the metadata descriptions of resources but also to normalise it.

For normalistaion purposes, OpenMinTeD recommends the use of the IETF’s BCP 47 (https://tools.ietf.org/html/bcp47), which specifies syntax for language tags, so as to include language, region, script and variant tags in one code.

Language information is mandatory and can be repeated for multilingual resources.

The metadata elements to be used are:

  • 'language' for the language of the contents of the resource: mandatory language tag, which is concatenated by mandatory language identifier (from the ISO 639 codes) and optional script, region and variant tags

  • 'metalanguage': can be optionally used for lexical/conceptual resources and language descriptions, for the language that is used to describe a resource (e.g. a lexicon of Greek terms with definitions in English is considered as having 'language' Greek and 'metalanguage' English).

Table 1. Compliance assessment
Product Version Compliant Justification Status

CORE

Full

Language is provided where available through the metadata of the aggregated resource or inferred through the full text

Draft

OpenAIRE

Full

Language is provided where available through the metadata of the aggregated resource.

Draft

Frontiers

No

English only supported

Draft

TheSoz

June-16

Full

via language attribute for each entity in the thesaurus

Final

Agrovoc

21/01/2016

Full

Labels are associated with language attributes, see models linked from http://aims.fao.org/vest-registry/vocabularies/agrovoc-multilingual-agricultural-thesaurus

Final

OLiA

June-16

No

Language only indirectly deducible from the description of the resource in the data type property 'comment' value.

Final

LAPPS

June-16

Partial

The LAPPS Web Service Exchange Vocabulary defines a language property at the document level, also recommending BCP 47 (also known as RFC 3066).

Final