Linked Places TSV v0.2 (LP-TSV)

15 Sep 2019

The following TSV place data format will be supported for contributions to the World-Historical Gazetteer (WHG) system as a simplified alternative to the more expressive default, Linked Places (LP) format. LP-TSV is suitable for relatively simple place records, e.g. those without a) temporal scoping for names, geometries, types, or relations; and/or b) citations for name variants. Samples.

LP-TSV files are unicode text (utf-8). Fields (columns) must be delimited with a tab character. Where multiple values are allowed within a field (indicated below), they are unquoted and delimited with semicolons. The following fields will be parsed and converted to Linked Places format automatically upon upload to WHG.

NB Records consisting of only the required id, title and title_source are exceedingly difficult to reconcile (i.e. match) with records in existing gazetteer data resources such as Getty TGN, Wikidata, DBpedia, and GeoNames. Additional context, e.g. ccodes, variants, types and geometry help considerably.


## required ##


Contributor's internal identifier. This must stay consistent throughout accessioning workflow, including subsequent updates


A single "preferred" place name, which acts as title for the record


Label or short citation for source of the title toponym


Earliest relevant date in ISO 8601 form (YYYY-MM-DD); omit month and/or year as req. BCE years must expressed as negative integer, e.g. -0320 for 320 BCE. To express a range, use a pair of dates, e.g. -0299/-0200 would indicate "starting in 3rd century BCE."


  • A start date or date range is required and end date or date range is optional . Both refer to the entire place record. Use start & end values in combination to indicate a valid period. A start value alone typically indicates the publication date of the title_source. The start and end values correspond to a timespan within a "when" object at the record level in the full Linked Places JSON-LD format.

## encouraged ##


URI for the source of the title toponym


One or more [ISO Alpha-2 two-letter codes] ( for modern countries that overlap or cover the place in question. N.B. These are used to generate a constraining geometry for searches, and not interpreted as assertions of a relation between the entities. semicolon-delimited


One or more URIs for matching record(s) in place name authority resources; interpreted as SKOS:closeMatch, which is "used to link two concepts that are sufficiently similar that they can be used interchangeably in some information retrieval applications" and is inclusive its sub-property SKOS:exactMatch. semicolon-delimited


One or more name and/or language variants; can be suffixed with language-script codes if available, per IETF best practices, BCP 47 using ISO639-1 2-letter codes for language and ISO15924 4-letter codes for script; e.g. {name}@lang-script. NB Omit script code if it is the default for a language. semicolon-delimited


One or more terms for place type (contributor's term, usually verbatim from the source, e.g. pueblo) semicolon-delimited


One or more AAT integer IDs from WHG's subset list of 160 place type concepts (tsv; xlsx showing hierarchy. While not required, this mapping will make records more discoverable in WHG interfaces. NOTE: aat_types should correspond to types, 1-to-1. If there is no corresponding aat_type, leave its position empty. E.g. If there are 4 types for a record and only those in positions 2 and 3 have a corresponding aat_type, this field's value would be something like 1234567;2345678; indicates semicolon-delimited

## optional ##


A single toponym for a containing place


Either 1) a URI for a web-published record of the parent_name above, or 2) a pointer to another record in the same datafile, consisting of a '#' character followed by the id of the parent record; e.g. "#1234"


Longitude, in decimal degrees


Latitude, in decimal degrees


Any geometry in OGC WKT format.


  • polygons should ideally be simplified to aid rendering performance, using e.g. a GIS function or MapShaper
  • geowkt will supercede lon/lat pair, if both are supplied; used typically for non-point geometry


Label for source of the geometry, e.g. GeoNames


URI identifier for the source of the geometry, e.g.


Latest relevant date, as above; pair indicates ending range


A short text description of the place

@kgeographer; 20190819