Skip to content

woeplanet-data/woeplanet-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

 woeplanet-data

"Everything is related to everything else, but near things are more related than distant things.”

Tobler’s First Law of Geography

The Short Version

WoePlanet is WhereOnEarth (AKA WOE) (AKA Geoplanet) data, based on Yahoo's GeoPlanet Data download, with some, but not all, of the original hierarchy linkages reinstated, with some, but not all, centroid and polygon coordinates added in from other data sources and formatted as GeoJSON, one file per WOEID and organised by Geoplanet place type.

Per-country and per-placetype repositories exist as part of the GitHub woeplanet organisation, with the following naming convention ...

woeplanet-[placetype]-[iso 3166-2 code]

So data for placetype 8 (state) for the United Kingdom would live in the following repo ...

woeplanet-state-gb

In some cases, even with splitting repos by placetype and country, the repo is getting close to GitHub's theoretical maximum repo size. In this case the data is further sub-partitioned, based on the first character of the place's name, so data for placetype 11 (zip) for the United Kingdom starting with A lives in the following repo ...

woeplanet-zip-gb-a

At the time of writing there are 2317 repos in the whosonfirst-data organisation. So a helpful manifest.json file which details each repo and is, in turn, based on each repo's meta/meta.json might be useful.

Country Codes

The country code in the WoePlanet repo naming convention follows the ISO 3166-1 Alpha 2 standard, but it's worth drawing attention to ...

  • XS is used for disputed territories or regions
  • ZZ is used for unknown or invalid territories or regions, which often means (shrug emoji) the world is a complicated place at times

These last two differ from the convention used by Who's On First and should probably be aligned at some point. But not today.

Who's On First?

Any similarity between Woeplanet and Who's On First is entirely coincidental. Inspiration for Who's On First came from Geoplanet, Yahoo's (now defunct) global gazetter and inspiration for Woeplanet came from Who's On First, especially on repository naming conventions and GeoJSON properties.

The Long(er) Version

Forking GeoPlanet one place at a time ...

This is an active but still experimental project to create a community driven project to maintain, update and make public in a usable form the Creative Commons licensed GeoPlanet data set.

Right at the very start was a set of closely coupled software and data called GeoPlanet, developed by UK based GDC (Graphical Data Capture). The heart of GeoPlanet was a hierarchically linked set of geographic place definitions, so for any place you could determine all the language and colloquial name variants, the parent(s), the children and adjacent or neighbouring places. Each entry in GeoPlanet was assigned a unique 32-bit identifier, known as a Whereonearth ID, or WOEID. GDC was subsequently acquired by Whereonearth in 1998 with the software component rebranded as InternetLocality and the data retaining the GeoPlanet name.

In 2005, Whereonearth was itself acquired by Yahoo!, initially to form part of the internal search marketing program, internally named as Panama. Four years later, as part of the Yahoo! Developer Network, a set of APIs based on InternetLocality were publicly released as GeoPlanet and Placemaker. To avoid confusion over the use of the GeoPlanet name, the data set was internally rebranded as WhereHaüs (pun fully intentional) and GeoPlanet was used solely to refer to the API. Coincident to this, a version of the data set was released publicly under a CC-BY license as GeoPlanet Data.

GeoPlanet Data contained some of the hierarchical richness of the internal data, including parent and adjacent relationships, but omitted child, sibling, "belongsto" and ancestor relationships. All geospatial data, such as centroids, polygons and bounding boxes were also removed due to licensing restrictions from third party, proprietary data sources.

The GeoPlanet Data set continued to be maintained and updated until June 2012 when Yahoo! removed access to the downloads with a message saying We are currently making the data non-downloadable while we determine a better way to surface the data as a part of the service. This remained until mid 2015 when Yahoo! announced that the YDN GeoPlanet and Placemaker APIs would be shutting down, although equivalent functionality would be available via the YQL API and the BOSS Geo APIs. The GeoPlanet API finally shut down late in 2016 and YQL was shutdown in early 2019. The BOSS Geo APIs are currently still available behind the BOSS paywall.

The publicly released GeoPlanet Data dumps remain available via the Internet Archive and several projects were released to make use of the data, such as GPLplanet by ex-Yahoo! Geo Director or Product Tyler Bell and woedb by ex-Flickr Lead Engineer Aaron Straup Cope, who went on to be one of the founding team members behind Who's on First. woedb went offline in 2018 and that was the genesis of WoePlanet.

See also ...

About

woeplanet is a global gazetteer of places - https://woeplanet.org

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published