THIS REPO IS NOT UP TP DATE WITH THE RECENT DDF STANDARD THE ETL SHOULD PROBABLY BE REIMPLEMENTED USING THE NEW RECIPE METHOD IN PYTHON.
This dataset contains geographic boundaries of subdivisions of countries, mainly from GADM, which is a spatial database of the location of the world's administrative areas (or adminstrative boundaries) for use in GIS and similar software. See GADM for more info about the underlaying data and the coverage. Currently nothing but the intention is to inclde the majority of content that GADM has.
The file ddf--entity_sets.csv, enumerates all geographic subdvisions(Like US-States) found in Gadm. Each subdivision is a row with the following properties as columns:
- id: We generated a unique identifier of the subdivision by concatinating the country id, and the name of the subdivision. for example usa_state.
- name: A singular name that contains the full path including the country: "USA State"
- drilldown_name: A name excluding the country: For example "State"
- country: the id of the country For example "usa"
Install python pandas (as described in ETL Script Requirements)
$cd ../process/etl/
$python partial_etl.py
As the script name indicates, additional etl is done manually.
Gapmidner created this dataset and provides it under Creative Common Attribution 4.0 International.