# Chapter 4: Sources of Geospatial Data

When creating a geospatial application, the data you use will be just as important as the code you write.  
High-quality geospatial data, and in particular base maps and imagery, will be the cornerstone of your application.  
If your maps don't look good, then your application will be treated as the work of an amateur, no matter how well you write the rest of your program.  
  
Traditionally, geospatial data has been treated as a valuable and scarce resource, being sold commercially for many thousands of dollars and with strict licensing constraints.  
Fortunately, as with the trend towards "democratizing" geospatial tools, geospatial data is now becoming increasingly available for free and with little or no restriction on its use.  
There are still situations where you may have to pay for data, for example, to guarantee the quality of the data, or if you need something that isn't available elsewhere, but it is now usually just a case of downloading the data you need, for free, from a suitable server.  
  
This chapter provides an overview of some of these major sources of freely-available geospatial data.  
This is not intended to be an exhaustive list, but rather to provide information on the sources which are likely to be most useful to the Python geospatial developer.  
  
In this chapter, we will cover:  
* Some of the major freely-available sources of vector-format geospatial data  
* Some of the main freely-available sources of raster geospatial data  
* Sources of other types of freely-available geospatial data, concentrating on databases of city and other place names  


## 4.1 Sources of geospatial data in vector format  

### 4.1.1 OpenStreetMap  

OpenStreetMap (http://openstreetmap.org) is a website where people can collaborate to create and edit geospatial data.  
It describes itself as a "free editable map of the whole world made by people like you."  


* __Data Format__  

OpenStreetMap does not use a standard format such as shapefies to store its data.  
Instead, it has developed its own XML-based format for representing geospatial data in the form of nodes (single points), ways (sequences of points that defie a line), areas (closed ways that represent polygons), and relations (collections of other elements).  
Any element (node, way, or relation) can have a number of tags associated with it that provide additional information about the element.  
Following is an example of how the OpenStreetMap XML data looks:  


In [None]:
<osm>
 <node id="603279517" lat="-38.1456457"
 lon="176.2441646".../>
 <node id="603279518" lat="-38.1456583"
 lon="176.2406726".../>
 <node id="603279519" lat="-38.1456540"
 lon="176.2380553".../>
 ...
 <way id="47390936"...>
     <nd ref="603279517"/>
     <nd ref="603279518"/>
     <nd ref="603279519"/>
     <tag k="highway" v="residential"/>
     <tag k="name" v="York Street"/>
 </way>
 ...
 <relation id="126207"...>
     <member type="way" ref="22930719" role=""/>
     <member type="way" ref="23963573" role=""/>
     <member type="way" ref="28562757" role=""/>
     <member type="way" ref="23963609" role=""/>
     <member type="way" ref="47475844" role=""/>
     <tag k="name" v="State Highway 30A"/>
     <tag k="ref" v="30A"/>
     <tag k="route" v="road"/>
     <tag k="type" v="route"/>
 </relation>
</osm>

* __Obtaining and using OpenStreetMap data__

You can obtain geospatial data from OpenStreetMap in one of following three ways:  
* You can use the OpenStreetMap API to download a subset of the data you are interested in.  
* You can download the entire OpenStreetMap database, called Planet.osm, and process it locally. Note that this is a multi-gigabyte download.  
* You can make use of one of the mirror sites that provide OpenStreetMap data nicely packaged into smaller chunks and converted into other data formats. For example, you can download the data for North America on a state-by-state basis, in one of several available formats, including shapefies.  
Let's take a closer look at each of these three options.  


__The OpenStreetMap API__  
Using the OpenStreetMap API (http://wiki.openstreetmap.org/wiki/API), you can download selected data from the OpenStreetMap database in one of following three ways:  
* You can specify a bounding box defiing the minimum and maximum longitude and latitude values, as shown in the following screenshot:  
The API will return all of the elements (nodes, ways, and relations),
which are completely or partially inside the specifid bounding box.  
* You can ask for a set of changesets which have been applied to the map. This returns all the changes made over a given time period, either for the entire map or just for the elements within a given bounding box.  
* You can download a specifi element by ID, or all the elements which are associated with a specifid element (for example, all elements belonging to a given relation).  

OpenStreetMap provides a Python module called OsmApi, which makes it easy to access the OpenStreetMap API. More information about this module can be found at http://wiki.openstreetmap.org/wiki/PythonOsmApi


__Planet.osm__  
If you choose to download the entire OpenStreetMap database for processing on your local computer, you will fist need to download the entire Planet.osm database.  
This database is available in two formats: a compressed XML-format fie containing all the nodes, ways, and relations in the OpenStreetMap database, or a special binary format called PBF that contains the same information but is smaller and faster to read.  
  
> PBF is replacing XML as the preferred data format; libraries for reading and writing PBF fies are available for various languages, including Python.  
  
The Planet.osm database is currently 23 GB in size if you download it in XML format, or 18 GB if you download it in PBF format.  
Both formats can be downloaded from http://planet.openstreetmap.org.  

The entire dump of the Planet.osm database is updated weekly, but regular "diffs" are produced which you can use to update your local copy of the Planet.osm database without having to download the entire database each time.  
The daily diffs are approximately 40 MB when they have been compressed.  


  
### 4.1.2 TIGER  

* Data format  
* Obtaining and Using TIGER Data  

### 4.1.3 Natural Earth  

* Data format  
* Obtaining and using Natural Earth vector data  

### 4.1.4 Global self-consistent high-resolution shoreline database (GSHHS)  
* Data format  
* Obtaining the GSHHS database  

### 4.1.5 World Borders Dataset  
* Data format  
* Obtaining the World Borders Dataset  

## 4.2 Sources of geospatial data in raster format  
### 4.2.1 Landsat  
* Data format  
* Obtaining Landsat imagery  

### 4.2.2 Natural Earth  
* Data format  
* Obtaining and using Natural Earth raster data  

### 4.2.3 Global Land One-kilometer Base Elevation (GLOBE)  
* Data format  
* Obtaining and using GLOBE data  

### 4.2.4 National Elevation Dataset (NED  
* Data format  
* Obtaining and using NED data  

## 4.3 Sources of other types of geospatial data  
### 4.3.1 GEOnet Names Server  
* Data format  
* Obtaining and using GEOnet Names Server data  

### 4.3.2 Geographic Names Information System (GNIS  
* Data format  
* Obtaining and using GNIS Data  

## 4.4 Choosing your geospatial data source  
## 4.5 Summary   