The Steppe, or How to fit arbitrary projections to data
In the current Encyclopedia Britannica's online article on "Steppe, the", there is a fabulous map that I would love to see broken out of its static fetters and dance with different overlays (e.g., temperature, rainfall, etc.). Choosing to ignore copyright concerns for the sake of narrow educational pursuits, this project is specifically focused on determining the projection used by this map, in order to get as exact georeferencing as possible, i.e., to accurately convert the green pixels to longitude/latitudes.
Hence the subtitle of the project: how can one go from georeferenced control points (GCPs) to a projection's parameters?
Currently, the project leverages Pyproj and allows me to specify
- a projection (e.g., "aea" for the Albers equal-area),
- a list of its parameters (e.g., lon/lat of false origin, and two standard parallels), and
- an initial numeric guess for these parameters
and after running a nonlinear least squares (provided by Scipy), can plot the image with the original and best-fit GCPs, as well as a coastline via Natural Earth.
Install Python3 (or consider using pyenv to easily manage different Python versions) and, in case you don't already have it, Git. Then, in your command line, run the following (the
$ symbol indicates your command prompt and isn't meant to be typed):
$ pip install virtualenv $ git clone https://github.com/fasiha/steppe-map.git $ cd steppe-map $ virtualenv . $ source bin/activate $ pip install -r requirements.txt --upgrade
This will ask pip, the Python package manager, to install virtualenv (
pip install virtualenv), which lets us manage per-project dependencies without polluting our global Python install. Then, git makes a copy of this repository (
git …) which you then enter (
cd …) and set up a virtualenv to manage dependencies (
virtualenv …). You then activate the virtualenv (
source …), and install all dependencies in the local directory (
This repository includes a copy of the Natural Earth Coastline database. Feel free to update it if you need to.
If you want to load the image and see the estimated projection's graticules on top of it, you'll need to download the 1600 by 1058 image from Britannica's article on the Steppe and save it as
TheSteppe.jpg. The MD5 checksum of the image I used to create the geo-control points is:
After creating a virtualenv and installing requirements in it, and gotten a copy of
TheSteppe.jpg, make sure you're still in a virtualenv and run the code:
$ source bin/activate $ python steppe.py
It takes less than a minute to run on my mid-2014 MacBook Pro. This should spit out some informative text and several plots.
The parameter fitting aspect of the project is reasonably flexible in fitting any Pyproj-supported projection to be fit with as few or as many unknown parameters. The system also estimates a polynomial 1 (affine) or 2 (quadratic) map between the projection's native output and pixels.
After trying several dozen projections, I found that the Winkel Tripel projection gives the best accuracy in terms of error between control and fitted points.
With 29 GCPs (included in this repo as
gcp29.points, in EPSG:3857, with units in meters), a two-parameter Winkel Tripel with a quadratic fit shows a good match between GCPs, coastlines, and the graticule ticks on the edges of the image.
Using simply an affine transform (poly1, instead of the above quadratic), we get a worse fit with the coastlines:
This map's SRS:
+units=m +proj=wintri +lon_0=46.39467751339456 +lat_1=36.580495615074135
It's unlikely that Britannica's cartographers intended to apply some distortion that needs a quadratic polynomial. My guess is that either the Winkel Tripel isn't the actual projection, or that the map acquired some distortion after it was made—maybe it was hand-traced at some point. (Note how the Norwegian coast in the image is quite bizarre.)
It should be relatively straightforward to adapt this repo to other cases:
- use QGIS to create a set of geo-reference control points,
- export them to disk (ideally in EPSG:3857, but lat/lon degrees are fine too, just, if degrees, just unset the
- edit the
steppe.pyscript to point to the new GCP file and
- the new image, and you should be good to go.
Feel free to contact me via email, GitHub, etc., and I can help, either with this map or with another.
§1. Note that most of these projections (see http://www.remotesensing.org/geotiff/proj_list) accept false easting and northing parameters, scalars which are added to all Cartesian locations. While the projection fitting can accommodate these readily, this is unnecessary as we remove any affine (
a*x + b, informally called "poly1") or quadratic (
a * x**2 + b * x + c, "poly2") transform between the projection's output (in Cartesian space) and the GCPs' pixel locations using Späth's algorithm (pdf). In simpler terms—for "poly1", we find and remove any rotation/scale/shift between the projected pixel locations and the GCP pixel locations. For "poly2", we find and remove a larger class of image distortions. I think it's neat that the optimization (i.e., the function minimization) does not estimate these—it only deals with the projection's parameters—and affine/quadratic distortions are dealt with separately.
§2. Much of the time, we want to reproject the image to an equirectangular (latlon, or Plate Carrée) projection. I'm trying to figure out how to use GDAL tools to do this, but in the meantime, I just added my own interpolation scheme that works well.
The community at GIS.stackexchange has been very helpful --- see http://gis.stackexchange.com/questions/43682/ --- thank you.