designing with spatial data - workshop materials
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Designing with spatial data for the web

Designing with spatial data


This workshop is designed to be very hands-on, but slightly less hands-on than the other park of Stamen's workshop series: Working with spatial data for the web. We will use some of the same tools that are covered in the other class (like Leaflet and D3.js), and we'll look at a few tools specific to today's class (like CARTO).

We will look at many examples that can be extended as exercises, and many more examples that will give you a taste of the possibilities out there, and some leads you can follow if you want to learn more.

It would be impossible to touch everything that we could find interesting in web mapping, so the hope is that after going through these three acts you will feel empowered to swap in your own data and leverage hundreds of examples in your own data visualization projects!

Before we get started:

We will be using two free online platforms for this class: GitHub (a website for sharing open source software and code), and CARTO (a user-friendly online mapping platform). If you don't have accounts on these two sites, please create accounts now:

Prelude: the landscape

An overview of the state of the art

Some maps we've made at Stamen, and maps we love made by others. What tools did these maps use? Where did the data come from? How and why were they designed this way?

Servers and clients, rasters and vectors

See: Anatomy of a web map

anatomy slide

To understand the fundamentals of tiled web maps, we'll look at some very basic Leaflet maps (you don't need to fully understand the code quite yet)

For contrast, here are a few basic D3.js maps:

Act I: data is part of design

Data preparation

Choose your data wisely. Clean it throughly. Remove what you don't need! In general, pre-process your data with the end product in mind.

All the tools we will look at can be used for data exploration, your data needs to change when you transition to data presentation.

You might need to use QGIS, but I'd recommend CartoDB (now "CARTO") for most things!

Let's load some data in CARTO, try the web editor to draw features, edit them, etc. Note, you can get lots of interesting datasets to start with from CARTO's Data Library

1) Getting started with CARTO

Look at the SQL viewer, and if you're familiar with SQL you can do a lot to transform your data really easily! Let's try extracting just a few features.

Note, if you love the command line, you can do most of this with GDAL's program ogr2ogr, especially with @dwtkns's fantastic GDAL cheat sheet. Or you can install PostGIS locally if you like the database interface of CARTO.


  • Mapshaper and TopoJSON
    • Review, what is TopoJSON?
    • generalization is not just for file size, it's also important for the look of the map
    • [Use the Brexit data as example]

2) Generalizing with Mapshaper

Natural Earth

Let's do a tour through the wonderful world of Natural Earth datasets. Not just what are they, but why are they?

3) Comparing levels of generalization in Natural Earth

Also note the boundary lines files. Why would we need those when we have the outlines of the countries already?

For one thing, you can apply a thick line to divide countries from each other without adding a thick stroke along their coastlines, for example.

(note, you can also do the same thing with TopoJSON)

Data precision

Act II: numbers to images


Why do they matter?

Have you learned to hate Mercator yet?

"Why are we changing maps?" (note, don't use the Peters projection either!)

Unless you're using tiles, you probably shouldn't use Mercator.

But wait, why do tiled maps use Mercator? Let's talk about that.

Let's start with the zoom 0 tile:

Projections as a design decision:

If you're making a choropleth (we'll talk about these later), use an equal area projection.

Map of the US, use Albers USA which has built-in sub-projections for Alaska and Hawaii!

Say, California, use EPSG:3310 California Albers

Map of the world, consider a Winkel Tripel (not equal area)

For a city, maybe it doesn't matter if you use Mercator.

BUT! Don't try to compare two city maps side-by-side using Mercator... the scale will be off!

4) Leaflet with scale bar and URL location

Use Leaflet Control.Scale

Projected maps vs projected data

Also, projections are not just about how you present your data. Sometimes your coordinates are already projected on disk! GeoJSON is almost always stored in latitude/longitude degrees, but other data sources might be stored in meters, or feet, or pixels (like TopoJSON).

For example, here's a map that uses a pre-projected topojson file and a null projection.

What do these coordinates look like? See EPSG:2163 Coordinates.

Okay, now what's "2163" mean? Let's talk about that...

Can I get your projection's number? :phone emoji:

D3 is convenient and it gives all these projections names, but if you're using anything other than D3, you might need to know your projection's EPSG code.

A few you might memorize after a while:

And sometimes, just sometimes, all those off-the-shelf projections just aren't enough, and you have to make your own projection. Since a projection is just a mathematical formula, it's not that hard to modify existing ones. See

Or, see what I like to call the "Hill-Akella Manhattan Projection"

How to lie to your mapping tools

Jump over to CartoDB again: how do projections work here?

What happens if we make it think the data is in one projection, but it's really in another?

With the right PostGIS query we are able to use raster tiles in non-Mercator projections! Just don't try to overlay them with any other raster maps!

Act III: communicating information on the map

Classification and choropleths

AKA how to lie with maps:

Equal Interval, Quantiles, Natural Breaks: example by @mkfreeman (Note that names from GIS are not the same as in D3)

5) USA Choropleth

Also, normalize your data by something (usually population). In our case, our data already comes with a "pct" column, so it's already normalized.

But sometimes you want to show the relative raw values too. Here's a way to do that, called a "value by alpha" choropleth:

6) Value-by-alpha Choropleth


Sorry, for election maps you kinda have to choose the awful blue-purple-red colors. It's the law. But for any other kind of map, please choose something better.

Who doesn't love ColorBrewer? But have you really spent some quality time at the source?

Rainbow color ramps considered harmful, please don't use them!

And of course you want to be color-blind friendly, so can we recommend Color Oracle?

Proportional symbols

First, let's switch our choropleth map to a point map of centroids. (dipping back into the data section)

First, let's try using d3.geo.centroid: 7) Proportional symbols with d3.geo.centroid

Why is it so slow? Let's pre-generate those centroids: create centroids

Now it's much snappier! 8) Proportional symbols with pre-generated centroids

Okay, back to proportional symbols! Choose your symbols wisely, and scale your symbols appropriately:

9) Proportional bars with pre-generated centroids


  • Make your legend a chart!

City Energy Project


Labels are one of the most difficult things to do programmatically...

You want to avoid text overlaps at all costs. Not just for maps, for charts too!

One approach: collision-detection for labels

[Best school day example abbreviations on the dorling maps]


For tiled maps, you should try to get your labels on top of your data.

CartoDB (now CARTO) does this by default: Let your labels shine! ...we'll talk a bit more about stacking tile layers after a break.

Act IV: bonus material! Mixing and matching to bring it all together

Leaflet layer stacking

Designing with big(-ish) data

Linked views: combining maps with other charts

Some closing thoughts, pet peeves, and our map checklist