Skip to content

Latest commit

 

History

History
175 lines (101 loc) · 8.19 KB

census.md

File metadata and controls

175 lines (101 loc) · 8.19 KB

The Census + a Bit about Mapping

First, a couple maps from WNYC for inspiration.

If you have Census questions, someone else probably already asked it! Try out the Census Bureau FAQ's for some answers.

Also: You can see the IPython Notebook I was writing in during class at http://nbviewer.ipython.org/github/ledeprogram/courses/blob/master/foundations/week_4/Census%20Stuff.ipynb

Different Surveys

It isn't just one big CENSUS!!! It's one big Census and a million other censuses. Let's review a few of them.

The Census

This is the big one, done every ten years.

American Community Survey (ACS)

ACS is done every year a limited capacity, and data is released on a 1-, 3-, and 5-year basis. That way you can pick and choose between recency and completeness. According to the Census, it asks about...

  • age
  • sex
  • race
  • family and relationships
  • income and benefits
  • health insurance
  • education
  • veteran status
  • disabilities
  • where you work and how you get there
  • where you live and how much you pay for some essentials

Which covers pretty much anything you need for a story about a community.

A few others

Survey of Income and Program Participation (SIPP) Current Population Survey (CPS) Consumer Expenditure Survey (CE) National Health Interview Survey (NHIS)

...and a ton of others. I generally stick to ACS, but you might find fun data sets elsewhere, too!

The Data

They do a decent job of making information available on the site

Online Tools

...and many more

Data sets

They have a developer section and a data discovery tool

Geography

Census geography is broken down into a million and one (sometimes-confusing) ways.

Available Slices

Lots of them

Alt text

States, counties, MSA, block groups, blocks, tracts

Metropolitan Statistical Areas are high-density areas that are closely related, ie. Thereare 388 in the USA, as you can see on this map. The one NYC is a part of is right here.

Census Tracts are among the smallest division. They can be divided into census block groups and then again into census blocks. You can see a map of NYC census tracts at http://maps.nyc.gov/census/.

  • 1 block group contains, on average, 39 blocks
  • Block groups have between 600-3000 people
  • Blocks have a 12-digit FIPS code
  • The USA contains ~8 million census blocks, ~200k block groups, and ~50 census blocks

ZIP Code Tabulation Areas are almost, kind of ZIP codes. There are 32,000 ZCTA's.

FIPS

State codes County codes

Census tract numbers

You can find FIPS codes by address GNIS

GIS files

GIS stands for Geographic Information Systems, and basically means "computer map stuff." You join these maps to your data programmatically, and voila! You've got a nice map.

Geography

The Census releases a set of data called TIGER (Topologically Integrated Geographic Encoding and Referencing) that illustrates all of the levels of geography that they cover.

The TIGER/Line files and related geographic data sets are available on census.gov, but there are some caveats! They often include parts of the ocean, are only available on a state-level basis, or are only available in Geodatabase format (which only ArcGIS can use). If you're planning on mapping, you might want to check out NHGIS down below for a more user-friendly format.

In theory you can browse TIGER divisions at TigerWeb, but I've found it nigh unuseable.

And oh my god, the TIGER logo is incredible http://en.wikipedia.org/wiki/Topologically_Integrated_Geographic_Encoding_and_Referencing#mediaviewer/File:US-Census-TIGERLogo.svg

File formats

In the world of GIS, there are a few different file formats. Sometimes you need to convert between them and it can be a pain, but

Shapefiles*

File format: .shp + optional .dbf/.shx/.prj

Shapefiles are the standard format for passing around geographic information. It comes with a few pieces...

  • The shp file is the actual geographic information
  • The prj file explains what kind of projection is being used (hope it isn't a state plane system)
  • The dbf file has the information associated with each element of the shapefile - city name, or population count, or whatever other data you're storing
  • The shx is an index file that speeds up a program working on the shapefile
Geodatabases

File format: .gdb

Geodatabases are a headache, and aren't well-supported on OS X. I don't know much about them as a result.

GeoJSON/TopoJSON

File formats: .json, .geojson, .topojson

JSON that supports geographic stuff! Written by developers instead of a standards group, it's especially popular among JavaScript applications.

If you need to write or edit smaller GeoJSON files, check out geojson.io.

TopoJSON is an extension of GeoJSON that allows for smaller file sizes. Instead of California having a western border and Nevada having an eastern border, TopoJSON collapses them into a single line and calls it a day.

Keyhole Markup Language (KML)

File format: KML

You see this a lot with Google. Not generally around for more serious GIS work, I don't think.

NHGIS: National Historical Geographic Information System

A project of the University of Minnesota, the National Historical Geographic Information System (NHGIS) organizes and archives Census data. Easy to browse, easy to download - they're frankly leagues ahead of the Census itself.

They coordinate data samples across years, fill in gaps, and make a lot of notes about gotchas from different versions of the Census. They're also a great resource for GIS files to maps with your data!

Their datasets go back to 1790 and they aim to releases data sets on NHGIS within 6 weeks of being released by the Census

IPums

https://usa.ipums.org/usa/

Accessing data

NHGIS

You browse it, you download datasets, there you go.

https://www.nhgis.org

The Census API

Visiting https://www.census.gov/data/developers/data-sets/acs-survey-5-year-data.html you see how you can make an API call directly. Let's look at that.

http://api.census.gov/data/2012/acs5/profile?get=NAME,DP02_0001PE&for=state:*&key=PUT_KEY_HERE

It looks like dogs.csv, kind of, with that header row and everything. Do we... cut and paste it into Python? Instead, let's find a module that does the hard work for us.

https://github.com/sunlightlabs/census