First, a couple maps from WNYC for inspiration.
If you have Census questions, someone else probably already asked it! Try out the Census Bureau FAQ's for some answers.
Also: You can see the IPython Notebook I was writing in during class at http://nbviewer.ipython.org/github/ledeprogram/courses/blob/master/foundations/week_4/Census%20Stuff.ipynb
It isn't just one big CENSUS!!! It's one big Census and a million other censuses. Let's review a few of them.
This is the big one, done every ten years.
ACS is done every year a limited capacity, and data is released on a 1-, 3-, and 5-year basis. That way you can pick and choose between recency and completeness. According to the Census, it asks about...
- age
- sex
- race
- family and relationships
- income and benefits
- health insurance
- education
- veteran status
- disabilities
- where you work and how you get there
- where you live and how much you pay for some essentials
Which covers pretty much anything you need for a story about a community.
Survey of Income and Program Participation (SIPP) Current Population Survey (CPS) Consumer Expenditure Survey (CE) National Health Interview Survey (NHIS)
...and a ton of others. I generally stick to ACS, but you might find fun data sets elsewhere, too!
They do a decent job of making information available on the site
...and many more
They have a developer section and a data discovery tool
Census geography is broken down into a million and one (sometimes-confusing) ways.
States, counties, MSA, block groups, blocks, tracts
Metropolitan Statistical Areas are high-density areas that are closely related, ie. Thereare 388 in the USA, as you can see on this map. The one NYC is a part of is right here.
Census Tracts are among the smallest division. They can be divided into census block groups and then again into census blocks. You can see a map of NYC census tracts at http://maps.nyc.gov/census/.
- 1 block group contains, on average, 39 blocks
- Block groups have between 600-3000 people
- Blocks have a 12-digit FIPS code
- The USA contains ~8 million census blocks, ~200k block groups, and ~50 census blocks
ZIP Code Tabulation Areas are almost, kind of ZIP codes. There are 32,000 ZCTA's.
Census tract numbers
You can find FIPS codes by address GNIS
GIS stands for Geographic Information Systems, and basically means "computer map stuff." You join these maps to your data programmatically, and voila! You've got a nice map.
The Census releases a set of data called TIGER (Topologically Integrated Geographic Encoding and Referencing) that illustrates all of the levels of geography that they cover.
The TIGER/Line files and related geographic data sets are available on census.gov, but there are some caveats! They often include parts of the ocean, are only available on a state-level basis, or are only available in Geodatabase format (which only ArcGIS can use). If you're planning on mapping, you might want to check out NHGIS down below for a more user-friendly format.
In theory you can browse TIGER divisions at TigerWeb, but I've found it nigh unuseable.
And oh my god, the TIGER logo is incredible http://en.wikipedia.org/wiki/Topologically_Integrated_Geographic_Encoding_and_Referencing#mediaviewer/File:US-Census-TIGERLogo.svg
In the world of GIS, there are a few different file formats. Sometimes you need to convert between them and it can be a pain, but
File format: .shp + optional .dbf/.shx/.prj
Shapefiles are the standard format for passing around geographic information. It comes with a few pieces...
- The
shp
file is the actual geographic information - The
prj
file explains what kind of projection is being used (hope it isn't a state plane system) - The
dbf
file has the information associated with each element of the shapefile - city name, or population count, or whatever other data you're storing - The
shx
is an index file that speeds up a program working on the shapefile
File format: .gdb
Geodatabases are a headache, and aren't well-supported on OS X. I don't know much about them as a result.
File formats: .json, .geojson, .topojson
JSON that supports geographic stuff! Written by developers instead of a standards group, it's especially popular among JavaScript applications.
If you need to write or edit smaller GeoJSON files, check out geojson.io.
TopoJSON is an extension of GeoJSON that allows for smaller file sizes. Instead of California having a western border and Nevada having an eastern border, TopoJSON collapses them into a single line and calls it a day.
File format: KML
You see this a lot with Google. Not generally around for more serious GIS work, I don't think.
A project of the University of Minnesota, the National Historical Geographic Information System (NHGIS) organizes and archives Census data. Easy to browse, easy to download - they're frankly leagues ahead of the Census itself.
They coordinate data samples across years, fill in gaps, and make a lot of notes about gotchas from different versions of the Census. They're also a great resource for GIS files to maps with your data!
Their datasets go back to 1790 and they aim to releases data sets on NHGIS within 6 weeks of being released by the Census
You browse it, you download datasets, there you go.
Visiting https://www.census.gov/data/developers/data-sets/acs-survey-5-year-data.html you see how you can make an API call directly. Let's look at that.
http://api.census.gov/data/2012/acs5/profile?get=NAME,DP02_0001PE&for=state:*&key=PUT_KEY_HERE
It looks like dogs.csv, kind of, with that header row and everything. Do we... cut and paste it into Python? Instead, let's find a module that does the hard work for us.