Minneapolis Health Birth Statistics: CSV-a-thon
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
parser/births
pdfs
.gitignore
2006-2008.csv
2007-2009.csv
2008-2010.csv
2009-2011.csv
Makefile
README.md
README.png

README.md

Minneapolis Birth Statistics

This is liberated data from the Minneapolis Health Department reports, previously available only in almost machine readable PDF format.

To access the source data:

  1. Go here http://www.minneapolismn.gov/health/reports/index.htm
  2. Scroll down to the dropdown and then and select a year. Download will begin.

Sample map

See a sample map on highlighting adequacy of care during pregnancy, by neighborhood. Be sure to click the neighborhoods for an infobox containing more data.

Cleanup

This is sort of documented by way of the Makefile.

  1. pdftotext - to extract the text

    pdftotext -enc UTF-8 -table path/to/ugly.pdf

  2. I wrote a parser for the text data, in parser/births/

  3. Imported the data into Exc Google Do Drive Spreadsheet, and connected it with a city neighborhood dataset to clean up the neighborhood names. I have added a couple columns to help: the original ordering in the data source, city_gis_neighborhood_id, and city_gis_corrected_name.

I've done some spotchecking, but if you notice inconsistencies, please tell me.

Notes

The field names are insanely long, because I didn't want to have to write documentation. If something is unclear, refer to original PDFs.

The "Unknown" neighborhood is not a surprise conspiracy neighborhood that no one knows about, it is neighborhoods listed as unknown in the survey.

Specific file notes

  • 2006-2008.csv - Does not contain Unknown data, and Mid-City Industrial neighborhood is not present

TODOs

There is still more data to release from the PDFs:

  • Community data
  • City data
  • More years