What fancy schools do U.S. legislators go to?
HTML
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
scripts/bootstrap
static/css
templates
.gitignore
README.md
app.py
foo.py
freeze.py

README.md

Congressmembers who went to cool colleges

Which of our elected representatives earned their academic stripes -- or, even worked, or at least attended classes -- at America's best colleges (as ranked by U.S. News and World Report)?

Its temporary home is here: http://beta-congress-colleges-fun.s3-website-us-east-1.amazonaws.com/

This is an example of a (very simple) final project for the Stanford Computational Journalism class.

It's a work in progress. Needs more slicing-and-dicing (e.g. how many Senators vs. Representatives for a given school?) and views (show all current Congressmembers, including those who don't have any affiliation with these fancy schools). The original goal was to list all colleges...but if you look at the raw unstructured data, you'd see what that's not so straightforward...

The data

This repo contains some code and data to mash up several data sources:

The wrangling

The script scripts/bootstrap/filter_top_schools.py contains a messy script that joins the unstructured bioguide text with proper school names.

The produced data files are in: data/wrangled/

The app

This app is a fairly simple Flask app, not much different than the one described in the First News App Tutorial.

It contains an example of frozen deployment using the Frozen Flask library. Just run freeze.py to generate a build/ directory that can be uploaded to any static file server, including S3.

Further work

I ended up classifying just the USN&WR schools because there are just too many variations in how the BioGuide authors chose to describe life events. But there are plenty of other things to categorize, such as how many legislators were doctors/lawyers/small-town-mayors, or served in the military, or passed the bar, etc.

While there are 12,000+ total Congressmembers, there are enough incongruities in the biographical record (nevermind such things as colleges renaming themselves over the centuries) that it may be difficult to devise a purely machine-learning approach that isn't much more hacky and time-intensive than a focused crowd-sourced effort.