Skip to content

veltman/congressional-acronyms

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Congressional Acronym Data

This is the data that goes into my analysis of congressional acronym usage. The data comes from GovTrack by way of their Bulk Data API. The files for all bills since 1973 were downloaded in XML format, and bill titles and sponsor information extracted and analyzed. For details on some of the data complexities to watch out for, see my notes.

The data consists of four tab-separated files:

  • bills.tsv. Basic information about all bills, including bill type, bill number, introduction date, sponsor ID, session of congress, and bill status. The key columns are IS_ACRONYM (boolean, 1 or 0) and ACRONYM_WORD (text, contains the acronym word or words). You can reconstruct a GovTrack URL from the pieces in the format: http://www.govtrack.us/congress/bills/{session of congress}/{bill type}{bill number}

  • titles.tsv. Titles for each bill, listed by bill ID. Includes title type ("short","official", or "popular")

  • legislators.tsv. Basic information about bill sponsors: name, sortable name, party, GovTrack URL.

  • wordmap.tsv. A utility table to deal with the fact that a small number of bills have multiple acronym words. Some of them are phrases, like "SAFE DOSES"; others have multiple independent acronym titles, like the PACT Act that is also the PRECAUTION Act; a few have a phrase with a non-acronym word in the middle, like "HIRE at HOME".

About

Raw data on congressional acronyms, 1972-2013

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published