Update IL scraper for older sessions #105

merged 4 commits into from Nov 7, 2011


None yet

2 participants


I see that the ILGA has made some changes to their URL structure since last I worked on this code. I poked around and worked out how to get URLs for the four previous sessions of the assembly. I haven't exhaustively tested things, but I did run the entire bill scraper and several days later, there were no fatal errors and the JSON I've looked at seems sound.

It does seem that there are six kinds of documents which are not currently being passed to bill.add_document and I'll see if I can find time to check those out and process them, but I figured I'd offer what works here rather than defer that indefinitely...

@jamesturk jamesturk merged commit 40728ce into sunlightlabs:master Nov 7, 2011
@cweber cweber added a commit that referenced this pull request Jul 10, 2012
@cweber cweber Added missingInfo link style. [Issue #105] a6e68d7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment