NY

James Turk edited this page Mar 28, 2013 · 5 revisions

What we're scraping and why

===================

We're currently using the senate api for:

  * the master list of active bill_id
  * the short bill title

We're currently scraping the senate website for:

  * bill "subjects", which on their site is the law compilation the bill primarily affects.
  * senate committee and floor votes
  * senate sponsor's memoranda
  * committee meetings

We're currently scraping the assembly for:

  * Bill sponsors, actions, and summaries
  * assembly votes
  * assembly sponsors' memoranda
  * version urls
  * assembly events

Senate API issues:

  * Bill actions are sometimes mangled or truncated in "uni bills"
  * Bill action ordering is mangled--an errant date sort occurring in the api somewhere
  * Sponsors are getting mangled on some bills
  * Bills have no session attribute
  * Assembly same-as id's aren't consistently displayed on the on the senate site.

Weirdness Of NY Companion Bills

=======================

NY has "same-as" bills, which are companion bills. It also has "uni bills", which as same-as bills that can only be amended if the amendment passes both houses. In other respects, they're two separate bills though.

Other Vagaries of NY Legislative Info

==============================

The assembly site doesn't publish senate votes. The senate site doesn't publish assembly votes.

So it seems the only way to get all votes for a bill that has a companion is to scrape both, then merge each's votes into the other, but only if the other hasn't been substituted and killed away first. AAAAAAAAARGGH!!