-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NM: Update votes scraper (was: Fix individual members' votes in House PDFs) #65
Comments
Here's a clean-ish version of the scraped
The scraper warns:
|
Yeah, that's certainly nudgable. I'm curious about the PDF source; I wonder if it's OCR, or they used some WYSIWYG form generator and were sloppy. |
@mileswwatkins is this solved? |
@In-vincible, no, my PR just changed the code to skip votes that were troublesome, which is why I spun off this ticket. https://github.com/openstates/openstates/pull/2103/files#diff-d57e2d82487395e5ff5349aea8c56550R273 |
Spacing issues still exist within PDFs. Sometimes a yes vote is categorized as a no vote. Example: https://www.nmlegis.gov/Sessions/19%20Regular/votes/HB0256HVOTE.PDF |
Adding context to this old issue:
|
New Mexico serves its votes in PDFs (directory), and we try to parse their tables using the
x
andy
coordinates of theX
checkmarks.Unfortunately, at least for one of the 2018 session's House vote PDFs, the rows in the LXMLized vote PDF don't line up; that is, one of the vote checkmarks has a
y
coordinate that differs from its member, so the vote can't be attributed.When this is detected, I'm setting the scraper to throw out individual-member counts, and keep vote totals. But the individual-member scraping is so close, some additional logic may be able to salvage these cases.
cc @cliftonmcintosh
The text was updated successfully, but these errors were encountered: