-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for 2019 #38
Comments
worth noting: |
DC is all incumbents, all good |
thanks a ton for all the help so far @csnardi - if you're working on any and want to comment here to claim them that'd be great so we don't duplicate effort 😄 |
Hi there - if I wanted to get started helping out to scrape some states in, which state should I start with to not step on anyone else's toes? |
I don't have any state in progress right now, so you're probably good to start wherever! |
Cool, I will take a look at Nevada this afternoon. |
NV is throwing 500 errors on the endpoints that were previously used. The website seems to display the latest reps though, so the scraper might need an update. I'm going to try to find a first state that works before I dive into editing a scraper. WA seems like it runs using
Going through the scraper contributing guide, it says after this stuff "And then the actual data scraping begins" with logged GET requests and whatnot. It doesn't seem like it's doing that at all. Any idea where to start to debug this? |
@nickoneill WA might need scraper updates too, it might be a small change to the page that is causing us to miss whatever data is there -- I can take a look at one of those if you want to move on to try to find one that runs without issue for now |
also, possibly of interest to you guys (& anyone else looking to take a crack at this) I added a few experimental command line flags to the merge script
these are all fairly rough right now, but Colorado was a lot faster when I ran with
and then it got me about 85% of the way there without a ton of manual file moving/etc. |
@jamesturk Thanks, I mostly wanted to check that I wasn't missing something really obvious. I will dig around for something that works on its own for starters. |
np, working on SD now (just a heads up) |
Ah, I didn't realize I had to manually request |
Actually more questions here first: I ran the merge helper and have some assumptions about what to do next: For people who are the same (like this with a 0.70 score), update the file in the data directory.
For people who are not the same (like this with a 0.10 score), move the unused file to |
The new IDs are because they're just GUIDs, there's no central database of legislator IDs like OCD has for Divisions/etc. So when you run the scraper, it creates a new GUID since it doesn't know any better. The old ID should be kept, since it'll match up with old data. For people who are not the same, you're likely correct, you'd just want to retire the old legislator and bring in the new one. I've created a simple script that can automate a lot of the process, it's not as complex and smart as merge.py, but it largely can perform all of the tasks as long as you check over it at the end/update the end/start dates: https://gist.github.com/csnardi/518cf39c0d0e909132f8ddec6f3817e9. |
Washington has not been updated, they convene on January 14th 2019 |
taking another swing at NH next |
AK finishes the batch, still tweaks to make but closing this in favor of individual issues as we go |
Akin to openstates/openstates-scrapers#2681 this is the master ticket for 2019 updates
Right now the process is a bit rough, but described here: https://github.com/openstates/people#updating-an-entire-state-via-a-scrape
Please reference this issue in any PRs 🙂
Remaining States as of 1/16:
Completed:
The text was updated successfully, but these errors were encountered: