Updates for 2019 #38

jamesturk · 2018-12-18T14:13:05Z

jamesturk · 2018-12-18T14:19:47Z

worth noting:
2018 House-only: KS, MN, NM, SC
~~2018 No elections: LA, MS, NJ, VA~~
(we should still check them as there are likely a few specials)

jamesturk · 2019-01-02T05:10:21Z

DC is all incumbents, all good

jamesturk · 2019-01-02T05:14:16Z

thanks a ton for all the help so far @csnardi - if you're working on any and want to comment here to claim them that'd be great so we don't duplicate effort 😄

nickoneill · 2019-01-03T16:40:02Z

Hi there - if I wanted to get started helping out to scrape some states in, which state should I start with to not step on anyone else's toes?

csnardi · 2019-01-03T18:15:43Z

I don't have any state in progress right now, so you're probably good to start wherever!

nickoneill · 2019-01-03T18:22:05Z

Cool, I will take a look at Nevada this afternoon.

nickoneill · 2019-01-03T19:39:17Z

NV is throwing 500 errors on the endpoints that were previously used. The website seems to display the latest reps though, so the scraper might need an update. I'm going to try to find a first state that works before I dive into editing a scraper.

WA seems like it runs using docker-compose run --rm scrape wa --fastmode --scrape, but only produces empty json files:

Starting openstates_database_1 ... done
loaded Open States pupa settings...
wa (scrape)
19:32:35 INFO pupa: save jurisdiction Washington as jurisdiction_ocd-jurisdiction-country:us-state:wa-government.json
19:32:35 INFO pupa: save organization Washington State Legislature as organization_52aa27f8-0f8e-11e9-8c2b-0242ac140003.json
19:32:35 INFO pupa: save organization Senate as organization_52aa2e2e-0f8e-11e9-8c2b-0242ac140003.json
19:32:35 INFO pupa: save organization House as organization_52aa3162-0f8e-11e9-8c2b-0242ac140003.json
wa (scrape)
jurisdiction scrape:
  duration:  0:00:00.022996
  objects:
    jurisdiction: 1
    organization: 3

Going through the scraper contributing guide, it says after this stuff "And then the actual data scraping begins" with logged GET requests and whatnot. It doesn't seem like it's doing that at all. Any idea where to start to debug this?

jamesturk · 2019-01-03T19:50:47Z

@nickoneill WA might need scraper updates too, it might be a small change to the page that is causing us to miss whatever data is there -- I can take a look at one of those if you want to move on to try to find one that runs without issue for now

jamesturk · 2019-01-03T19:53:20Z

also, possibly of interest to you guys (& anyone else looking to take a crack at this) I added a few experimental command line flags to the merge script

  --remove-identical / --no-remove-identical
                                  In incoming mode, remove identical files.
  --copy-new / --no-copy-new      In incoming mode, copy brand new files over.
  --interactive / --no-interactive
                                  Do interactive merges.

these are all fairly rough right now, but Colorado was a lot faster when I ran with

--copy-new --remove-identical the first time

and then --interactive after that to take care of the rest (you should hit a to abort when the merge candidates get really bad)

it got me about 85% of the way there without a ton of manual file moving/etc.

nickoneill · 2019-01-03T20:10:51Z

@jamesturk Thanks, I mostly wanted to check that I wasn't missing something really obvious. I will dig around for something that works on its own for starters.

jamesturk · 2019-01-03T20:19:32Z

np, working on SD now (just a heads up)

nickoneill · 2019-01-03T21:03:06Z

Ah, I didn't realize I had to manually request people to get it to fetch them. I'm moving forward on AZ, will open a PR with more questions about how the person merge process works.

nickoneill · 2019-01-03T21:14:33Z

Actually more questions here first: I ran the merge helper and have some assumptions about what to do next:

For people who are the same (like this with a 0.70 score), update the file in the data directory.
0.70 data/az/people/Ben-Toma-6d71c8d3-c677-4f4b-96eb-411fbc464c2c.yml incoming/az/people/Ben-Toma-74b10855-2da0-4a63-bf6a-620c8ab4a6e8.yml

Why do they have new IDs? Which one should I keep?

For people who are not the same (like this with a 0.10 score), move the unused file to retired and add the new file?
data/az/people/Ken-Clark-a4d3ecd4-8bfc-4e6c-8290-a557bc347680.yml incoming/az/people/Amish-Shah-0147aaf4-bd4a-4a55-8003-684e3e60f522.yml

csnardi · 2019-01-03T21:26:35Z

The new IDs are because they're just GUIDs, there's no central database of legislator IDs like OCD has for Divisions/etc. So when you run the scraper, it creates a new GUID since it doesn't know any better. The old ID should be kept, since it'll match up with old data.

For people who are not the same, you're likely correct, you'd just want to retire the old legislator and bring in the new one.

I've created a simple script that can automate a lot of the process, it's not as complex and smart as merge.py, but it largely can perform all of the tasks as long as you check over it at the end/update the end/start dates: https://gist.github.com/csnardi/518cf39c0d0e909132f8ddec6f3817e9.

nickoneill · 2019-01-06T20:19:01Z

Washington has not been updated, they convene on January 14th 2019

jamesturk · 2019-01-09T19:34:59Z

taking another swing at NH next

jamesturk · 2019-02-07T02:18:16Z

AK finishes the batch, still tweaks to make but closing this in favor of individual issues as we go

jamesturk mentioned this issue Dec 18, 2018

Update ME #35

Closed

This was referenced Jan 2, 2019

SC 2019 #46

Closed

2019 ms la #48

Merged

2019 va ms #49

Merged

jamesturk pinned this issue Jan 2, 2019

csnardi mentioned this issue Jan 3, 2019

Update Utah legislators for 2019 #50

Merged

jamesturk mentioned this issue Jan 3, 2019

PA 2019 #51

Merged

This was referenced Jan 3, 2019

Update Hawaii legislators for 2019 #52

Merged

Update Tennessee legislators for 2019 #53

Merged

jamesturk mentioned this issue Jan 3, 2019

Colorado 2019 #54

Merged

jamesturk mentioned this issue Jan 3, 2019

SD 2019 #38 #57

Merged

jamesturk mentioned this issue Jan 7, 2019

temporary KS fix #74

Merged

This was referenced Jan 7, 2019

Update Wyoming legislators for 2019 #75

Merged

Update Ohio legislators for 2019 #78

Merged

nickoneill mentioned this issue Jan 8, 2019

Updated NV for 2019 #79

Merged

This was referenced Jan 9, 2019

Update Texas legislators for 2019 #80

Merged

Update Minnesota legislators for 2019 #81

Merged

jamesturk mentioned this issue Jan 9, 2019

KY 2019 #82

Merged

csnardi mentioned this issue Jan 9, 2019

Update Maryland legislators for 2019 #83

Merged

jamesturk added a commit that referenced this issue Jan 9, 2019

DE 2019, #38

7cfa7c9

jamesturk mentioned this issue Jan 9, 2019

DE 2019 #84

Merged

This was referenced Jan 9, 2019

Update Vermont legislators for 2019 #85

Merged

Update Nebraska legislators for 2019 #86

Merged

jamesturk mentioned this issue Jan 9, 2019

Nh 2019 #87

Merged

csnardi mentioned this issue Jan 9, 2019

Update West Virginia legislators for 2019 #88

Merged

This was referenced Jan 12, 2019

2019 people for IA #93

Merged

Update Arkansas legislators for 2019 #97

Merged

Update Illinois legislators for 2019 #98

Merged

Update Missouri legislators for 2019 #99

Merged

csnardi mentioned this issue Jan 18, 2019

Update Washington legislators for 2019 #104

Merged

This was referenced Jan 22, 2019

Update Florida legislators for 2019 #106

Merged

Update Connecticut legislators for 2019 #107

Merged

Update New Mexico legislators for 2019 #108

Merged

Update North Carolina legislators for 2019 #109

Merged

Update Oregon legislators for 2019 #110

Merged

jamesturk mentioned this issue Feb 7, 2019

AK 2019 #121

Merged

jamesturk closed this as completed Feb 7, 2019

jamesturk unpinned this issue Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates for 2019 #38

Updates for 2019 #38

jamesturk commented Dec 18, 2018 •

edited

Loading

jamesturk commented Dec 18, 2018 •

edited

Loading

jamesturk commented Jan 2, 2019

jamesturk commented Jan 2, 2019

nickoneill commented Jan 3, 2019

csnardi commented Jan 3, 2019

nickoneill commented Jan 3, 2019

nickoneill commented Jan 3, 2019

jamesturk commented Jan 3, 2019

jamesturk commented Jan 3, 2019 •

edited

Loading

nickoneill commented Jan 3, 2019

jamesturk commented Jan 3, 2019

nickoneill commented Jan 3, 2019

nickoneill commented Jan 3, 2019

csnardi commented Jan 3, 2019

nickoneill commented Jan 6, 2019

jamesturk commented Jan 9, 2019

jamesturk commented Feb 7, 2019

Updates for 2019 #38

Updates for 2019 #38

Comments

jamesturk commented Dec 18, 2018 • edited Loading

jamesturk commented Dec 18, 2018 • edited Loading

jamesturk commented Jan 2, 2019

jamesturk commented Jan 2, 2019

nickoneill commented Jan 3, 2019

csnardi commented Jan 3, 2019

nickoneill commented Jan 3, 2019

nickoneill commented Jan 3, 2019

jamesturk commented Jan 3, 2019

jamesturk commented Jan 3, 2019 • edited Loading

nickoneill commented Jan 3, 2019

jamesturk commented Jan 3, 2019

nickoneill commented Jan 3, 2019

nickoneill commented Jan 3, 2019

csnardi commented Jan 3, 2019

nickoneill commented Jan 6, 2019

jamesturk commented Jan 9, 2019

jamesturk commented Feb 7, 2019

jamesturk commented Dec 18, 2018 •

edited

Loading

jamesturk commented Dec 18, 2018 •

edited

Loading

jamesturk commented Jan 3, 2019 •

edited

Loading