Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wisconsin 2014 general elections data missing an Assembly district? #10

Closed
epaulson opened this issue Sep 25, 2016 · 5 comments
Closed

Comments

@epaulson
Copy link

It looks like Assembly district 99 got dropped in the data as checked in:

In [1]: import pandas as pd

In [2]: df = pd.read_csv("20141104__wi__general_ward.csv")

In [3]: df.columns
Out[3]: 
Index(['county', 'ward', 'office', 'district', 'total votes', 'party',
       'candidate', 'votes'],
      dtype='object')

In [4]: df.loc[df['office'] == 'Assembly']['district'].unique()
Out[4]: 
array([  1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,  11.,
        12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,  22.,
        23.,  24.,  25.,  26.,  27.,  28.,  29.,  30.,  31.,  32.,  33.,
        34.,  35.,  36.,  37.,  38.,  39.,  40.,  41.,  42.,  43.,  44.,
        45.,  46.,  47.,  48.,  49.,  50.,  51.,  52.,  53.,  54.,  55.,
        56.,  57.,  58.,  59.,  60.,  61.,  62.,  63.,  64.,  65.,  66.,
        67.,  68.,  69.,  70.,  71.,  72.,  73.,  74.,  75.,  76.,  77.,
        78.,  79.,  80.,  81.,  82.,  83.,  84.,  85.,  86.,  87.,  88.,
        89.,  90.,  91.,  92.,  93.,  94.,  95.,  96.,  97.,  98.])

In [5]: len(df.loc[df['office'] == 'Assembly']['district'].unique())
Out[5]: 98

Grabbing the API metadata from http://openelections.net/api/v1/election/?format=json&limit=0&state__postal=WI and checking the .xlsx the GAB supplies I do see AD 99:

"direct_links": [
"http://www.gab.wi.gov/sites/default/files/11.4.2014%20Election%20Results%20-%20all%20offices%20w%20x%20w%20report.xlsx"
],
"end_date": "2014-11-04",
"gov": true,
"house": true,
"id": 1574,

Not sure what other elections might be missing data in case this is an off-by-one error in the parser somewhere...

@epaulson
Copy link
Author

epaulson commented Oct 1, 2016

This does seem to be a problem with the processed csv data as checked into github - when I clone the repo and run parse.py - which uses the Excel files that are in the local_data_cache - I get different results from what are checked in to the yearly directory data:

(openelex27)epaulson:~/development/openelex/openelections-data-wi $ git status
# On branch 10-missing-election
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   2010/20100914__wi__primary_ward.csv
#   modified:   2010/20101102__wi__general_ward.csv
#   modified:   2011/20110503__wi__special_general_ward.csv
#   modified:   2012/20120403__wi__primary_ward.csv
#   modified:   2012/20120605__wi__general-recall_ward.csv
#   modified:   2014/20140812__wi__primary_ward.csv
#   modified:   2014/20141104__wi__general_ward.csv
#   modified:   2015/20150217__wi__special_primary_ward.csv

Not sure how you want to keep those up to date with what parser.py produces.

@dwillis
Copy link
Contributor

dwillis commented Oct 1, 2016

Thanks for this catch.

@nbdavies
Copy link
Contributor

@davipo has some changes in his fork that fix this. (There were some spreadsheets where the last sheet wasn't being read.)

@nbdavies
Copy link
Contributor

@epaulson Can we close this one?

@epaulson
Copy link
Author

epaulson commented Jan 1, 2017

Yeah, I think so - I just spot-checked the same CSV from a fresh checkout and it had 99 races instead of 98, so I think it's better now.

@epaulson epaulson closed this as completed Jan 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants