Wabstic import #158

gh-PonyM · 2019-08-08T15:43:44Z

Improved errors when parsing major and proporz elections.
Switching expected headers in wabstic import scripts from gewahlt to gewaehlt. Fixed failing import of majorz election test dataset that raised "Missing column: gewahlt".
Import of new majorz and proporz test data was successfull

- Passing tests - Removing archive search specific modifications

- Move into the else block, that gets executed if try does not raise any error

- test catches first of execpetions, the rest works the same way

coveralls · 2019-08-08T15:53:51Z

Coverage decreased (-0.04%) to 94.844% when pulling e5a4fcd on wabstic_import into d21c358 on master.

href

Looks good to me, aside from a few small things. You certainly want to avoid consuming all the CSV lines however - that's a blocker for me.

I think coercing None to 0 makes sense, as CandiateResult.votes wants it that way. When in doubt we probably want to stick with the existing model. A lot more thinking goes into building up the model than adding features later, so I guess we trust that it makes sense as is 😉

href · 2019-08-09T11:26:00Z

onegov/election_day/formats/common.py

+def line_is_relevant(line, number, district=None):
+    if district:
+        return line.sortwahlkreis == district and line.sortgeschaeft == number
+    else:


You can remove the else, since the if above always returns:

if district: return line.sortwahlkreis == district and line.sortgeschaeft == number return line.sortgeschaeft == number

href · 2019-08-09T11:28:29Z

onegov/election_day/formats/common.py

+    :param none_be_zero: raises ValueError if line.col is None
+    :return: integer value of line.col
+    """
+    assert hasattr(line, col), 'Check done in load_csv'


What does 'Check done in load_csv' mean? Could you elaborate here?

If you need longer texts, you can also write assertions like this:

assert hasattr(line, col), f""" {col} was not found on {line} - this should have failed in load_csv """

I forgot to drop this line, since in load_csv in common.py, checks if the expected columns are there is performed already. The comment was a remainder of that.

href · 2019-08-09T11:29:49Z

onegov/election_day/formats/common.py

+        return line.sortgeschaeft == number
+
+
+def validate_integer(line, col, none_be_zero=True):


I'd say non_be_zero should be named treat_none_as_zero. What do you think?

it's better like this, I also added a default kwarg to return that default and changed the lines where the votes are received for the candidate.

href · 2019-08-09T11:32:01Z

onegov/election_day/formats/election/wabstic_majorz.py

-    entity_id = int(line.bfsnrgemeinde or 0)
+def get_entity_id(line):
+    col = 'bfsnrgemeinde'
+    # try:


Commented out code should not be committed, unless it is meant to be activated in the near future (for example, if a Puppet deployment has been prepared and tested, but not yet rolled out).

href · 2019-08-09T11:36:19Z

onegov/election_day/formats/election/wabstic_majorz.py

@@ -79,6 +81,16 @@ def import_election_wabstic_majorz(
    entities = principal.entities[election.date.year]
    election_id = election.id

+    def has_no_lines(lines, filename):
+        if not list(lines):


You consume the whole list of lines, to check if there's at least one line. That seems wasteful. If lines is not a generator, you can just check for by running if lines.

If lines is a generator, don't have to consume it to check for it. You can do either this:

for line in lines: return False # has lines return True # has no lines

Or by getting the next value of the generator with a fallback:

if next(lines, None): return True # has lines return False # has no lines

Yes, it's definitively a blocker...

href · 2019-08-09T11:37:52Z

onegov/election_day/formats/election/wabstic_proporz.py

@@ -179,6 +185,16 @@ def import_election_wabstic_proporz(
    entities = principal.entities[election.date.year]
    election_id = election.id

+    def has_no_lines(lines, filename):
+        if not list(lines):


Same as above, this should not traverse all lines.

- unclear which files could be empty and must be importable - does not fit script flow causing errors anyway and aborting import

Lukas added 18 commits August 8, 2019 12:47

Adds modifications from archive search branch

015227d

- Passing tests - Removing archive search specific modifications

Changes expected column title according to csv files (gewaehlt)

bf88641

Refactoring validation functions

46d27b2

Fixes use of entity_id before possible assignment (in try block)

f939002

- Move into the else block, that gets executed if try does not raise any error

Adds helper function

145df90

Adds annotations and debug prints

d8cf71f

Refactoring

2b3e07c

Refactoring

d438ee5

Sorts printed errors to compare to test

634a33f

Refactoring

1ca25d4

- test catches first of execpetions, the rest works the same way

Renames gewahlt to gewaehlt, refactoring, improved errors

a79eb0f

Fixes '' in csv to be 0 (integer) according to CandidateResult.votes

6cfa318

Refactoring

e41a98e

Improves error msgs for kandidatengde, fixes tests

5267001

Refactoring, line_is_relevant can be shared for proporz and majorz

622bfb2

Linting

6458629

Refactoring

03f733f

Reduces redundancy of checking for missing columns

c946a3f

gh-PonyM requested a review from href August 8, 2019 15:43

href suggested changes Aug 9, 2019

View reviewed changes

Lukas added 5 commits August 9, 2019 18:16

Removes validation since its done in csv import

ee492ac

removes validation of empty lines

6e632f5

- unclear which files could be empty and must be importable - does not fit script flow causing errors anyway and aborting import

Refactor

a5dcc21

Maintains old return value 'None' if stimmen is '' is csv file

4d88322

Linting

e5a4fcd

gh-PonyM merged commit 7991878 into master Aug 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wabstic import #158

Wabstic import #158

gh-PonyM commented Aug 8, 2019

coveralls commented Aug 8, 2019 •

edited

Loading

href left a comment

href Aug 9, 2019

href Aug 9, 2019

gh-PonyM Aug 9, 2019

href Aug 9, 2019

gh-PonyM Aug 9, 2019

href Aug 9, 2019

href Aug 9, 2019

gh-PonyM Aug 9, 2019

href Aug 9, 2019

		return line.sortgeschaeft == number


		def validate_integer(line, col, none_be_zero=True):

Wabstic import #158

Wabstic import #158

Conversation

gh-PonyM commented Aug 8, 2019

coveralls commented Aug 8, 2019 • edited Loading

href left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Aug 8, 2019 •

edited

Loading