-
Notifications
You must be signed in to change notification settings - Fork 60
Add command line option to limit dataset years #60
base: master
Are you sure you want to change the base?
Conversation
LGTM… waiting for okfn-brasil/serenata-toolbox#97 then. |
rosie.py
Outdated
klass = getattr(rosie, target_module) | ||
klass.main(target_directory) | ||
klass.main(target_directory, years=years) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing in tests:
$ python rosie.py run federal_senate
Traceback (most recent call last):
File "rosie.py", line 62, in <module>
command()
File "rosie.py", line 36, in run
klass.main(target_directory, years=years)
UnboundLocalError: local variable 'years' referenced before assignment
If no --years
is passed, years
variable is not set.
Hello everyone, I'm getting back to this PR :) I tested the command with (serenata_rosie) ➜ rosie git:(irio-limit-years) ✗ python rosie.py run chamber_of_deputies --years 2017 /tmp/
2017-12-08 13:58:34,684 - root - INFO - Merging all datasets…
2017-12-08 13:58:34,684 - root - INFO - Loading reimbursements-2017.xz…
2017-12-08 13:58:37,153 - root - INFO - Dropping rows without document_value or reimbursement_number…
2017-12-08 13:58:37,845 - root - INFO - Grouping dataset by applicant_id, document_id and year…
2017-12-08 13:58:37,846 - root - INFO - Gathering all reimbursement numbers together…
2017-12-08 13:58:40,804 - root - INFO - Summing all net values together…
2017-12-08 13:58:40,826 - root - INFO - Summing all reimbursement values together…
2017-12-08 13:58:40,852 - root - INFO - Generating the new dataset…
2017-12-08 13:58:41,999 - root - INFO - Casting changes to a new DataFrame…
2017-12-08 13:58:41,999 - root - INFO - Writing it to file…
2017-12-08 13:59:00,764 - root - INFO - Done.
Downloading 2016-09-03-companies.xz: 100%|██████████████████████████████████████████████████████████████| 4.84M/4.84M [00:03<00:00, 1.30Mb/s] |
@@ -31,7 +30,12 @@ def run(): | |||
exit(1) | |||
target_directory = argv[3] if len(argv) >= 4 else '/tmp/serenata-data/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just find out the problem here: if we update the value, we need to check here: if len(argv) >= 4 else '/tmp/serenata-data/'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I plan to do, like @Irio's tag --years
, I'll add a tag for path --path
so if there is an path, it'll be the next argument
Depends on datasciencebr/serenata-toolbox#97, missing tests once the other pull request gets merged.