Fixed reimbursement with net value with comma #144

viniciusartur · 2017-09-12T21:21:51Z

Fixing bug on Rosie reported at issue #77

Bug:
Given a CSV with a reimbursement with net_value containing comma (ex: 77,99)
When I translate this CSV using Serenata-Toolbox
Then I have a CSV translated with net_value containing comma

Expected:
Then I have a CSV translated with net_value containing dot

Bug analysis:
Converter responsible for replacing comma with dot was removed on this commit.

Solution:
Using decimal attribute on pandas.read_csv(), as suggested by @cuducos here.
Test added to cover this specific scenario.

rennerocha · 2017-09-12T21:28:08Z

tests/fixtures/chamber_of_deputies/reimbursements-with-comma

@@ -0,0 +1,2 @@
+congressperson_name,congressperson_id,congressperson_document,term,state,party,term_id,subquota_number,subquota_description,subquota_group_id,subquota_group_description,supplier,cnpj_cpf,document_number,document_type,issue_date,document_value,remark_value,net_value,month,year,installment,passenger,leg_of_the_trip,batch_number,reimbursement_number,reimbursement_value,applicant_id,document_id
+ABELARDO CAMARINHA,141463,329,2011,SP,PSB,54,1,Maintenance of office supporting parliamentary activity,0,,TIM CELULAR S/A,04206050005140,20272-AB,0,2009-05-19 00:00:00,411.51,0,411.51,5,2009,0,,,406387,2950,0,1772,1615640


Is this file correct? The name says that the reimbursements has a comma, but the content has a value with a point.

I renamed to reimbursements-with-decimal-point. Thank you for pointing out this.

cuducos

Many many thanks for that @viniciusartur! I added minor comments, what do you think about them? Overall it's really good, but I would like you to consider some minor tweaks to make the code even clearer ; )

cuducos · 2017-09-13T09:50:59Z

tests/fixtures/chamber_of_deputies/reimbursements-with-decimal-point

@@ -0,0 +1,2 @@
+congressperson_name,congressperson_id,congressperson_document,term,state,party,term_id,subquota_number,subquota_description,subquota_group_id,subquota_group_description,supplier,cnpj_cpf,document_number,document_type,issue_date,document_value,remark_value,net_value,month,year,installment,passenger,leg_of_the_trip,batch_number,reimbursement_number,reimbursement_value,applicant_id,document_id
+ABELARDO CAMARINHA,141463,329,2011,SP,PSB,54,1,Maintenance of office supporting parliamentary activity,0,,TIM CELULAR S/A,04206050005140,20272-AB,0,2009-05-19 00:00:00,411.51,0,411.51,5,2009,0,,,406387,2950,0,1772,1615640


I guess we should have .csv as a extension in this filename, right? Also can we have equivalent file names?

For example, one is …with-comma and the other …with-decimal-point. What about:

Ano-with-comma.csv

reimbursements-with-point.csv

Or

Ano-with-comma-as-decimal-separator.csv

reimbursements-with-point-as-decimal-separator.csv

I agree with the first option. I'm working on it.

cuducos · 2017-09-13T09:51:03Z

tests/unit/chambers_of_deputies/test_chamber_of_deputies_dataset.py

@@ -67,6 +68,15 @@ def test_clean_2017_reimbursements(self):
            with self.subTest():
                assert(subquota in all_subquotas)

+    def test_translate_csv_with_reimbursement_with_net_value_with_comma(self):
+        csv_with_comma = os.path.join(self.fixtures_path, 'Ano-with-comma.csv')
+        with open(os.path.join(self.fixtures_path, 'reimbursements-with-decimal-point'), 'r') as csv_expected:


Can we define the path in another line so we keep under the PEP8 suggestion of 80 cols?

I agree with that. I'm working on it.

cuducos · 2017-09-13T09:52:23Z

tests/unit/chambers_of_deputies/test_chamber_of_deputies_dataset.py

+            expected = csv_expected.read()
+
+        xz_output = Dataset('')._translate_file(csv_with_comma)
+        output = lzma.open(xz_output).read().decode('utf-8')


We need a context manager here too (with lzma.open(…))… or at least to close the lzma file handler after reading it.

I agree with that. I'm working on it.

viniciusartur · 2017-09-13T15:04:45Z

To ensure ubiquitous language, we changed from comma to decimal-comma.

cuducos · 2017-09-13T15:05:57Z

tests/unit/chambers_of_deputies/test_chamber_of_deputies_dataset.py

+        xz_path = Dataset('')._translate_file(csv_with_decimal_comma)
+        with lzma.open(xz_path) as xz_file:
+            output = xz_file.read().decode('utf-8')
+        assert(output == expected)


Let's use unittest to get better messages when test fails? That would be self.assertEqual(output, expected)

…sts fail

coveralls · 2017-09-13T15:53:02Z

Coverage remained the same at 85.667% when pulling 626a151 on viniciusartur:master into f30d0a8 on datasciencebr:master.

okfn-brasil#77 - Fixed reimbursement with net value with comma

a8ad367

rennerocha reviewed Sep 12, 2017

View reviewed changes

Enlightened name of reimbursements file

88c45d2

cuducos requested changes Sep 13, 2017

View reviewed changes

viniciusartur added 2 commits September 13, 2017 11:01

Minor improvements to make code more clear

94ebe2e

Changed from comma to decimal-comma to ensure ubiquitous language

e58cbd8

cuducos reviewed Sep 13, 2017

View reviewed changes

viniciusartur added 2 commits September 13, 2017 15:09

Changed assert to unittest.assertEqual to get better messages when te…

c2e0dbf

…sts fail

Bump version to include patch

626a151

okfn-brasil deleted a comment from coveralls Sep 13, 2017

cuducos approved these changes Sep 13, 2017

View reviewed changes

okfn-brasil deleted a comment from coveralls Sep 13, 2017

cuducos mentioned this pull request Sep 13, 2017

Fix float convertion #141

Closed

cuducos merged commit 7790572 into okfn-brasil:master Sep 13, 2017

cuducos mentioned this pull request Sep 14, 2017

Rosie breaks while trying to convert comma value to float okfn-brasil/rosie#77

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed reimbursement with net value with comma #144

Fixed reimbursement with net value with comma #144

viniciusartur commented Sep 12, 2017 •

edited

Loading

rennerocha Sep 12, 2017

viniciusartur Sep 12, 2017

cuducos left a comment

cuducos Sep 13, 2017

viniciusartur Sep 13, 2017

cuducos Sep 13, 2017

viniciusartur Sep 13, 2017

cuducos Sep 13, 2017

viniciusartur Sep 13, 2017

viniciusartur commented Sep 13, 2017

cuducos Sep 13, 2017

coveralls commented Sep 13, 2017 •

edited

Loading

		@@ -0,0 +1,2 @@
		congressperson_name,congressperson_id,congressperson_document,term,state,party,term_id,subquota_number,subquota_description,subquota_group_id,subquota_group_description,supplier,cnpj_cpf,document_number,document_type,issue_date,document_value,remark_value,net_value,month,year,installment,passenger,leg_of_the_trip,batch_number,reimbursement_number,reimbursement_value,applicant_id,document_id
		ABELARDO CAMARINHA,141463,329,2011,SP,PSB,54,1,Maintenance of office supporting parliamentary activity,0,,TIM CELULAR S/A,04206050005140,20272-AB,0,2009-05-19 00:00:00,411.51,0,411.51,5,2009,0,,,406387,2950,0,1772,1615640

Fixed reimbursement with net value with comma #144

Fixed reimbursement with net value with comma #144

Conversation

viniciusartur commented Sep 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cuducos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viniciusartur commented Sep 13, 2017

Choose a reason for hiding this comment

coveralls commented Sep 13, 2017 • edited Loading

viniciusartur commented Sep 12, 2017 •

edited

Loading

coveralls commented Sep 13, 2017 •

edited

Loading