Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculation of net_value #85

Closed
c4rl0sm3nd3s opened this issue Oct 18, 2016 · 7 comments
Closed

Calculation of net_value #85

c4rl0sm3nd3s opened this issue Oct 18, 2016 · 7 comments

Comments

@c4rl0sm3nd3s
Copy link

c4rl0sm3nd3s commented Oct 18, 2016

As discussed on Telegram, there are cases where the net_value is zero but when compared on the result from the website, it shows a value different from zero.

For example, this case: http://www.camara.gov.br/cota-parlamentar/documento?nuDeputadoId=1074&numMes=6&numAno=2016&despesa=5&cnpjFornecedor=14415344000176&idDocumento=0000000068

@c4rl0sm3nd3s
Copy link
Author

c4rl0sm3nd3s commented Oct 18, 2016

One way to fix that is to calculate the net_value. First compare the reimbursement_value, if it is not zero, check if the net_value is equals to zero, then calculate the difference between reimbursement_value with document_value.

@cuducos
Copy link
Collaborator

cuducos commented Oct 18, 2016

Great topic, thanks, @c4rl0sm3nd3s.

@Irio, any advice on how to proceed in terms of best practices for data science projects? Just updating the dataset seems quite aggressive in terms of reproducibility — does adding calculated_net_value to the big CSV make sense?

@cuducos cuducos changed the title calculation of 'net_value' Calculation of net_value Oct 18, 2016
@c4rl0sm3nd3s
Copy link
Author

maybe use temporary variable during the process, no need to save it in the CSV.

@pmargreff
Copy link

pmargreff commented Nov 3, 2016

Hi, I bit in the same problem, the columns value aren't all consistent. You can create a new column and fill that in one line using Julia map function.

The result is something like:

table[:calculated_net_value] = map((x,y) -> x - y, table[:document_value], table[:reimbursement_value])

@cuducos
Copy link
Collaborator

cuducos commented Nov 3, 2016

Just as an option, I would suggest asking the Lower House for clarification. They have a contact form that generates a protocol (we can make some pressure with the protocol if they don't reply) and (from my experience) they are very thoughtful in responding to the public.

@Irio
Copy link
Collaborator

Irio commented Nov 14, 2016

Today I am pairing with @cuducos to solve this issue.

There are a few things we already discovered as sources of these inconsistencies:

  1. Flight tickets issue for multiple passengers (where a document corresponds to the expense for a single person) and ticket compensation (when the airline gives a voucher in specific cases).
  2. A single document being reimbursed in more than 1 payment (check http://jarbas.datasciencebr.com/#/document_id/5914504. 2 payments for the expense).

@cuducos
Copy link
Collaborator

cuducos commented Dec 8, 2016

Fixed in YYYY-MM-DD-reimbursements.xz dataset ; )

@cuducos cuducos closed this as completed Dec 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants