Skip to content
This repository has been archived by the owner on Dec 22, 2022. It is now read-only.

Using zeros for missing data. #59

Closed
carranco-sga opened this issue Apr 9, 2020 · 7 comments
Closed

Using zeros for missing data. #59

carranco-sga opened this issue Apr 9, 2020 · 7 comments
Labels
Contributors Contributors to fix For documentation Improvements or additions to documentation General question Further information is requested Help wanted Extra attention is needed Requirement Requirement in the project

Comments

@carranco-sga
Copy link
Collaborator

Describe the bug
Is there any agreed consensus in the way missing data should be handled?

Commit a1a60bd by @dfuribez has changed some of the missing data I maintain to zeros; however, since March 23, Mexico's data reports don't include specifics on the number of recovered people, hence why I report it as missing.

Expected behavior
Missing data should be dealt with consistently for any one country and the database as a whole so it is more useful.
The changes made in the commit are just a fraction of the required ones to have consistency in Mexico's data.
I will gladly make the changes to the data under my responsibility, to maintain reporting style if there is a consensus in the way missing data should be handled.

However, I think zero has a very precise definition: nothing. Treating data that is not available (missing, but not necessarily equal to zero) as zero, confuses me.
And well, putting personal biases aside, it's much easier to "complete" the data replacing missing with zeros than vice-versa. A simple line of regex magic does the trick in these cases.

Files with the bug
Refer to a1a60bd.

Pinging contributors: @ZurMaD @rafnixg @RcrdPhysics @pablora19 @martingra @mamanipatricia @leytzher @josetup123 @ivanMSC @diegocl02 @dfuribez @ariasbordahugo @pablorea @rendergraf @Caospierre

@carranco-sga carranco-sga added For documentation Improvements or additions to documentation Help wanted Extra attention is needed General question Further information is requested Requirement Requirement in the project Contributors Contributors to fix labels Apr 9, 2020
@ivanMSC
Copy link
Collaborator

ivanMSC commented Apr 9, 2020

I vote for blanks.

@RcrdPhysics
Copy link

RcrdPhysics commented Apr 9, 2020 via email

@dfuribez
Copy link
Collaborator

dfuribez commented Apr 9, 2020

Yeah, I totally agree we need a way to differentiate between nothing and not reported data. I mentioned that problem on Slack a few days ago and I was told just to change all "missing" values for zeros.

And I'm sorry if I'm messing with your work.

@carranco-sga
Copy link
Collaborator Author

No harm done, @dfuribez. I hope this issue allows us to have a more useful database.

@pablodz
Copy link
Collaborator

pablodz commented Apr 10, 2020

Scripts need changes to allow blank spaces and we need to review all dataset to don't clear blank spaces.

@carranco-sga
Copy link
Collaborator Author

All missing strings in Mexico's data should've been changed to empty strings by 8cff184.

pablodz added a commit that referenced this issue Apr 27, 2020
@pablodz
Copy link
Collaborator

pablodz commented May 1, 2020

We're now using blanck spaces `` for missing data

@pablodz pablodz closed this as completed May 1, 2020
pablodz pushed a commit that referenced this issue Jun 14, 2020
pablodz added a commit that referenced this issue Jun 14, 2020
pablodz pushed a commit that referenced this issue Jun 20, 2020
pablodz added a commit that referenced this issue Jun 20, 2020
pablodz pushed a commit that referenced this issue Dec 12, 2020
pablodz added a commit that referenced this issue Dec 12, 2020
pablodz added a commit that referenced this issue Dec 12, 2020
Former-commit-id: c175344
Former-commit-id: 4a8a1730cbb796cead5f481d4fef508c9b17dabb
pablodz pushed a commit that referenced this issue Dec 12, 2020
Former-commit-id: c26f404
Former-commit-id: 8e7174e6ba95390d32a6af69f877bc728771a602
pablodz added a commit that referenced this issue Dec 12, 2020
Former-commit-id: 3ae6faa
Former-commit-id: 5c01b9aa9017d0b0ba032825b8f370790f072ff4
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Contributors Contributors to fix For documentation Improvements or additions to documentation General question Further information is requested Help wanted Extra attention is needed Requirement Requirement in the project
Projects
None yet
Development

No branches or pull requests

5 participants