Skip to content
This repository has been archived by the owner on Dec 5, 2022. It is now read-only.

parse.number('') should return null, not 0 #524

Open
lazd opened this issue Apr 14, 2020 · 4 comments
Open

parse.number('') should return null, not 0 #524

lazd opened this issue Apr 14, 2020 · 4 comments

Comments

@lazd
Copy link
Contributor

lazd commented Apr 14, 2020

Description

/title

Steps to reproduce

  1. parse.number('')
  2. It's 0!

Expected behavior

It's null?

Additional context

This is a tough one. It seems like it should return null, which will cause a validation error, and if the scraper author wants to return zero for empty string, they can explicitly do: parse.number(parse.string(whatever) || 0)

@qgolsteyn
Copy link

Blank fields in CSVs are "" by default. I personally think returning 0 here would be a bug in almost all cases. I think having the ability to specify this behaviour through a parameter flag would be best, something like : parse.number(str, emptyAsZero=false). The default behaviour however should be to treat an empty string a undefined.

@lazd
Copy link
Contributor Author

lazd commented Apr 14, 2020

An option passed to parse.number would definitely be clean.

@shaperilio
Copy link

From a data point of view, this is a tough question.

In an HTML table, it's already hard to convince yourself that an empty column really means zero. If they know there are no deaths, they would type zero, right? If they don't know how many deaths they have, then why would they add a death column?

In a CSV, you can bet your bottom dollar that an empty column could very well mean they don't have data. ArcGIS CSVs are a perfect example of that, they're often littered with fields in some archaic database that no one has any clue about. The same will happen with any API endpoint that is accessing such a database.

And sources that give us a field = 0... do we know what standard they use? Does zero mean zero or not tracking?

I think at the end-user consumption level, we should at the very least say "zero may mean there is no data reported"

@camjc
Copy link
Contributor

camjc commented Apr 16, 2020

And at a data collection level we probably want to distinguish between we have data of zero or we have no data. 👍

@jzohrab jzohrab closed this as completed Aug 9, 2020
@jzohrab jzohrab reopened this Aug 9, 2020
@jzohrab jzohrab transferred this issue from covidatlas/coronadatascraper Aug 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants