Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Missing" Values #1107

Open
m-knopp opened this issue Aug 11, 2023 · 2 comments
Open

"Missing" Values #1107

m-knopp opened this issue Aug 11, 2023 · 2 comments

Comments

@m-knopp
Copy link

m-knopp commented Aug 11, 2023

Hi,

I have a hard time understanding why empty Strings are interpreted as missing by default. missing represents a value that exists, but we don't have access to. Why would we assume that with no semantic information about the data we are parsing?
"" is not a missing value, it is just an empty String and should be treated as such.

This is really awkward when you try further operations that fail because the type of a column is now Union{Missing, String}.
Also I found that
CSV.read("myfancytable.csv", DataFrame, missingstring="")
does not replace the "missing" values with empty Strings, they are still missing values.
CSV.read("myfancytable.csv", DataFrame, missingstring="abc")
does not replace "missing" values with String("abc"), but with nothing values.

My suggestion is to use String("") or nothing as the default value for an empty table cell.

@hhaensel
Copy link

hhaensel commented Sep 6, 2023

You have two options:

  • provide the option missingstring = nothing
    CSV.read("myfancytable.csv", DataFrame, missingstring=nothing)
  • provide an explicit type for the columns in question
    CSV.read("myfancytable.csv", DataFrame, types = Dict(:mystringcolumn => String))

@hhaensel
Copy link

hhaensel commented Sep 6, 2023

I agree that it's a bit strange that

CSV.read("test.csv", DataFrame, missingstring = String[])

doesn't provide the same result as

CSV.read("test.csv", DataFrame, missingstring = nothing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants