Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to detect NULL values #120

Closed
lubomirmatus opened this issue May 12, 2021 · 3 comments
Closed

Add ability to detect NULL values #120

lubomirmatus opened this issue May 12, 2021 · 3 comments

Comments

@lubomirmatus
Copy link

The lib now does not have ability to detect that some value in row is NULL.

col1, col2, col3
0,0,0
0,0,
0,,0
,0,0

When reading numeric rows

CSVReader<3, trim_chars<>, double_quote_escape<',', '\"'>> csv(file);
int col1, col2, col3;
csv.read_row(col1, col2, col3);

the columns with missing values are returned as 0, so we cannot determine wheather it is real 0 or NULL.

Possible proposal is to use some kind of "nullable" wrapping type.

CSVReader<3, trim_chars<>, double_quote_escape<',', '\"'>> csv(file);
Nullable<int> col1, col2, col3;
csv.read_row(col1, col2, col3);
if (col1.is_null()) { ... }
@ben-strasser
Copy link
Owner

ben-strasser commented May 12, 2021 via email

@lubomirmatus
Copy link
Author

Thank you for your fast response.
Yes, I have been thinking about this solution, but it is kind of Do-It-Yourself.
It would be fine to have some incorporated solution for NULLs directly in lib.

@ben-strasser
Copy link
Owner

There are two problems with NULLs:

  • There are many different ways to represent them in CSV files, nil, null, Null, invalid, n/a, ... You will not find an exhaustive list, so it will always be incomplete. Further, there are candidates such as nan where it is not clear whether this is a valid error state value or an invalid NULL value.
  • Many types do not have a null state. What do you put into an int if null is encountered? We could now do something like introduce an io::Nullable<int> or maybe use an std::optional<int>. The first gets complex very quickly and makes the interface way more complex. The second would probably work. I have not thought it through. The reason it is not there is because the library is older than std::optional.

All of these are problems that are circumvented in a do-it-yourself model. The simple cases that cover 99% get a pretty interface. Everything else needs to go via char*. The library is carefully written, that you will not loose speed by parsing the char* yourself. No copy is involved. It directly points into the internal storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants