Skip to content

Conversation

@davidanthoff
Copy link
Member

Fixes JuliaData/JuliaDB.jl#267.

This changes the column type detection logic for empty quoted strings. With this change, any "" is now always detected as a quoted string, end of story. Right now I think that makes the most sense, but I'm not super sure?

It seems generally weird to detect "" as a missing value, in general. I think it makes sense to detected say ,, as missing (i.e. the column between the two commas), but I think as soon as there are quotes there, we should interpret it as "there is a value, namely an empty string".

@shashi and @joshday can you think about this as well a bit? I feel this is the kind of change where it helps to poll a bunch of folks to make sure I don't overlooked some important corner case :)

There is probably another fix for the that is linked above, but this seemed the most straightforward one to me...

@davidanthoff
Copy link
Member Author

I think I'm going to merge this without review :) We have some real bugs with the currently tagged version and master that users are hitting in the wild, and so I think we need to get a new version out ASAP to deal with those problems. If this PR here is the wrong strategy, it strikes me as wrong for corner cases, whereas the issues on the currently shipping version are not corner cases.

So, I would still value a review, and if you guys disagree, we can revert in a follow up PR!

@davidanthoff davidanthoff merged commit 1b9878e into queryverse:master Mar 26, 2019
@davidanthoff davidanthoff deleted the change-guess-empty-quotes branch March 26, 2019 15:27
@joshday
Copy link

joshday commented Mar 26, 2019

Catching up on notifications now! Your reasoning on "" sounds solid to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CSV Parsing Error in JuliaDB Tutorial

2 participants