regression: read_csv()
uses quote of \0
if no quote char seen in sample_size
rows
#11838
Closed
2 tasks done
Labels
What happens?
read_csv() claims to use a default quote character of
"
. But starting in 0.10.1, if there is noquote
found within the firstsample_size
rows, then it actually defaults to\0
(ie no quoting). This is still present in the nightly build.I found this when trying to read the csv from https://doi.org/10.7910/DVN/JXPREB: the first 482808 lines have no quote characters, but the 482809th one does. In 0.10.0, the quote char would correctly get sniffed as
"
. In 0.10.1+, the quote char incorrectly get sniffed as\0
. I can workaround this by explicitly supplying the argquote='"'
, but I shouldn't have to do this.To Reproduce
OS:
macos M1
DuckDB Version:
nightly
DuckDB Client:
python
Full Name:
Nick Crews
Affiliation:
Ship Creek Group
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a nightly build
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: