New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a setting to parse crlf with Tab separated CSV (TSV?!) files #56257
Comments
@Avogar I'll work on this one |
@Avogar two questions
As I can see from debugging return carriage gets read in as part of the field (eg: a line ClickHouse/src/IO/ReadHelpers.cpp Lines 499 to 533 in 2cb2bcf
Function above gets called originally from here for context: ClickHouse/src/Processors/Formats/Impl/TabSeparatedRowInputFormat.cpp Lines 140 to 152 in ca124df
|
Yes. When new setting is true, we should be able to parse files with line ending '\r\n' correctly.
We can work with ClickHouse/src/Processors/Formats/Impl/TabSeparatedRowInputFormat.cpp Lines 168 to 182 in ca124df
So, we should go through all |
Not sure if this is a feature or usability, so feel free to re-label.
I have a tab separated file with windows line endings, which means it doesn't work correctly with any of the TSV formats. It will process with the CSV ones:
But then every field is inside one row and I have to do a split on tab to separate them.
@alexey-milovidov said maybe we could add a setting:
The text was updated successfully, but these errors were encountered: