-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support an arbitrary CSV delimiter #2263
Support an arbitrary CSV delimiter #2263
Conversation
23c6017
to
8cb4539
Compare
private: | ||
void checkStringIsACharacter(const String & x) const { | ||
if (x.size() != 1) | ||
throw Exception(std::string("A setting's value string has to be an exactly one character long")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary to construct std::string
explicitly.
private: | ||
void checkStringIsACharacter(const String & x) const { | ||
if (x.size() != 1) | ||
throw Exception(std::string("A setting's value string has to be an exactly one character long")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ErrorCodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style. Braces {}
should be in a separate new line.
|
||
void set(const Field & x) | ||
{ | ||
String s = safeGet<const String &>(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use reference here to avoid copying.
|
||
When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to a comma or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) are all supported. | ||
*By default — `,`. See a [format_csv_delimiter](/docs/en/operations/settings/settings/#format_csv_delimiter) setting for additional info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that Markdown supports HTML entities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems logical since markdown is compiled into HTML. Also, mkdocs
in the docs/
directory compiles it correctly.
Almost everything is Ok, but tests are missing. |
Hmm, can we still call that a |
Yes, it is controversial. In fact, many "CSV readers" are configurable in this way.
Do they use multibyte delimiter by default? |
Nope. The default is |
does |
1.1.54390 version has the same error. =((( |
Added a
format_csv_delimiter
setting for specifying an arbitrary CSV delimiter.How it can be used:
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en