-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CustomSeparated/Template formats don't work properly with CSV escaping rule and arbitrary field delimiter #39620
Comments
|
Actually , i have test many version ,but all is not work well . |
shorter example $ echo "aa._bb"|clickhouse-local --structure "a String, b String" --input-format CustomSeparated --format_custom_escaping_rule='CSV' --format_custom_field_delimiter='._' -q 'select * from table'
aa bb
$ echo "a_a._bb"|clickhouse-local --structure "a String, b String" --input-format CustomSeparated --format_custom_escaping_rule='CSV' --format_custom_field_delimiter='._' -q 'select * from table'
a_a bb
$ echo '"a.a"._"bb"'|clickhouse-local --structure "a String, b String" --input-format CustomSeparated --format_custom_escaping_rule='CSV' --format_custom_field_delimiter='._' -q 'select * from table'
a.a bb
$ echo "a.a._bb"|clickhouse-local --structure "a String, b String" --input-format CustomSeparated --format_custom_escaping_rule='CSV' --format_custom_field_delimiter='._' -q 'select * from table'
Code: 27. DB::ParsingException: Cannot parse input: expected '._' before: 'a._bb\n':
Row 1:
Column 0, name: a, type: String, parsed text: "a"
ERROR: There is no delimiter between fields: expected "._", got "a._bb<LINE FEED>"
: While executing CustomSeparatedRowInputFormat: While executing File: (at row 1)
. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) it fails if the first character |
”it fails if the first character . of format_custom_field_delimiter ._ is met in the text a.a and text without quotes.“ @den-crane do you known how to resolve this problem |
Yes, there is a problem with parsing of unquoted strings in CSV format in CustomSeparated/Template formats when deilimiter is large then 1 symbol. It happels because internal implementation of parsing data types in CSV format works only with char delimiter. I will try to fix this problem. By now you can use Regexp format to read such files. |
版本为:22.6.3.35
问题:使用CustomSeparated 格式导入多字符字段分隔符的文本中,文本内容不能包含多分隔中的第一个字符。
测试过程:
文件的分隔符号为 ”|!“
当导入tableA.txt 数据的时候,就会在name|52位置报错。 但是”|“ 字符是我需要的内容
就像报错提示的内容一样,如果字段中含有分隔符(多个)的首个字符,则解析会认为是分隔符开始,但是这不是我期望的结果。因为
name|52
就是一个完整的字符串。The text was updated successfully, but these errors were encountered: