Consumption of data packed in 7z does not work on s3 and locally.
select *, _file as file
FROM file('tr_cyp_kosmos2023-12-25.7z', 'CSV'
)
limit 100
SETTINGS
format_csv_delimiter=' ',
input_format_csv_detect_header=false,
input_format_csv_allow_whitespace_or_tab_as_delimiter=true
;
Query id: ac28d786-85c0-4128-b67b-62c50d09a318
Elapsed: 0.163 sec.
Received exception from server (version 24.9.1):
Code: 27. DB::Exception: Received from host:9000. DB::Exception: Cannot parse input: expected ' ' before: 'q#e.\0\0\0\0\0�\0\0\0\0\0\0\0�\'j�����\r]\0\f�i�t[��]�Ƥ=�ko�)k�/T۟��G��L�N\fo�\baʿBn�w����{�����\0�mkr��Ps����?�ĸ���kZ:!���G�Ն�E,~ݽ�����v��\'�mzx[': (at row 1)
:
Row 1:
Column 0, name: ticker_quote_time, type: DateTime64(3), parsed text: "7z��<SINGLE QUOTE><0x1C><ASCII NUL><0x04>w�"
ERROR: garbage after DateTime64(3): "q#e.<0x01><ASCII NUL><ASCII NUL><ASCII NUL><ASCII NUL><ASCII NUL>"
: (in file/uri /var/lib/clickhouse/user_files/ticks/tr_cyp_kosmos2023-12-25.7z): While executing ParallelParsingBlockInputFormat: While executing File. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)
A clear and concise description of what works not as it is supposed to.
How to reproduce
cd /var/lib/clickhouse/user_files
wget https://github.com/Delphin1/clickhouse-7z-issue/blob/main/tr_cyp_kosmos2023-12-25.7z
query select above
- Which ClickHouse server version to use
CH version: 24.9.1.3278
Expected behavior
Parsed data table
Looks like some codecs do not support.
7z t tr_cyp_kosmos2023-12-25.7z
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C.UTF-8,Utf16=on,HugeFiles=on,64 bits,8 CPUs LE)
Scanning the drive for archives:
1 file, 77575 bytes (76 KiB)
Testing archive: tr_cyp_kosmos2023-12-25.7z
--
Path = tr_cyp_kosmos2023-12-25.7z
Type = 7z
Physical Size = 77575
Headers Size = 162
Method = LZMA2:1536k
Solid = -
Blocks = 1
Everything is Ok
Size: 1284855
Compressed: 77575
Consumption of data packed in 7z does not work on s3 and locally.
How to reproduce
cd /var/lib/clickhouse/user_files
wget https://github.com/Delphin1/clickhouse-7z-issue/blob/main/tr_cyp_kosmos2023-12-25.7z
query select above
CH version: 24.9.1.3278
Expected behavior
Parsed data table
Looks like some codecs do not support.