I made a deliberately bad CSV file:
$ cat foo.csv
name,favColour
Mark,Blue
David
Giles,Red
I try to process it:
SELECT *
FROM file('foo.csv')
Query id: 42729c66-9642-4c79-8c77-a68228aa64a4
Elapsed: 0.031 sec.
Received exception:
Code: 636. DB::Exception: The table structure cannot be extracted from a CSV format file. Error:
Code: 117. DB::Exception: Rows have different amount of values. (INCORRECT_DATA) (version 24.3.1.469 (official build)).
You can specify the structure manually: (in file/uri /Users/m
Makes sense, the structure is bad. So I set input_format_allow_errors_num which I thought would skip the bad row and I told it to use the CSVWithNames format too. But it still throws the error?
SELECT *
FROM file('foo.csv', CSVWithNames)
SETTINGS input_format_allow_errors_num = 5
Query id: 167b8033-9fa4-496f-b3e5-dd05cd3f8e04
Elapsed: 0.001 sec.
Received exception:
Code: 636. DB::Exception: The table structure cannot be extracted from a CSVWithNames format file. Error:
Code: 117. DB::Exception: Rows have different amount of values. (INCORRECT_DATA) (version 24.3.1.469 (official build)).
You can specify the structure manually: (in file/uri /Users/markhneedham/projects/videos/20240305-WindowFunctions/foo.csv). (CANNOT_EXTRACT_TABLE_STRUCTURE)
We can work around that by setting input_format_max_rows_to_read_for_schema_inference=1 which will have it use only 1 row to infer the schema, but it would be simpler to use if input_format_allow_errors_num and input_format_allow_errors_ratio were used during schema inference
cc @Avogar
I made a deliberately bad CSV file:
I try to process it:
Makes sense, the structure is bad. So I set input_format_allow_errors_num which I thought would skip the bad row and I told it to use the CSVWithNames format too. But it still throws the error?
We can work around that by setting
input_format_max_rows_to_read_for_schema_inference=1which will have it use only 1 row to infer the schema, but it would be simpler to use ifinput_format_allow_errors_numandinput_format_allow_errors_ratiowere used during schema inferencecc @Avogar