Skip to content

Cannot query CSV file with non-standard extension from DataFusion CLI / Python bindings #6147

@andygrove

Description

@andygrove

Describe the bug

I have CSV files with extension .tbl and it is not possible to query them through SQL. This is quite the barrier to running TPC-H benchmarks with the generated files.

To Reproduce

Querying a CSV file with .csv extension works

$ echo bob > customer.csv
$ datafusion-cli

DataFusion CLI v23.0.0
❯ CREATE EXTERNAL TABLE customer STORED AS CSV LOCATION 'customer.csv';
0 rows in set. Query took 0.008 seconds.
❯ SELECT * FROM customer;
+----------+
| column_1 |
+----------+
| bob      |
+----------+
1 row in set. Query took 0.001 seconds.

😄

Cannot query CSV file with .tbl extension

$ echo bob > customer.tbl
$ datafusion-cli

DataFusion CLI v23.0.0
❯ CREATE EXTERNAL TABLE customer STORED AS CSV LOCATION 'customer.tbl';
0 rows in set. Query took 0.001 seconds.
❯ SELECT * FROM customer;
0 rows in set. Query took 0.001 seconds.

😢

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions