-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed as duplicate of#13456
Closed as duplicate of#13456
Copy link
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
There are more and more blogs like this that show examples of running queries against data on remote object store
I would like to compare the performance of DataFusion to these other systems, but I find it really hard to run the examples
For example, in the above blog post,
INSERT INTO tripdata
SELECT * FROM s3('s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/*.parquet', NOSIGN)
SETTINGS max_threads=32, max_insert_threads=32, input_format_parallel_parsing=0;When I try to follow the example in https://datafusion.apache.org/user-guide/cli/datasources.html#remote-files-directories to look at this same data in datafusion-cli it doesn't work (and it gives me a confusing message)
$ datafusion-cli
DataFusion CLI v47.0.0
> select count(*) from 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet';
Error during planning: table 'datafusion.public.s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet' not found
>I also tried setting AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as suggested and it still fails:
$AWS_ACCESS_KEY_ID=foo AWS_SECRET_ACCESS_KEY=bar datafusion-cli
DataFusion CLI v47.0.0
> select count(*) from 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet';
Error during planning: table 'datafusion.public.s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet' not foundCREATE EXTERNAL TABLE does appear to work
> CREATE EXTERNAL TABLE hits
STORED AS PARQUET LOCATION 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet';
Object Store error: The operation lacked the necessary privileges to complete for path nyc_taxi_rides/data/tripdata_parquet: Error performing HEAD https://s3.us-east-1.amazonaws.com/altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet in 136.439542ms - Server returned non-2xx status code: 403 Forbidden:Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request