-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
1.2.4.1
What's Wrong?
When I use S3 load or S3 tvf to load parquet files stored in GCS, I get errors as below:
2023-06-08 07:14:51,734 ERROR (mysql-nio-pool-3539|101239) [S3Storage.list():372] errors while get file status
org.apache.doris.common.UserException: errCode = 2, detailMessage = Failed to get S3 FileSystem for bucket is null/empty
at org.apache.doris.backup.S3Storage.getFileSystem(S3Storage.java:142) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.backup.S3Storage.list(S3Storage.java:356) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.common.util.BrokerUtil.parseFile(BrokerUtil.java:90) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.tablefunction.ExternalFileTableValuedFunction.parseFile(ExternalFileTableValuedFunction.java:142) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.tablefunction.S3TableValuedFunction.<init>(S3TableValuedFunction.java:127) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.tablefunction.TableValuedFunctionIf.getTableFunction(TableValuedFunctionIf.java:49) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.analysis.TableValuedFunctionRef.<init>(TableValuedFunctionRef.java:40) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.analysis.CUP$SqlParser$actions.case817(SqlParser.java:28107) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.analysis.CUP$SqlParser$actions.CUP$SqlParser$do_action(SqlParser.java:8620) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.analysis.SqlParser.do_action(SqlParser.java:2291) ~[doris-fe.jar:1.2-SNAPSHOT]
at java_cup.runtime.lr_parser.parse(lr_parser.java:584) ~[jflex-1.4.3.jar:?]
at org.apache.doris.common.util.SqlParserUtils.getMultiStmts(SqlParserUtils.java:60) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.parse(ConnectProcessor.java:388) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:285) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:473) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:700) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_372]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_372]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_372]
Caused by: java.lang.IllegalArgumentException: bucket is null/empty
at org.apache.hadoop.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:144) ~[hadoop-shaded-guava-1.1.1.jar:1.1.1]
at org.apache.hadoop.fs.s3a.S3AUtils.propagateBucketOptions(S3AUtils.java:1162) ~[hadoop-aws-3.3.3.jar:?]
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:426) ~[hadoop-aws-3.3.3.jar:?]
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469) ~[hadoop-common-3.3.3.jar:?]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537) ~[hadoop-common-3.3.3.jar:?]
at org.apache.doris.backup.S3Storage.getFileSystem(S3Storage.java:140) ~[doris-fe.jar:1.2-SNAPSHOT]
... 19 more
Load SQL
LOAD LABEL label_gcs_test
(
DATA INFILE("s3://bucket/tables_prod/parquet_log_report/part*")
INTO TABLE load_test
FORMAT AS "parquet"
)
WITH S3
(
"AWS_ENDPOINT" = "https://storage.googleapis.com",
"AWS_ACCESS_KEY" = "ak",
"AWS_SECRET_KEY" = "sk",
"AWS_REGION" = "us"
)
PROPERTIES ( "timeout" = "3600" );
LOAD LABEL label_gcs_test
(
DATA INFILE("https://storage.googleapis.com/bucket/tables_prod/parquet_log_report/part*")
INTO TABLE load_test
FORMAT AS "parquet"
)
WITH S3
(
"AWS_ENDPOINT" = "https://storage.googleapis.com",
"AWS_ACCESS_KEY" = "ak",
"AWS_SECRET_KEY" = "sk",
"AWS_REGION" = "us",
"use_path_style" = "true"
)
PROPERTIES ( "timeout" = "3600" );Whatever which method I used, I always got same error. Then I tried in code:
URI uri = new URI("s3://bucket/tables_prod/parquet_log_report/part*");Then bucket ( uri.getHost() ) is null.
What You Expected?
S3 load can load data stored in GCS successfully.
How to Reproduce?
No response
Anything Else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels