[refactor](oss) unify FE OSS filesystem with Jindo#61269
[refactor](oss) unify FE OSS filesystem with Jindo#61269CalvinKirs merged 2 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 27835 ms |
TPC-DS: Total hot run time: 153036 ms |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 27041 ms |
TPC-DS: Total hot run time: 168679 ms |
|
/review |
| hadoopStorageConfig.set("fs.oss.accessKeyId", accessKey); | ||
| hadoopStorageConfig.set("fs.oss.accessKeySecret", secretKey); | ||
| hadoopStorageConfig.set("fs.oss.endpoint", endpoint); | ||
| hadoopStorageConfig.set("fs.oss.region", region); |
There was a problem hiding this comment.
Are we missing this before?
Code Review SummaryPR: refactor unify FE OSS filesystem with Jindo This PR unifies the FE-side OSS Hadoop filesystem implementation to Jindo FS, replacing the legacy Critical Checkpoint Conclusions
Minor Pre-existing Note (not blocking)
Verdict: No issues found. The PR is clean and ready. |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
This PR unifies the FE-side OSS Hadoop filesystem implementation to Jindo FS and removes legacy OSS filesystem dependencies that are no longer needed. ## Why We currently have multiple OSS filesystem implementations on the FE classpath, including: - `org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem` - `paimon-oss` This makes OSS behavior inconsistent and increases the chance of classpath conflicts. Since Doris already packages and uses Jindo FS, FE should consistently use Jindo instead of mixing multiple OSS filesystem implementations. ## Changes - Switch `OSSProperties` to use Jindo FS: - `fs.oss.impl = com.aliyun.jindodata.oss.JindoOssFileSystem` - `fs.AbstractFileSystem.oss.impl = com.aliyun.jindodata.oss.JindoOSS` - Keep `OSSHdfsProperties` aligned with the same Jindo FS constants. - Add FE unit test coverage to verify OSS Hadoop config is initialized with Jindo FS. - Remove legacy OSS filesystem dependencies from FE modules: - remove `paimon-oss` from `fe-core` - remove `paimon-oss` from `preload-extensions` - remove `hadoop-aliyun` from FE dependency management and `hadoop-deps` ## Scope This PR only updates FE-side OSS filesystem wiring and FE-related dependency cleanup. Non-FE modules are intentionally left unchanged. ## Verification - `run-fe-ut.sh --run org.apache.doris.datasource.property.storage.OSSPropertiesTest,org.apache.doris.datasource.property.storage.OSSHdfsPropertiesTest` - Full FE reactor build passed ## Notes `aliyun-sdk-oss` is still kept because it is still used by FE cloud storage code (`OssRemote`) and is not part of the Hadoop OSS filesystem implementation cleanup in this PR.
This PR unifies the FE-side OSS Hadoop filesystem implementation to Jindo FS and removes legacy OSS filesystem dependencies that are no longer needed. ## Why We currently have multiple OSS filesystem implementations on the FE classpath, including: - `org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem` - `paimon-oss` This makes OSS behavior inconsistent and increases the chance of classpath conflicts. Since Doris already packages and uses Jindo FS, FE should consistently use Jindo instead of mixing multiple OSS filesystem implementations. ## Changes - Switch `OSSProperties` to use Jindo FS: - `fs.oss.impl = com.aliyun.jindodata.oss.JindoOssFileSystem` - `fs.AbstractFileSystem.oss.impl = com.aliyun.jindodata.oss.JindoOSS` - Keep `OSSHdfsProperties` aligned with the same Jindo FS constants. - Add FE unit test coverage to verify OSS Hadoop config is initialized with Jindo FS. - Remove legacy OSS filesystem dependencies from FE modules: - remove `paimon-oss` from `fe-core` - remove `paimon-oss` from `preload-extensions` - remove `hadoop-aliyun` from FE dependency management and `hadoop-deps` ## Scope This PR only updates FE-side OSS filesystem wiring and FE-related dependency cleanup. Non-FE modules are intentionally left unchanged. ## Verification - `run-fe-ut.sh --run org.apache.doris.datasource.property.storage.OSSPropertiesTest,org.apache.doris.datasource.property.storage.OSSHdfsPropertiesTest` - Full FE reactor build passed ## Notes `aliyun-sdk-oss` is still kept because it is still used by FE cloud storage code (`OssRemote`) and is not part of the Hadoop OSS filesystem implementation cleanup in this PR.
This PR unifies the FE-side OSS Hadoop filesystem implementation to Jindo FS and removes legacy OSS filesystem dependencies that are no longer needed.
Why
We currently have multiple OSS filesystem implementations on the FE classpath, including:
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystempaimon-ossThis makes OSS behavior inconsistent and increases the chance of classpath conflicts. Since Doris already packages and uses Jindo FS, FE should consistently use Jindo instead of mixing multiple OSS filesystem implementations.
Changes
OSSPropertiesto use Jindo FS:fs.oss.impl = com.aliyun.jindodata.oss.JindoOssFileSystemfs.AbstractFileSystem.oss.impl = com.aliyun.jindodata.oss.JindoOSSOSSHdfsPropertiesaligned with the same Jindo FS constants.paimon-ossfromfe-corepaimon-ossfrompreload-extensionshadoop-aliyunfrom FE dependency management andhadoop-depsScope
This PR only updates FE-side OSS filesystem wiring and FE-related dependency cleanup.
Non-FE modules are intentionally left unchanged.
Verification
run-fe-ut.sh --run org.apache.doris.datasource.property.storage.OSSPropertiesTest,org.apache.doris.datasource.property.storage.OSSHdfsPropertiesTestNotes
aliyun-sdk-ossis still kept because it is still used by FE cloud storage code (OssRemote) and is not part of the Hadoop OSS filesystem implementation cleanup in this PR.