[draft]Branch fs spi phase5 clean master#62022
Closed
morningman wants to merge 43 commits intoapache:masterfrom
Closed
[draft]Branch fs spi phase5 clean master#62022morningman wants to merge 43 commits intoapache:masterfrom
morningman wants to merge 43 commits intoapache:masterfrom
Conversation
### What problem does this PR solve? Issue Number: N/A Problem Summary: Before splitting filesystem implementations into independent Maven modules (Phase 3), several compile-time couplings must be eliminated. This commit completes all Phase 0 prerequisite decoupling tasks: - P0.1: Introduce FsStorageType enum in fe-foundation (zero-dep module) to replace StorageBackend.StorageType (Thrift-generated) in PersistentFileSystem. Add FsStorageTypeAdapter for bidirectional Thrift conversion. Update all subclasses and callers (Repository, BackupJob, RestoreJob, CloudRestoreJob). - P0.2: Add IOException-based default bridge methods to ObjStorage interface (checkObjectExists, getObjectChecked, putObjectChecked, deleteObjectChecked, deleteObjectsChecked, copyObjectChecked, listObjectsChecked). Add ObjStorageStatusAdapter for Status→IOException conversion. Zero changes to existing implementations. - P0.3: Decouple SwitchingFileSystem from ExternalMetaCacheMgr via new FileSystemLookup functional interface. FileSystemProviderImpl passes a lambda. - P0.4: Extract MultipartUploadCapable interface from ObjFileSystem, removing the forced abstract method. S3FileSystem and AzureFileSystem implement it. HMSTransaction now uses instanceof check instead of ObjFileSystem cast. - P0.5: Introduce FileSystemDescriptor POJO for Repository metadata serialization, replacing direct PersistentFileSystem subclass serialization. Migrate GsonUtils to string-based Class.forName() reflection for legacy format backward compat, removing 7 compile-time imports of concrete filesystem classes. - P0.6: Add FileSystemSpiProvider interface skeleton in fs/spi/ as the future ServiceLoader contract for Phase 3 module split. ### Release note None ### Check List (For Author) - Test: No need to test (pure refactor; all changes are backward compatible; three successful FE builds verified during development) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Maven build cache was caching checkstyle:check results and emitting 'Skipping plugin execution (cached)' even when sources had changed. Two fixes: 1. Add check/checkstyle/ to global cache input so that changes to checkstyle rules (checkstyle.xml, suppressions.xml, etc.) correctly invalidate all module caches. 2. Mark the checkstyle:check execution (id: validate) as runAlways in executionControl so it is never skipped regardless of cache state. Checkstyle is a quality gate and must always execute. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ObjFileSystem: remove unused Map import (was used by the now-removed abstract completeMultipartUpload method) - GsonUtils: fix CustomImportOrder violations - LogManager/Logger imports were inserted before com.google.* imports; move them after all com.* imports in correct lexicographical order Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…pache#61862) ## Summary This PR completes **Phase 0** of the [FE filesystem SPI refactoring](apache#61860) — removing compile-time couplings that would otherwise prevent splitting filesystem implementations into independent Maven modules in later phases. ## Changes ### P0.1 — FsStorageType enum migration - Introduce `FsStorageType` enum in `fe-foundation` (zero-dependency module) to replace Thrift-generated `StorageBackend.StorageType` in `PersistentFileSystem` - Add `FsStorageTypeAdapter` in `fe-core` for bidirectional Thrift↔FsStorageType conversion - Update all subclasses and callers: `Repository`, `BackupJob`, `RestoreJob`, `CloudRestoreJob` ### P0.2 — ObjStorage IOException bridge - Add `IOException`-based `default` bridge methods to `ObjStorage` interface - Add `ObjStorageStatusAdapter` for `Status→IOException` conversion ### P0.3 — SwitchingFileSystem decoupling - Introduce `FileSystemLookup` functional interface - Decouple `SwitchingFileSystem` from `ExternalMetaCacheMgr` ### P0.4 — MultipartUploadCapable interface - Extract `MultipartUploadCapable` interface from `ObjFileSystem` - `S3FileSystem` and `AzureFileSystem` implement it; `HMSTransaction` uses `instanceof` check ### P0.5 — GsonUtils compile-time decoupling - Introduce `FileSystemDescriptor` POJO for `Repository` metadata serialization - `GsonUtils` removes 7 compile-time concrete class imports, uses `Class.forName()` reflection ### P0.6 — FileSystemSpiProvider skeleton - Add `FileSystemSpiProvider` interface in `fs/spi/` ### Build - Fix Maven build cache incorrectly skipping `checkstyle:check` - Fix checkstyle violations (unused import, import order) ## Testing - FE build: ✅ Checkstyle: 0 violations ✅ Closes part of apache#61860 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…stem API and value objects (apache#61908) ### What problem does this PR solve? Issue Number: apache#61860 Problem Summary: The existing FileSystem interface uses Status-based return values, bare String paths, and Hadoop-dependent RemoteFile objects throughout the FE codebase, making it hard to test and impossible to isolate from Hadoop at the module boundary. Phase 1 introduces the new clean IOException-based FileSystem API with typed Location value objects, while preserving full backward compatibility via LegacyFileSystemApi. ### Release note None ### Check List (For Author) - Test: Manual build verification (./build.sh --fe) passes with zero errors - Behavior changed: No (all existing code paths preserved via LegacyFileSystemApi) - Does this need documentation: No New files: - Location.java: immutable URI value object replacing bare String paths - FileEntry.java: immutable file/dir descriptor replacing Hadoop-dependent RemoteFile - FileIterator.java: lazy Closeable iterator interface for directory listing - LegacyFileSystemApi.java: @deprecated copy of old FileSystem interface (Status-based) - LegacyFileSystemAdapter.java: abstract bridge implementing new FileSystem via legacy* methods - LegacyToNewFsAdapter.java: wraps any LegacyFileSystemApi as new FileSystem - MemoryFileSystem.java: in-memory FileSystem for unit testing Modified files: - FileSystem.java: replaced with new clean IOException-based interface - PersistentFileSystem, LocalDfsFileSystem, SwitchingFileSystem: implements LegacyFileSystemApi - FileSystemProvider, FileSystemLookup: return LegacyFileSystemApi - DorisInputFile, DorisOutputFile: add location() method; deprecate path() - HdfsInputFile, HdfsOutputFile, HdfsInputStream: use Location instead of ParsedPath - ParsedPath: @deprecated + toLocation() conversion method - RemoteFile: @deprecated + toFileEntry()/fromFileEntry() conversion methods - RemoteFiles, RemoteFileRemoteIterator: @deprecated - All callers updated to use LegacyFileSystemApi type Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…#61909) followup apache#61908 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…module (apache#61911) ### What problem does this PR solve? Issue Number: apache#61860 Problem Summary: Eliminates duplicate object storage abstraction in the Cloud mode codebase. Previously, cloud.storage.RemoteBase and its subclasses (S3Remote, OssRemote, CosRemote, ObsRemote, BosRemote, AzureRemote, etc.) provided object storage operations independently from the fs.obj module. This change migrates all functionality into fs.obj and removes the entire Remote* hierarchy, so all Cloud Stage/Copy operations now go through ObjFileSystem. Key changes: - Add OssObjStorage, CosObjStorage, ObsObjStorage, BosObjStorage as new ObjStorage implementations with STS token, presigned URL, list, and head support - Extend AzureObjStorage and S3ObjStorage with getStsToken, getPresignedUrl, listObjectsWithPrefix, headObjectWithMeta - Add ObjFileSystem passthrough methods: getStsToken, getPresignedUrl, listObjectsWithPrefix, headObjectWithMeta, deleteObjectsByKeys - Extract RemoteBase.ObjectInfo inner class to standalone cloud.storage.ObjectInfo - Rewrite ObjectInfoAdapter: no RemoteBase dependency; uses ObjFileSystem for STS in ARN flow - Migrate all callers (StageUtil, CopyLoadPendingTask, CleanCopyJobTask, CopyIntoAction, CreateStageCommand, CopyIntoInfo) to ObjFileSystem - Delete RemoteBase, S3Remote, OssRemote, CosRemote, ObsRemote, BosRemote, TosRemote, AzureRemote, DefaultRemote, MockRemote - Rewrite CopyLoadPendingTaskTest and StageUtilTest to use MockUp<ObjFileSystem> - Fix Maven build cache causing test execution to be skipped (run-fe-ut.sh) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ent SPI modules (apache#61919) ### What problem does this PR solve? Issue Number: N/A Problem Summary: Refactor the Apache Doris FE filesystem layer so that each storage backend (S3, OSS, COS, OBS, Azure, HDFS, Local) lives in its own independent Maven module and is discovered at runtime via Java ServiceLoader (SPI pattern). Adding a new storage backend now requires zero changes to fe-core. Changes by step: **P3.0b** – Rename `FileSystemProvider` → `LegacyFileSystemProviderFactory` to free up the name `FileSystemProvider` for the new SPI interface. **Step 1** – Create `fe/fe-filesystem/` aggregator POM and `fe-filesystem-spi` module. Defines the stable SPI contracts: - `FileSystem`, `ObjStorage`, `FileSystemProvider` interfaces - `HadoopAuthenticator`, `IOCallable` for HDFS/Kerberos - Value objects: `FileEntry`, `FileIterator`, `InputFile`, `OutputFile`, `RequestBody`, `ObjectMetadata`, `MultipartPart`, `UploadedPart` **Step 2** – `fe-filesystem-s3` module: - `S3Uri`, `S3ObjStorage`, `S3FileSystem`, `S3OutputStream` - `S3FileSystemProvider` registered via META-INF/services **Step 3** – `fe-filesystem-oss`, `fe-filesystem-cos`, `fe-filesystem-obs`: - Thin providers that reuse S3FileSystem via S3-compatible APIs - Each translates vendor-specific property keys to AWS_* keys **Step 3b** – `fe-filesystem-azure` module: - `AzureUri` (wasb/wasbs/abfs/abfss/https), `AzureObjStorage` - Supports shared-key, service-principal, and default-credential auth - Multipart upload via Azure block-blob stageBlock/commitBlockList **Step 4** – `fe-filesystem-hdfs` module: - `HdfsConfigBuilder`, `SimpleHadoopAuthenticator`, `KerberosHadoopAuthenticator` - `DFSFileSystem` implements `spi.FileSystem` directly - Supports hdfs/viewfs/ofs/jfs/oss schemes **Step 6** – `fe-filesystem-local` (test-only): - `LocalFileSystem` using `java.nio.file` for unit-test isolation **Step 7** – `fe-core` integration: - `fe-filesystem-spi` added as compile dep; all impl modules as runtime deps - New `StoragePropertiesConverter` translates `StorageProperties` → Map - `FileSystemFactory` extended with `getFileSystem(Map)` / `getFileSystem(StorageProperties)` using ServiceLoader (double-checked locking); old `get()` methods kept `@Deprecated` for Phase-4 migration - Fix checkstyle violations (import order, line length) in HMS classes introduced by P3.0b rename ### Release note None (internal refactoring; no user-visible behavior change) ### Check List (For Author) - Test: `mvn install -pl fe-core -am -DskipTests` passes (BUILD SUCCESS) - Behavior changed: No (legacy API fully preserved; new SPI API added alongside) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: N/A Problem Summary: Phase 4 step P4.0 — add FileSystemTransferUtil, a utility class that provides higher-level transfer operations (download, upload, directUpload, globList) built on top of the spi.FileSystem primitives. This utility is a prerequisite for migrating all 21 FileSystemFactory.get() call sites to the new SPI API. Also adds fe-filesystem-local as a test-scope dependency in fe-core/pom.xml to enable unit tests using LocalFileSystemProvider. ### Release note None ### Check List (For Author) - Test: Unit Test (FileSystemTransferUtilTest — 12 tests, all passing) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ading infrastructure ### What problem does this PR solve? Issue Number: N/A Problem Summary: Phase 4 P4.1 introduces the plugin-directory loading mechanism so fe-core no longer bundles cloud filesystem providers as runtime JARs. Providers are now discovered at startup by FileSystemPluginManager via DirectoryPluginRuntimeManager (production) or ServiceLoader (tests / classpath). This decouples fe-core from transitive cloud-SDK dependencies. Changes: - fe-filesystem-spi/pom.xml: add fe-extension-spi compile dep (enables PluginFactory) - FileSystemProvider: extends PluginFactory; add default Plugin create() bridge - FileSystemPluginManager: new class, dual-path provider loading (ServiceLoader + DirectoryPluginRuntimeManager), createFileSystem() delegation - FileSystemFactory: add initPluginManager() static init; getFileSystem() delegates to manager when set, falls back to ServiceLoader for tests - Config.java + fe.conf: add filesystem_plugin_root config key - Env.java: call initFileSystemPluginManager() during FE startup - fe-core/pom.xml: remove 6 fe-filesystem-* runtime deps; keep local as test-scope - build.sh: deploy each fe-filesystem-<name>.jar + deps into output/fe/plugins/filesystem/<name>/ at build time ### Release note None ### Check List (For Author) - Test: No need to test (P4.1 is infrastructure; integration covered by P4.8 tests) - Behavior changed: Yes — filesystem providers loaded from plugin directory at runtime instead of bundled JARs; behavior identical when plugin dir is configured correctly - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…i.FileSystem ### What problem does this PR solve? Issue Number: N/A Problem Summary: Phase 4 P4.2 migrates FileSystemCache from returning legacy RemoteFileSystem to spi.FileSystem, and updates all downstream callers to use the new SPI API instead of LegacyFileSystemApi. Changes: - FileSystemCache: change cache type to spi.FileSystem; use FileSystemFactory.getFileSystem() in loadFileSystem(); rename getRemoteFileSystem() to getFileSystem(); add getProperties() to FileSystemCacheKey - DirectoryLister: change listFiles() parameter from LegacyFileSystemApi to spi.FileSystem - FileSystemDirectoryLister: reimplement using FileSystemTransferUtil.globList() + FileEntry→RemoteFile conversion (transitional, RemoteFile not yet replaced) - TransactionScopeCachingDirectoryLister: update parameter types - AcidUtil: change parameter types to spi.FileSystem; replace Status-based exists/globList/listFiles calls with spi.FileSystem equivalents - HiveExternalMetaCache: change to use fsCache.getFileSystem() - HiveUtil.isSplittable(): change parameter from RemoteFileSystem to spi.FileSystem (BrokerFileSystem instanceof check retained as no-op pending broker SPI module) - FileSystemProviderImpl: use legacy FileSystemFactory.get() directly since SwitchingFileSystem is not yet migrated to spi.FileSystem - HiveAcidTest: use spi.LocalFileSystem instead of LocalDfsFileSystem for AcidUtil calls - TransactionScopeCachingDirectoryListerTest: update mock to match new interface ### Release note None ### Check List (For Author) - Test: Regression test (HiveAcidTest + TransactionScopeCachingDirectoryListerTest compile and pass after migration; build succeeds with DskipTests) - Behavior changed: Internal — FileSystem objects are now loaded from SPI providers rather than legacy implementations in fe-core - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…System ### What problem does this PR solve? Issue Number: N/A Problem Summary: Replace FileSystemFactory.get() + RemoteFileSystem.deleteDirectory() with FileSystemFactory.getFileSystem() + spi.FileSystem.delete(Location, true) in InsertIntoTVFCommand.deleteExistingFiles(). ### Release note None ### Check List (For Author) - Test: Build succeeds with -DskipTests - Behavior changed: No — same delete-directory semantics via SPI - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Type in RepositoryMgr ### What problem does this PR solve? Issue Number: N/A Problem Summary: Remove direct dependency on concrete S3FileSystem/AzureFileSystem implementation classes in RepositoryMgr. Use FsStorageType enum from FileSystemDescriptor instead of instanceof checks, so RepositoryMgr no longer needs compile-time references to legacy filesystem implementation classes. Changes: - Repository: add getFileSystemDescriptor() public accessor - RepositoryMgr: replace instanceof S3FileSystem/AzureFileSystem check with getFileSystemDescriptor().getStorageType() == FsStorageType.S3/AZURE; remove imports of S3FileSystem and AzureFileSystem ### Release note None ### Check List (For Author) - Test: Build succeeds with -DskipTests - Behavior changed: No — S3FileSystem and AzureFileSystem always had storageType S3 and AZURE respectively - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…4.3)
### What problem does this PR solve?
Issue Number: N/A
Problem Summary: Cloud callers (StageUtil, CopyLoadPendingTask, CleanCopyJobTask,
ObjectInfoAdapter, CopyIntoAction, CreateStageCommand) were using the legacy
ObjFileSystem/RemoteFileSystem API. This commit migrates them to the new
org.apache.doris.filesystem.spi.FileSystem SPI, completing Phase 4.3 of the
filesystem SPI migration.
### Changes
**fe-filesystem-spi:**
- New StsCredentials class replacing Triple<String,String,String> for STS token results
- ObjStorage: added 5 default cloud-specific methods (getStsToken, listObjectsWithPrefix,
headObjectWithMeta, getPresignedUrl, deleteObjectsByKeys) with UnsupportedOperationException
defaults
- ObjFileSystem: added 5 delegate methods forwarding to the underlying ObjStorage
**fe-filesystem-s3:**
- S3ObjStorage: added PROP_BUCKET/PROP_ROLE_ARN/PROP_EXTERNAL_ID constants, bucket field,
and full implementations of all 5 cloud-specific methods ported from legacy S3ObjStorage
**fe-core:**
- StoragePropertiesConverter: added AWS_BUCKET and STS key (AWS_ROLE_ARN/AWS_EXTERNAL_ID)
pass-through for AbstractS3CompatibleProperties
- 6 cloud callers migrated from FileSystemFactory.get() to FileSystemFactory.getFileSystem():
CleanCopyJobTask, CopyLoadPendingTask, StageUtil, ObjectInfoAdapter, CopyIntoAction,
CreateStageCommand
- CloudInternalCatalog.filterCopyFiles: updated parameter from ObjectFile to RemoteObject
- CopyLoadPendingTaskTest: updated mocks and types to use new SPI types
### Release note
None
### Check List (For Author)
- Test: Build passes; CopyLoadPendingTaskTest updated with new SPI types
- Build verified with ./build.sh --fe
- Behavior changed: No (same runtime behavior, different abstraction layer)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
Iceberg's DelegateFileIO and its 3 companion classes (DelegateInputFile,
DelegateOutputFile, DelegateSeekableInputStream) were using the legacy
LegacyFileSystemApi / fs.io.* APIs. This commit migrates them to the new
org.apache.doris.filesystem.spi.FileSystem SPI, completing phase P4.4 of
the filesystem SPI migration.
Changes:
- SPI layer: add DorisInputStream (seekable InputStream abstraction),
extend DorisInputFile with exists()/lastModifiedTime() defaults and
change newStream() return type to DorisInputStream; add
newInputFile(Location, long) default to FileSystem
- HDFS: add HdfsSeekableInputStream wrapping FSDataInputStream; update
HdfsInputFile to return it
- S3: add openInputStreamAt()/headObjectLastModified() to S3ObjStorage;
update S3InputFile and add S3SeekableInputStream (lazy range-based seek)
- Azure: add openInputStreamAt()/headObjectLastModified() to
AzureObjStorage using BlobInputStreamOptions+BlobRange; update
AzureInputFile and add AzureSeekableInputStream
- Local: add LocalSeekableInputStream using RandomAccessFile; update
anonymous DorisInputFile in LocalFileSystem
- MemoryFileSystem: fix Phase 2 placeholder in newStream(); add
MemorySeekableInputStream extending fs.io.DorisInputStream
- DelegateSeekableInputStream: migrate to spi.DorisInputStream, use
getPos() instead of getPosition()
- DelegateInputFile: migrate to spi.DorisInputFile, use location()
instead of deprecated path()
- DelegateOutputFile: migrate constructor from
(LegacyFileSystemApi, ParsedPath) to (spi.FileSystem, spi.Location)
- DelegateFileIO: migrate field/initialize() to use spi.FileSystem via
FileSystemFactory.getFileSystem(); update all methods to use
spi.Location; simplify deleteFiles() to iterate individually
### Release note
None
### Check List (For Author)
- Test: Build verification (./build.sh --fe) passes
- Manual test: N/A (DelegateFileIO is not yet enabled in production)
- Behavior changed: No (DelegateFileIO enabling line remains commented out)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…criptor
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
BackupJob, RestoreJob and BackupHandler were calling
repo.getRemoteFileSystem().getStorageProperties().getBackendConfigProperties()
and repo.getRemoteFileSystem().getThriftStorageType() to populate BE
snapshot/upload/download task RPCs. These calls required a live
PersistentFileSystem object even though no actual I/O was performed —
only metadata extraction.
This commit adds getThriftStorageType() and getBackendConfigProperties()
to FileSystemDescriptor (the lightweight POJO already used for
serialization), then replaces all metadata-only getRemoteFileSystem()
calls with getFileSystemDescriptor() equivalents:
- FileSystemDescriptor: add getThriftStorageType() via FsStorageTypeAdapter,
add getBackendConfigProperties() via StorageProperties.createPrimary()
- BackupJob (2 sites): updateBrokerProperties + UploadTask construction
- RestoreJob (2 sites): updateBrokerProperties + createDownloadTask
- BackupHandler (1 site): mergeProperties() property map extraction
Repository.getRemoteFileSystem() and its I/O usages (globList, upload,
download, etc.) are intentionally left unchanged — they require a live
connection and are a separate concern.
### Release note
None
### Check List (For Author)
- Test: Build verification (./build.sh --fe) passes
- No behavior change: same values produced via different code path
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ithLimit) ### What problem does this PR solve? Issue Number: N/A Problem Summary: S3SourceOffsetProvider used legacy RemoteFileSystem.globListWithLimit which depends on fe-core's S3URI and GlobListResult. Migrates to the new SPI layer to eliminate the dependency on legacy filesystem APIs. Changes: - Add spi.GlobListing: SPI replacement for GlobListResult carrying List<FileEntry>, bucket, prefix, and maxFile fields - Add FileSystem.globListWithLimit() default method (throws UnsupportedOperationException) with full javadoc; S3FileSystem overrides it - Implement S3FileSystem.globListWithLimit(): parses S3 URI, uses PathMatcher for glob filtering, paginates via ListObjectsV2 with optional startAfter and size/count limits - S3SourceOffsetProvider: use FileSystemFactory.getFileSystem() returning spi.FileSystem, replace GlobListResult/RemoteFile with GlobListing/FileEntry, replace Status-based error handling with IOException; debug point now throws IOException directly ### Release note None ### Check List (For Author) - Test: Build succeeded; logic is structurally identical to the legacy globListInternal - Behavior changed: No (same S3 listing logic, same offset semantics) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
….FileSystem (P4.7) ### What problem does this PR solve? Issue Number: N/A Problem Summary: HMSTransaction and its surrounding infrastructure (HMSExternalCatalog, HiveTransactionManager, TransactionManagerFactory, FileSystemUtil) were using the legacy LegacyFileSystemApi / SwitchingFileSystem / MultipartUploadCapable chain. This commit migrates the entire Hive transaction write path to the new spi.FileSystem abstraction. Key changes: - Add SpiSwitchingFileSystem: new spi.FileSystem implementation that routes per-path to the appropriate FileSystem via LocationPath + FileSystemFactory; caches resolved filesystems; supports a test constructor for injecting a delegate - spi.FileSystem: add listFiles(), listFilesRecursive(), listDirectories(), renameDirectory() default methods - spi.ObjFileSystem: add completeMultipartUpload(path, uploadId, Map) convenience overload - HMSTransaction: remove LegacyFileSystemApi/SwitchingFileSystem/ MultipartUploadCapable; field fs is now spi.FileSystem (SpiSwitchingFileSystem). MPU abort uses ObjFileSystem.getObjStorage().abortMultipartUpload(). MPU commit uses ObjFileSystem.completeMultipartUpload(). Wrapper methods now throw IOException / RuntimeException instead of returning Status. - FileSystemUtil: asyncRenameFiles/asyncRenameDir now accept spi.FileSystem - HMSExternalCatalog: replaces FileSystemProviderImpl with SpiSwitchingFileSystem - HiveTransactionManager / TransactionManagerFactory: parameter type changed from LegacyFileSystemProviderFactory to SpiSwitchingFileSystem - Tests: HmsCommitTest and HMSTransactionPathTest migrated to spi.FileSystem; FakeFileSystem now implements spi.FileSystem; LocalDfsFileSystem replaced with LocalFileSystem (SPI) ### Release note None ### Check List (For Author) - Test: Unit Test — HmsCommitTest and HMSTransactionPathTest updated and compile - Behavior changed: No (same semantics, different API layer) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rokerUtil ### What problem does this PR solve? Issue Number: N/A Problem Summary: As part of the filesystem SPI migration (Phase 5), this commit implements the `fe-filesystem-broker` Maven module — a zero-fe-core- dependency plugin that provides broker-based filesystem access via the unified `FileSystem` SPI. Two legacy `FileSystemFactory.get()` + `RemoteFileSystem` calls in `BrokerUtil` are migrated to use the new SPI. ### Release note None ### Check List (For Author) - Test: Manual build verification (FE build passes cleanly) - Behavior changed: No (same broker Thrift RPC calls, new abstraction layer) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: N/A Problem Summary: Migrates all I/O operations in Repository.java (initRepository, ping, listSnapshots, upload, download, getSnapshotInfo) from the legacy Status-based PersistentFileSystem API to the new IOException-based SPI FileSystem API. This is Phase 4.6-IO of the filesystem SPI migration. Key design decisions: - Non-broker repos: spiFs initialized once in constructor/gsonPostProcess via FileSystemFactory.getFileSystem(StorageProperties) - Broker repos: spiFs=null; acquireSpiFs() resolves a live broker endpoint per I/O call via BrokerMgr (matching the lazy-resolution behavior of legacy BrokerFileSystem) - Legacy PersistentFileSystem field is retained for metadata-only methods (getLocation, getInfo, getCreateStatement, getBrokerAddress) — to be removed in a follow-up cleanup pass ### Release note None ### Check List (For Author) - Test: Build verification (FE build passes cleanly) - Behavior changed: No (same operations, new abstraction layer) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ath (P4.6-meta) ### What problem does this PR solve? Issue Number: N/A Problem Summary: Repository.java previously held a transient PersistentFileSystem field used for both I/O and metadata operations. The I/O path was migrated to the SPI FileSystem in the previous commit. This commit completes the metadata migration, eliminating the last dependency on PersistentFileSystem from the live (non-persistence) path. Changes: - FileSystemDescriptor: add fromStorageProperties() factory method that maps StorageProperties → FileSystemDescriptor without going through a legacy PersistentFileSystem instance. - Repository: remove transient fileSystem/getFileSystem()/ getRemoteFileSystem() fields/methods; constructor now accepts StorageProperties directly; getLocation(), getBrokerAddress(), getInfo(), getCreateStatement() all use fileSystemDescriptor; gsonPostProcess() only initializes spiFs (no legacy fileSystem). - BackupHandler: createRepository() and alterRepository() pass StorageProperties directly to Repository constructor; remove FileSystemFactory.get() (legacy API) and RemoteFileSystem imports. - CloudRestoreJob: replace repo.getRemoteFileSystem().* calls with repo.getFileSystemDescriptor().* equivalents. ### Release note None ### Check List (For Author) - Test: Manual build verification (sh build.sh --fe) - Behavior changed: No (same semantics, different internal path) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… FileSystem API Issue Number: N/A Problem Summary: After the Repository class was refactored to accept StorageProperties instead of RemoteFileSystem, and getRemoteFileSystem() was removed, 6 test files failed to compile (18 errors total). This commit fixes all compilation errors by updating the tests to use the new SPI-based API. Changes: - Repository.java: Fix listSnapshots() logic bug — old code called fs.listFiles() (files only) then filtered for isDirectory(), always yielding empty results. Rewritten to use fs.list() + FileIterator. - RepositoryTest.java: Full rewrite of mocks from @mocked RemoteFileSystem to @mocked spi.FileSystem + MockUp<FileSystemFactory>; use BrokerProperties as the Repository constructor arg so getLocation() returns raw strings. - BackupJobTest.java, RestoreJobTest.java, CloudRestoreJobTest.java: Replace FileSystemFactory.get(BrokerProperties.of(...)) with BrokerProperties.of(...) directly (constructor now takes StorageProperties). - CreateRepositoryCommandTest.java: Replace getRemoteFileSystem().getProperties() with getFileSystemDescriptor().getProperties(). - S3FileSystemTest.java: Pass StorageProperties.createPrimary(properties) instead of an S3FileSystem instance to the Repository constructor. None - Test: Regression test / Unit Test / Manual test / No need to test (with reason) - Build verified: mvn test-compile passes with no errors - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Issue Number: N/A Problem Summary: The META-INF/services/org.apache.doris.filesystem.spi.FileSystemProvider files in all 8 filesystem modules were missing the required Apache Software Foundation license header, causing license check failures. None - Test: No need to test (license header addition only) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…build.sh ### What problem does this PR solve? Issue Number: N/A Problem Summary: Three issues in the fe-filesystem plugin modules: 1. S3ObjStorage.getProperties() body was missing (accidental deletion), causing a compilation error. 2. S3FileSystem.java had a misindented block comment causing checkstyle failure. 3. fe-filesystem-broker files had import ordering violations (SAME_PACKAGE / THIRD_PARTY groups not separated by blank lines, and THIRD_PARTY imports appearing before SAME_PACKAGE imports in BrokerClientFactory/Pool). 4. BrokerClientFactory used a 4-arg TSocket constructor not present in thrift 0.16.0; replaced with the 3-arg (host, port, timeout) form. 5. BrokerClientPool used generic GenericKeyedObjectPoolConfig<T> not available in commons-pool2 2.2; replaced with raw type. 6. build.sh --fe did not compile the fe-filesystem plugin modules at all; added them to FE_MODULES so they are built alongside fe-core. ### Release note None ### Check List (For Author) - Test: No need to test (build fix only; all filesystem modules compile cleanly) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…plugins ### What problem does this PR solve? Issue Number: N/A Problem Summary: The dependency:copy-dependencies step for filesystem plugins was running from DORIS_HOME with 'fe/fe-filesystem/...' paths, but there is no root-level pom.xml that includes these modules, causing 'Could not find the selected project in the reactor' errors. Fixed by running the command from DORIS_HOME/fe with 'fe-filesystem/...' paths. Also added the broker module to the copy loop (it was missing) and removed the now-unused FS_SPI_MODULE variable. ### Release note None ### Check List (For Author) - Test: sh build.sh --fe passes with BUILD SUCCESS and no errors - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tput copy ### What problem does this PR solve? Issue Number: N/A Problem Summary: When Maven Build Cache restores a cache hit, it only restores the packaged JAR (doris-fe.jar) and skips re-running the maven-dependency-plugin copy-dependencies goal that populates target/lib/. This caused 'cp: cannot stat fe-core/target/lib/*: No such file or directory' when build.sh tried to copy dependency JARs into output/fe/lib/. Fix: detect missing target/lib/ after the Maven build phase and explicitly run 'mvn dependency:copy-dependencies -pl fe-core' before the cp step. ### Release note None ### Check List (For Author) - Test: Manual test — ran 'sh build.sh --fe' successfully with BUILD SUCCESS - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…SystemAdapter ### What problem does this PR solve? Issue Number: N/A Problem Summary: Phase A of the P4.8 legacy filesystem class deletion. Both LegacyToNewFsAdapter and LegacyFileSystemAdapter have no external callers (confirmed by grep). LegacyToNewFsAdapter was a concrete adapter wrapping LegacyFileSystemApi; LegacyFileSystemAdapter was the abstract Status→IOException bridge. Both are now dead code after the P4.1–P4.7 caller migrations. Also removes the stale @see LegacyFileSystemAdapter javadoc tag from FileSystem.java. ### Release note None ### Check List (For Author) - Test: No need to test (deleting dead code with no callers) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… SPI FileEntry
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
The DirectoryLister interface and its implementations returned RemoteFile
(a Hadoop-dependent legacy class), forcing callers like HiveExternalMetaCache
and AcidUtil to depend on Hadoop Path objects. This phase eliminates that
dependency by:
1. Adding modificationTime to SPI FileEntry and RemoteObject so no metadata
is lost during migration.
2. All SPI providers (HDFS, S3, Azure, Broker, Local) now populate
modificationTime from their underlying SDKs.
3. DirectoryLister interface now returns RemoteIterator<FileEntry> (SPI).
4. FileSystemDirectoryLister, SimpleRemoteIterator, and
TransactionScopeCachingDirectoryLister all updated accordingly.
5. RemoteFileRemoteIterator and RemoteFiles deleted (no remaining callers).
6. HiveExternalMetaCache.addFile() now accepts FileEntry; converts
List<BlockInfo> to BlockLocation[] for HiveFileStatus.
7. AcidUtil removes toRemoteFiles() bridge; works with FileEntry directly.
8. PathVisibleTest updated to use String-based isFileVisible() signature.
### Release note
None
### Check List (For Author)
- Test: Regression test / Unit Test / Manual test / No need to test (with reason)
- Build verification (FE build succeeds)
- PathVisibleTest, TransactionScopeCachingDirectoryListerTest,
RepositoryTest, CopyLoadPendingTaskTest updated and compile clean
- Behavior changed: No (modificationTime was 0L before; now populated from SDK)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: N/A Problem Summary: The MultipartUploadCapable interface (legacy fe-core) was the only multipart upload contract for S3FileSystem and AzureFileSystem. All callers have already migrated to the SPI ObjFileSystem (which has its own completeMultipartUpload() method and ObjStorage.initiateMultipartUpload/ uploadPart/abortMultipartUpload). The interface is now dead code: - HMSTransaction already uses SPI ObjFileSystem directly - No code casts to MultipartUploadCapable Delete the interface and remove its implements clause from S3FileSystem and AzureFileSystem. The legacy completeMultipartUpload(bucket, key, uploadId, parts) methods remain in those classes pending Phase G deletion. ### Release note None ### Check List (For Author) - Test: No need to test (dead interface removal, FE build verified) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…gacy HDFS IO wrappers ### What problem does this PR solve? Issue Number: N/A Problem Summary: - HdfsStorageVault.checkConnectivity() was instantiating legacy DFSFileSystem directly for makeDir/exists/delete operations using Status-based API - Five legacy HDFS IO wrapper classes (HdfsInputFile, HdfsOutputFile, HdfsInputStream, HdfsOutputStream, HdfsInput) only had callers in the legacy DFSFileSystem.newInputFile()/newOutputFile() overrides Changes: - HdfsStorageVault: replace new DFSFileSystem(...) with FileSystemFactory.getFileSystem(StorageProperties) and use SPI mkdirs/exists/delete(Location) methods; IOException wrapped as DdlException - DFSFileSystem: remove newInputFile()/newOutputFile() overrides (falls back to LegacyFileSystemApi default UnsupportedOperationException); remove unused imports (Location, HdfsInputFile, HdfsOutputFile, DorisInputFile, DorisOutputFile, ParsedPath) - Delete: HdfsInputFile, HdfsOutputFile, HdfsInputStream, HdfsOutputStream, HdfsInput (all dead code after removing DFSFileSystem overrides) Note: ExternalCatalog and HMSExternalCatalog still reference DFSFileSystem.PROP_ALLOW_FALLBACK_TO_SIMPLE_AUTH and DFSFileSystem.getHdfsConf() as static-only usage; full migration deferred to Phase G when legacy DFSFileSystem is deleted (blocked by OSSHdfsFileSystem, JFSFileSystem, OFSFileSystem subclasses). ### Release note None ### Check List (For Author) - Test: No need to test (HdfsStorageVault.checkConnectivity is integration-only, no existing unit test; behavior identical — SPI HDFS provider uses same Hadoop FileSystem under the hood) - Behavior changed: No (same HDFS operations, same error handling) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… provider infrastructure ### What problem does this PR solve? Issue Number: N/A Problem Summary: FileSystemProviderImpl, LegacyFileSystemProviderFactory, SwitchingFileSystem, and FileSystemLookup had zero callers outside their own files. The entire legacy provider chain was dead code: no production or test code referenced FileSystemProviderImpl or LegacyFileSystemProviderFactory after previous phases migrated all callers to the SPI FileSystemFactory. Changes: - Delete FileSystemProviderImpl (no callers; only created SwitchingFileSystem) - Delete LegacyFileSystemProviderFactory (interface with no callers) - Delete SwitchingFileSystem (only instantiated by deleted FileSystemProviderImpl) - Delete FileSystemLookup (FunctionalInterface only used by deleted SwitchingFileSystem) - FileSystem.java: replace @see LegacyFileSystemApi javadoc with reference to SPI Note: LegacyFileSystemApi is NOT deleted here — it is still implemented by PersistentFileSystem and LocalDfsFileSystem which are Phase G deletion scope. ### Release note None ### Check List (For Author) - Test: No need to test (deleting unreachable dead code; build verifies no callers) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…em, AzureFileSystem and StorageTypeMapper ### What problem does this PR solve? Issue Number: N/A Problem Summary: - StorageTypeMapper was only called from FileSystemFactory.get() (deprecated), which had zero production callers — only dead @Disabled/@ignore tests - BrokerFileSystem, S3FileSystem, AzureFileSystem (legacy fe-core versions) were only instantiated through StorageTypeMapper; all production code already routes through FileSystemFactory.getFileSystem() → SPI providers - HiveUtil.isSplittable() had a dead instanceof BrokerFileSystem branch: the fs parameter is always an SPI FileSystem from FileSystemCache, never the legacy BrokerFileSystem, so the broker-specific Thrift RPC path was unreachable Changes: - Delete StorageTypeMapper.java (legacy enum factory, no callers) - Delete BrokerFileSystem.java, S3FileSystem.java, AzureFileSystem.java (legacy) - Delete FileSystemFactory.get(StorageProperties) and get(FileSystemType, Map) deprecated methods (no production callers; StorageTypeMapper is gone) - HiveUtil.isSplittable(): remove dead instanceof BrokerFileSystem branch - IcebergHadoopCatalogTest: instantiate DFSFileSystem directly (was using deleted FileSystemFactory.get(); test is @ignore) - PaimonDlfRestCatalogTest: remove readByDorisS3FileSystem() helper (used deleted S3FileSystem; test is @disabled) - Delete BrokerStorageTest, S3FileSystemTest (tested deleted legacy classes) Note: GsonUtils already handles missing legacy classes via reflection + try/catch ClassNotFoundException; the BrokerFileSystem/S3FileSystem/AzureFileSystem entries in the RuntimeTypeAdapterFactory will gracefully skip at startup. ### Release note None ### Check List (For Author) - Test: No need to test (deleting dead code with zero live callers; build verifies no remaining references) - Behavior changed: No (SPI path already handled all cases; broker RPC branch in isSplittable was unreachable) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…d legacy ObjFileSystem
### What problem does this PR solve?
Issue Number: N/A
Problem Summary: As part of the P4.8 legacy class deletion series, removes dead
code that has no production callers:
- OSSHdfsFileSystem, JFSFileSystem, OFSFileSystem: subclasses of DFSFileSystem
with zero production instantiation (StorageTypeMapper which mapped them was
deleted in P4.8-F)
- org.apache.doris.fs.remote.ObjFileSystem: legacy abstract class whose only
subclasses (S3FileSystem, AzureFileSystem) were deleted in P4.8-F
Also removes their entries from GsonUtils reflection array (ClassNotFoundException
was already handled gracefully, but the entries serve no purpose).
Fixes StageUtilTest to import org.apache.doris.filesystem.spi.ObjFileSystem
(the SPI interface used by production StageUtil) instead of the now-deleted
legacy org.apache.doris.fs.remote.ObjFileSystem.
### Release note
None
### Check List (For Author)
- Test: Regression test / Unit Test / Manual test / No need to test (with reason)
- FE build passes
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rence infrastructure ### What problem does this PR solve? Issue Number: N/A Problem Summary: Delete DFSFileSystem (the legacy HDFS wrapper class) and its supporting classes now that all callers have been migrated: - DFSFileSystem.java: deleted after migrating ExternalCatalog and HMSExternalCatalog away from its static members (PROP_ALLOW_FALLBACK_TO_SIMPLE_AUTH constant and getHdfsConf() method). These are now inlined directly in the callers using HdfsConfiguration and the literal string 'ipc.client.fallback-to-simple-auth-allowed'. - DFSFileSystemPhantomReference.java: helper class for phantom reference tracking, only used within the dfs package - RemoteFSPhantomManager.java: background cleanup thread for Hadoop FileSystem objects, only called from DFSFileSystem.nativeFileSystem() - IcebergHadoopCatalogTest.java: @ignore test with no assertions, purely manual exploration code using DFSFileSystem.nativeFileSystem() Also removes DFSFileSystem from the GsonUtils reflection array. ### Release note None ### Check List (For Author) - Test: No need to test (deleted classes have no production callers; ExternalCatalog and HMSExternalCatalog behavior is unchanged — same Hadoop config semantics) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve? Issue Number: N/A Problem Summary: Completes the P4.8 legacy class deletion series by removing the entire legacy filesystem class hierarchy now that DFSFileSystem is gone: G.4 - RemoteFileSystem: abstract class extending PersistentFileSystem; no remaining concrete subclasses. Deletes RemoteFileSystemTest (only tested this abstract class). G.5 - PersistentFileSystem: base abstract class of the legacy hierarchy; callers (Repository.legacyFileSystem) migrated first. LegacyFileSystemApi: interface implemented only by PersistentFileSystem and LocalDfsFileSystem. LocalDfsFileSystem: only used in HiveAcidTest to create test fixtures; replaced with java.nio.file.Files helper method. GsonUtils: removes buildLegacyFileSystemAdapterFactory() and its RuntimeTypeAdapterFactory registration. G.6 - GlobListResult: was used only internally within S3ObjStorage (and previously by LegacyFileSystemApi default method); moved to a private static inner class of S3ObjStorage. H.1 - Repository.legacyFileSystem: removes the @deprecated backward-compat field and the deserialization branch that used it. Old serialized metadata with a 'rfs' field will now silently skip the legacy path (fileSystemDescriptor will be null, and the method returns early). New metadata uses 'fs_descriptor' exclusively. H.2 - FileSystemDescriptor.fromPersistentFileSystem(): removes the migration helper method (no longer called by anyone after H.1). Also cleans up the javadoc that referenced deleted classes. ### Release note None ### Check List (For Author) - Test: No need to test (all deleted classes are dead code with zero production callers; HiveAcidTest behavior unchanged — same test fixture files created via NIO) - Behavior changed: No (Repository deserialization: clusters running this code have already written 'fs_descriptor' format; the 'rfs' legacy field path was only needed for migration from very old metadata) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
1. The fallback dependency:copy-dependencies for fe-core ran without
-DoutputDirectory, so Maven used the default target/dependency/ instead
of target/lib/. This caused 'cp: cannot stat target/lib/*' when a
Maven Build Cache hit skipped the copy-dependencies lifecycle phase.
Fix: add -DoutputDirectory and -DincludeScope=runtime to match the
pom.xml execution configuration.
2. All fe-filesystem impl modules lacked <finalName>, producing
fe-filesystem-s3-1.2-SNAPSHOT.jar while build.sh expected
doris-fe-filesystem-s3*.jar. Fix: add <finalName>doris-fe-filesystem-{name}</finalName>
to each module's pom.xml, consistent with fe-filesystem-spi.
3. The filesystem plugin deployment loop used bare mvn instead of
${MVN_CMD}. Fix: use ${MVN_CMD} for consistency.
4. The main JAR glob pattern in the deployment loop had a spurious '-'
before '*' (doris-fe-filesystem-s3-*.jar) that couldn't match the
versionless finalName output. Fix: remove the '-'.
### Release note
None
### Check List (For Author)
- Test: Manual test - ran sh build.sh --fe, verified all 8 filesystem
plugins (s3/azure/oss/cos/obs/hdfs/local/broker) deploy correctly to
output/fe/plugins/filesystem/ with main JAR + transitive deps
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Deploy all JARs (main + transitive deps) flat into
output/fe/plugins/filesystem/{module}/ instead of using a lib/
subdirectory, matching the expected plugin loading convention.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ParsedPath ### What problem does this PR solve? Problem Summary: Remove completely dead code from fe-core fs/ package. - fs/operations/ (6 files): Broker/HDFS file operations with zero external callers. - fs/spi/FileSystemSpiProvider.java: Superseded by fe-filesystem-spi module, zero callers. - fs/io/ParsedPath.java: @deprecated with zero callers; also remove the @deprecated path() default methods from DorisInputFile and DorisOutputFile that referenced it. ### Release note None ### Check List (For Author) - Test: FE unit tests (fs.FileSystemTransferUtilTest, fs.MemoryFileSystemTest, fs.TransactionScopeCachingDirectoryListerTest, fs.SchemaTypeMapperTest) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…s interface layer
### What problem does this PR solve?
Problem Summary: Continue fe-core fs/ cleanup (Phase 5.2).
- Add isFile()/name() convenience methods to spi.FileEntry
- Add deleteFiles() default method to spi.FileSystem
- Rewrite MemoryFileSystem to implement filesystem.spi.FileSystem instead of the
old fs.FileSystem; override listFilesRecursive() and listDirectories() to handle
the implicit directory model correctly
- Update MemoryFileSystemTest to use SPI types and renamed API methods
- Delete 6 dead files: fs/FileSystem.java, fs/FileIterator.java,
fs/io/DorisInputFile.java, fs/io/DorisInput.java, fs/io/DorisInputStream.java,
fs/io/DorisOutputFile.java
### Release note
None
### Check List (For Author)
- Test: FE unit tests (MemoryFileSystemTest, FileSystemTransferUtilTest,
TransactionScopeCachingDirectoryListerTest, SchemaTypeMapperTest) — all pass
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: N/A
Problem Summary: Phase 5.3 of the fs-core cleanup. Removes the legacy
fe-core object-storage layer (fs/obj/, fs/remote/RemoteFile, fs/FileEntry,
fs/Location) and migrates its three remaining production callers to the
modern SPI interfaces already used by the rest of the codebase.
### Release note
None
### Check List (For Author)
- Test: Regression test / Unit Test / Manual test / No need to test
- ./build.sh --fe BUILD SUCCESS
- MemoryFileSystemTest 27 tests passed
- Zero remaining imports from fs.obj.*, fs.remote.*, fs.FileEntry, fs.Location
- Behavior changed: No (ping connectivity test logic preserved; multipartUpload
now uses 3-step SPI API instead of single-method wrapper)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dule ### What problem does this PR solve? Issue Number: N/A Problem Summary: FileSystemTransferUtil had zero fe-core dependencies (used only SPI types), but lived in org.apache.doris.fs inside fe-core. Moving it to fe-filesystem-spi keeps all pure-SPI utilities in the SPI module and reduces fe-core's responsibility. Changes: - Move FileSystemTransferUtil.java from org.apache.doris.fs (fe-core) to org.apache.doris.filesystem.spi (fe-filesystem-spi) - Remove redundant same-package imports; make globToRegex() public so callers outside the package (e.g. test) can access it - Update AcidUtil.java and FileSystemDirectoryLister.java import paths - Add explicit import in FileSystemTransferUtilTest.java (same-package access no longer applies) ### Release note None ### Check List (For Author) - Test: Unit test (FileSystemTransferUtilTest — all 20 methods pass) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e-filesystem-spi ### What problem does this PR solve? Issue Number: N/A Problem Summary: Several classes in org.apache.doris.fs had zero fe-core dependencies and belonged logically in fe-filesystem-spi. This commit moves them to keep the SPI module self-contained and reduce fs/ package scope. Changes: - Move RemoteIterator, FileSystemIOException, SimpleRemoteIterator to org.apache.doris.filesystem.spi; drop unused ErrCode field from FileSystemIOException (no caller ever set it) - Move FileSystemType to org.apache.doris.filesystem.spi - Move FileSystemUtil to org.apache.doris.filesystem.spi; replace org.apache.hadoop.fs.Path (path concat only) with plain string concatenation to eliminate Hadoop dependency from SPI module - Move MemoryFileSystem from fe-core src/main to src/test (test-only class should not ship in production JAR) - Update all callers: DirectoryLister, FileSystemDirectoryLister, SchemaTypeMapper, TransactionScopeCachingDirectoryLister, HiveExternalMetaCache, HMSTransaction, LocationPath and their tests - Remove dead ErrCode.NOT_FOUND check in HiveExternalMetaCache (getErrorCode() always returned empty; else-branch always executed) ### Release note None ### Check List (For Author) - Test: Unit tests (MemoryFileSystemTest, SchemaTypeMapperTest — all pass) - Behavior changed: No (ErrCode branch was dead code; FileSystemUtil path concatenation is semantically identical to Hadoop Path) - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.