fix(io): return correct list_status paths and reuse storage operators#101
fix(io): return correct list_status paths and reuse storage operators#101JingsongLi merged 3 commits intoapache:mainfrom
Conversation
|
@Aitozi @JingsongLi PTAL 👀 |
|
cc @luoyuxia to take a review |
# Conflicts: # crates/paimon/src/io/file_io.rs
luoyuxia
left a comment
There was a problem hiding this comment.
@QuakeWang Thanks for the pr. Left minor comments. PTAL
| size: meta.content_length(), | ||
| is_dir: meta.is_dir(), | ||
| path: entry.path().to_string(), | ||
| path: format!("{base_path}{}", entry.path()), |
There was a problem hiding this comment.
IIUC, format!("{base_path}{}", entry.path()) just add leading / to entry.path()?
Considering openal will always call normalize_path to remove leading /, do we real need to add eading / to entry.path()? Could you just keep entry.path()?
There was a problem hiding this comment.
@luoyuxia Thanks for the comment. entry.path() in OpenDAL is relative to operator root, while FileStatus.path here is expected to preserve the original FileIO path prefix (e.g., memory:/ or
file:/) for consistency with get_status and follow-up FileIO operations. normalize_path applies to Operator input paths, not to Entry::path() output.
There was a problem hiding this comment.
Thanks. It sounds reasonable to me to return the full path with scheme prefix.
But it doesn't seem to be necessary now for reusing the storage operator. Maybe you could create a separate PR for reusing the storage operator part in which we can revist it further it.
| size: meta.content_length(), | ||
| is_dir: meta.is_dir(), | ||
| path: entry.path().to_string(), | ||
| path: format!("{base_path}{}", entry.path()), |
There was a problem hiding this comment.
Thanks. It sounds reasonable to me to return the full path with scheme prefix.
But it doesn't seem to be necessary now for reusing the storage operator. Maybe you could create a separate PR for reusing the storage operator part in which we can revist it further it.
|
@luoyuxia Thanks for the suggestions. I narrowed this PR to the list_status path fix only and reverted the storage-operator reuse changes. I also simplified the new test by removing non-essential assertions and reducing duplicated write code. The FS path regression test is kept. |
Purpose
Fix
FileIO::list_statuspath semantics: each returnedFileStatus.pathnow points to the actual listed entry instead of the input directory path.Also, keep reusable operators in
Storageto preserve memory backend state across calls.Brief change log
list_statusto return real entry paths (base_path + entry.path()), and fetch required metadata vialist_with(...).metakey(...).Storageto reuseOperatorinstances across calls (including memory backend).list_statuson bothfsandmemorybackends.Tests
cargo fmt --all -- --checkcargo clippy --all-targets --workspace -- -D warningscargo test -p paimon io::file_io::file_action_testcargo test --all-targets --workspaceAll tests passed locally.
API and Format
list_statusnow returns correct entry paths.Documentation
No documentation update required.