-
-
Notifications
You must be signed in to change notification settings - Fork 396
ObjectStore.list_dir corrupts directory names due to lstrip vs removeprefix #3753
Copy link
Copy link
Closed
Description
_transform_list_dir in zarr/storage/_obstore.py uses str.lstrip(prefix) on line 263 to strip path prefixes from common_prefixes. But lstrip strips individual characters from the argument, not a prefix string. This corrupts directory names when the prefix shares characters with child directory names.
Line 264 already correctly uses removeprefix for objects — the fix is to do the same for prefixes on line 263.
Reproducer:
# /// script
# requires-python = ">=3.12"
# dependencies = ["zarr>=3", "obstore", "xarray", "numpy"]
# ///
import asyncio, tempfile
import numpy as np, xarray as xr
from obstore.store import LocalStore
from zarr.storage import ObjectStore
tmpdir = tempfile.mkdtemp()
parent_dir = f"{tmpdir}/bucket_root"
filepath = f"{parent_dir}/subdir/data.zarr"
xr.Dataset({"temp": (("x", "y"), np.arange(12, dtype="float32").reshape(3, 4))}).to_zarr(
filepath, consolidated=False, zarr_format=2
)
store = LocalStore(prefix=parent_dir)
zarr_store = ObjectStore(store=store)
async def main():
results = [item async for item in zarr_store.list_dir("subdir/data.zarr")]
print(results) # ['emp', '.zattrs', '.zgroup'] — 'temp' corrupted to 'emp'
asyncio.run(main())Fix:
- prefixes = [obj.lstrip(prefix).lstrip("/") for obj in list_result["common_prefixes"]]
+ prefixes = [obj.removeprefix(prefix).lstrip("/") for obj in list_result["common_prefixes"]]Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels