this issue was written by claude. it was discovered while working on #3961
Bug: delete_dir in stateful test uses raw startswith for bookkeeping, causing flaky KeyError in delete_group_using_del
The hypothesis state machine in src/zarr/testing/stateful.py tracks created nodes in two sets, self.all_arrays and self.all_groups. When delete_dir(path) runs, it prunes those sets using a raw string-prefix match:
# src/zarr/testing/stateful.py:307-312
matches = set()
for node in self.all_groups | self.all_arrays:
if node.startswith(path):
matches.add(node)
self.all_groups = self.all_groups - matches
self.all_arrays = self.all_arrays - matches
node.startswith(path) matches any node whose path string begins with path, not just nodes that are descendants of the directory path. So delete_dir('6/f') matches a sibling node at 6/faNT7p7jvJsO3_C._HYi and incorrectly removes it from all_arrays.
The real store-level delete_dir('6/f') only removes objects under 6/f/, so 6/faNT... survives in the store. The bookkeeping and the model now disagree. When delete_group_using_del later walks members(...) of an ancestor group and tries self.all_arrays.remove(obj.path), the entry has already been pruned by the broken delete_dir, and the call raises KeyError.
Reproduction
Slow Hypothesis CI run https://github.com/zarr-developers/zarr-python/actions/runs/25939320276 found this in two distinct falsifying examples in the same job:
File "src/zarr/testing/stateful.py", line 372, in delete_group_using_del
self.all_arrays.remove(obj.path)
KeyError: '6/j3pnC'
File "src/zarr/testing/stateful.py", line 372, in delete_group_using_del
self.all_arrays.remove(obj.path)
KeyError: '6/faNT7p7jvJsO3_C._HYi'
The shrunk trace shows the pattern clearly: an array is created at 6/faNT7p7jvJsO3_C._HYi, delete_dir('6/f') is invoked, and the next delete_group_using_del targeting '6' blows up because the bookkeeping for 6/faNT... is gone but the store still has it.
The bug is non-deterministic in CI because .github/workflows/hypothesis.yaml does not pin a hypothesis seed. Most runs pass; the example only surfaces when node_names generates a name that is a string-prefix-collision with another sibling's name and the action ordering exposes the bookkeeping drift.
Root cause
delete_dir strips entries by string prefix instead of by path-segment prefix. The check needs to require that any match is either equal to path or has path followed by the / segment separator.
Suggested fix
Replace the body of the delete_dir cleanup loop with a segment-aware check:
matches = {
node for node in self.all_groups | self.all_arrays
if node == path or node.startswith(path + "/")
}
Origin
Introduced in #3130 (commit c972f7f) when the additional stateful actions were ported from icechunk. Unrelated to the current zarr-metadata refactor; surfaced there only because Hypothesis randomization happened to find it.
this issue was written by claude. it was discovered while working on #3961
Bug:
delete_dirin stateful test uses rawstartswithfor bookkeeping, causing flakyKeyErrorindelete_group_using_delThe hypothesis state machine in
src/zarr/testing/stateful.pytracks created nodes in two sets,self.all_arraysandself.all_groups. Whendelete_dir(path)runs, it prunes those sets using a raw string-prefix match:node.startswith(path)matches any node whose path string begins withpath, not just nodes that are descendants of the directorypath. Sodelete_dir('6/f')matches a sibling node at6/faNT7p7jvJsO3_C._HYiand incorrectly removes it fromall_arrays.The real store-level
delete_dir('6/f')only removes objects under6/f/, so6/faNT...survives in the store. The bookkeeping and the model now disagree. Whendelete_group_using_dellater walksmembers(...)of an ancestor group and triesself.all_arrays.remove(obj.path), the entry has already been pruned by the brokendelete_dir, and the call raisesKeyError.Reproduction
Slow Hypothesis CI run https://github.com/zarr-developers/zarr-python/actions/runs/25939320276 found this in two distinct falsifying examples in the same job:
The shrunk trace shows the pattern clearly: an array is created at
6/faNT7p7jvJsO3_C._HYi,delete_dir('6/f')is invoked, and the nextdelete_group_using_deltargeting'6'blows up because the bookkeeping for6/faNT...is gone but the store still has it.The bug is non-deterministic in CI because
.github/workflows/hypothesis.yamldoes not pin a hypothesis seed. Most runs pass; the example only surfaces whennode_namesgenerates a name that is a string-prefix-collision with another sibling's name and the action ordering exposes the bookkeeping drift.Root cause
delete_dirstrips entries by string prefix instead of by path-segment prefix. The check needs to require that any match is either equal topathor haspathfollowed by the/segment separator.Suggested fix
Replace the body of the
delete_dircleanup loop with a segment-aware check:Origin
Introduced in #3130 (commit c972f7f) when the additional stateful actions were ported from icechunk. Unrelated to the current zarr-metadata refactor; surfaced there only because Hypothesis randomization happened to find it.