Skip to content

RF: Derive LocalZarrEntry.filepath from zarr_basepath and parts#1843

Merged
yarikoptic merged 1 commit intomasterfrom
refactor-localzarrentry-filepath-property
Apr 29, 2026
Merged

RF: Derive LocalZarrEntry.filepath from zarr_basepath and parts#1843
yarikoptic merged 1 commit intomasterfrom
refactor-localzarrentry-filepath-property

Conversation

@candleindark
Copy link
Copy Markdown
Member

Summary

LocalZarrEntry previously stored both filepath and (zarr_basepath, parts) — two representations of the entry's position within the zarr, kept in sync only by convention inside _get_subpath and parent. This PR makes filepath a @property derived as zarr_basepath.joinpath(*parts), so the two states can no longer disagree. Internally the only direct construction site, ZarrAsset.filetree, drops its now-redundant filepath= argument.

Compatibility

LocalZarrEntry is exported in dandi.files, so external code could in principle depend on filepath being a dataclass field. A GitHub code search suggests this is unlikely in practice:

  • gh search code 'LocalZarrEntry' returns only one match outside this repo (tushkum34-cloud/pylibsmeta), and it is an auto-generated JSON snapshot of public-symbol metadata — not source code that uses the class.
  • gh search code 'LocalZarrEntry(' finds zero direct constructor calls outside this repo.
  • Attribute reads (entry.filepath) remain source-compatible with a property; only callers that pass filepath= to the constructor or use dataclasses.replace(entry, filepath=...) would be affected, and none were found.

Private/unindexed code cannot be ruled out, but the realistic exposure surface is very small.

Verification

  • tox -e lint,typing — pass.
  • hatch run test:run dandi/tests/test_files.py -k "zarr or Zarr" — 14 passed.

Release Notes

Refactor LocalZarrEntry so filepath is derived from zarr_basepath and parts instead of being stored separately, eliminating the possibility of internal inconsistency between the two representations.

Previously LocalZarrEntry stored both filepath and (zarr_basepath, parts),
two representations of the same fact kept in sync only by convention
inside _get_subpath and parent.  Convert filepath to a @Property derived
as zarr_basepath.joinpath(*parts), removing the possibility of internal
inconsistency.  zarr_basepath is now the only on-disk state field; parts
(inherited from BasePath) is the single source of truth for the entry's
position within the zarr.

The only direct construction site, ZarrAsset.filetree, drops the now
unneeded filepath= argument.
@candleindark candleindark requested a review from yarikoptic April 28, 2026 06:16
@candleindark candleindark added zarr Python API code quality Nonfunctional improvements to the codebase labels Apr 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.26%. Comparing base (4a14a1f) to head (af831f5).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
dandi/files/zarr.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1843      +/-   ##
==========================================
- Coverage   76.30%   76.26%   -0.05%     
==========================================
  Files          87       87              
  Lines       12484    12486       +2     
==========================================
- Hits         9526     9522       -4     
- Misses       2958     2964       +6     
Flag Coverage Δ
unittests 76.26% <66.66%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@candleindark candleindark added the patch Increment the patch version when merged label Apr 28, 2026
@yarikoptic
Copy link
Copy Markdown
Member

sounds good to me, but I am still a little cautious -- do history digging on how having both came about -- I wonder if there could be a case not covered by tests which require them to diverge somehow?

Copy link
Copy Markdown
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please provide historical context on how two came about

@candleindark
Copy link
Copy Markdown
Member Author

@yarikoptic — historical context as requested:

  • LocalZarrEntry was introduced in 5e561e9 as part of Support Zarr directories #853 ("Support Zarr directories", merged 2022-01-25). It has had both filepath (a stored field) and the inherited parts (from BasePath) since that very first commit — there was never a version where only one existed.
  • In that same initial commit, the helpers that derive new entries (_get_subpath, parent) already updated filepath and parts in lockstep via replace(self, filepath=..., parts=...). git log -G over the lifetime of dandi/files.py and dandi/files/zarr.py confirms no commit has ever set filepath independently of parts (or vice versa) — af831f5 (this PR) is the only other commit that touches that construction surface.
  • No test, no other code path, and no API consumer has ever required them to diverge. The redundancy appears to be an artifact of the original design — mirroring the filepath attribute that other LocalAsset subclasses expose as a stored field, and avoiding a small re-joinpath on each access. It was never load-bearing.

So the invariant filepath == zarr_basepath / "/".join(parts) has held since the class was introduced; this PR just makes it structural rather than relying on every helper to maintain it by hand.

@yarikoptic yarikoptic merged commit 96bf2df into master Apr 29, 2026
62 of 69 checks passed
@yarikoptic yarikoptic deleted the refactor-localzarrentry-filepath-property branch April 29, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

code quality Nonfunctional improvements to the codebase patch Increment the patch version when merged Python API zarr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants