-
Notifications
You must be signed in to change notification settings - Fork 0
fix: add support for forecast datasets #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughExpanded excluded coordinate keys, made metadata-triggered resize conservative and thread-safe (delegating to resize_store), added retry with exponential backoff to full-shard fetch, made metadata deletion idempotent, added tests for forecast/step coords and idempotent delete, and bumped version to 3.3.1. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
py_hamt/sharded_zarr_store.py (2)
299-337: Critical: Event cleanup missing on retry exhaustion.The retry logic is well-designed with exponential backoff, but there's a critical flaw: if all retries fail, the
_pending_shard_loadsevent is never cleaned up (lines 323-325 only execute on success). This leaves waiting coroutines blocked indefinitely on line 426.Apply this diff to ensure cleanup on all paths:
) -> None: """ Fetch a shard from CAS and cache it, with retry logic for transient errors. Args: shard_idx: The index of the shard to fetch. shard_cid: The CID of the shard. max_retries: Maximum number of retry attempts for transient errors. retry_delay: Delay between retry attempts in seconds. """ + try: - for attempt in range(max_retries): - try: - shard_data_bytes = await self.cas.load(shard_cid) - decoded_shard = dag_cbor.decode(shard_data_bytes) - if not isinstance(decoded_shard, list): - raise TypeError(f"Shard {shard_idx} did not decode to a list.") - await self._shard_data_cache.put(shard_idx, decoded_shard) - # Always set the Event to unblock waiting coroutines - if shard_idx in self._pending_shard_loads: - self._pending_shard_loads[shard_idx].set() - del self._pending_shard_loads[shard_idx] - return # Success - except (ConnectionError, TimeoutError) as e: - # Handle transient errors (e.g., network issues) - if attempt < max_retries - 1: - await asyncio.sleep( - retry_delay * (2**attempt) - ) # Exponential backoff - continue - else: - raise RuntimeError( - f"Failed to fetch shard {shard_idx} after {max_retries} attempts: {e}" - ) + for attempt in range(max_retries): + try: + shard_data_bytes = await self.cas.load(shard_cid) + decoded_shard = dag_cbor.decode(shard_data_bytes) + if not isinstance(decoded_shard, list): + raise TypeError(f"Shard {shard_idx} did not decode to a list.") + await self._shard_data_cache.put(shard_idx, decoded_shard) + return # Success + except (ConnectionError, TimeoutError) as e: + # Handle transient errors (e.g., network issues) + if attempt < max_retries - 1: + await asyncio.sleep( + retry_delay * (2**attempt) + ) # Exponential backoff + continue + else: + raise RuntimeError( + f"Failed to fetch shard {shard_idx} after {max_retries} attempts: {e}" + ) + finally: + # Always clean up pending load, whether success or failure + if shard_idx in self._pending_shard_loads: + self._pending_shard_loads[shard_idx].set() + del self._pending_shard_loads[shard_idx]
1-1: Runruff formatto fix formatting issues.The pipeline reports that
ruff-formatreformatted 2 files. Runpre-commit run --all-fileslocally to apply formatting changes and re-commit.#!/bin/bash # Apply ruff formatting to fix the pipeline failure ruff format py_hamt/sharded_zarr_store.py tests/test_sharded_zarr_store.pyBased on pipeline failure logs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
py_hamt/sharded_zarr_store.py(3 hunks)tests/test_sharded_zarr_store.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
{py_hamt,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
{py_hamt,tests}/**/*.py: Use type hints throughout the codebase (mypy enforced)
Python code must pass ruff linting (ruff check)
Files:
tests/test_sharded_zarr_store.pypy_hamt/sharded_zarr_store.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use Hypothesis property-based testing in the test suite
Gate IPFS-dependent tests behind the --ipfs pytest flag so they can be skipped when IPFS isn’t availablePlace all tests within the /tests directory
Files:
tests/test_sharded_zarr_store.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Ensure all Python functions are fully typed and mypy-compliant
Prefer simple, functional, easy-to-understand functions
Use meaningful variable and function names
Add comments for complex logic to aid readability and maintenance
Files:
tests/test_sharded_zarr_store.pypy_hamt/sharded_zarr_store.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/zarr_hamt_store.py : ZarrHAMTStore must implement the zarr.abc.store.Store interface
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/{store,hamt,zarr_hamt_store,encryption_hamt_store}.py : Make all storage-related operations async to accommodate IPFS/network calls
Applied to files:
tests/test_sharded_zarr_store.py
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/zarr_hamt_store.py : ZarrHAMTStore must implement the zarr.abc.store.Store interface
Applied to files:
py_hamt/sharded_zarr_store.py
🪛 GitHub Actions: Triggered on push from 0xSwego to branch/tag forecast-ds-support
tests/test_sharded_zarr_store.py
[error] 1-1: pre-commit: ruff-format reformatted 2 files. The hook reported a formatting change and the commit/tests may fail until changes are staged. Run 'pre-commit run --all-files' locally to apply and re-commit.
py_hamt/sharded_zarr_store.py
[error] 1-1: pre-commit: ruff-format reformatted 2 files. The hook reported a formatting change and the commit/tests may fail until changes are staged. Run 'pre-commit run --all-files' locally to apply and re-commit.
🪛 Ruff (0.14.5)
tests/test_sharded_zarr_store.py
138-138: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🔇 Additional comments (4)
tests/test_sharded_zarr_store.py (2)
2-2: LGTM!The
mathimport is used appropriately at line 138 for themath.ceilcalculation.
80-151: Good test coverage for forecast coordinate handling.The test properly exercises the new forecast coordinate support:
- Creates a dataset with
forecast_reference_timeandstepcoordinates- Verifies that the primary array geometry remains unchanged after writing (lines 136-139)
- Confirms round-trip data integrity
- Validates that coordinate arrays are stored and retrievable
This addresses the PR objective of adding support for forecast datasets.
py_hamt/sharded_zarr_store.py (2)
343-351: Appropriate expansion of coordinate exclusion list.Adding
forecast_reference_timeandstepto the excluded array prefixes correctly treats them as coordinate/metadata arrays rather than primary chunk data, enabling proper handling of forecast datasets.
630-656: Well-designed resize guard with proper TOCTOU protection.The tightened resize logic correctly:
- Filters out root
zarr.jsonfrom resize consideration (line 632)- Validates dimensional compatibility before attempting resize (lines 639-641, 645-648)
- Double-checks conditions after lock acquisition to prevent race conditions
- Ensures
_resize_completeevent is always set via try/finally (lines 651-655)The dual validation (before and after lock acquisition) properly handles concurrent resize attempts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/test_sharded_zarr_store.py (1)
135-137: Addstrict=Truetozip()for safety.This was flagged in a previous review. The
zip()call should usestrict=Trueto catch dimension mismatches betweenarray_shape_tupleandchunk_shape_tuple.Apply this diff:
assert store_write._chunks_per_dim == tuple( - math.ceil(a / c) for a, c in zip(array_shape_tuple, chunk_shape_tuple) + math.ceil(a / c) for a, c in zip(array_shape_tuple, chunk_shape_tuple, strict=True) )Based on static analysis hints from Ruff (B905).
🧹 Nitpick comments (1)
py_hamt/sharded_zarr_store.py (1)
630-653: Resize logic improvements look solid.The tightened conditions and dimension validation prevent inappropriate resizes and race conditions. The double-check pattern after lock acquisition is a good defensive practice.
For consistency, consider using
!=instead ofnot ==:- if key.endswith("zarr.json") and not key == "zarr.json": + if key.endswith("zarr.json") and key != "zarr.json":
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
py_hamt/sharded_zarr_store.py(3 hunks)tests/test_sharded_zarr_store.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
{py_hamt,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
{py_hamt,tests}/**/*.py: Use type hints throughout the codebase (mypy enforced)
Python code must pass ruff linting (ruff check)
Files:
tests/test_sharded_zarr_store.pypy_hamt/sharded_zarr_store.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use Hypothesis property-based testing in the test suite
Gate IPFS-dependent tests behind the --ipfs pytest flag so they can be skipped when IPFS isn’t availablePlace all tests within the /tests directory
Files:
tests/test_sharded_zarr_store.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Ensure all Python functions are fully typed and mypy-compliant
Prefer simple, functional, easy-to-understand functions
Use meaningful variable and function names
Add comments for complex logic to aid readability and maintenance
Files:
tests/test_sharded_zarr_store.pypy_hamt/sharded_zarr_store.py
🧠 Learnings (2)
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/{store,hamt,zarr_hamt_store,encryption_hamt_store}.py : Make all storage-related operations async to accommodate IPFS/network calls
Applied to files:
tests/test_sharded_zarr_store.py
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/zarr_hamt_store.py : ZarrHAMTStore must implement the zarr.abc.store.Store interface
Applied to files:
py_hamt/sharded_zarr_store.py
🧬 Code graph analysis (1)
tests/test_sharded_zarr_store.py (2)
py_hamt/sharded_zarr_store.py (2)
flush(523-566)exists(687-698)py_hamt/zarr_hamt_store.py (2)
read_only(108-111)exists(193-199)
🪛 Ruff (0.14.5)
tests/test_sharded_zarr_store.py
136-136: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
py_hamt/sharded_zarr_store.py
634-634: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Create project environment, run all checks (3.12)
🔇 Additional comments (3)
py_hamt/sharded_zarr_store.py (1)
343-351: LGTM! Forecast coordinate support added correctly.The addition of
forecast_reference_timeandstepto the exclusion list ensures these coordinate arrays are properly treated as metadata rather than chunked data, which aligns with the PR objective to support forecast datasets.tests/test_sharded_zarr_store.py (2)
2-2: LGTM! Import necessary for test assertion.The
mathimport is appropriately used in the new test for calculating expected chunks per dimension.
80-149: Comprehensive test for forecast dataset support.This test effectively validates the PR objective by:
- Creating a dataset with forecast-specific coordinates (
forecast_reference_timeandstep)- Verifying data integrity through write/read cycles
- Confirming coordinate arrays are properly stored and retrievable
The test structure is clear and the assertions are appropriate.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #84 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 7 7
Lines 1310 1308 -2
=========================================
- Hits 1310 1308 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/test_sharded_zarr_store.py (1)
80-149: Usezip(..., strict=True)in the chunk-geometry assertion to catch shape mismatches.The new forecast/step test looks good and nicely validates that writing a 4D forecast dataset doesn’t perturb the primary array geometry. To tighten the assertion and satisfy Ruff’s B905 recommendation, make the
zipexplicit about dimensional agreement:- assert store_write._array_shape == array_shape_tuple - assert store_write._chunks_per_dim == tuple( - math.ceil(a / c) for a, c in zip(array_shape_tuple, chunk_shape_tuple) - ) + assert store_write._array_shape == array_shape_tuple + assert store_write._chunks_per_dim == tuple( + math.ceil(a / c) + for a, c in zip(array_shape_tuple, chunk_shape_tuple, strict=True) + )That way, any accidental mismatch in the length of
array_shape_tupleandchunk_shape_tuplewill raise instead of silently truncating.#!/bin/bash # Quick grep to ensure all zip(array_shape_tuple, chunk_shape_tuple) calls # in the repo use strict=True. rg -n "zip\(array_shape_tuple,\s*chunk_shape_tuple" --type=py -C2
🧹 Nitpick comments (3)
py_hamt/sharded_zarr_store.py (3)
299-337: Clean up_pending_shard_loadson terminal fetch failure to avoid unnecessary timeouts.The retry/backoff around
cas.loadis good, but when the finalConnectionError/TimeoutErroroccurs you raiseRuntimeErrorwithout clearing_pending_shard_loads[shard_idx]. Callers coming through_load_or_initialize_shard_cachewill then block up to 60 seconds on an event that is never set before hitting the timeout path that finally cleans it up.You can unblock waiters immediately and avoid relying on the timeout branch by clearing the pending entry before raising on the last attempt:
@@ - for attempt in range(max_retries): + for attempt in range(max_retries): try: @@ - return # Success + return # Success except (ConnectionError, TimeoutError) as e: # Handle transient errors (e.g., network issues) - if attempt < max_retries - 1: + if attempt < max_retries - 1: await asyncio.sleep( retry_delay * (2**attempt) ) # Exponential backoff continue - else: - raise RuntimeError( - f"Failed to fetch shard {shard_idx} after {max_retries} attempts: {e}" - ) + else: + # Ensure waiters on this shard index are unblocked on terminal failure. + if shard_idx in self._pending_shard_loads: + self._pending_shard_loads[shard_idx].set() + del self._pending_shard_loads[shard_idx] + raise RuntimeError( + f"Failed to fetch shard {shard_idx} after {max_retries} attempts: {e}" + )
343-373: Expandedexcluded_array_prefixescorrectly captures forecast coordinate arrays.Including
"forecast_reference_time"and"step"(plus"latitude"/"longitude") inexcluded_array_prefixesmatches the forecast/step coordinate behavior exercised in the new tests and keeps those coordinate arrays in the metadata namespace rather than the main chunk grid, which is whatexists()/delete()expect.If you ever profile this hot, you could consider hoisting
excluded_array_prefixesto a module- or class-level constant instead of rebuilding the set on every_parse_chunk_keycall, but that’s purely a micro-optimisation.
630-653: Resize trigger narrowing inset()looks correct; consider DRYing the shape check.Restricting the resize path to keys ending with
"zarr.json"but not the root"zarr.json", and only when the new shape has the same rank as_array_shapeand actually differs, is a good way to prevent coordinate metadata (e.g., 1D coords) from resharding the main index. The double-check inside_resize_lockis also a nice concurrency guard.To reduce duplication and make the intent a bit clearer, you could factor the repeated condition into a small helper or local
needs_resizeboolean:- if not new_array_shape: + if not new_array_shape: raise ValueError("Shape not found in metadata.") - # Only resize when the metadata shape represents the primary array. - if ( - len(new_array_shape) == len(self._array_shape) - and tuple(new_array_shape) != self._array_shape - ): + # Only resize when the metadata shape represents the primary array. + needs_resize = ( + len(new_array_shape) == len(self._array_shape) + and tuple(new_array_shape) != self._array_shape + ) + if needs_resize: async with self._resize_lock: @@ - if ( - len(new_array_shape) == len(self._array_shape) - and tuple(new_array_shape) != self._array_shape - ): + if needs_resize:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
py_hamt/sharded_zarr_store.py(4 hunks)tests/test_sharded_zarr_store.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
{py_hamt,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
{py_hamt,tests}/**/*.py: Use type hints throughout the codebase (mypy enforced)
Python code must pass ruff linting (ruff check)
Files:
py_hamt/sharded_zarr_store.pytests/test_sharded_zarr_store.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Ensure all Python functions are fully typed and mypy-compliant
Prefer simple, functional, easy-to-understand functions
Use meaningful variable and function names
Add comments for complex logic to aid readability and maintenance
Files:
py_hamt/sharded_zarr_store.pytests/test_sharded_zarr_store.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use Hypothesis property-based testing in the test suite
Gate IPFS-dependent tests behind the --ipfs pytest flag so they can be skipped when IPFS isn’t availablePlace all tests within the /tests directory
Files:
tests/test_sharded_zarr_store.py
🧠 Learnings (1)
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/zarr_hamt_store.py : ZarrHAMTStore must implement the zarr.abc.store.Store interface
Applied to files:
py_hamt/sharded_zarr_store.py
🧬 Code graph analysis (1)
tests/test_sharded_zarr_store.py (1)
py_hamt/sharded_zarr_store.py (5)
ShardedZarrStore(142-874)open(210-241)flush(523-566)exists(687-698)delete(712-731)
🪛 GitHub Actions: Triggered on push from 0xSwego to branch/tag forecast-ds-support
tests/test_sharded_zarr_store.py
[error] 165-178: ruff-format reformatted the file during pre-commit and the hook failed. Run 'pre-commit run --all-files' to re-check formatting.
🪛 Ruff (0.14.6)
py_hamt/sharded_zarr_store.py
634-634: Avoid specifying long messages outside the exception class
(TRY003)
tests/test_sharded_zarr_store.py
136-136: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🔇 Additional comments (3)
py_hamt/sharded_zarr_store.py (1)
716-721: Idempotent metadata/coordinate delete behavior is a good fit for callers.Making the metadata branch of
delete()usepop(key, None)and only toggling_dirty_rootwhen something was actually removed gives you idempotent semantics for coordinate/metadata deletions, which matches how the new tests exercise repeated deletes without errors and is in line with common store APIs.tests/test_sharded_zarr_store.py (2)
151-197: Good targeted test for coordinate-metadata delete idempotence.
test_coordinate_metadata_delete_idempotentnicely exercises repeated deletes for"forecast_reference_time/c/0"and"step/c/0", matching the new_parse_chunk_keyexclusions and idempotent metadata delete logic in the store. This should guard against regressions where coordinate deletions start raising again.
1-10: The file is already properly formatted with no uncommitted changes required.The file
tests/test_sharded_zarr_store.pycurrently has proper formatting that complies with ruff-format standards:
- Imports are correctly organized (stdlib → third-party → local with blank line separators)
- No trailing whitespace or formatting issues detected
- Working tree is clean with no uncommitted changes
The review comment's suggestion to run pre-commit and commit diffs is unnecessary, as the file already meets all formatting requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/test_sharded_zarr_store.py (1)
81-138: Tightenzip()withstrict=Truein chunk geometry assertionThis test is a solid coverage of 4D forecast datasets and correctly keeps the store geometry aligned with the Dataset. The only concern is the
zip()call in the_chunks_per_dimassertion:assert store_write._chunks_per_dim == tuple( math.ceil(a / c) for a, c in zip(array_shape_tuple, chunk_shape_tuple) )To satisfy Ruff B905 and also surface any future dimension mismatches between
array_shape_tupleandchunk_shape_tuple, make thezipstrict:- assert store_write._chunks_per_dim == tuple( - math.ceil(a / c) for a, c in zip(array_shape_tuple, chunk_shape_tuple) - ) + assert store_write._chunks_per_dim == tuple( + math.ceil(a / c) + for a, c in zip(array_shape_tuple, chunk_shape_tuple, strict=True) + )This keeps the test aligned with the intended invariants and avoids silent truncation if the tuples ever diverge. The async usage here also matches the fully async storage APIs for IPFS-backed stores. Based on learnings, ...
🧹 Nitpick comments (1)
tests/test_sharded_zarr_store.py (1)
152-197: Idempotent delete test is good; optional: assert post-conditionsThe test correctly exercises coordinate delete idempotence (double-deleting the same keys without raising), which matches the store’s intended behavior for metadata/chunk deletions.
If you want to make this stronger, you could optionally assert that the coordinate chunks are actually gone after the first delete (e.g., via
await store.exists("forecast_reference_time/c/0") is False) before re-deleting, but that’s not strictly necessary for catching regressions in idempotence itself.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
py_hamt/sharded_zarr_store.py(4 hunks)tests/test_sharded_store_resizing.py(0 hunks)tests/test_sharded_zarr_store.py(2 hunks)tests/test_sharded_zarr_store_coverage.py(1 hunks)
💤 Files with no reviewable changes (1)
- tests/test_sharded_store_resizing.py
🚧 Files skipped from review as they are similar to previous changes (1)
- py_hamt/sharded_zarr_store.py
🧰 Additional context used
📓 Path-based instructions (3)
{py_hamt,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
{py_hamt,tests}/**/*.py: Use type hints throughout the codebase (mypy enforced)
Python code must pass ruff linting (ruff check)
Files:
tests/test_sharded_zarr_store_coverage.pytests/test_sharded_zarr_store.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use Hypothesis property-based testing in the test suite
Gate IPFS-dependent tests behind the --ipfs pytest flag so they can be skipped when IPFS isn’t availablePlace all tests within the /tests directory
Files:
tests/test_sharded_zarr_store_coverage.pytests/test_sharded_zarr_store.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Ensure all Python functions are fully typed and mypy-compliant
Prefer simple, functional, easy-to-understand functions
Use meaningful variable and function names
Add comments for complex logic to aid readability and maintenance
Files:
tests/test_sharded_zarr_store_coverage.pytests/test_sharded_zarr_store.py
🧠 Learnings (1)
📚 Learning: 2025-09-15T21:16:14.768Z
Learnt from: CR
Repo: dClimate/py-hamt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-15T21:16:14.768Z
Learning: Applies to py_hamt/{store,hamt,zarr_hamt_store,encryption_hamt_store}.py : Make all storage-related operations async to accommodate IPFS/network calls
Applied to files:
tests/test_sharded_zarr_store.py
🧬 Code graph analysis (1)
tests/test_sharded_zarr_store.py (2)
py_hamt/sharded_zarr_store.py (3)
flush(523-566)exists(689-700)delete(714-733)py_hamt/zarr_hamt_store.py (3)
read_only(108-111)exists(193-199)delete(236-248)
🪛 Ruff (0.14.6)
tests/test_sharded_zarr_store.py
137-137: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Create project environment, run all checks (3.12)
🔇 Additional comments (3)
tests/test_sharded_zarr_store_coverage.py (1)
208-211: Coverage for metadata-without-shape path looks goodExercising
store.set()with shape-less metadata here is enough for this coverage test, since the geometry invariants are asserted intest_metadata_without_shape_does_not_resize.tests/test_sharded_zarr_store.py (2)
2-3: New imports are appropriate and used
jsonandmathare both exercised in the new tests below (metadata encoding and chunk-geometry assertions), so these additions are correct and won’t trip unused-import linting.
198-220: Metadata-without-shape resize guard is well coveredThis test cleanly verifies the new behavior: writing a
zarr.jsonmetadata blob without ashapefield no longer triggers resizing, and_array_shape/_chunks_per_dimstay unchanged. Using JSON encoding viajson.dumpsand the buffer prototype matches how other metadata paths operate, so this test is aligned with the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/test_sharded_zarr_store.py (3)
81-152: Forecast/step coordinate test covers the new dataset shape & metadata behavior wellThis test does a good job exercising a realistic 4‑D forecast layout, verifying both round‑trip integrity and that
forecast_reference_time/stepcoordinate chunks are persisted as metadata. The geometry assertions using_array_shapeand_chunks_per_dim(withzip(..., strict=True)) give strong protection against regressions in chunk layout.If you expect the internal representation of
_array_shape/_chunks_per_dimto evolve, consider adding a small public helper or factory inShardedZarrStorefor these invariants so tests depend less on private attributes. As per coding guidelines, this keeps tests resilient to internal refactors.
153-197: Idempotent coordinate metadata delete behavior is validated; consider asserting final stateThe test correctly verifies that repeated deletes of
forecast_reference_timeandstepcoordinate metadata complete without error, matching the intended idempotentdeleteimplementation.For a slightly stronger check, you could also assert that
await store.exists("forecast_reference_time/c/0")andawait store.exists("step/c/0")areFalseafter the deletions (optionally after aflush()and reopen) to confirm state as well as behavior.
199-221: Metadata-without-shape regression is covered; note reliance on private internalsThis test cleanly captures the regression condition that metadata JSON lacking a
shapemust not trigger a resize or error, and the assertions on_array_shapeand_chunks_per_dimmake the intent explicit.If
_array_shape/_chunks_per_dimare likely to change or be encapsulated further, consider asserting via a small public accessor or helper instead of private attributes so the test remains stable across internal refactors.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_sharded_zarr_store.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
{py_hamt,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
{py_hamt,tests}/**/*.py: Use type hints throughout the codebase (mypy enforced)
Python code must pass ruff linting (ruff check)
Files:
tests/test_sharded_zarr_store.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use Hypothesis property-based testing in the test suite
Gate IPFS-dependent tests behind the --ipfs pytest flag so they can be skipped when IPFS isn’t availablePlace all tests within the /tests directory
Files:
tests/test_sharded_zarr_store.py
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Ensure all Python functions are fully typed and mypy-compliant
Prefer simple, functional, easy-to-understand functions
Use meaningful variable and function names
Add comments for complex logic to aid readability and maintenance
Files:
tests/test_sharded_zarr_store.py
🧬 Code graph analysis (1)
tests/test_sharded_zarr_store.py (1)
py_hamt/sharded_zarr_store.py (5)
ShardedZarrStore(142-876)open(210-241)flush(523-566)exists(689-700)delete(714-733)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Create project environment, run all checks (3.12)
🔇 Additional comments (1)
tests/test_sharded_zarr_store.py (1)
2-3: New stdlib imports are appropriate and used correctly
jsonandmathare both used in the new tests and placed correctly with other standard-library imports; no issues here.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.