Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions py_hamt/encryption_hamt_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,19 @@ def __init__(
self.header = header
self.metadata_read_cache: dict[str, bytes] = {}

def with_read_only(self, read_only: bool = False) -> "SimpleEncryptedZarrHAMTStore":
if read_only == self.read_only:
return self

clone = type(self).__new__(type(self))
clone.hamt = self.hamt
clone.encryption_key = self.encryption_key
clone.header = self.header
clone.metadata_read_cache = self.metadata_read_cache
clone._forced_read_only = read_only # safe; attribute is declared
zarr.abc.store.Store.__init__(clone, read_only=read_only)
return clone

Comment on lines +113 to +125
Copy link
Contributor

@coderabbitai coderabbitai bot Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing read_only property to use _forced_read_only attribute

The with_read_only method sets _forced_read_only on line 122, but the class doesn't have a read_only property that checks this attribute. Without such a property, the cloned instance won't actually respect the forced read-only state.

Add a read_only property similar to the one in ZarrHAMTStore:

@property
def read_only(self) -> bool:
    return getattr(self, "_forced_read_only", self.hamt.read_only)

This property should be added before the with_read_only method to ensure the forced read-only state is properly respected throughout the class.

🧰 Tools
🪛 GitHub Actions: Triggered on push from Faolain to branch/tag ci/update-zarr-3.0.9

[error] 115-115: Coverage missing at line 115. Coverage is below the required 100% threshold.

🤖 Prompt for AI Agents
In py_hamt/encryption_hamt_store.py around lines 113 to 125, the with_read_only
method sets the _forced_read_only attribute on the cloned instance, but the
class lacks a read_only property that returns this attribute. To fix this, add a
read_only property before the with_read_only method that returns
_forced_read_only if set, otherwise falls back to self.hamt.read_only. This
ensures the forced read-only state is respected by the cloned instance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this understands that the encryption class is built from the ZarrHAMTStore so we should be able to safely ignore this as the ZarrHAMTStore does have this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheGreatAlgo You're absolutely correct! I apologize for the oversight. Since SimpleEncryptedZarrHAMTStore inherits from ZarrHAMTStore, it automatically inherits the read_only property that was added to the parent class.

Looking at the inheritance relationship:

class SimpleEncryptedZarrHAMTStore(ZarrHAMTStore):

The parent ZarrHAMTStore class already has the read_only property that returns _forced_read_only if set, otherwise falls back to self.hamt.read_only. This means the with_read_only method will work correctly as the cloned instance will respect the _forced_read_only attribute through the inherited property.

The comment can be safely ignored as the implementation is correct through inheritance.

🐰✨
/| /|
( ✌️ )
|(_)|
^ ^


✏️ Learnings added
Learnt from: TheGreatAlgo
PR: dClimate/py-hamt#78
File: py_hamt/encryption_hamt_store.py:113-125
Timestamp: 2025-08-21T07:17:10.677Z
Learning: In py_hamt codebase, SimpleEncryptedZarrHAMTStore inherits from ZarrHAMTStore, which means it inherits the read_only property that checks _forced_read_only attribute. When analyzing with_read_only methods in encrypted store classes, remember to consider inherited properties from the parent ZarrHAMTStore class.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: abidsikder
PR: dClimate/py-hamt#7
File: py_hamt/hamt.py:366-367
Timestamp: 2024-12-02T20:09:32.152Z
Learning: In the `py_hamt` HAMT implementation, always obtaining a lock prevents multithreaded reads in read-only mode. To allow multithreaded reads in read-only mode, locks should not be acquired during read operations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

damn right it will work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheGreatAlgo Exactly! The inheritance chain makes it all work seamlessly. Thanks for keeping me on track with the codebase architecture! 🎯


🧠 Learnings used
Learnt from: TheGreatAlgo
PR: dClimate/py-hamt#78
File: py_hamt/encryption_hamt_store.py:113-125
Timestamp: 2025-08-21T07:17:10.677Z
Learning: In py_hamt codebase, SimpleEncryptedZarrHAMTStore inherits from ZarrHAMTStore, which means it inherits the read_only property that checks _forced_read_only attribute. When analyzing with_read_only methods in encrypted store classes, remember to consider inherited properties from the parent ZarrHAMTStore class.

Learnt from: abidsikder
PR: dClimate/py-hamt#7
File: py_hamt/hamt.py:366-367
Timestamp: 2024-12-02T20:09:32.152Z
Learning: In the `py_hamt` HAMT implementation, always obtaining a lock prevents multithreaded reads in read-only mode. To allow multithreaded reads in read-only mode, locks should not be acquired during read operations.

def _encrypt(self, val: bytes) -> bytes:
"""Encrypts data using ChaCha20-Poly1305."""
nonce = get_random_bytes(24) # XChaCha20 uses a 24-byte nonce
Expand Down
18 changes: 15 additions & 3 deletions py_hamt/store_httpx.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,9 +232,21 @@ def __init__(
# helper: get or create the client bound to the current running loop #
# --------------------------------------------------------------------- #
def _loop_client(self) -> httpx.AsyncClient:
"""Get or create a client for the current event loop."""
"""Get or create a client for the current event loop.

If the instance was previously closed but owns its clients, a fresh
client mapping is lazily created on demand. Users that supplied their
own ``httpx.AsyncClient`` still receive an error when the instance has
been closed, as we cannot safely recreate their client.
"""
if self._closed:
raise RuntimeError("KuboCAS is closed; create a new instance")
if not self._owns_client:
raise RuntimeError("KuboCAS is closed; create a new instance")
# We previously closed all internally-owned clients. Reset the
# state so that new clients can be created lazily.
self._closed = False
self._client_per_loop = {}

loop: asyncio.AbstractEventLoop = asyncio.get_running_loop()
try:
return self._client_per_loop[loop]
Expand All @@ -245,7 +257,7 @@ def _loop_client(self) -> httpx.AsyncClient:
headers=self._default_headers,
auth=self._default_auth,
limits=httpx.Limits(max_connections=64, max_keepalive_connections=32),
# Uncomment when they finally support Robost HTTP/2 GOAWAY responses
# Uncomment when they finally support Robust HTTP/2 GOAWAY responses
# http2=True,
)
self._client_per_loop[loop] = client
Expand Down
37 changes: 35 additions & 2 deletions py_hamt/zarr_hamt_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ class ZarrHAMTStore(zarr.abc.store.Store):
```
"""

_forced_read_only: bool | None = None # sentinel for wrapper clones

def __init__(self, hamt: HAMT, read_only: bool = False) -> None:
"""
### `hamt` and `read_only`
Expand Down Expand Up @@ -79,10 +81,36 @@ def __init__(self, hamt: HAMT, read_only: bool = False) -> None:
"""@private"""

@property
def read_only(self) -> bool:
"""@private"""
def read_only(self) -> bool: # type: ignore[override]
if self._forced_read_only is not None: # instance attr overrides
return self._forced_read_only
return self.hamt.read_only

def with_read_only(self, read_only: bool = False) -> "ZarrHAMTStore":
"""
Return this store (if the flag already matches) or a *shallow*
clone that presents the requested read‑only status.

The clone **shares** the same :class:`~py_hamt.hamt.HAMT`
instance; no flushing, network traffic or async work is done.
"""
# Fast path
if read_only == self.read_only:
return self # Same mode, return same instance

# Create new instance with different read_only flag
# Creates a *bare* instance without running its __init__
clone = type(self).__new__(type(self))

# Copy attributes that matter
clone.hamt = self.hamt # Share the HAMT
clone._forced_read_only = read_only
clone.metadata_read_cache = self.metadata_read_cache.copy()

# Re‑initialise the zarr base class so that Zarr sees the flag
zarr.abc.store.Store.__init__(clone, read_only=read_only)
return clone

def __eq__(self, other: object) -> bool:
"""@private"""
if not isinstance(other, ZarrHAMTStore):
Expand Down Expand Up @@ -145,6 +173,9 @@ def supports_partial_writes(self) -> bool:

async def set(self, key: str, value: zarr.core.buffer.Buffer) -> None:
"""@private"""
if self.read_only:
raise Exception("Cannot write to a read-only store.")

if key in self.metadata_read_cache:
self.metadata_read_cache[key] = value.to_bytes()
await self.hamt.set(key, value.to_bytes())
Expand All @@ -167,6 +198,8 @@ def supports_deletes(self) -> bool:

async def delete(self, key: str) -> None:
"""@private"""
if self.read_only:
raise Exception("Cannot write to a read-only store.")
try:
await self.hamt.delete(key)
# In practice these lines never seem to be needed, creating and appending data are the only operations most zarrs actually undergo
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dependencies = [
"dag-cbor>=0.3.3",
"msgspec>=0.18.6",
"multiformats[full]>=0.3.1.post4",
"zarr>=3.0.8",
"zarr==3.0.9",
"pycryptodome>=3.21.0",
]

Expand Down
33 changes: 25 additions & 8 deletions tests/test_kubocas_session.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import unittest
from threading import Event, Thread

import httpx
import pytest

from py_hamt import KuboCAS
Expand Down Expand Up @@ -189,18 +190,34 @@ def fake_run(coro):


@pytest.mark.asyncio
async def test_loop_client_raises_after_close():
"""
Verify that calling _loop_client() on a closed KuboCAS instance
raises a RuntimeError.
"""
# Arrange: Create a KuboCAS instance
async def test_loop_client_reopens_after_close():
"""Calling _loop_client() after aclose() recreates a fresh client."""
cas = KuboCAS()

# Act: Close the instance. This should set the internal _closed flag.
first = cas._loop_client()
await cas.aclose()

# Assert: Check that calling _loop_client again raises the expected error.
# Should no longer raise; instead a new client is created.
reopened = cas._loop_client()
assert isinstance(reopened, httpx.AsyncClient)
assert reopened is not first
assert cas._closed is False

await cas.aclose()


@pytest.mark.asyncio
async def test_loop_client_rejects_reuse_of_external_client(global_client_session):
"""Calling _loop_client() after aclose() raises when client is user-supplied."""
cas = KuboCAS(
client=global_client_session,
rpc_base_url="http://127.0.0.1:5001",
gateway_base_url="http://127.0.0.1:8080",
)
assert cas._loop_client() is global_client_session

await cas.aclose()
cas._closed = True # simulate closed instance with external client
with pytest.raises(RuntimeError, match="KuboCAS is closed; create a new instance"):
cas._loop_client()

Expand Down
71 changes: 71 additions & 0 deletions tests/test_read_only_guards.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# tests/test_read_only_guards.py
import numpy as np
import pytest
from Crypto.Random import get_random_bytes

from py_hamt import HAMT, InMemoryCAS, SimpleEncryptedZarrHAMTStore, ZarrHAMTStore


# ---------- helpers ----------------------------------------------------
async def _rw_plain():
cas = InMemoryCAS()
hamt = await HAMT.build(cas=cas, values_are_bytes=True)
return ZarrHAMTStore(hamt, read_only=False)


async def _rw_enc():
cas = InMemoryCAS()
hamt = await HAMT.build(cas=cas, values_are_bytes=True)
key, hdr = get_random_bytes(32), b"hdr"
return SimpleEncryptedZarrHAMTStore(hamt, False, key, hdr)


# ---------- plain store ------------------------------------------------
@pytest.mark.asyncio
async def test_plain_read_only_guards():
rw = await _rw_plain()
ro = rw.with_read_only(True)

assert ro.read_only is True
with pytest.raises(Exception):
await ro.set("k", np.array([1], dtype="u1"))
with pytest.raises(Exception):
await ro.delete("k")


@pytest.mark.asyncio
async def test_plain_with_same_flag_returns_self():
rw = await _rw_plain()
assert rw.with_read_only(False) is rw # early‑return path


@pytest.mark.asyncio
async def test_roundtrip_plain_store():
rw = await _rw_plain() # writable store
ro = rw.with_read_only(True) # clone → RO
assert ro.read_only is True
assert ro.hamt is rw.hamt

# idempotent: RO→RO returns same object
assert ro.with_read_only(True) is ro

# back to RW (new wrapper)
rw2 = ro.with_read_only(False)
assert rw2.read_only is False and rw2 is not ro
assert rw2.hamt is rw.hamt

# guard: cannot write through RO wrapper
with pytest.raises(Exception):
await ro.set("k", np.array([0], dtype="u1"))


# ---------- encrypted store -------------------------------------------
@pytest.mark.asyncio
async def test_encrypted_read_only_guards_and_self():
rw = await _rw_enc()
assert rw.with_read_only(False) is rw # same‑flag path
ro = rw.with_read_only(True)
with pytest.raises(Exception):
await ro.set("k", np.array([2], dtype="u1"))
with pytest.raises(Exception):
await ro.delete("k")
21 changes: 21 additions & 0 deletions tests/test_zarr_ipfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,3 +189,24 @@ async def test_list_dir_dedup():
await hamt.set("foo/bar/1", b"")
results = [n async for n in zhs.list_dir("foo/")]
assert results == ["bar"] # no duplicates


@pytest.mark.asyncio
async def test_open_rw_store_triggers_helper():
"""
A write‑enabled ZarrHAMTStore must open cleanly in read‑mode.
Behaviour before 3.2.0: ValueError → helper missing.
Behaviour after 3.2.0: success → helper present.
"""
# --- 1. create a small dataset and a RW HAMT backed by in‑memory CAS
cas = InMemoryCAS()
hamt = await HAMT.build(cas=cas, values_are_bytes=True) # read/write
store_rw = ZarrHAMTStore(hamt, read_only=False)

ds = xr.Dataset({"x": ("t", np.arange(3))})
ds.to_zarr(store=store_rw, mode="w", zarr_format=3)

# --- 2. try to re‑open **the same write‑enabled store** in *read* mode
# – this calls Store.with_read_only(True) internally
reopened = xr.open_zarr(store=store_rw) # <-- MUST NOT raise
assert reopened.x.shape == (3,) # sanity check
2 changes: 1 addition & 1 deletion tests/testing_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def create_ipfs():
if client is None:
pytest.skip("Neither IPFS daemon nor Docker available – skipping IPFS tests")

image = "ipfs/kubo:v0.35.0"
image = "ipfs/kubo:v0.36.0"
rpc_p = _free_port()
gw_p = _free_port()

Expand Down
8 changes: 4 additions & 4 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.