perf: memoize encoded inner chunk for scalar complete-shard writes by d-v-b · Pull Request #177 · d-v-b/zarr-python

d-v-b · 2026-05-30T07:18:32Z

In ShardingCodec._encode_partial_sync's full-shard-rewrite loop, a scalar broadcast value produces byte-for-byte identical results for every complete inner chunk (same fill, same empty-check, same encoded bytes). Compute that outcome once and reuse it across all complete chunks instead of re-merging, re-checking write_empty_chunks, and re-encoding tens of thousands of identical chunks. Incomplete edge chunks still merge against their own data individually.

Target case (fused, memory, chunks=100/shards=1M, no compression): write 92.26ms -> 21.59ms (4.3x). Pipeline parity (byte-identical to batched) and 956 tests pass under the fused pipeline; adversarial partial-overwrite/ edge/compression/2D/aliasing checks pass.

[Description of PR]

TODO:

Add unit tests and/or doctests in docstrings
Add docstrings and API docs for any new/modified user-facing classes and functions
New/modified features documented in docs/user-guide/*.md
Changes documented as a new file in changes/
GitHub Actions have all passed
Test coverage is 100% (Codecov passes)

In ShardingCodec._encode_partial_sync's full-shard-rewrite loop, a scalar broadcast value produces byte-for-byte identical results for every complete inner chunk (same fill, same empty-check, same encoded bytes). Compute that outcome once and reuse it across all complete chunks instead of re-merging, re-checking write_empty_chunks, and re-encoding tens of thousands of identical chunks. Incomplete edge chunks still merge against their own data individually. Target case (fused, memory, chunks=100/shards=1M, no compression): write 92.26ms -> 21.59ms (4.3x). Pipeline parity (byte-identical to batched) and 956 tests pass under the fused pipeline; adversarial partial-overwrite/ edge/compression/2D/aliasing checks pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions Bot added the needs release notes label May 30, 2026

d-v-b merged commit c9c8c26 into perf/prepared-write-v2 May 30, 2026
2 checks passed

d-v-b deleted the perf/prepared-write-v2-scalar-memo branch May 30, 2026 07:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: memoize encoded inner chunk for scalar complete-shard writes#177

perf: memoize encoded inner chunk for scalar complete-shard writes#177
d-v-b merged 1 commit into
perf/prepared-write-v2from
perf/prepared-write-v2-scalar-memo

d-v-b commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

d-v-b commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant