Implement iterating partitions in insertion order without global index by Copilot · Pull Request #254 · albe/node-event-storage

Copilot · 2026-03-21T09:01:50Z

readRange with index=false now iterates all partitions directly in sequenceNumber order, allowing cross-partition insertion-order iteration without a global index (useful when the global index is unavailable or corrupted). The default behavior — using the global index — is unchanged.

Changes

`ReadablePartition` — header exposure via `readFrom` and `readAll`

Rather than introducing a new method, the document header fields are now exposed through the existing partition read API:

readFrom(position, size, headerOut): Accepts an optional headerOut object. When provided, it is mutated in-place with { dataSize, sequenceNumber, time64 } from the document header — zero extra allocation on the default (null) code path.
readAll(after, headerOut): Passes headerOut through to readFrom on each iteration, and additionally populates headerOut.position with the document's byte offset within the partition. Uses headerOut.dataSize for position advancement when available (avoids a Buffer.byteLength call).

`ReadableStorage` — partition scan mode

readRange: Default behavior (index = null) uses the global index as before. Passing index = false explicitly triggers partition scan mode.
iterateRange: When index === false, delegates to iteratePartitionsBySequenceNumber and yields only the document. Otherwise uses the provided index or falls back to the global index.
iteratePartitionsBySequenceNumber(fromSeq, untilSeq) (public API): K-way merge across all partitions. Allocates one reusable headerOut object per partition and passes it to readAll. Yields { document, sequenceNumber, partitionName, position } per entry, providing all information needed to rebuild index entries. Advances each partition iterator to the first document with sequenceNumber >= fromSeq, then repeatedly yields the minimum-sequenceNumber document. SequenceNumbers in headers are 0-based; from/until in the public API are 1-based (fromSeq = from - 1).

// Default: uses the global index as before
for (const doc of storage.readRange(1)) { ... }

// Explicit index: uses that index
for (const doc of storage.readRange(1, -1, storage.index)) { ... }

// Pass false to bypass the global index and iterate partitions by sequenceNumber:
for (const doc of storage.readRange(1, -1, false)) { ... }

// Direct access to enriched entries for reindexing:
for (const { document, sequenceNumber, partitionName, position } of storage.iteratePartitionsBySequenceNumber(0, Infinity)) {
    // rebuild index entries from partitionName + position + sequenceNumber
}

Breaking change

None — the default readRange(from, until) behavior is unchanged (uses the global index). Passing false as the third argument is new opt-in functionality.

Original prompt

This section details on the original issue you should resolve

<issue_title>Return storage document header with reads to allow external sorting by sequenceNumber or time64</issue_title>
<issue_description>Since 0.7 the storage layer stores an external sequence number and a monotonic time64 timestamp into every document. Until now that information is not returned back when reading from the storage.

This requires an API change, which is breaking.</issue_description>

<agent_instructions>@copilot suggest a change on the iterateRange method that iterates partitions directly and orders documents by sequenceNumber from the document header (which by default is coming from the monotonic clock on write), unless an index is specified in which case it iterates that as it does currently. The goal is to be able to iterate documents in logical insertion order across streams/partitions, without having a global index. For example in the case the global index has been corrupted. Add tests that verify the order returned.</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@albe The EventStore read API should not be dealing with internal document sequence numbers and timestamps, so that part should not change. The goal though is, that the storage sequence number can be used to replace the storage level global index for cross-stream (partition) reading purposes. At least the global index should only be optional for performance improvements and not mandatory to reconstruct the document order. See #24, which requires iterating all partitions in insertion order to reindex documents.
The Storage read API currently consists of two methods:
read(number, index): document
This API method does not need to change. If you want to read a single document from the storage, the sequence number is already known and timestamp is likely not of interest. For the case they are, a new method can be added.
*readRange(from, until = -1, index = null): Generator<document>
This API method is supposed to return all documents in the order they were written to the storage. If an index is specified only the documents in that index (stream) should be returned. Hence, technically this API also shouldn't change - a reader is likely not concerned with the individual document's sequence number (he only wants them in the given range and in order) or timestamp. Again, an additional API method can be added to allow this use-case.

So effectively, the *iterateRange(from, until, index) implementation should not read from the global index, but instead iterate over all partitions and return the documents in the sequenceNumber order.

A potential additional API method could be something like
*readTimeRange(fromTime, untilTime): Generator<document>
which would return all documents within a given time range, rather than sequence number range. Once a method that iterates all documents and orders by the document metadata is implemented, adding this API should be straightforward. The biggest issue to solve is how to efficiently find the start/end point for the range. That could be solved by indexing the document time</comment_new>

Fixes Return storage document header with reads to allow external sorting by sequenceNumber or time64 #150

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…ndex is specified Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/992c92a7-025b-417b-9626-2ce2390beadd

coveralls · 2026-03-21T10:54:36Z

coverage: 97.863% (+0.04%) from 97.826%
when pulling eb379fe on copilot/return-storage-document-header
into 28d4e34 on main.

albe · 2026-03-21T22:21:30Z

@copilot reconsidering the choice of default behavior being partition scanning, this is probably a bad choice. The default should be the better behavior, which is using the global index if it's (still) there. Change the API so that in order to NOT use the global index, the index argument needs to be explicitly set to false. This will make this change b/c as well.

Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/ab2b998e-7f96-4809-825a-70ea29c0246c

Copilot · 2026-03-21T22:23:19Z

@copilot reconsidering the choice of default behavior being partition scanning, this is probably a bad choice. The default should be the better behavior, which is using the global in...

Done in ff744f8. Default behavior (index = null) now uses the global index as before. Passing false explicitly triggers the partition scan ordered by sequenceNumber. The three new tests now use readRange(from, until, false) to exercise that path.

albe · 2026-03-22T10:11:16Z

@copilot Let's condense the ReadablePartition API surface such that we don't introduce a new method for reading with headers. We already do read the document header anyway in readFrom so we can also expose the full header in the read partition API layer. We keep the storage layer as is though, as it's higher level.
When implementing, consider the potential performance impact if we have to return an array or object structure to hold both data and header, since this implies a potential additional allocation on every read.

…emove readAllWithHeaders Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/09d5945b-45eb-4ae2-8637-fc9802c46b61

Copilot · 2026-03-22T10:14:48Z

@copilot Let's condense the ReadablePartition API surface such that we don't introduce a new method for reading with headers. We already do read the document header anyway in `readFr...

Done in fcd5390. Removed readAllWithHeaders; header fields are now exposed via an optional headerOut object parameter on the existing readFrom(position, size, headerOut) and readAll(after, headerOut) methods. Passing a pre-allocated object lets callers reuse it across reads (zero extra allocation per document). iteratePartitionsBySequenceNumber now allocates one headerOut = {} per partition and passes it to readAll, reading headerOut.sequenceNumber and headerOut.dataSize after each gen.next() call.

albe · 2026-03-22T11:03:13Z

@copilot looking good, one last thing: consider the case mentioned in #264 where the primary index lags behind actual documents in the partitions. In that case we want to use the functionality in this PR to bring the primary index (and eventually secondary indexes) back in sync by iterating documents by their document headers sequence number, then adding the missing index entries. This means we not only need to receive the document data, but the document sequence number, partition name and partition offset.
Make this information available with the least changes necessary.

…BySequenceNumber Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/97833e10-26b5-48f7-96a3-aa626b11ec94

Copilot · 2026-03-22T11:05:51Z

@copilot looking good, one last thing: consider the case mentioned in #264 where the primary index lags behind actual documents in the partitions. In that case we want to use the fun...

Done in d87beda.

Two minimal changes:

ReadablePartition.readAll — when headerOut is supplied, headerOut.position is now set to the document's byte offset within the partition before each yield (alongside the existing dataSize, sequenceNumber, time64). No extra allocation; the same pre-existing headerOut object is mutated.
ReadableStorage.iteratePartitionsBySequenceNumber — now public (@api); yields { document, sequenceNumber, partitionName, position } per entry instead of raw documents. iterateRange (called by readRange) now extracts .document from each entry, so the readRange interface is unchanged.

Callers doing reindexing can drive it via:

for (const { document, sequenceNumber, partitionName, position } of storage.iteratePartitionsBySequenceNumber(0, Infinity)) {
    // rebuild index entries from partitionName + position + sequenceNumber
}

albe · 2026-03-22T11:24:49Z

src/Storage/ReadableStorage.js

+            }
+
+            if (!result.done && headerOut.sequenceNumber <= untilSeq) {
+                iterators.push({ gen, headerOut, data: result.value, sequenceNumber: headerOut.sequenceNumber, position: headerOut.position, partitionName: partition.name });


@copilot code readability nitpick: rename the iterators variable to partitions and gen to reader here.
Name variables more to the what than the how.

Done in eb379fe — iterators → partitions, gen → reader throughout iteratePartitionsBySequenceNumber.

…eNumber Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/0a7ac0a1-f311-40c2-8415-a1aae70291c3

Initial plan

27eaecc

Copilot AI assigned Copilot and albe Mar 21, 2026

Copilot started work on behalf of albe March 21, 2026 09:01 View session

Modify iterateRange to iterate partitions by sequenceNumber when no i…

81e06bf

…ndex is specified Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/992c92a7-025b-417b-9626-2ce2390beadd

Copilot AI changed the title ~~[WIP] Update iterateRange method to return storage document header~~ iterateRange: use k-way partition merge by sequenceNumber when no index specified Mar 21, 2026

Copilot AI requested a review from albe March 21, 2026 09:15

Copilot finished work on behalf of albe March 21, 2026 09:15

Copilot started work on behalf of albe March 21, 2026 22:21 View session

Change partition scan to require index=false instead of index=null

ff744f8

Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/ab2b998e-7f96-4809-825a-70ea29c0246c

Copilot AI changed the title ~~iterateRange: use k-way partition merge by sequenceNumber when no index specified~~ iterateRange: use k-way partition merge by sequenceNumber when index=false Mar 21, 2026

Copilot finished work on behalf of albe March 21, 2026 22:23

Copilot started work on behalf of albe March 22, 2026 10:11 View session

Condense ReadablePartition API: expose header via readFrom/readAll, r…

fcd5390

…emove readAllWithHeaders Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/09d5945b-45eb-4ae2-8637-fc9802c46b61

Copilot finished work on behalf of albe March 22, 2026 10:15

albe mentioned this pull request Mar 22, 2026

Implement auto-repair #107

Merged

Copilot AI mentioned this pull request Mar 22, 2026

Detect primary index lagging behind partition data at startup #264

Merged

Copilot started work on behalf of albe March 22, 2026 11:03 View session

Expose sequenceNumber, partitionName, position from iteratePartitions…

d87beda

…BySequenceNumber Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/97833e10-26b5-48f7-96a3-aa626b11ec94

Copilot finished work on behalf of albe March 22, 2026 11:06

albe reviewed Mar 22, 2026

View reviewed changes

Copilot started work on behalf of albe March 22, 2026 11:24 View session

Rename iterators→partitions, gen→reader in iteratePartitionsBySequenc…

eb379fe

…eNumber Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/0a7ac0a1-f311-40c2-8415-a1aae70291c3

Copilot AI requested a review from albe March 22, 2026 11:26

Copilot finished work on behalf of albe March 22, 2026 11:26

albe changed the title ~~iterateRange: use k-way partition merge by sequenceNumber when index=false~~ Implement iterating partitions in insertion order without global index Mar 22, 2026

albe marked this pull request as ready for review March 22, 2026 11:57

albe merged commit f923873 into main Mar 22, 2026
10 checks passed

albe deleted the copilot/return-storage-document-header branch March 22, 2026 11:57

albe mentioned this pull request Mar 22, 2026

Allow reindexing a Storage #24

Closed

Copilot AI mentioned this pull request Mar 22, 2026

Allow reindexing a Storage: auto-repair lagging index and public reindex() API #266

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement iterating partitions in insertion order without global index#254

Implement iterating partitions in insertion order without global index#254
albe merged 6 commits intomainfrom
copilot/return-storage-document-header

Copilot AI commented Mar 21, 2026 •

edited

Loading

Uh oh!

coveralls commented Mar 21, 2026 •

edited

Loading

Uh oh!

albe commented Mar 21, 2026

Uh oh!

Copilot AI commented Mar 21, 2026

Uh oh!

albe commented Mar 22, 2026

Uh oh!

Copilot AI commented Mar 22, 2026

Uh oh!

albe commented Mar 22, 2026

Uh oh!

Copilot AI commented Mar 22, 2026

Uh oh!

albe Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

ReadablePartition — header exposure via readFrom and readAll

ReadableStorage — partition scan mode

Breaking change

Comments on the Issue (you are @copilot in this section)

Uh oh!

coveralls commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albe commented Mar 21, 2026

Uh oh!

Copilot AI commented Mar 21, 2026

Uh oh!

albe commented Mar 22, 2026

Uh oh!

Copilot AI commented Mar 22, 2026

Uh oh!

albe commented Mar 22, 2026

Uh oh!

Copilot AI commented Mar 22, 2026

Uh oh!

albe Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 21, 2026 •

edited

Loading

`ReadablePartition` — header exposure via `readFrom` and `readAll`

`ReadableStorage` — partition scan mode

coveralls commented Mar 21, 2026 •

edited

Loading