-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
When reading individual variables from a .om file stored on S3 (or any HTTP range-request backend) via OmFileReader.from_fsspec(), each get_child_by_index(i) call triggers a separate range request because each child's variable message is stored immediately after that child's compressed data — scattered throughout the file.
Even reading a single pixel from a handful of variables requires many small metadata fetches first. For a file with N children, the reader must make ~2N small range requests (one for each child's variable message, one for each LUT entry) just to discover where the data chunks live, before it can fetch the actual data. Each of these reads is tiny (77–131 bytes) but at unpredictable offsets spread across the entire file.
A pre-fetch of the file's tail region (e.g., last 2 MB) captures the root group message and trailer, but not the per-child metadata, because that metadata is interleaved with data chunks throughout the file body.
On S3, each range request has ~400ms of latency overhead, so these scattered metadata reads dominate total query time — turning what should be a fast point lookup into a multi-second operation.
Root Cause
The .om file layout stores each child's metadata right after that child's compressed data chunks:
[child 0 data chunks] [child 0 variable msg] [child 0 LUT entries]
[child 1 data chunks] [child 1 variable msg] [child 1 LUT entries]
...
[child N data chunks] [child N variable msg] [child N LUT entries]
[root group message]
[trailer]
For local file access this layout is fine — the OS page cache handles it transparently. But for S3/HTTP range-request access, every metadata read at a new file offset requires a separate HTTP round trip.
Suggested Fix
Consolidate all child metadata into a contiguous region at the end of the file, before the root group message and trailer:
[child 0 data chunks]
[child 1 data chunks]
...
[child N data chunks]
[child 0 variable msg] [child 0 LUT entries] <- all metadata grouped together
[child 1 variable msg] [child 1 LUT entries]
...
[child N variable msg] [child N LUT entries]
[root group message]
[trailer]
This way, a single tail-region read would capture all child metadata + root + trailer. Any subsequent variable read would only need requests for the actual compressed data chunks — the metadata is already cached.
Alternatively, an option in OmFileWriter to control metadata placement would work. The current "inline after data" layout is fine for local access and could remain the default, with an opt-in "consolidated metadata" mode for files intended to be served over HTTP/S3.
Environment
omfilesPython package version 1.1.0- Reading via
OmFileReader.from_fsspec()with an S3 range-request backend