Improve REST Support for lazy snapshot loading by grantatspothero · Pull Request #16207 · apache/iceberg

grantatspothero · 2026-05-04T13:53:48Z

Previously this PR added support for lazy snapshot loading: https://github.com/apache/iceberg/pull/6850/changes

This PR improves the lazy loading by supporting lazy loading of snapshotLog.

For tables with high numbers of snapshots (eg: tables with low latency commits) this can result in significant memory savings.

Considerations:

Wanted to maintain backwards compatibility so kept and deprecated setSnapshotsSupplier

gaborkaszab · 2026-05-04T17:56:21Z

Hi @grantatspothero ,
I see this is a draft PR but grabbed my attention as I was investigating the lazy snapshot loading area recently. Could you help me understand what is exactly that worries you wrt the snapshot log? Is it network traffic, is it memory consumption on client side or something else?
The reason I ask is that initially I'd think that there isn't much we can win with lazily loading the snapshot log because each log entry is just 2 longs. So basically with thousands of snapshots we still are in the low kilobytes territory in terms of memory usage. Above that amount of snapshots you're doomed anyway :)

grantatspothero · 2026-05-04T18:28:36Z

Our problem was excessive memory usage due to caching TableMetadata on the client side.

Storing a List<HistoryEntry> in memory is fine for small numbers of snapshots, but each entry takes ~32 bytes and this grows quickly when you have a single coordinator service caching iceberg metadata in memory.

Example:

1000 table metadata cached in memory
each table commits every 30s, with 30 days of snapshot retention = 2*60*24*30 ~100K snapshots in iceberg metadata
32 bytes * 100K = 3.2 MB snapshotLog per table
3.2MB/table * 1000 tables = 32GB

Note: this is "resident set size" not "total allocations" which tends to be significantly higher due to intermediate allocations of parsing JSON.

For multi-tenant coordinator services (eg: query engines, cache services) this memory usage is a problem. The biggest memory hog is by far the snapshots array, but snapshotLog is the next biggest. Since iceberg already defers snapshots, it seemed reasonable to defer snapshotLog.

grantatspothero · 2026-05-04T18:49:34Z

Above that amount of snapshots you're doomed anyway :)

It is becoming more common to have large numbers of snapshots in iceberg due to prevalence of streaming ingestion/low latency commits.

See mailing list discussions: https://www.mail-archive.com/dev@iceberg.apache.org/msg12764.html
Examples: kafka-connect iceberg sink, Confluent Tableflow, Starburst streaming ingestion.

This doesn't solve the full problem mentioned in that mailing list thread (writes still pay the full cost of writing snapshots/snapshotLog), but it does solve the problem for readers. And for query engine/caching usecases, reads >> writes so this could be beneficial.

Previously only lazily loaded snapshots

gaborkaszab · 2026-05-05T15:57:01Z

Thanks you for the explanation, @grantatspothero !
I feel that ~100k snapshot tables are at the very extreme end of use-cases. I'm wondering if the table is changed every 30sec, then is there any point storing it in a cache. We'd need to reload it frequently anyway.
I wanted to advise you to reach out to the dev@ list to see wider community feedback on this. I see you've already done so, thanks!

I can take a look at the code, if the improvement is simple enough, I don't see why not to include. If it's messy or complicated, we might need some community support to get it through.

grantatspothero · 2026-05-05T17:44:54Z

I'm wondering if the table is changed every 30sec, then is there any point storing it in a cache.

Two different definitions of cache:

"Within query metadata caching". Within a single query's lifetime, TableMetadata must live in coordinator memory. Queries are usually short but sometimes can take hours, wasting coordinator memory for hours for long running queries. This wasted memory is exacerbated by: # of concurrent queries and # of tables per query. Compare this to the hive table model where coordinator memory is mostly bounded.
"Cross-query metadata caching". I believe this is what you are talking about. Trino does not support cross-query table metadata caching today, but some engines do and have problems. With a cross-query cache it is difficult to control caching at a fine granularity. "Cache these long lived table metadatas but not these constantly changing ones"

github-actions Bot added the core label May 4, 2026

grantatspothero force-pushed the gn/lazySnapshotLog branch from e0ba204 to e606251 Compare May 4, 2026 17:04

grantatspothero changed the title ~~Gn/lazy snapshot log~~ Improve REST Support for lazy snapshot loading May 4, 2026

grantatspothero force-pushed the gn/lazySnapshotLog branch 2 times, most recently from c9bc366 to c756769 Compare May 4, 2026 17:11

grantatspothero force-pushed the gn/lazySnapshotLog branch from c756769 to 083fb92 Compare May 4, 2026 22:01

Core: Add support for lazy loading of snapshotLog

3c6b025

Previously only lazily loaded snapshots

grantatspothero force-pushed the gn/lazySnapshotLog branch from 083fb92 to 3c6b025 Compare May 4, 2026 22:02

grantatspothero marked this pull request as ready for review May 4, 2026 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve REST Support for lazy snapshot loading#16207

Improve REST Support for lazy snapshot loading#16207
grantatspothero wants to merge 1 commit intoapache:mainfrom
grantatspothero:gn/lazySnapshotLog

grantatspothero commented May 4, 2026 •

edited

Loading

Uh oh!

gaborkaszab commented May 4, 2026

Uh oh!

grantatspothero commented May 4, 2026 •

edited

Loading

Uh oh!

grantatspothero commented May 4, 2026 •

edited

Loading

Uh oh!

gaborkaszab commented May 5, 2026

Uh oh!

grantatspothero commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

grantatspothero commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaborkaszab commented May 4, 2026

Uh oh!

grantatspothero commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grantatspothero commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaborkaszab commented May 5, 2026

Uh oh!

grantatspothero commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

grantatspothero commented May 4, 2026 •

edited

Loading

grantatspothero commented May 4, 2026 •

edited

Loading

grantatspothero commented May 4, 2026 •

edited

Loading

grantatspothero commented May 5, 2026 •

edited

Loading