s3_scrubber snapshot
can't consistently snapshot sharded tenants
#7573
Labels
s3_scrubber snapshot
can't consistently snapshot sharded tenants
#7573
This test/debug tool was recently added in #7444 .
This ticket tracks a limitation in the tool when used for sharded tenants which are being written to.
It works well enough for shards in that it fetches all the data for all the shards, and one can start up a pageserver. However, because shards advance their disk_consistent_lsn independently, trying to run an endpoint against the downloaded data has a couple of problems:
The impact is that you get a bunch of shards that are independently valid, so can be used for e.g. testing data consistency or compaction at the pageserver level, but can't be used to run a postgres instance.
To solve this, we probably need to make the
tenant-import
command smart enough to trim back imported data to a specific lsn (the lowest disk_consistent_lsn of the shards), including trimming layer files. This could either be done in the scrubber or as a pageserver API (perhaps as part of thetenant-import
flow).The text was updated successfully, but these errors were encountered: