Description
Background
From the time before always-authoritative index_part.json
, we had to handle duplicate layers. See the RFC for an illustration of how duplicate layers could happen:
neon/docs/rfcs/027-crash-consistent-layer-map-through-index-part.md
Lines 41 to 50 in a8e6d25
As of #5198 , we should not be exposed to that problem anymore.
Problem 1
But, we still have
- code in Pageserver than handles duplicate layers
- tests in the test suite that demonstrates the problem using a failpoint
However, the test in the test suite doesn't use the failpoint to induce a crash that could legitimately happen in production.
What is does instead is to return early with an Ok()
, so that the code in Pageserver that handles duplicate layers (item 1) actually gets exercised.
That "return early" would be a bug in the routine if it happened in production.
So, the tests in the test suite are tests for their own sake, but don't serve to actually regress-test any production behavior.
Problem 2
Further, if production code did (it nowawdays doesn't!) create a duplicate layer, I think the code in Pageserver that handles that condition (item 1 above) is too little too late:
- the code handles it by discarding the newer
struct Layer
- however, on disk, we have already overwritten the old with the new layer file
- the fact that we do it atomically doesn't matter because ...
- if the new layer file is not bit-identical, then we have a cache coherency problem
- PS PageCache block cache: caches old bit battern
- blob_io offsets stored in variables, based on pre-overwrite bit pattern / offsets
- => reading based on these offsets from the new file might yield different data than before
Soution
- Remove the test suite code
- Remove the Pageserver code that handles duplicate layers too late
- Add a panic/abort in the Pageserver code for when we'd overwrite a layer
- Use
RENAME_NOREPLACE
to detect this correctly
- Use
Concern originally raised in #7707 (comment)