pageserver: fix cont lsn jump on vectored read path #7412

VladLazar · 2024-04-17T17:31:19Z

Problem

Vectored read path may return an image that's newer than the request lsn under certain circumstances.

  LSN
    ^
    |
    |
500 | ------------------------- -> branch point
400 |        X
300 |        X
200 | ------------------------------------> requested lsn
100 |        X
    |---------------------------------> Key

Legend:
* X - page images

The vectored read path inspects each ancestor timeline one by one starting from the current one.
When moving into the ancestor timeline, the current code resets the current search lsn (called cont_lsn in code)
to the lsn of the ancestor timeline (here).

For instance, if the request lsn was 200, we would:

Look into the current timeline and find nothing for the key
Descend into the ancestor timeline and set cont_lsn=500
Return the page image at LSN 400

Myself and Christian find it very unlikely for this to have happened in prod since the vectored read path
is always used at the last record lsn.

This issue was found by a regress test during the work to migrate get page handling to use the vectored
implementation. I've applied my fix to that wip branch and it fixed the issue.

Summary of changes

The fix is to set the current search lsn to the min between the requested LSN and the ancestor lsn.
Hence, at step 2 above we would set the current search lsn to 200 and ignore the images above that.

A test illustrating the bug is also included. Fails without the patch and passes with it.

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist

github-actions · 2024-04-17T17:40:32Z

2766 tests run: 2648 passed, 0 failed, 118 skipped (full report)

Flaky tests (1)

Postgres 16

test_vm_bit_clear_on_heap_lock: debug

Code coverage* (full report)

functions: 28.0% (6443 of 22999 functions)
lines: 46.8% (45431 of 97156 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
11a1d23 at 2024-04-18T15:24:36.919Z :recycle:}

pageserver/src/tenant.rs

koivunej

The test looks convincing enough, but I cannot make the connection to the actual vectored read path change (min).

Does the old Timeline::get_value_reconstruct_data have this problem? There, we move to ancestor if cont_lsn is less than ancestor_lsn, and cont_lsn is originally request_lsn + 1. Okay, I think these are now aligned.

VladLazar · 2024-04-18T13:19:02Z

The test looks convincing enough, but I cannot make the connection to the actual vectored read path change (min).

Does the old Timeline::get_value_reconstruct_data have this problem? There, we move to ancestor if cont_lsn is less than ancestor_lsn, and cont_lsn is originally request_lsn + 1. Okay, I think these are now aligned.

It's quite tricky to see the alignment between the read paths since they're structured differently. I think going through the example in the cover letter is the best way of understanding the issue. The test produces a very similar setup, albeit with a lot of boilerplate. Let me know if you wish to chat about it - the bug is quite obvious once you know about it :)

VladLazar · 2024-04-18T17:40:13Z

I'll merge now to get some validation for the change before next weeks release.

hlinnaka · 2024-04-19T13:18:50Z

Great explanation and test case!

It took me a while to understand why the non-vectored path is not suffering from the same bug. I believe it's because `get_reconstruct_data' has this in the loop:


            // Recurse into ancestor if needed
            if is_inherited_key(key) && Lsn(cont_lsn.0 - 1) <= timeline.ancestor_lsn {
                trace!(
                    "going into ancestor {}, cont_lsn is {}",
                    timeline.ancestor_lsn,
                    cont_lsn
                );

                timeline_owned = timeline.get_ready_ancestor_timeline(ctx).await?;
                timeline = &*timeline_owned;
                prev_lsn = None;
                continue 'outer;
            }

Wouldn't hurt to explicitly test for the same scenario in the non-vectored case too. Or maybe we already have a test for that?

It would be useful to cross-check that get_reconstruct_data produces the same result as get_vectored. For example, at the end of get_reconstruct_data, call get_vectored with a corresponding single-item KeySpace, and assert that the result is the same.

VladLazar · 2024-04-19T13:36:41Z

It took me a while to understand why the non-vectored path is not suffering from the same bug. I believe it's because `get_reconstruct_data' has this in the loop:


            // Recurse into ancestor if needed
            if is_inherited_key(key) && Lsn(cont_lsn.0 - 1) <= timeline.ancestor_lsn {
                trace!(
                    "going into ancestor {}, cont_lsn is {}",
                    timeline.ancestor_lsn,
                    cont_lsn
                );

                timeline_owned = timeline.get_ready_ancestor_timeline(ctx).await?;
                timeline = &*timeline_owned;
                prev_lsn = None;
                continue 'outer;
            }

Right. The non-vectored read path does not touch the cont_lsn when descending into the ancestor timeline. cont_lsn should never increase for it.

Wouldn't hurt to explicitly test for the same scenario in the non-vectored case too. Or maybe we already have a test for that?

It would be useful to cross-check that get_reconstruct_data produces the same result as get_vectored. For example, at the end of get_reconstruct_data, call get_vectored with a corresponding single-item KeySpace, and assert that the result is the same.

We already have this cross checking mechanism in place (see validate_get_vectored_impl). It is currently enabled for regression tests and and in staging, but disabled in prod.

For context, I caught this bug with the cross validation mechanism on my wip branch to migrate the get page requests to use the vectored path. The non-vectored path was returning the correct result, but it's a fair point that the test could be updated to exercise the non-vectored read path too.

VladLazar force-pushed the vlad/fix-cont-lsn-jump branch 3 times, most recently from 383699f to 5cd17cf Compare April 17, 2024 18:15

VladLazar requested a review from problame April 17, 2024 18:30

VladLazar marked this pull request as ready for review April 17, 2024 18:30

VladLazar requested a review from a team as a code owner April 17, 2024 18:30

pageserver: fix cont lsn jump on vectored read path

ee25fdf

VladLazar force-pushed the vlad/fix-cont-lsn-jump branch from 5cd17cf to ee25fdf Compare April 18, 2024 10:58

VladLazar requested a review from koivunej April 18, 2024 11:50

koivunej reviewed Apr 18, 2024

View reviewed changes

pageserver/src/tenant.rs Outdated Show resolved Hide resolved

koivunej reviewed Apr 18, 2024

View reviewed changes

pageserver/src/tenant.rs Outdated Show resolved Hide resolved

koivunej reviewed Apr 18, 2024

View reviewed changes

pageserver/src/tenant.rs Outdated Show resolved Hide resolved

koivunej reviewed Apr 18, 2024

View reviewed changes

pageserver/src/tenant.rs Show resolved Hide resolved

koivunej approved these changes Apr 18, 2024

View reviewed changes

review: adress Joonas' test comments

11a1d23

jcsp approved these changes Apr 18, 2024

View reviewed changes

VladLazar merged commit 6eb946e into main Apr 18, 2024
53 checks passed

VladLazar deleted the vlad/fix-cont-lsn-jump branch April 18, 2024 17:40

VladLazar mentioned this pull request Apr 19, 2024

[DNM ]pageserver: coalesce read paths testing branch #7398

Closed

5 tasks

VladLazar mentioned this pull request Apr 23, 2024

pageserver: coalesce read paths #7477

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pageserver: fix cont lsn jump on vectored read path #7412

pageserver: fix cont lsn jump on vectored read path #7412

VladLazar commented Apr 17, 2024 •

edited

github-actions bot commented Apr 17, 2024 •

edited

Postgres 16

koivunej left a comment

VladLazar commented Apr 18, 2024

VladLazar commented Apr 18, 2024

hlinnaka commented Apr 19, 2024

VladLazar commented Apr 19, 2024

pageserver: fix cont lsn jump on vectored read path #7412

pageserver: fix cont lsn jump on vectored read path #7412

Conversation

VladLazar commented Apr 17, 2024 • edited

Problem

Summary of changes

Checklist before requesting a review

Checklist before merging

github-actions bot commented Apr 17, 2024 • edited

2766 tests run: 2648 passed, 0 failed, 118 skipped (full report)

Postgres 16

Code coverage* (full report)

koivunej left a comment

Choose a reason for hiding this comment

VladLazar commented Apr 18, 2024

VladLazar commented Apr 18, 2024

hlinnaka commented Apr 19, 2024

VladLazar commented Apr 19, 2024

VladLazar commented Apr 17, 2024 •

edited

github-actions bot commented Apr 17, 2024 •

edited