Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_lsn_by_timestamp: clamp commit_lsn to be >= min_lsn #7488

Merged
merged 1 commit into from
Apr 23, 2024

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Apr 23, 2024

There was an edge case where get_lsn_by_timestamp/find_lsn_for_timestamp could have returned an lsn that is before the limits we enforce: when we did find SLRU entries with timestamps before the one we search for.

The API contract of get_lsn_by_timestamp is to not return something before the anchestor lsn.

cc https://neondb.slack.com/archives/C03F5SM1N02/p1713871064147029

@arpad-m arpad-m requested a review from a team as a code owner April 23, 2024 16:22
@arpad-m arpad-m requested review from jcsp and koivunej April 23, 2024 16:22
Copy link
Contributor

@jcsp jcsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is there any way to craft a test case for this?

Copy link

2766 tests run: 2645 passed, 0 failed, 121 skipped (full report)


Flaky tests (1)

Postgres 16

  • test_timeline_size_quota_on_startup: release

Code coverage* (full report)

  • functions: 28.1% (6478 of 23035 functions)
  • lines: 47.0% (46008 of 97944 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
623c3a6 at 2024-04-23T17:06:16.779Z :recycle:

@arpad-m
Copy link
Member Author

arpad-m commented Apr 23, 2024

This is an edge case: found_smaller needs to be true (otherwise we already return min_lsn), so the search must have found a timestamp before the specified timestamp at the min_lsn. Also, the search may not yield any lsn after the lsn we are looking for.

I'll be thinking about a test case. Depending on urgency, @koivunej feel free to merge this if you think a fix is urgent.

@arpad-m
Copy link
Member Author

arpad-m commented Apr 23, 2024

Hmmm tried for two hours to create a reproducer but the issue is that this is really an edge case.

This (added to test_lsn_mapping) is still successful before this patch:

        # Regression test for an edge case fixed by #7488
        # Create a bunch of branches and probe the returned lsn at that timestamp
        found_with_present = False
        for i in range(10, len(tbl), 100):
            probe_timestamp_before = cast(datetime, tbl[i - 1][1])
            probe_timestamp = cast(datetime, tbl[i][1])
            result_before = client.timeline_get_lsn_by_timestamp(
                tenant_id, timeline_id, probe_timestamp_before
            )
            assert result_before["kind"] not in ["past", "nodata"]
            parent_lsn_before = Lsn(result_before["lsn"])

            result = client.timeline_get_lsn_by_timestamp(tenant_id, timeline_id, probe_timestamp)
            assert result["kind"] not in ["past", "nodata"]
            parent_lsn = Lsn(result["lsn"])

            for branch_lsn in range(int(parent_lsn_before), int(parent_lsn), 8):
                branch_lsn = Lsn(branch_lsn)
                timeline_id_child = env.neon_cli.create_branch(
                    f"test_lsn_mapping_child_loop_{i}_{branch_lsn}",
                    tenant_id=tenant_id,
                    ancestor_branch_name="test_lsn_mapping",
                    ancestor_start_lsn=branch_lsn,
                )

                result = client.timeline_get_lsn_by_timestamp(
                    tenant_id, timeline_id_child, probe_timestamp_before
                )
                assert result["kind"] != "nodata"
                if result["kind"] in ["present", "future"]:
                    found_with_present = True
                lsn = Lsn(result["lsn"])
                assert lsn >= branch_lsn, "Grandchild lsn needs to be valid"
                timeline_id_child = env.neon_cli.create_branch(
                    f"test_lsn_mapping_child_loop_{i}_{branch_lsn}_grandchild",
                    tenant_id=tenant_id,
                    ancestor_branch_name="test_lsn_mapping",
                    ancestor_start_lsn=lsn,
                )
        assert found_with_present

@arpad-m arpad-m merged commit 18fd73d into main Apr 23, 2024
53 checks passed
@arpad-m arpad-m deleted the arpad/max_by_min_lsn branch April 23, 2024 22:46
arpad-m added a commit that referenced this pull request Apr 26, 2024
…7520)

Implements an approach different from the one #7488 chose: We now return
`past` instead of `present` (or`future`) when encountering the edge case
where commit_lsn < min_lsn. In my opinion, both `past` and `present` are
correct responses, but past is slightly better as the lsn returned by
`present` with #7488 is one too "new". In practice, this shouldn't
matter much, but shrug.

We agreed in slack that this is the better approach:
https://neondb.slack.com/archives/C03F5SM1N02/p1713871064147029
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants