Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support plan replayer for historical stats #45038

Closed
time-and-fate opened this issue Jun 28, 2023 · 0 comments · Fixed by #44592
Closed

Support plan replayer for historical stats #45038

time-and-fate opened this issue Jun 28, 2023 · 0 comments · Fixed by #44592
Assignees

Comments

@time-and-fate
Copy link
Member

Backgound

Since we already supported historical stats and make it enabled by default. It would be convenient for the user to be able to download the plan replayer file with historical stats.
To achieve this, we need to enhance the syntax of the PLAN REPLAYER DUMP statement to specify the expected time, then make the dump logic able to fetch and dump the expected historical stats, and we can reuse the code of historical stats HTTP API for this.

Design

Syntax

Add a new clause WITH STATS AS OF TIMESTAMP [datetime | TSO] after PLAN REPLAYER DUMP. The AS OF TIMESTAMP part is consistent with the stale read syntax.

For example:

plan replayer dump with stats as of timestamp '442012134592479233' explain [analyze] select ...
plan replayer dump with stats as of timestamp '2023-06-14 17:00:00' explain [analyze] select ...

Behavior

  • If the user specified a valid time and the corresponding historical stats are available, the stats in the replayer file would be the expected historical stats.
  • The TS used to fetch the historical stats would be recorded in sql_meta.toml in the plan replayer zip file.
  • If the expected historical stats are not available, it would dump the latest stats (the same as when there's no specified time), and the corresponding table names would be recorded in errors.txt in the plan replayer zip file.

Tests

  • Create a table (mark as time1), insert some data, then analyze (mark as time2), then insert some data, then analyze again (mark as time3).
    • Try to use the PLAN REPLAYER DUMP ... AS OF TIMESTAMP time1 ... to get a plan replayer file. In the zip file:
      • The recorded time in the sql_meta.toml should be consistent with time1.
      • The stats should be the same as the stats of time3.
      • It should be recorded in errors.txt that it failed to fetch historical stats of the table.
    • Try to get plan replayer file at time2. In the zip file:
      • The recorded time in the sql_meta.toml should be consistent with time2.
      • The stats should be the same as the stats of time2.
      • There should be no errors.txt.
    • Try to get plan replayer file at time2. In the zip file:
      • The recorded time in the sql_meta.toml should be consistent with time3.
      • The stats should be the same as the stats of time3.
      • There should be no errors.txt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant