Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add explicit AOST syntax to ANALYZE #96430

Open
dbist opened this issue Feb 2, 2023 · 4 comments
Open

stats: add explicit AOST syntax to ANALYZE #96430

dbist opened this issue Feb 2, 2023 · 4 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-queries SQL Queries Team
Projects

Comments

@dbist
Copy link
Contributor

dbist commented Feb 2, 2023

Is your feature request related to a problem? Please describe.

When you run an explain command, the engine may report back saying to run ANALYZE and not CREATE STATS. If you run ANALYZE and look at the schema jobs, you'll see CREATE STATS WITH OPTIONS AS OF SYSTEM TIME. Running ANALYZE with AOST explicitly will break however, even though they are analogous.

Describe the solution you'd like

I understand that ANALYZE runs with AOST in the background and adding syntactic sugar is redundant but perhaps a bit confusing to the end user as executing CREATE STATS WITH AOST explicitly works. There's also an opportunity to run ANALYZE with a different AOST interval than default. Yes, then you can just run CREATE STATS with that interval but we also should strive for being consistent.

From Yahor:

there could be value in overriding AOST that we add automatically. The main goal of adding AOST for stats collection is to use “inconsistent scan”, and for that purpose AOST -0.001ms is sufficient. However, one might be ok with running ANALYZE further into the past in order to utilize follower reads, and then ANALYZE AOST -5s or something would be good.

The syntax I imagine should be ANALYZE t1 AS OF SYSTEM TIME. I'm surprised CREATE STATS syntax is CREATE STATISTICS "" FROM t1 WITH OPTIONS AS OF SYSTEM TIME but there must be a good reason to use WITH OPTIONS, which is not common to the other uses of AOST for backups, SELECT, etc.

Describe alternatives you've considered

Run CREATE STATS WITH AOST instead.

Additional context

CREATE STATISTICS "" FROM t1 WITH OPTIONS AS OF SYSTEM TIME '-0.001ms';
and
ANALYZE t1 WITH OPTIONS AS OF SYSTEM TIME '-0.001ms'; is not possible? I did try ANALYZE t1 AS OF SYSTEM TIME '-0.001ms'; and it doesn’t work either.

The following however works

BEGIN;
SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp();
ANALYZE t1;
COMMIT;

Jira issue: CRDB-24121

@dbist dbist added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Feb 2, 2023
@yuzefovich yuzefovich added this to Triage in SQL Queries via automation Feb 2, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Feb 2, 2023
@yuzefovich
Copy link
Member

From Alex Entin:

then ANALYZE AOST -5s or something would be good

Adding an angle, fwiw: more and more frequently getting into situations when need to increase the closed ts into minutes, to avoid persistent retries that do lots of harm. That defeats follower reads in stats because of hard coded AOST. A hard coded number seems inflexible, yet putting something like [conceptually] kv.closed_timestamp.target_duration + 1ms creates other potential problems e.g. stale / irrelevant stats. Not sure what a good solution is, but needs to be better than any coded number perhaps

From Yahor Yuzefovich:
The main rationale for adding AOST '-0.001ms' automatically to ANALYZE was not the desire to use the follower reads, but to trigger a special “inconsistent scan” behavior. -0.001ms is the smallest we could go. Before, we had ANALYZE perform stats collection as of current time, so we wanted the smallest change in behavior. CREATE STATISTICS gives enough flexibility to specify AOST as determined by the user. However, the automatic table collection uses AOST -30s and currently not configurable, so we probably could be more flexible there.

@mgartner
Copy link
Collaborator

mgartner commented Feb 2, 2023

I see no reason to add an AOST option to ANALAYZE. It's a Postgres command that we support for compatibility reasons. The AOST is as small as possible on purpose so that stats are as up-to-date as possible, and users aren't surprised by stale stats and bad plans immediately after running ANALYZE. If they need more control, they should reach for CREATE STATISTICS.

@mgartner
Copy link
Collaborator

mgartner commented Feb 7, 2023

One use-case for a custom AOST with ANALYZE is to use follower reads for stats collection, which would require a AOST further in the past than the default AOST of ANALYZE.

@michae2 michae2 moved this from Triage to Backlog in SQL Queries Feb 7, 2023
@michae2
Copy link
Collaborator

michae2 commented Feb 7, 2023

Tagging @taroface during triage discussion about documenting ANALYZE vs CREATE STATISTICS re AOST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) docs-done docs-known-limitation T-sql-queries SQL Queries Team
Projects
Status: Backlog
SQL Queries
Backlog (DO NOT ADD NEW ISSUES)
Development

No branches or pull requests

5 participants