Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlstats: set default value of sql.stats.flush.minimum_interval to 10m #123466

Closed
wants to merge 3 commits into from

Conversation

xinhaoz
Copy link
Member

@xinhaoz xinhaoz commented May 2, 2024

sqlstats: cleanup flush, move min flush interval check

This commit moves the check previously in MaybeFlush that
aborts the flush if not enough time has passed since the
last flush operation began to the sql stats flush loop.

MaybeFlush should be called by other functions to trigger
a flush of sql stats immediately (e.g. on server drain). The
minumum flush interval is only relevant to the sql stats flush
worker. Other checks, such as aborting due to flush being
disabled or due to the maximum number of rows being reached,
remain with MaybeFlush.

Epic: none
Part of: #123091

Release note: None

sqlstats: add metric tracking skipped sql stats flushes

This commit adds the metric sql.stats.flushes.skipped.count
which tracks the number of flushes skipped (triggered but
not started).

Epic: none
Part of : #123091

Release note (ops change): New metric, sql.stats.flushes.skipped.count
tracks the number of sql stats flushes that were skipped either due to
not meeting the minimum flush duration or the system tables being full.

sqlstats: set default value of sql.stats.flush.minimum_interval to 10m

This commit sets the minimum flush interval for sql stats to be
equivalent to the default flush interval, 10m. This ensures that
the flush will not be allowed to trigger consecutively when there
is constant memory pressure (ex. due to hitting in-memory fingerprint
limit) on the sql stats system. We want to prevent the flush from
constantly running in the BG since it can use up resources.

Fixes: #123091
Epic: CRDB-37544

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@xinhaoz xinhaoz marked this pull request as ready for review May 2, 2024 15:23
@xinhaoz xinhaoz requested a review from a team as a code owner May 2, 2024 15:23
@xinhaoz xinhaoz requested review from nkodali, dhartunian and abarganier and removed request for a team May 2, 2024 15:23
@dhartunian dhartunian added backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.1.0-rc FROZEN: requires AlexL's approval via #db-backports-24-1-0-release labels May 2, 2024
SELECT count(*)
FROM system.transaction_statistics
WHERE app_name = 'min_flush_test'
`, [][]string{{"1"}})
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try using flushes skipped metric to check here.

query += `,1`
}
sqlConn.Exec(t, query)
}

observerConn.CheckQueryResults(t, `
SELECT count(*)
FROM crdb_internal.statement_statistics
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check persisted table here

This commit sets the minimum flush interval for sql stats to be
equivalent to the default flush interval, 10m. This ensures that
the flush will not be allowed to trigger consecutively when there
is constant memory pressure (ex. due to hitting in-memory fingerprint
limit) on the sql stats system. We want to prevent the flush from
constantly running in the BG since it can use up resources.

Fixes: cockroachdb#123091
Epic: CRDB-37544

Release note: None
@xinhaoz
Copy link
Member Author

xinhaoz commented May 3, 2024

As per discussion, opened #123567 with just the CS change. I'll modify this PR as a follow-up to that.

@xinhaoz xinhaoz changed the title sqlstats: set default value of sql.stats.flush.minimum_interval to 10m [wip] sqlstats: correct timer reset value when flushing too early May 3, 2024
@xinhaoz xinhaoz removed the backport-24.1.0-rc FROZEN: requires AlexL's approval via #db-backports-24-1-0-release label May 3, 2024
This commit moves the check previously in `MaybeFlush` that
aborts the flush if not enough time has passed since the
last flush operation began to the sql stats flush loop.

`MaybeFlush` should be called by other functions to trigger
a flush of sql stats immediately (e.g. on server drain). The
minumum flush interval is only relevant to the sql stats flush
worker. Other checks, such as aborting due to flush being
disabled or due to the maximum number of rows being reached,
remain with `MaybeFlush`.

Epic: none
Part of: cockroachdb#123091

Release note: None
This commit adds the metric `sql.stats.flushes.skipped.count`
which tracks the number of flushes skipped (triggered but
not started).

Epic: none
Part of : cockroachdb#123091

Release note (ops change): New metric, `sql.stats.flushes.skipped.count`
tracks the number of sql stats flushes that were skipped either due to
not meeting the minimum flush duration or the system tables being full.
@xinhaoz xinhaoz changed the title [wip] sqlstats: correct timer reset value when flushing too early sqlstats: correct timer reset value when flushing too early May 3, 2024
@xinhaoz xinhaoz changed the title sqlstats: correct timer reset value when flushing too early sqlstats: set default value of sql.stats.flush.minimum_interval to 10m May 3, 2024
@xinhaoz
Copy link
Member Author

xinhaoz commented May 3, 2024

@dhartunian Reviving this PR and closed the one linked above. I omited the small portion that was technically a bug fix, but I think all these other changes can be batched together. Thoughts?

@xinhaoz xinhaoz requested a review from dhartunian May 3, 2024 19:17
Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: just some nits

Reviewed 1 of 2 files at r3, 7 of 7 files at r4, 2 of 3 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @abarganier, @nkodali, and @xinhaoz)


pkg/sql/sqlstats/persistedsqlstats/flush_test.go line 376 at r5 (raw file):

	testutils.SucceedsSoon(t, func() error {
		if srv.SQLServer().(*sql.Server).ServerMetrics.StatsMetrics.SQLStatsFlushesSkipped.Count() < 0 {

can you add confirmation prior to the execution of above statements, that skipped count is zero?


pkg/sql/sqlstats/persistedsqlstats/flush_test.go line 402 at r5 (raw file):

	srv, conn, _ := serverutils.StartServer(t, base.TestServerArgs{
		Knobs: base.TestingKnobs{
			SQLStatsKnobs: sqlstats.CreateTestingKnobs(),

does this do anything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sqlstats: default sql.stats.flush.minimum_interval to be equal to sql.stats.flush.interval
3 participants