release-21.2: sql: use adaptive sampling rate for telemetry logging #70960
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #70786.
/cc @cockroachdb/release
Resolves #70553
Release justification: category 4
Previously, telemetry logging used a configurable QPS threshold
and sampling rate, for which we would log all statements if we
were under the QPS threshold, and then start sampling at the
given rate once at the threshold. Using this technique meant
that we will often see a sharp decreaes in telemetry logging
once the sampling rate increases, as sampling rates would typically
need to be at low values to accomodate a high QPS.
This commit replaces the above technique with an adaptive sampling
rate which merely logs events to telemetry at a maximum frequency.
Rather than relying on QPS, we will simply track when we have
last logged to the telemtry channel, and decide whether or not to
log a given event accordingly.
Release note (sql change): The cluster setting
sql.telemetry.query_sampling.qps_threshold
, andsql.telemetry.query_sampling.sample_rate
have been removed.A new setting,
sql.telemetry.query_sampling.max_event_frequency
has been introduced, with default value of 10 events per second.
roachrod results
cluster created with:
results of tpcc 1000
Telemetry OFF
Telemetry ON