-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
telemetry: add copr-cache, tiflash, cluster index, async commit #23454
telemetry: add copr-cache, tiflash, cluster index, async commit #23454
Conversation
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
/cc @breeswish |
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
session/session.go
Outdated
length := len(telemetry.CoprocessorCacheTelemetry.MinuteWindow) | ||
if length == 0 || time.Since(*telemetry.CoprocessorCacheTelemetry.MinuteWindow[length-1].BeginAt) >= time.Minute { | ||
var i int | ||
for i = 0; i < length && time.Since(*telemetry.CoprocessorCacheTelemetry.MinuteWindow[i].BeginAt) >= 6*time.Hour; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May use some meaningful definition to instead of 6, such as TELEMETRY_UPDATE_INTERVAL ?
session/session.go
Outdated
if length == 0 || time.Since(*telemetry.CoprocessorCacheTelemetry.MinuteWindow[length-1].BeginAt) >= time.Minute { | ||
var i int | ||
for i = 0; i < length && time.Since(*telemetry.CoprocessorCacheTelemetry.MinuteWindow[i].BeginAt) >= 6*time.Hour; i++ { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any efficient way to skip the expired data? Forgive me , because I have no information about the refresh method of the telemetry verb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way is to move the logic to getTelemetryFeatureUsageInfo
. It's a background thread.
session/session.go
Outdated
} | ||
if ratio >= 0.01 { | ||
telemetry.CoprocessorCacheTelemetry.MinuteWindow[length-1].P1.Add(1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest add else to skip following un-hit check.
E.G. ratio = 0.005, we have checked ratio > 0 and ratio < 0.01, all the following check could be passed such as whether ratio >= 0.1, ratio >= 0.2, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestion, thx.
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
d67710e
to
f80be36
Compare
} | ||
|
||
// async commit | ||
stmt, err = exec.ParseWithParams(context.TODO(), `show config where name = 'storage.enable-async-apply-prewrite'`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async-commit is controlled by a session variable: tidb_enable_async_commit
. You can remove codes about this part. I will add it later.
telemetry/telemetry.go
Outdated
@@ -34,6 +34,8 @@ const ( | |||
Prompt = "telemetry" | |||
// ReportInterval is the interval of the report. | |||
ReportInterval = 24 * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change it to 6 hour and change window to 1 hour~
telemetry/telemetry.go
Outdated
@@ -34,6 +34,8 @@ const ( | |||
Prompt = "telemetry" | |||
// ReportInterval is the interval of the report. | |||
ReportInterval = 24 * time.Hour | |||
// UpdateInterval means the max time window for saved data. | |||
UpdateInterval = 6 * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can call it WindowInterval for better readability?
session/session.go
Outdated
return recordSet, nil | ||
} | ||
|
||
// CountTelemetry records the telemetry. | ||
func CountTelemetry(s *session) { | ||
telemetry.CoprocessorCacheTelemetry.Lock.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock seems to be very destructive.. I'm very worried about its performance.
sessionctx/variable/session.go
Outdated
// CoprCacheHitNum is to record coprocessor cache hit times for one statement. | ||
CoprCacheHitNum atomic2.Uint64 | ||
// CopRespTimes is to record coprocessor response times for one statement. | ||
CopRespTimes atomic2.Uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CopRespTimes atomic2.Uint64 | |
CoprRespTimes atomic2.Uint64 |
By the way what is response times? Is it number of received coprocessor responses?
telemetry/data_feature_usage.go
Outdated
// cluster index | ||
exec := ctx.(sqlexec.RestrictedSQLExecutor) | ||
stmt, err := exec.ParseWithParams(context.TODO(), ` | ||
SELECT sha2(TABLE_NAME, 256) name, TIDB_PK_TYPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's truncate it~
SELECT sha2(TABLE_NAME, 256) name, TIDB_PK_TYPE | |
SELECT left(sha2(TABLE_NAME, 256), 6) name, TIDB_PK_TYPE |
/bench |
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
/merge |
This pull request has been accepted and is ready to merge. Commit hash: c876042
|
/merge |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a performance related question.
Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com>
…into dev/telemetry_copr_cache
/run-all-tests |
data race:
It's not about the PR. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: e4dfe12
|
/run-sqllogic-test-2 |
/run-unit-test |
//run-unit-test |
Leak test fail:
|
/merge |
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
cherry pick to release-5.0 in PR #23489 |
Signed-off-by: lzmhhh123 lzmhhh123@gmail.com
What problem does this PR solve?
Problem Summary: as title says
What is changed and how it works?
How it Works:
Related changes
pingcap/docs
/pingcap/docs-cn
:Check List
Tests
admin show telemetry
Side effects
Release note