-
Notifications
You must be signed in to change notification settings - Fork 33
ddl: refine ddl log and add more useful message in log #3223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @hongyunyan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on refining the DDL (Data Definition Language) barrier mechanism and its associated logging. It reduces log verbosity for routine operations, adjusts the resend intervals for certain tasks, and significantly enhances the precision of table ID tracking within the DDL barrier. These changes collectively aim to improve the clarity of system logs, optimize the timing of resend operations, and ensure more robust and accurate synchronization of DDL events, particularly in how the system manages and reports on table-specific events. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request primarily refines logging by reducing log levels from INFO/WARN to DEBUG in several places, which should help reduce log noise in production. It also includes a nice refactoring of TableCountChecker to provide more detailed debug messages, which aligns well with the goal of improving log messages. However, there are also some significant changes to the retry logic for DDL-related messages. While one change improves responsiveness, another one might introduce latency. I've added a few comments with suggestions for improvement.
| callback: callback, | ||
| } | ||
| t.taskHandle = taskScheduler.Submit(t, time.Now().Add(50*time.Millisecond)) | ||
| t.taskHandle = taskScheduler.Submit(t, time.Now().Add(resendTimeInterval)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial resend delay for this task has been increased from 50ms to 10 seconds. This is a significant change that could increase DDL processing latency if the first TableSpanBlockStatus message from the dispatcher to the maintainer is lost due to a transient network issue. Was this change intentional? A 10-second delay for the first retry seems quite long and could be perceived as a performance regression.
|
|
||
| func (be *BarrierEvent) resend(mode int64) []*messaging.TargetMessage { | ||
| if time.Since(be.lastResendTime) < 10*time.Second { | ||
| if time.Since(be.lastResendTime) < time.Second { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resend interval is hardcoded here. It would be better to define this as a constant to improve maintainability and consistency, similar to how resendTimeInterval was introduced in downstreamadapter/dispatcher/helper.go.
For example, you could define a constant at the package level:
const barrierEventResendInterval = time.SecondThen use this constant here.
| func (rc *TableIDRangeChecker) Detail() string { | ||
| return fmt.Sprintf("reported count: %d, require count: %d", len(rc.reportedMap), rc.needCount) | ||
| buf := &strings.Builder{} | ||
| buf.WriteString("reported count: ") | ||
| buf.WriteString(strconv.FormatInt(int64(len(rc.reportedMap)), 10)) | ||
| buf.WriteString(", require count: ") | ||
| buf.WriteString(strconv.FormatInt(int64(rc.needCount), 10)) | ||
| buf.WriteString(", uncovered tables: ") | ||
| for _, id := range rc.tableIDs { | ||
| _, ok := rc.reportedMap[id] | ||
| if !ok { | ||
| buf.WriteString(fmt.Sprintf("%d,\n", id)) | ||
| } | ||
| } | ||
|
|
||
| return buf.String() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enhanced Detail method is great for debugging! However, using fmt.Sprintf inside a loop can be inefficient. Also, the current implementation will leave a trailing comma and newline, and it will print , uncovered tables: even if there are no uncovered tables.
A more efficient and cleaner approach would be to collect the uncovered table IDs and then join them. This avoids repeated string formatting and the trailing separator.
func (rc *TableIDRangeChecker) Detail() string {
buf := &strings.Builder{}
buf.WriteString("reported count: ")
buf.WriteString(strconv.FormatInt(int64(len(rc.reportedMap)), 10))
buf.WriteString(", require count: ")
buf.WriteString(strconv.FormatInt(int64(rc.needCount), 10))
var uncoveredIDs []string
for _, id := range rc.tableIDs {
if _, ok := rc.reportedMap[id]; !ok {
uncoveredIDs = append(uncoveredIDs, strconv.FormatInt(id, 10))
}
}
if len(uncoveredIDs) > 0 {
buf.WriteString(", uncovered tables: ")
buf.WriteString(strings.Join(uncoveredIDs, ", "))
}
return buf.String()
}|
/test all |
|
/retest |
[LGTM Timeline notifier]Timeline:
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: asddongmen, lidezhu, tenfyzhong The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
What problem does this PR solve?
Issue Number: ref #3178
What is changed and how it works?
table_count_range_checker_test.goto construct the checker with explicit table-ID slices, aligning the tests with the newTableIDRangeCheckerAPI.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note