Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

br: advancer owner failed to update global checkpoint #50445

Closed
BornChanger opened this issue Jan 16, 2024 · 4 comments · Fixed by #50539
Closed

br: advancer owner failed to update global checkpoint #50445

BornChanger opened this issue Jan 16, 2024 · 4 comments · Fixed by #50539
Labels
affects-6.5 affects-7.1 affects-7.5 component/br This issue is related to BR of TiDB. report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.

Comments

@BornChanger
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Replace a TiKV instance by file copy from one EC2 to another EC2

2. What did you expect to see? (Required)

Log backup continue running

3. What did you see instead (Required)

Log backup stuck with error of

"Alert: RPO of log backup is too high - critical
 Description: The log data is not persisted to the storage for more than 30 minutes. This alert often indicates anomalies. You can check the TiKV logs to find the cause."

And TiDB log of the advancer owner has log of:

 [2024/01/15 16:55:52.168 +00:00] [WARN] [advancer.go:315] ["failed to update global checkpoint: stale"] [old=447044033015185426] [new=0]
[2024/01/15 16:55:52.172 +00:00] [WARN] [owner_daemon.go:63] ["failed on tick"] [error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 10.29.45.253:20160: i/o timeout\""] 

4. What is your TiDB version? (Required)

@BornChanger BornChanger added the type/bug The issue is confirmed as a bug. label Jan 16, 2024
@BornChanger
Copy link
Contributor Author

/component br

@ti-chi-bot ti-chi-bot bot added the component/br This issue is related to BR of TiDB. label Jan 16, 2024
@BornChanger
Copy link
Contributor Author

/severity major

@BornChanger
Copy link
Contributor Author

BornChanger commented Jan 16, 2024

This file copy way to replace tikv reuses the old store ID. but advancer owner has cache the old store id with old ip. so the connection is unavaiable.
To workaround, we can 1. restart tidb(advance owner ), 2. rebuild the pitr task.

@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 affects-7.5 component/br This issue is related to BR of TiDB. report/customer Customers have encountered this bug. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants