New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pitr checkpoint ts lag reached more than 8h after inject network partition between one of tikv and pd leader #16469
Comments
/severity critical |
/assign BornChanger |
@Lily2025: GitHub didn't allow me to assign the following users: BornChanger. Note that only tikv members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/component br |
@BornChanger: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/component backup-restore |
/assign YuJuncen |
This should be a mistake in #16008, which added a stale checking to every The time line is: Region R become leader -> Then, the The solution is to always add a phantom record in the subscription tracer if there isn't one while we are starting. |
This will only affect |
close #16469 Now, `Start` will always put a phantom record in subscription tracer if there isn't one. Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
close tikv#16469 Now, `Start` will always put a phantom record in subscription tracer if there isn't one. Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: hillium <yujuncen@pingcap.com>
close tikv#16469 Now, `Start` will always put a phantom record in subscription tracer if there isn't one. Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: hillium <yujuncen@pingcap.com>
close tikv#16469 Now, `Start` will always put a phantom record in subscription tracer if there isn't one. Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: hillium <yujuncen@pingcap.com>
close tikv#16469, ref tikv#16554 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
close tikv#16469 Now, `Start` will always put a phantom record in subscription tracer if there isn't one. Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: dbsid <chenhuansheng@pingcap.com>
close tikv#16469, ref tikv#16554 Signed-off-by: Yu Juncen <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: dbsid <chenhuansheng@pingcap.com>
close tikv#16469, ref tikv#16554 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Bug Report
What version of TiKV are you using?
./tikv-server -V
TiKV
Release Version: 8.0.0-alpha
Edition: Community
Git Commit Hash: 43d0e06
Git Commit Branch: heads/refs/tags/v8.0.0-alpha
UTC Build Time: 2024-01-26 11:47:08
Rust Version: rustc 1.77.0-nightly (89e2160c4 2023-12-27)
Enable Features: pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine cloud-aws cloud-gcp cloud-azure trace-async-tasks openssl-vendored
Profile: dist_release
What operating system and CPU are you using?
8c/32g
Steps to reproduce
1、start pitr
2、run workload
go-tpc tpcc run -D tpcc20000 --host tc-tidb.endless-ha-test-oltp-pitr-tps-6570032-1-702 -P4000 --warehouses 20000 -T 32 --ignore-error '2013,1213,1105,1205,8022,8028,9004,9007,1062' --user root --password '' --interval '10s'
3、inject network partition between one of tikv and pd leader
What did you expect?
pitr checkpoint ts lag less than 10mins after fault recover
What did happened?
pitr checkpoint ts lag reached more than 8h after inject network partition between one of tikv and pd leader
The text was updated successfully, but these errors were encountered: