Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

异常add_learner #189

Closed
qinzuoyan opened this issue Oct 17, 2018 · 2 comments
Closed

异常add_learner #189

qinzuoyan opened this issue Oct 17, 2018 · 2 comments
Labels
type/bug This issue reports a bug.

Comments

@qinzuoyan
Copy link
Contributor

qinzuoyan commented Oct 17, 2018

背景

2018/09/29,c3srv-xiaomi集群多个节点重启。
server版本:Pegasus Server 1.10.0 (3ad6fe7) Release

Coredump

(gdb) bt
#0  0x0000003847a328a5 in raise () from /lib64/libc.so.6
#1  0x0000003847a34085 in abort () from /lib64/libc.so.6
#2  0x00007f5748db61ae in dsn_coredump() () at /home/sunweijie/source/pegasus/rdsn/src/core/core/service_api_c.cpp:99
#3  0x00007f5748c47610 in dsn::replication::replica::update_local_configuration(dsn::replication::replica_configuration const&, bool) ()
    at /home/sunweijie/source/pegasus/rdsn/src/dist/replication/lib/replica_config.cpp:758
#4  0x00007f5748ceb28a in dsn::replication::replica::on_add_learner(dsn::replication::group_check_request const&) () at /home/sunweijie/source/pegasus/rdsn/src/dist/replication/lib/replica_learn.cpp:1374
#5  0x00007f5748cc6ca3 in dsn::replication::replica_stub::on_add_learner(dsn::replication::group_check_request const&) ()
    at /home/sunweijie/source/pegasus/rdsn/src/dist/replication/lib/replica_stub.cpp:959
#6  0x00007f5748cdd67e in std::_Function_handler<void ()(void*), bool dsn::serverlet<dsn::replication::replica_stub>::register_rpc_handler<dsn::replication::group_check_request>(dsn::task_code, char const*, void (dsn::replication::replica_stub::*)(dsn::replication::group_check_request const&))::{lambda(void*)#1}>::_M_invoke(std::_Any_data const&, void*) ()
    at /home/sunweijie/source/pegasus/rdsn/include/dsn/cpp/serverlet.h:170
#7  0x00007f5748db1941 in dsn::task::exec_internal() () at /home/sunweijie/source/pegasus/rdsn/src/core/core/task.cpp:177
#8  0x00007f5748e03bcd in dsn::task_worker::loop() () at /home/sunweijie/source/pegasus/rdsn/src/core/core/task_worker.cpp:323
#9  0x00007f5748e03d99 in dsn::task_worker::run_internal() () at /home/sunweijie/source/pegasus/rdsn/src/core/core/task_worker.cpp:302
#10 0x00007f5746b1c600 in execute_native_thread_routine () at /home/qinzuoyan/git.xiaomi/pegasus/toolchain/objdir/../gcc-4.8.2/libstdc++-v3/src/c++11/thread.cc:84
#11 0x0000003848207851 in start_thread () from /lib64/libpthread.so.0
#12 0x0000003847ae811d in clone () from /lib64/libc.so.6
(gdb)

log

D2018-09-29 15:38:57.202 (1538206737202143690 27b5c) replica.rep_long7.04010000000002f9: replica_stub.cpp:1434:on_gc(): gc_shared: gc condition for 23.23@x.x.x.x:xxxxx, status = replication::partition_status::PS_PRIMARY, garbage_max_decree = 80530851, last_durable_decree= 80530851, plog_max_commit_on_disk = 80530874
D2018-09-29 15:39:07.622 (1538206747622587975 27b23) replica.replica0.04010001180428f1: replica_config.cpp:953:on_config_sync(): 23.23@x.x.x.x:xxxxx: configuration sync
... ...
D2018-09-29 15:39:26.236 (1538206766236494687 27b23) replica.replica0.04007afd458ee1de: replica_stub.cpp:955:on_add_learner(): 23.23@x.x.x.x:xxxxx: received add learner, primary = 10.136.133.8:31801, ballot = 63, status = replication::partition_status::PS_POTENTIAL_SECONDARY, last_committed_decree = 80530876
D2018-09-29 15:39:26.236 (1538206766236506724 27b23) replica.replica0.04007afd458ee1de: replica_learn.cpp:1365:on_add_learner(): 23.23@x.x.x.x:xxxxx: process add learner, primary = x.x.x.x:xxxxx, ballot = 63, status = replication::partition_status::PS_POTENTIAL_SECONDARY, last_committed_decree = 80530876
D2018-09-29 15:39:26.236 (1538206766236801383 27b23) replica.replica0.04007afd458ee1de: replication_app_base.cpp:95:store(): store replica_init_info to /home/work/ssd7/pegasus/c3srv-xiaomi/replica/reps/23.23.pegasus/.init-info succeed, time_used_ns = 288575: init_ballot = 63, init_durable_decree = 80530851, init_offset_in_shared_log = 127917214287498, init_offset_in_private_log = 24974930792
D2018-09-29 15:39:26.236 (1538206766236813483 27b23) replica.replica0.04007afd458ee1de: replica_config.cpp:700:update_local_configuration(): 23.23@x.x.x.x:xxxxx: update ballot to init file from 61 to 63 OK
F2018-09-29 15:39:26.236 (1538206766236816962 27b23) replica.replica0.04007afd458ee1de: replica_config.cpp:758:update_local_configuration(): assertion expression: false
F2018-09-29 15:39:26.236 (1538206766236841282 27b23) replica.replica0.04007afd458ee1de: replica_config.cpp:758:update_local_configuration(): invalid execution path
@qinzuoyan qinzuoyan added the type/bug This issue reports a bug. label Oct 17, 2018
@hycdong hycdong closed this as completed Jan 9, 2019
@neverchanje
Copy link
Contributor

Background

c4srv-store 1.11.3 (6/17)

Core Dump

(gdb) bt
#0  0x00007f238aabc1d7 in raise () from /lib64/libc.so.6
#1  0x00007f238aabd8c8 in abort () from /lib64/libc.so.6
#2  0x00007f238e5883ee in dsn_coredump () at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/service_api_c.cpp:73
#3  0x00007f238e4892b7 in dsn::replication::replica::update_local_configuration (this=this@entry=0x29cad80, config=..., same_ballot=same_ballot@entry=true)
    at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/dist/replication/lib/replica_config.cpp:807
#4  0x00007f238e4ee1cb in dsn::replication::replica::on_add_learner (this=0x29cad80, request=...) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/dist/replication/lib/replica_learn.cpp:1377
#5  0x00007f238e44fce2 in dsn::replication::replica_stub::on_add_learner (this=0x227e580, request=...) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/dist/replication/lib/replica_stub.cpp:1004
#6  0x00007f238e46a9e0 in operator() (request=<optimized out>, __closure=0x9349840) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/include/dsn/cpp/serverlet.h:169
#7  std::_Function_handler<void (dsn::message_ex*), bool dsn::serverlet<dsn::replication::replica_stub>::register_rpc_handler<dsn::replication::group_check_request>(dsn::task_code, char const*, void (dsn::replication::replica_stub::*)(dsn::replication::group_check_request const&))::{lambda(dsn::message_ex*)#1}>::_M_invoke(std::_Any_data const&, dsn::message_ex*) (__functor=..., 
    __args#0=<optimized out>) at /home/work/qinzuoyan/Pegasus/toolchain/output/include/c++/4.8.2/functional:2071
#8  0x00007f238e5d9ce9 in dsn::task::exec_internal (this=this@entry=0x13dbbc28) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task.cpp:180
#9  0x00007f238e65a42d in dsn::task_worker::loop (this=0x26f4580) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task_worker.cpp:211
#10 0x00007f238e65a5f9 in dsn::task_worker::run_internal (this=0x26f4580) at /home/work/qinzuoyan/Pegasus/pegasus/rdsn/src/core/core/task_worker.cpp:191
#11 0x00007f238b414600 in std::(anonymous namespace)::execute_native_thread_routine (__p=<optimized out>)
    at /home/qinzuoyan/git.xiaomi/pegasus/toolchain/objdir/../gcc-4.8.2/libstdc++-v3/src/c++11/thread.cc:84
#12 0x00007f238c081dc5 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f238ab7e73d in clone () from /lib64/libc.so.6

@neverchanje neverchanje reopened this Jun 17, 2019
@neverchanje
Copy link
Contributor

This bug is fixed in release 1.11.6 https://github.com/XiaoMi/pegasus/releases/tag/v1.11.6.
Close it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug This issue reports a bug.
Projects
None yet
Development

No branches or pull requests

3 participants