osd/PGLog: assert out on performing overflowed log trimming #21580

xiexingguo · 2018-04-21T03:02:17Z

Performing overflowed log-trim can be a sign of big trouble, e.g.,
the complete_to iterator will now point to an invalid position
of the original pg-log list when the trimming is done, and hence
randomly trigger Segmentation faults as below:

2018-03-07 17:38:46.109018 7f274a4ed700 -1 *** Caught signal (Segmentation fault) **
1: (()+0xa51f31) [0x7f278290bf31]
2: (()+0xf370) [0x7f277fb4f370]
3: (PrimaryLogPG::recover_got(hobject_t, eversion_t)+0x266) [0x7f2782512786]
4: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo const&, std::shared_ptr<ObjectContext>, bool, ObjectStore::Tran
saction*)+0x2a4) [0x7f278251f3b4]
5: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, PushReplyOp*, ObjectStore::Transaction*)+0x2e2) [0x7f2782690f82]
6: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x194) [0x7f2782691224]
7: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2f1) [0x7f278269fd41]
8: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x7f27825c2470]

The root cause of why PGs are starting to trim more log entries than
we expect is still lost to me, but setting the trap here should generally
do no harm and hopefully expose the above problem a little bit more offen.
We'll see.

Signed-off-by: xie xingguo xie.xingguo@zte.com.cn

Performing overflowed log-trim can be a sign of big trouble, e.g., the **complete_to** iterator will now point to an invalid position of the original pg-log list when the trimming is done, and hence randomly trigger **Segmentation fault**s as below: ``` 2018-03-07 17:38:46.109018 7f274a4ed700 -1 *** Caught signal (Segmentation fault) ** 1: (()+0xa51f31) [0x7f278290bf31] 2: (()+0xf370) [0x7f277fb4f370] 3: (PrimaryLogPG::recover_got(hobject_t, eversion_t)+0x266) [0x7f2782512786] 4: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo const&, std::shared_ptr<ObjectContext>, bool, ObjectStore::Tran saction*)+0x2a4) [0x7f278251f3b4] 5: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, PushReplyOp*, ObjectStore::Transaction*)+0x2e2) [0x7f2782690f82] 6: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x194) [0x7f2782691224] 7: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2f1) [0x7f278269fd41] 8: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x7f27825c2470] ``` The root cause of why PGs are starting to trim more log entries than we expect is still lost to me, but setting the trap here should generally do no harm and hopefully expose the above problem a little bit more offen. We'll see. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

tchaikov · 2018-04-23T10:53:52Z

http://pulpito.ceph.com/kchai-2018-04-23_07:38:14-rados-wip-kefu-testing-2018-04-23-1357-distro-basic-smithi/

test failures are tracked / addressed by

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of accessing violation while searching for the oldest log entry to keep in the log list later. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc: trivial resolution

In ceph/ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph/ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc: trivial resolution (cherry picked from commit 85a029a) Resolves: rhbz#1608060

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc: trivial resolution

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc: trivial resolution (cherry picked from commit 54b04ba) Resolves: rhbz#1608060

In ceph#21580 I set a trap to catch some wired and random segmentfaults and in a recent QA run I was able to observe it was successfully triggered by one of the test case, see: ``` http://qa-proxy.ceph.com/teuthology/xxg-2018-07-30_05:25:06-rados-wip-hb-peers-distro-basic-smithi/2837916/teuthology.log ``` The root cause is that there might be holes on log versions, thus the approx_size() method should (almost) always overestimate the actual number of log entries. As a result, we might be at the risk of overtrimming log entries. ceph#18338 reveals a probably easier way to fix the above problem but unfortunately it also can cause big performance regression and hence comes this pr.. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit 3654d56) Conflicts: src/osd/PrimaryLogPG.cc: trivial resolution

xiexingguo added the core label Apr 21, 2018

xiexingguo requested a review from liewegas April 21, 2018 03:02

liewegas approved these changes Apr 23, 2018

View reviewed changes

liewegas added this to the mimic milestone Apr 23, 2018

liewegas requested a review from jdurgin April 23, 2018 02:55

liewegas added the needs-qa label Apr 23, 2018

tchaikov added the wip-kefu-testing label Apr 23, 2018

xiexingguo mentioned this pull request Apr 23, 2018

osd: calc_min_last_complete_ondisk() should use actingset #21508

Closed

tchaikov removed needs-qa wip-kefu-testing labels Apr 23, 2018

tchaikov self-assigned this Apr 23, 2018

liewegas merged commit 7003d41 into ceph:master Apr 23, 2018

xiexingguo deleted the wip-add-assert branch April 24, 2018 00:45

xiexingguo mentioned this pull request Jul 30, 2018

osd/PrimaryLogPG: fix potential pg-log overtrimming #23317

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd/PGLog: assert out on performing overflowed log trimming #21580

osd/PGLog: assert out on performing overflowed log trimming #21580

xiexingguo commented Apr 21, 2018

tchaikov commented Apr 23, 2018

osd/PGLog: assert out on performing overflowed log trimming #21580

osd/PGLog: assert out on performing overflowed log trimming #21580

Conversation

xiexingguo commented Apr 21, 2018

tchaikov commented Apr 23, 2018