Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async/rdma: fix a potential coredump when handling tx_buffers under heavy RDMA #18036

Merged
merged 3 commits into from Sep 29, 2017

Conversation

ownedu
Copy link
Contributor

@ownedu ownedu commented Sep 29, 2017

traffic, there are chances to access a current_chunk which can be beyond the
range of pre-allocated Tx buffer pool thus causes a coredump.

Signed-off-by: Yan Lei yongyou.yl@alibaba-inc.com

traffic, there are chances to access a current_chunk which can be beyond the
range of pre-allocated Tx buffer pool thus causes a coredump.

Signed-off-by: Yan Lei <yongyou.yl@alibaba-inc.com>
@ownedu
Copy link
Contributor Author

ownedu commented Sep 29, 2017

@yuyuyu101 Pls help take a look at this PR; thanks!

heavy RDMA traffic, there are chances to access a current_chunk which can
be beyond the range of pre-allocated Tx buffer pool thus causes a coredump

Signed-off-by: Yan Lei <yongyou.yl@alibaba-inc.com>
@ownedu ownedu changed the title Fix a potential coredump when handling tx_buffers under heacy RDMA msg/async/rdma: fix a potential coredump when handling tx_buffers under heavy RDMA Sep 29, 2017
@ownedu
Copy link
Contributor Author

ownedu commented Sep 29, 2017

In case of OSD backfill, some OSD processes could crash with the following stack:

 1: (()+0x90da02) [0x7f4aa3403a02]
 2: (()+0xf100) [0x7f4aa1471100]
 3: (RDMAConnectedSocketImpl::submit(bool)+0xa3b) [0x7f4aa3638d7b]
 4: (RDMAWorker::handle_pending_message()+0xce) [0x7f4aa362608e]
 5: (EventCenter::process_events(int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x87c) [0x7f4aa35bbf4c]
 6: (()+0xb07ade) [0x7f4aa35fdade]
 7: (()+0xb5220) [0x7f4aa0174220]
 8: (()+0x7dc5) [0x7f4aa1469dc5]
 9: (clone()+0x6d) [0x7f4a9f8dbced]
 NOTE: a copy of the executable, or objdump -rdS <executable> is needed to interpret this.

@yuyuyu101 yuyuyu101 merged commit fd704ab into ceph:master Sep 29, 2017
@tchaikov
Copy link
Contributor

@yuyuyu101 and @ownedu please avoid introducing merge commits.

@yuyuyu101
Copy link
Member

my bad, I don't notice extra merge commit...

@ownedu ownedu deleted the wip-fix-asyncrdma-coredump branch September 29, 2017 22:19
@ownedu
Copy link
Contributor Author

ownedu commented Sep 29, 2017

Thanks @yuyuyu101

@ownedu
Copy link
Contributor Author

ownedu commented Sep 30, 2017

@xiexingguo Could you pls add label RDMA for this PR? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants