Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async/rdma: fix a coredump introduced by PR #18053, #18204

Merged
merged 1 commit into from Oct 13, 2017

Conversation

Projects
None yet
4 participants
@ownedu
Copy link
Contributor

commented Oct 10, 2017

where the iterator is not working properly after erase().

introduced by #18053

Signed-off-by: Yan Lei yongyou.yl@alibaba-inc.com

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 10, 2017

@yuyuyu101 @tchaikov pls help review this fix; thanks.

@tchaikov tchaikov requested a review from yuyuyu101 Oct 10, 2017

@@ -244,17 +244,20 @@ void RDMADispatcher::polling()
perf_logger->set(l_msgr_rdma_inflight_tx_chunks, inflight);
if (num_dead_queue_pair) {
Mutex::Locker l(lock); // FIXME reuse dead qp because creating one qp costs 1 ms
for (auto &i : dead_queue_pairs) {
auto it = dead_queue_pairs.begin();

This comment has been minimized.

Copy link
@tchaikov

tchaikov Oct 10, 2017

Contributor

my bad, i didn't realize that we were removing elements while iterating thru the vector.

This comment has been minimized.

Copy link
@ownedu

ownedu Oct 10, 2017

Author Contributor

Thanks for your quick review, anyway.

@tchaikov tchaikov changed the title msg/async/rdma: fix a coredump bug which is introduced by PR #18053, msg/async/rdma: fix a coredump introduced by PR #18053, Oct 10, 2017

perf_logger->dec(l_msgr_rdma_active_queue_pair);
--num_dead_queue_pair;
if (i->get_tx_wr()) {
ldout(cct, 10) << __func__ << " bypass qp=" << i << " tx_wr=" << i->get_tx_wr() << dendl;

This comment has been minimized.

Copy link
@yuyuyu101

yuyuyu101 Oct 10, 2017

Member

I suppose improve this to level 20

This comment has been minimized.

Copy link
@ownedu

ownedu Oct 10, 2017

Author Contributor

ok

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@yuyuyu101 Log level is bumped up, pls help double check; thanks.

@tchaikov tchaikov added the needs-qa label Oct 11, 2017

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@tchaikov could you pls help me clarify the needs-qa label? Does that mean this commit will go though some automatic QA tests before merging? Thanks.

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@ownedu see http://docs.ceph.com/docs/master/dev/#integration-tests-aka-ceph-qa-suite

also, could you squash your commits into a single one? i think you've addressed @yuyuyu101 's concern.

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@tchaikov Regarding the needs-qa, got it and thanks; and the commits combination is done.

msg/async/rdma: fix a coredump bug which is introduced by PR #18053,
where the iterator is not working properly after erase().

Signed-off-by: Yan Lei <yongyou.yl@alibaba-inc.com>

@ownedu ownedu force-pushed the ownedu:wip-fix-async-rdma-coredump branch from 8f90d71 to 322f87f Oct 11, 2017

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@yuyuyu101 @tchaikov The latest commit failed with "make check", probably by bad connections, could you pls help take a look? And if so, how to redo "make check"?

The following tests FAILED:
2 - run-cli-tests (Failed)
Errors while running CTest
Build step 'Execute shell' marked build as failure
[PostBuildScript] - Execution post build scripts.
[ceph-pull-requests] $ /bin/sh -xe /tmp/jenkins7902536043534808187.sh

  • sudo reboot
    Agent went offline during the build
    ERROR: Connection was broken: java.io.EOFException
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2638)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3113)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:853)
    at java.io.ObjectInputStream.(ObjectInputStream.java:349)
    at hudson.remoting.ObjectInputStreamEx.(ObjectInputStreamEx.java:48)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

how to redo "make check"

jenkins, retest this please.

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

Passed; thanks @tchaikov

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@yuyuyu101 @tchaikov QA teuthology testing will be started automatically once it is scheduled?

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@ownedu please define "schedule".

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@tchaikov I see "Job stats for this page: 2080 queued, 158 failed, 72 waiting, 102 running, 47 dead, 1348 passed, (3807 total)" in http://pulpito.ceph.com, if I understand correctly this PR's QA is queued and waiting for test.

Plz correct me if I misunderstood anything.

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

AFAICT, pulpito is not triggered by any label on github.

is queued

no

and waiting for test.

yes.

some kind developer will collect the "needs-qa" PRs for testing them in batch, and analyze the test result, then hopefully merge the tested PRs if all goes well. and it does not happen automatically. normally, this takes up to one week.

@ownedu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@tchaikov gotcha; thanks.

@tchaikov tchaikov merged commit 5f021b2 into ceph:master Oct 13, 2017

5 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.