Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: rbd: rbd-nbd IO hang #11467

Merged
4 commits merged into from Oct 25, 2016
Merged

jewel: rbd: rbd-nbd IO hang #11467

4 commits merged into from Oct 25, 2016

Conversation

ghost
Copy link

@ghost ghost commented Oct 13, 2016

@ghost ghost self-assigned this Oct 13, 2016
@ghost ghost added this to the jewel milestone Oct 13, 2016
@ghost ghost added bug-fix rbd labels Oct 13, 2016
@ghost
Copy link
Author

ghost commented Oct 13, 2016

  CXX      test/librbd/operation/unittest_librbd-test_mock_SnapshotRemoveRequest.o
  CXX      test/librbd/operation/unittest_librbd-test_mock_SnapshotRollbackRequest.o
test/librbd/operation/test_mock_ResizeRequest.cc: In member function ‘virtual void librbd::operation::TestMockOperationResizeRequest_FlushCacheError_Test::TestBody()’:
test/librbd/operation/test_mock_ResizeRequest.cc:298:315: error: no matching function for call to ‘librbd::operation::TestMockOperationResizeRequest_FlushCacheError_Test::when_resize(librbd::MockImageCtx&, uint64_t, bool, int, bool)’
   ASSERT_EQ(-EINVAL, when_resize(mock_image_ctx, ictx->size / 2, true, 0, false));
                                                                                                                                                                                                                                                                                                                           ^
test/librbd/operation/test_mock_ResizeRequest.cc:298:315: note: candidate is:
test/librbd/operation/test_mock_ResizeRequest.cc:136:7: note: int librbd::operation::TestMockOperationResizeRequest::when_resize(librbd::MockImageCtx&, uint64_t, uint64_t, bool)
   int when_resize(MockImageCtx &mock_image_ctx, uint64_t new_size,
       ^
test/librbd/operation/test_mock_ResizeRequest.cc:136:7: note:   candidate expects 4 arguments, 5 provided

MockImageCtx should be MockImageTestCtx or something, introduced after this commit but backported before it

@ghost ghost changed the title jewel: rbd-nbd IO hang DNM: jewel: rbd-nbd IO hang Oct 13, 2016
@dillaman
Copy link

@dachary Do you need help with this PR so we can get the nbd tests functional?

@ghost
Copy link
Author

ghost commented Oct 20, 2016

I think I can manage it, I'll ask for help if I'm stuck. Will work on it in the next 24h.

Jason Dillaman added 4 commits October 21, 2016 12:16
Fixes: http://tracker.ceph.com/issues/16921
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit ce7c152)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit c6cfb61)
Any potential writeback outside the extents of a shrunk image
would result in orphaned objects.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 3f93a19)
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
(cherry picked from commit 4ce6638)

Conflicts:
     src/test/librbd/operation/test_mock_ResizeRequest.cc:
     when_resize does not have the allow_shrink argument because
     d1f2c55 has not been
     backported
@ghost ghost changed the title DNM: jewel: rbd-nbd IO hang jewel: rbd-nbd IO hang Oct 21, 2016
@ghost
Copy link
Author

ghost commented Oct 21, 2016

jenkins test this please (rbd core)

@ghost
Copy link
Author

ghost commented Oct 21, 2016

building in gitbuilder under wip-17262-jewel

@ghost
Copy link
Author

ghost commented Oct 21, 2016

jenkins test this please (rbd failure again) http://tracker.ceph.com/issues/17642

@dillaman
Copy link

retest this please

@ghost
Copy link
Author

ghost commented Oct 24, 2016

filter="rbd/thrash/{base/install.yaml clusters/{fixed-2.yaml openstack.yaml} fs/xfs.yaml msgr-failures/few.yaml thrashers/cache.yaml workloads/rbd_fsx_nbd.yaml}"
teuthology-suite --priority 101 --suite rbd --filter="$filter" --suite-branch jewel --email loic@dachary.org --ceph wip-17262-jewel --machine-type smithi

@dillaman
Copy link

@dachary hmm -- assertion failure in the daemon on one test run but no coredump generated.

@ghost
Copy link
Author

ghost commented Oct 24, 2016

Re-running the above to assert how transient (or not) the failure is

@dillaman
Copy link

@dachary I'm hoping it's just due to the fact that PR #10869 hasn't been merged. Hard to tell since the assertion failure reason wasn't logged.

@ghost
Copy link
Author

ghost commented Oct 24, 2016

@dillaman the thrashers/default job passed once and the thrashers/cache passed once. Do you think this is good enough to merge ? Or should we wait for PR #10869 to be merged and maybe improve the situation ?

@dillaman
Copy link

@dachary Let me re-run it a few times w/ rbd debugging enabled to see if I can repeat it today.

@dillaman
Copy link

@dachary This failure is something different -- I am going to try to get a coredump out of this.

@dillaman
Copy link

@dachary This PR is good to merge -- but I need to open a new ticket for the associated crash since it's high-pri and isn't related to this change.

@ghost ghost merged commit 7714689 into ceph:jewel Oct 25, 2016
@theanalyst theanalyst changed the title jewel: rbd-nbd IO hang "jewel: rbd: rbd-nbd IO hang" Nov 17, 2016
@theanalyst theanalyst changed the title "jewel: rbd: rbd-nbd IO hang" jewel: rbd: rbd-nbd IO hang Nov 17, 2016
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant