Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librbd: asynchronous image removal state machine #12102

Merged
merged 15 commits into from Feb 24, 2017

Conversation

yangdongsheng
Copy link
Contributor

Thanx @vshankar give me the honor to open a separate Pull Request for Async remove state machine. This will be referred by #10896 and #12041

@yangdongsheng
Copy link
Contributor Author

Seems it does not compile because there is some other dependent commits. Seems not difficult, Will address it tomorrow .

@yangdongsheng
Copy link
Contributor Author

anyone can help to add tags of rbd and feature? Thanx :)

@trociny
Copy link
Contributor

trociny commented Nov 23, 2016

@yangdongsheng @vshankar It looks like it needs some work -- see Jenkins failure.

@trociny trociny changed the title librbd: Async remove [DNM] librbd: asynchronous image removal state machine Nov 23, 2016
@vshankar
Copy link
Contributor

@trociny mostly yes -- probably due to changes in other parts of librbd.

@yangdongsheng I can take a look at this next week but feel free to fix it!

@yangdongsheng
Copy link
Contributor Author

Thanx @trociny

@vshankar I will try to fix it later after I finish my another work, maybe tomorrow. Thanx

@yangdongsheng
Copy link
Contributor Author

@trociny and @vshankar It's green now. I missed one commit about force remove. It's added back now.

trociny pushed a commit that referenced this pull request Nov 28, 2016
[DNM] librbd: asynchronous image removal state machine #12102
@trociny
Copy link
Contributor

trociny commented Nov 30, 2016

@yangdongsheng @vshankar
When removing non-existent image, it does not return error:

zhuzha:~/ceph/ceph.upstream/build% ./bin/rbd remove NOTEXIST ; echo $?
2016-11-30 11:28:06.677754 7fe8feffd700 -1 librbd::image::RemoveRequest: 0x7fe940170370 handle_open_image error opening image: (2) No such file or directory
Removing image: 100% complete...done.
0

zhuzha:~/ceph/ceph.upstream/build% RBD_FEATURES=127 LD_LIBRARY_PATH=./lib PYTHONPATH=../src/pybind:../src/test/pybind:./lib/cython_modules/lib.2  nosetests -v test_rbd -m 'test_remove_dne'
test_rbd.test_remove_dne ... FAIL

======================================================================
FAIL: test_rbd.test_remove_dne
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/SSD/ceph/ceph.upstream/build/test_rbd.py", line 255, in test_remove_dne
AssertionError: ImageNotFound not raised

trociny pushed a commit that referenced this pull request Nov 30, 2016
[DNM] librbd: asynchronous image removal state machine #12102
ldout(m_cct, 20) << ": r=" << *result << dendl;

if (*result < 0) {
lderr(m_cct) << "error opening image: " << cpp_strerror(*result) << dendl;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trociny The bug about NOTEXIST is about there should be a "m_retval = *result;". I did not fix it immediately, because I found there is no steps to remove journal and objectmap. @vshankar is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the missing of m_retval assignment is definitely a bug. but why the journal and objectmap are not removed in RemoveRequest. @vshankar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind. I found there is a step for disable features. These work would be done here. So I will fix the m_retval bug then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yangdongsheng Missing m_retval assignment is definitely a bug, nothing intentional. While that needs to be fixed, make sure m_retval is not clobbered again in handle_mirror_image_remove().

Journal and ObjectMap are removed -- see disable_features which invokes DisableFeaturesRequest state machine.

@yangdongsheng
Copy link
Contributor Author

@trociny @vshankar The bug about removing NOTEXIST was fixed. And I squashed this fix into the main commit by adding a comment in commit message above my singed-off-by.

trociny pushed a commit that referenced this pull request Dec 2, 2016
[DNM] librbd: asynchronous image removal state machine #12102
@trociny
Copy link
Contributor

trociny commented Dec 3, 2016

@yangdongsheng @vshankar This test from qa/workunits/rbd/copy.sh fails:

test_remove() {
   ...
    # remove with header missing (old format)
    rbd create --image-format 1 -s 1 test1
    rados rm -p rbd test1.rbd
    rbd rm test1

Now it returns:

2016-12-03 19:19:21.688776 7fec30ff9700 -1 librbd::image::RemoveRequest: 0x7fec6e1b0480 handle_open_image error opening image: (2) No such file or directory
Removing image: 0% complete...failed.
rbd: delete error: (2) No such file or directory

@yangdongsheng
Copy link
Contributor Author

@trociny When I look into more about the RemoveRequest, I think there are some more work we can do before merging it. So I propose we hold a while on this pull request. I will send a commit about my change about it for review, if that's okey, I will squash it into the original commit. Sounds good? @trociny @vshankar

@yangdongsheng
Copy link
Contributor Author

@trociny The new commit will address the problem you mentioned. @vshankar please help to review it.

@vshankar
Copy link
Contributor

vshankar commented Dec 5, 2016

@trociny The new commit will address the problem you mentioned. @vshankar please help to review it.

Will review it soon (currently I'm in mid of pushing my PR). Thanks.

ldout(m_cct, 20) << ": r=" << *result << dendl;

if ((*result < 0) && (*result != -ENOENT)) {
if (m_retval == 0) {
Copy link
Contributor

@vshankar vshankar Dec 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an unnecessary check. When we are here m_retval is always -ve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be non-zero here. I added a check here just be sure we will not overwrite it. And make this function safe enough even when we move it to another locate in state machine.

That means, yes in this context, we don't need this block of code. But from the insight of making this function logically consistent. what about keep these code here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, its good to put it up as a comment and keep the function without introducing changes which are not necessary now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment sounds good. thanx

Context *RemoveRequest<I>::handle_mirror_image_remove(int *result) {
ldout(m_cct, 20) << ": r=" << *result << dendl;

if ((*result < 0) && (*result != -ENOENT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at the original code. We need to check for -EOPNOTSUPP too, just in case we are running on an older cluster. Also, I don't like very much unnecessary extra parenthesis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct, EOPNOTSUPP is important.

@yangdongsheng
Copy link
Contributor Author

@trociny and @vshankar retest this please, I have addressed these comments and squashed my new commit to the original commit.

@dillaman
Copy link

@yangdongsheng It actually might just be an existing issue with the OpenStack tempest test case in how it handles cleanup [1].

[1] https://bugs.launchpad.net/cinder/+bug/1648885

@yangdongsheng
Copy link
Contributor Author

@dillaman sounds yes. Thanx for your information. :) Then I think we can continue what we need on this PR.

Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dillaman
Copy link

retest this please

@dillaman
Copy link

@yangdongsheng there is a unit test failure with the latest re-build request

@yangdongsheng
Copy link
Contributor Author

@dillaman sounds like a race condition here.


send_trim_image()
   - RWLock::RLocker owner_lock(m_image_ctx->owner_lock)       |
   - TrimRequest request->send();                                                     |              m_on_finish->complete()
   - ~ owner_locker automatically in call stack being destroyed.     |              send_close_image()
                                                                                                              |              ceph_assert(!m_image_ctx->owner_lock)
                                                                                                                            

So if the gc in c++ for the instance of owner_lock is after send_close_image(), this ceph_assert() could fail.

Sounds possible?

@dillaman
Copy link

dillaman commented Feb 17, 2017

@yangdongsheng You should never assert on a lock not being held because another thread could always be using the lock in the background (in another thread). In the case of TrimRequest, if the image is zero bytes, I'd image that it would immediately exit while still holding the lock. You can wrap the callback provided to TrimRequest with librbd::util::create_async_context_callback to ensure that the "handle" method is invoked in a lock-free context.

@yangdongsheng
Copy link
Contributor Author

@dillaman thanx jason. updated as below:

diff --git a/src/librbd/image/RemoveRequest.cc b/src/librbd/image/RemoveRequest.cc
index 11633c5..3898ab3 100644
--- a/src/librbd/image/RemoveRequest.cc
+++ b/src/librbd/image/RemoveRequest.cc
@@ -24,6 +24,7 @@ namespace image {

 using librados::IoCtx;
 using util::create_context_callback;
+using util::create_async_context_callback;
 using util::create_rados_ack_callback;

 static const std::string IMAGE_CLIENT_ID("");
@@ -288,7 +289,8 @@ void RemoveRequest<I>::trim_image() {
   ldout(m_cct, 20) << dendl;

   using klass = RemoveRequest<I>;
-  Context *ctx = create_context_callback<klass, &klass::handle_trim_image>(this);
+  Context *ctx = create_async_context_callback(
+    *m_image_ctx, create_context_callback<klass, &klass::handle_trim_image>(this));

   RWLock::RLocker owner_lock(m_image_ctx->owner_lock);
   librbd::operation::TrimRequest<I> *req = librbd::operation::TrimRequest<I>::create(
diff --git a/src/test/librbd/image/test_mock_RemoveRequest.cc b/src/test/librbd/image/test_mock_RemoveRequest.cc
index 6b202a4..74c66c1 100644
--- a/src/test/librbd/image/test_mock_RemoveRequest.cc
+++ b/src/test/librbd/image/test_mock_RemoveRequest.cc
@@ -240,9 +240,9 @@ TEST_F(TestMockImageRemoveRequest, SuccessV1) {

   InSequence seq;
   expect_state_open(*m_mock_imctx, 0);
-  expect_op_work_queue(*m_mock_imctx);
   expect_get_group(*m_mock_imctx, 0);
   expect_trim(*m_mock_imctx, mock_trim_request, 0);
+  expect_op_work_queue(*m_mock_imctx);
   expect_state_close(*m_mock_imctx);
   expect_wq_queue(op_work_queue, 0);

@@ -290,10 +290,10 @@ TEST_F(TestMockImageRemoveRequest, SuccessV2) {

   InSequence seq;
   expect_state_open(*m_mock_imctx, 0);
-  expect_op_work_queue(*m_mock_imctx);
   expect_mirror_image_get(*m_mock_imctx, 0);
   expect_get_group(*m_mock_imctx, 0);
   expect_trim(*m_mock_imctx, mock_trim_request, 0);
+  expect_op_work_queue(*m_mock_imctx);
   expect_remove_child(*m_mock_imctx, 0);
   expect_mirror_disable(*m_mock_imctx, mock_mirror_disable_request, 0);
   expect_state_close(*m_mock_imctx);
@@ -324,10 +324,10 @@ TEST_F(TestMockImageRemoveRequest, NotExistsV2) {

   InSequence seq;
   expect_state_open(*m_mock_imctx, 0);
-  expect_op_work_queue(*m_mock_imctx);
   expect_mirror_image_get(*m_mock_imctx, 0);
   expect_get_group(*m_mock_imctx, 0);
   expect_trim(*m_mock_imctx, mock_trim_request, 0);
+  expect_op_work_queue(*m_mock_imctx);
   expect_remove_child(*m_mock_imctx, 0);
   expect_mirror_disable(*m_mock_imctx, mock_mirror_disable_request, 0);
   expect_state_close(*m_mock_imctx);

@dillaman
Copy link

@yangdongsheng Looks like the negative assert is still present and I hit it when I run the unit tests locally:

8: [ RUN ] TestMockImageRemoveRequest.SuccessV1
8: /home/jdillaman/ceph/src/librbd/image/RemoveRequest.cc: In function 'void librbd::image::RemoveRequest::send_close_image(int) [with ImageCtxT = librbd::MockImageCtx]' thread 7fa3b8b17700 time 2017-02-20 18:43:24.537008
8: /home/jdillaman/ceph/src/librbd/image/RemoveRequest.cc: 390: FAILED assert(!m_image_ctx->owner_lock.is_locked())
8: ceph version 12.0.0-619-g5389cae (5389caea91f9881085571952f7b71dda3a0255f7)
8: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fa3c0902e82]
8: 2: (librbd::image::RemoveRequestlibrbd::MockImageCtx::send_close_image(int)+0x17f) [0x555d4b0b309f]
8: 3: (librbd::image::RemoveRequestlibrbd::MockImageCtx::handle_trim_image(int*)+0xa3) [0x555d4b0b45f3]
8: 4: (librbd::util::detail::C_StateCallbackAdapter<librbd::image::RemoveRequestlibrbd::MockImageCtx, &librbd::image::RemoveRequestlibrbd::MockImageCtx::handle_trim_image, true>::complete(int)+0x1b) [0x555d4b0b708b]
8: 5: (ContextWQ::process(Context*)+0xe1) [0x555d4afdb561]
8: 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x1025) [0x7fa3c090c275]
8: 7: (ThreadPool::WorkThread::entry()+0x10) [0x7fa3c090d490]
8: 8: (()+0x75ca) [0x7fa3c93955ca]
8: 9: (clone()+0x6d) [0x7fa3be08c0ed]

yangdongsheng and others added 4 commits February 21, 2017 19:17
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
@yangdongsheng
Copy link
Contributor Author

@dillaman there is a fix for it. And I tested it by hacking some code in trim to delay the releasing of owner_lock to reproduce the race condition.

--- a/src/librbd/image/RemoveRequest.cc
+++ b/src/librbd/image/RemoveRequest.cc
@@ -302,12 +302,16 @@ template<typename I>
 Context *RemoveRequest<I>::handle_trim_image(int *result) {
   ldout(m_cct, 20) << ": r=" << *result << dendl;
 
+  {
+    // Wait the RLocker released in trim_image()
+    RWLock::WLocker owner_lock(m_image_ctx->owner_lock);
+  }
+
   if (*result < 0) {
     lderr(m_cct) << "warning: failed to remove some object(s): " << cpp_strerror(*result)
                  << dendl;
   }

ldout(m_cct, 20) << ": r=" << *result << dendl;

{
// Wait the RLocker released in trim_image()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this is even needed. librbd::image::CloseRequest will eventually take a write-lock on ImageCtx::owner_lock, so any potential for thread races should be cleared at that point long before the lock is destroyed. Did you ever hit a race where something failed because the trim request process was still holding a read-lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a second thought, I think there is no necessary for a assert at https://github.com/ceph/ceph/pull/12102/files#diff-7b8bedbcd26c203c38e02a5bd9dbfa1cR394 . This assert was moved from internal.cc which was a single-threading implementation. As you mentioned, CloseRequest will take the write-lock if they want. So we don't have to make sure the owner_lock is not hold outside. So fix as below:

--- a/src/librbd/image/RemoveRequest.cc
+++ b/src/librbd/image/RemoveRequest.cc
@@ -302,11 +302,6 @@ template<typename I>
 Context *RemoveRequest<I>::handle_trim_image(int *result) {
   ldout(m_cct, 20) << ": r=" << *result << dendl;
 
-  {
-    // Wait the RLocker released in trim_image()
-    RWLock::WLocker owner_lock(m_image_ctx->owner_lock);
-  }
-
   if (*result < 0) {
     lderr(m_cct) << "warning: failed to remove some object(s): " << cpp_strerror(*result)
                  << dendl;
@@ -391,8 +386,6 @@ void RemoveRequest<I>::send_close_image(int r) {
   ldout(m_cct, 20) << dendl;
 
   m_ret_val = r;
-  assert(!m_image_ctx->owner_lock.is_locked());
-
   using klass = RemoveRequest<I>;
   Context *ctx = create_context_callback<klass, &klass::handle_send_close_image>(this);

And I tested it by for ((i=0;i<100000;i++)); do CEPH_LIB=./lib ./bin/unittest_librbd --gtest_filter='TestMockImageRemoveRequest.*' >> /tmp/output; done; grep FAIL /tmp/output with hacking this code:

--- a/src/librbd/image/RemoveRequest.cc
+++ b/src/librbd/image/RemoveRequest.cc
@@ -296,6 +296,11 @@ void RemoveRequest<I>::trim_image() {
   librbd::operation::TrimRequest<I> *req = librbd::operation::TrimRequest<I>::create(
     *m_image_ctx, ctx, m_image_ctx->size, 0, m_prog_ctx);
   req->send();
+  int i = 0;
+  for (i = 0; i < 10000; i++) {
+    lderr(m_cct) << "delay delay delay the lock releasing: "
+                 << dendl;
+  }
 }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yangdongsheng just a hint: --gtest_repeat=1000000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trociny , Oh great!! thanx for your information. :)

yangdongsheng and others added 11 commits February 23, 2017 11:10
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
The copy.sh is not only testing the rbd copy, but also
others such as rbd ls, rbd remove. Then rename it to generic.sh

Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants