rbd: when Ceph cluster is full, return -ENOSPC for creating image command. #14167

liupan1111 · 2017-03-27T14:38:45Z

As we talked in #14116, I improved "create" path, and return -ENOSPC when cluster is full.

liupan1111 · 2017-03-27T14:48:23Z

@dillaman Please help take a look, thanks!

yangdongsheng · 2017-03-28T07:53:21Z

Just curious, how can we decide a action should set_osdmap_full_try() or not? So far, I think remove and snap remove looks reasonable to me. But why create? If create should not be blocked, what about other actions? resize? list? What's the policy for this?

liupan1111 · 2017-03-28T08:14:50Z

list, resize, won't be blocked at this moment, because they are not in "write" path.

yangdongsheng · 2017-03-28T08:33:31Z

wait a minute, resize will not be blocked? how can I request a larger size when the cluster is full?

liupan1111 · 2017-03-28T08:36:19Z

rbd image is thin-provisioned, "resize" will only change the value of metadata, won't really change the size.

yangdongsheng · 2017-03-28T08:49:56Z

But it's a "write" on header object, refer to this:
https://github.com/ceph/ceph/blob/master/src/librbd/operation/ResizeRequest.cc#L383

and I got a block in my testing.

$ rbd resize test -s 10G
2017-03-28 16:44:52.547655 7f98217fa700  0 client.4152.objecter  FULL, paused modify 0x7f9800008a70 tid 7
^C
$ ^C

liupan1111 · 2017-03-28T08:56:57Z

cool, good catch. I want to fix create at this moment, and fix other write related commands together later.

yangdongsheng · 2017-03-28T09:23:09Z

I am just wondering, what the kind of the actions should be blocked, just data writing?
So, should we set_osdmap_full_try() on ImageCtx.md_ctx?

I don't think it's a good idea to set_osdmap_full_try() on actions one by one.

dillaman · 2017-03-28T13:28:50Z

@yangdongsheng While I also agree that this continued effort has the potential to get beyond ridiculous real fast when the simple solution is don't run with clusters at 100% capacity, we cannot just set the flag on ImageCtx::md_ctx unconditionally since some IO things also hit it (like object map updates).

dillaman · 2017-03-28T13:37:32Z

@liupan1111 That first commit against the osdc should be a separate PR for the core team to review. What exactly is the bug you are fixing?

liupan1111 · 2017-03-28T13:40:51Z

@dillaman, the first commit is about incomplete fix in #12627, which is also about full try. I added op->target.flags |= CEPH_OSD_FLAG_FULL_TRY in wrong place.

I added this commit in this PR, because it is a prerequisite of next fix for create.

liupan1111 · 2017-03-28T13:42:28Z

@yangdongsheng While I also agree that this continued effort has the potential to get beyond ridiculous real fast when the simple solution is don't run with clusters at 100% capacity, we cannot just set the flag on ImageCtx::md_ctx unconditionally since some IO things also hit it (like object map updates).

One thing just want to clarify is: not 100% capacity, just full, which is controlled by mon_osd_full_ratio(default 95%).

dillaman · 2017-03-28T13:43:38Z

@liupan1111 Understood -- but it's one thing to support freeing up space from your cluster when you are full and it's another to try to work around every possible way to touch images when full.

liupan1111 · 2017-03-28T13:45:56Z

@dillaman Agree, so how about I just leave the second commit in this pr, and move the first one into a new PR?

dillaman · 2017-03-28T13:56:07Z

@liupan1111 Perfect, thanks

…mand. Signed-off-by: Pan Liu <liupan1111@gmail.com>

liupan1111 · 2017-03-28T14:05:23Z

@dillaman done, thanks.

liupan1111 · 2017-04-02T04:24:41Z

@dillaman ping

liupan1111 · 2017-04-02T04:27:08Z

#14193

dillaman · 2017-04-03T15:16:43Z

@liupan1111 I agree w/ @yangdongsheng in that we don't want to keep adding these things one-by-one for each CLI action. It was one thing when it was just to help remove RBD images when the cluster is full, but it's going to be another thing to support any arbitrary RBD action. I think the only way to solve this would be to have a new librados::IoCtx within ImageCtx that is used for all operations throughout librbd w/o the need to only address this fix in the rbd CLI.

liupan1111 · 2017-04-29T16:17:55Z

@dillaman , I agree. I also talked with dongsheng offline, and I will think about it.

liupan1111 added feature rbd core common labels Mar 27, 2017

liupan1111 requested a review from dillaman March 27, 2017 14:48

rbd: when Ceph cluster is full, return -ENOSPC for creating image com…

af6106f

…mand. Signed-off-by: Pan Liu <liupan1111@gmail.com>

liupan1111 removed common core labels Mar 28, 2017

dillaman closed this May 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rbd: when Ceph cluster is full, return -ENOSPC for creating image command. #14167

rbd: when Ceph cluster is full, return -ENOSPC for creating image command. #14167

liupan1111 commented Mar 27, 2017 •

edited

Loading

liupan1111 commented Mar 27, 2017

yangdongsheng commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017 •

edited

Loading

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017

dillaman commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017 •

edited

Loading

liupan1111 commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

liupan1111 commented Apr 2, 2017

liupan1111 commented Apr 2, 2017

dillaman commented Apr 3, 2017

liupan1111 commented Apr 29, 2017

rbd: when Ceph cluster is full, return -ENOSPC for creating image command. #14167

rbd: when Ceph cluster is full, return -ENOSPC for creating image command. #14167

Conversation

liupan1111 commented Mar 27, 2017 • edited Loading

liupan1111 commented Mar 27, 2017

yangdongsheng commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017 • edited Loading

liupan1111 commented Mar 28, 2017

yangdongsheng commented Mar 28, 2017

dillaman commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017 • edited Loading

liupan1111 commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

dillaman commented Mar 28, 2017

liupan1111 commented Mar 28, 2017

liupan1111 commented Apr 2, 2017

liupan1111 commented Apr 2, 2017

dillaman commented Apr 3, 2017

liupan1111 commented Apr 29, 2017

liupan1111 commented Mar 27, 2017 •

edited

Loading

yangdongsheng commented Mar 28, 2017 •

edited

Loading

liupan1111 commented Mar 28, 2017 •

edited

Loading