os/bluestore: fix the allocate in bluefs #19030

tangwenjun3 · 2017-11-20T08:01:32Z

when bluefs succeed to reserve but failed to alloc in db space,
it will cause a assert, just because of the space fragmentation.
in this situation, it could not use slow device space,and it
would happen in stupid or bitmap allocator.

Signed-off-by: tangwenjun tang.wenjun3@zte.com.cn

tchaikov · 2017-11-20T08:12:49Z

src/os/bluestore/BlueFS.cc 100644 → 100755

@tangwenjun3 please do not add "x" permission bit to the source file, unless it's supposed to be executable.

tangwenjun3 · 2017-11-20T08:21:04Z

@tchaikov fix

xiexingguo · 2017-11-20T08:26:58Z

src/os/bluestore/BlueFS.cc

+    extents.reserve(4);  // 4 should be (more than) enough for most allocations
+    alloc_len = alloc[id]->allocate(left, min_alloc_size, hint, &extents);
+  }
+  if (r < 0 || (alloc_len < (int64_t)left)) {


@tangwenjun3 If reserve() is okay but we fail to allocate(), we should call unreserve() first before we switch to try other devices(e.g., slow), right? otherwise there is potential space leaks..

xiexingguo

Good catch

ifed01 · 2017-11-20T15:27:49Z

src/os/bluestore/BlueFS.cc

+  }
+  if (r < 0 || (alloc_len < (int64_t)left)) {
+    if (r == 0)
+      alloc[id]->unreserve(left);


You should probably unreserve the difference between left and alloc_len and release already allocated extents

Ahh, right.

release already allocated extents

But instead of doing this, I think we can just push the allocated extents into ev as we technically support mix-usage of space from different devices:

if (r ==0) { alloc[id]->unreserve(left - alloc_len); for (auto& p : extents) { bluefs_extent_t e = bluefs_extent_t(id, p.offset, p.length); if (!ev->empty() && ev->back().bdev == e.bdev && ev->back().end() == (uint64_t) e.offset) { ev->back().length += e.length; } else { ev->push_back(e); } }

Well, on a second thought, "release already allocated extents" perhaps is a better choice as supporting a single request of allocating space from multiple devices is a little bit too complicated...

when bluefs succeed to reserve but failed to alloc in db space, it will cause a assert, just because of the space fragmentation. in this situation, it could not use slow device space, and it would happen in stupid or avl allocator. Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>

tangwenjun3 · 2017-11-21T12:36:26Z

@ifed01 @xiexingguo
think twice, choose "release already allocated extents"

xiexingguo · 2017-11-22T00:50:55Z

lgtm

tchaikov · 2017-11-23T06:19:18Z

@tangwenjun3 could you please help prepare the luminous backport? as the conflict resolution is non-trivial.

tangwenjun3 · 2017-11-23T07:25:14Z

@tchaikov ok

backport from pr ceph#19030 Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>

Aran85 · 2018-04-20T02:34:08Z

@xiexingguo should we append a perf counter in bluefs to watch the count of falling to slow drive which because of fragmentation? if it is frequently ,i think we should do something else to avoid this.

xiexingguo · 2018-04-20T02:39:51Z

@xiexingguo should we append a perf counter in bluefs to watch the count of falling to slow drive which because of fragmentation? if it is frequently ,i think we should do something else to avoid this.

Yeah, make sense to me

xiexingguo reviewed Nov 20, 2017

View reviewed changes

xiexingguo added bluestore bug-fix labels Nov 20, 2017

tangwenjun3 force-pushed the wip-fix-bluefs-allocate branch 7 times, most recently from 4550cfd to c8899f2 Compare November 20, 2017 11:10

xiexingguo approved these changes Nov 20, 2017

View reviewed changes

xiexingguo added needs-qa needs-backport labels Nov 20, 2017

ifed01 requested changes Nov 20, 2017

View reviewed changes

xiexingguo removed the needs-qa label Nov 21, 2017

tangwenjun3 force-pushed the wip-fix-bluefs-allocate branch 5 times, most recently from 0c8f27d to 22188dd Compare November 21, 2017 10:37

tangwenjun3 force-pushed the wip-fix-bluefs-allocate branch from 22188dd to d4f868a Compare November 21, 2017 12:34

xiexingguo added the needs-qa label Nov 22, 2017

tchaikov added the wip-kefu-testing label Nov 22, 2017

ifed01 approved these changes Nov 22, 2017

View reviewed changes

tchaikov merged commit f21ef22 into ceph:master Nov 23, 2017

tangwenjun3 added a commit to tangwenjun3/ceph that referenced this pull request Nov 23, 2017

os/bluestore: fix the allocate in bluefs

a44afc7

backport from pr ceph#19030 Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>

tangwenjun3 added a commit to tangwenjun3/ceph that referenced this pull request Nov 23, 2017

os/bluestore: fix the allocate in bluefs

f8171d3

backport from pr ceph#19030 Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>

tangwenjun3 mentioned this pull request Nov 23, 2017

os/bluestore: fix the allocate in bluefs #19113

Closed

Aran85 mentioned this pull request Jul 17, 2018

bluefs: append fragment fallback perf counter #21543

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os/bluestore: fix the allocate in bluefs #19030

os/bluestore: fix the allocate in bluefs #19030

tangwenjun3 commented Nov 20, 2017 •

edited by xiexingguo

tchaikov commented Nov 20, 2017

tangwenjun3 commented Nov 20, 2017

xiexingguo Nov 20, 2017 •

edited

xiexingguo left a comment

ifed01 Nov 20, 2017 •

edited

xiexingguo Nov 21, 2017

xiexingguo Nov 21, 2017

tangwenjun3 commented Nov 21, 2017

xiexingguo commented Nov 22, 2017

tchaikov commented Nov 23, 2017

tangwenjun3 commented Nov 23, 2017

Aran85 commented Apr 20, 2018 •

edited

xiexingguo commented Apr 20, 2018

os/bluestore: fix the allocate in bluefs #19030

os/bluestore: fix the allocate in bluefs #19030

Conversation

tangwenjun3 commented Nov 20, 2017 • edited by xiexingguo

tchaikov commented Nov 20, 2017

tangwenjun3 commented Nov 20, 2017

xiexingguo Nov 20, 2017 • edited

Choose a reason for hiding this comment

xiexingguo left a comment

Choose a reason for hiding this comment

ifed01 Nov 20, 2017 • edited

Choose a reason for hiding this comment

xiexingguo Nov 21, 2017

Choose a reason for hiding this comment

xiexingguo Nov 21, 2017

Choose a reason for hiding this comment

tangwenjun3 commented Nov 21, 2017

xiexingguo commented Nov 22, 2017

tchaikov commented Nov 23, 2017

tangwenjun3 commented Nov 23, 2017

Aran85 commented Apr 20, 2018 • edited

xiexingguo commented Apr 20, 2018

tangwenjun3 commented Nov 20, 2017 •

edited by xiexingguo

xiexingguo Nov 20, 2017 •

edited

ifed01 Nov 20, 2017 •

edited

Aran85 commented Apr 20, 2018 •

edited