New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore: fix the allocate in bluefs #19030
Conversation
@tangwenjun3 please do not add "x" permission bit to the source file, unless it's supposed to be executable. |
@tchaikov fix |
extents.reserve(4); // 4 should be (more than) enough for most allocations | ||
alloc_len = alloc[id]->allocate(left, min_alloc_size, hint, &extents); | ||
} | ||
if (r < 0 || (alloc_len < (int64_t)left)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tangwenjun3 If reserve() is okay but we fail to allocate(), we should call unreserve() first before we switch to try other devices(e.g., slow), right? otherwise there is potential space leaks..
4550cfd
to
c8899f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
src/os/bluestore/BlueFS.cc
Outdated
} | ||
if (r < 0 || (alloc_len < (int64_t)left)) { | ||
if (r == 0) | ||
alloc[id]->unreserve(left); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably unreserve the difference between left and alloc_len and release already allocated extents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right.
release already allocated extents
But instead of doing this, I think we can just push the allocated extents into ev as we technically support mix-usage of space from different devices:
if (r ==0) {
alloc[id]->unreserve(left - alloc_len);
for (auto& p : extents) {
bluefs_extent_t e = bluefs_extent_t(id, p.offset, p.length);
if (!ev->empty() &&
ev->back().bdev == e.bdev &&
ev->back().end() == (uint64_t) e.offset) {
ev->back().length += e.length;
} else {
ev->push_back(e);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, on a second thought, "release already allocated extents" perhaps is a better choice as supporting a single request of allocating space from multiple devices is a little bit too complicated...
0c8f27d
to
22188dd
Compare
when bluefs succeed to reserve but failed to alloc in db space, it will cause a assert, just because of the space fragmentation. in this situation, it could not use slow device space, and it would happen in stupid or avl allocator. Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>
22188dd
to
d4f868a
Compare
@ifed01 @xiexingguo |
lgtm |
@tangwenjun3 could you please help prepare the luminous backport? as the conflict resolution is non-trivial. |
@tchaikov ok |
backport from pr ceph#19030 Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>
backport from pr ceph#19030 Signed-off-by: tangwenjun <tang.wenjun3@zte.com.cn>
@xiexingguo should we append a perf counter in bluefs to watch the count of falling to slow drive which because of fragmentation? if it is frequently ,i think we should do something else to avoid this. |
Yeah, make sense to me |
when bluefs succeed to reserve but failed to alloc in db space,
it will cause a assert, just because of the space fragmentation.
in this situation, it could not use slow device space,and it
would happen in stupid or bitmap allocator.
Signed-off-by: tangwenjun tang.wenjun3@zte.com.cn