Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore/BlueFS: move release unused extents work in _flush_and_syn_log #17684

Merged
merged 1 commit into from Sep 28, 2017

Conversation

majianpeng
Copy link
Member

Now only in sync_metadata release unused extents. So if allocate extents
failed, we should try relase unused extents.

Signed-off-by: Jianpeng Ma jianpeng.ma@intel.com

alloc[i]->release(p.get_start(), p.get_len());
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to retry the reserve() call, then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actualy, no, this still isn't right.. we have to flush the log before we can release and potentially re-use (i.e. write to) previously referenced extents or else we may crash and have overwritten data we previously referenced.

If we want to improve this then we should be able to move that release dance in sync_metadata() into _sync_and_flush_log(). I.e., we need to repvent a case where an unlink causes us to release extents, reallocate, and overwrite old data before the unlink event has committed to the log..

Copy link
Member Author

@majianpeng majianpeng Sep 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But i have a question: your commit : e0ec06e which introduce deferred release unused extents. I'm not understand. Why? Thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlink A: release extents(a,b,c...)
A allocate: a,b.
But unlink A will update log_t before log_t update when allocate(a,b). If crash, those logs all missed. Which case cause log(allocate) exist but log(release) missed.

@liewegas
Copy link
Member

Earlier versions of the allocators would handle the delay of releases. Released extents would go on a pending_release lease, and after each commit there was a call into allocator telling it prevoiusly queued releases are now safe to actaully release. That commit moved that logic into the caller.

In your example, the problem is that after we alllocate teh blocks we will presumably (over)write them before queueing the log_t record.. which will overwrite content that was there before. We we crash and the unlink (and allocate) are lost, we still have a valid pointer to those blocsk from the old version of A and it's contents has been corrupted.

@majianpeng
Copy link
Member Author

@liewegas , Thanks your detail explain. By your suggestion: move release into _flush_and_sync_log. And call _flush(log_write, true) if reserve faild.

@liewegas
Copy link
Member

The first commit looks good.

The second commit is problematic, because _flush will drop the lock, and most of the _allocate() callers aren't prepared for that. I suspect what actually needs to happen is each of those callers independently needs to handle _allocate() failure instead of asserting, and be adjusted one-by-one.

Honestly, though, I'm not sure it is worth the effort. With the first patch it seems very unlikely that we'll be in a situation where we need to allocate and have enough space in the pending free list to continue.

@majianpeng majianpeng changed the title os/bluestore/BlueFS: if allocate failed, try release unused extents. os/bluestore/BlueFS: move release unused extents work in _flush_and_syn_log Sep 15, 2017
ASAP to release unused extents.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
@liewegas
Copy link
Member

This looks right to me but bluefs data safety is critical so I'd like a few more eyes on it! Thanks

Copy link
Member

@xiexingguo xiexingguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm also

@liewegas liewegas merged commit bf8a109 into ceph:master Sep 28, 2017
runsisi pushed a commit to runsisi/ceph that referenced this pull request Mar 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants