-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crimson/os/seastore/rbm: make rbm support multiple shards #51770
Conversation
src/crimson/os/seastore/random_block_manager/block_rb_manager.h
Outdated
Show resolved
Hide resolved
src/crimson/os/seastore/random_block_manager/block_rb_manager.h
Outdated
Show resolved
Hide resolved
7bec68c
to
a8549f3
Compare
a8549f3
to
4303ba5
Compare
@cyx1231st @liu-chunmei Can you take a look? The goal of this PR is to align RBM with recent multi shard updates. |
src/crimson/os/seastore/random_block_manager/nvme_block_device.cc
Outdated
Show resolved
Hide resolved
src/crimson/os/seastore/random_block_manager/nvme_block_device.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change commit type error: multiple shards
ae44dea
to
9d367e5
Compare
src/crimson/os/seastore/random_block_manager/block_rb_manager.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except for one error need be fixed.
9d367e5
to
8e061ad
Compare
@liu-chunmei Done. |
@cyx1231st Please take a look when you are available. |
return RBM_SUPERBLOCK_SIZE; | ||
} | ||
|
||
rbm_abs_addr get_journal_start() { | ||
return shard_info.start_offset + get_journal_start_off(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why each shard's journal needs to reserve a RBM_SUPERBLOCK_SIZE
?
IIUC there is only one superblock for all the shards, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. There is a superblock in RBM. Please see the below.
I'm thinking of a backup super block like other filesystems. This might help to check inconsistency and recovery. So, this part is reserved for now. Also, this can be used as the extension of the existing primary's superblock if the superblock size exceeds the block size. If we want to save the amount of space (super block size x #shard number) by removing a super block in other shards, this is also fine with me for now. But, we need to change the disk layout entirely if those blocks are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering we still need to adjust the layout in the future.
We need to finally implement dynamic partition across shards, which means the data size for each shard may become dynamic in the future.
In that case, probably starting from | superblock(s) | circular journals | data |
and static partition the data for each shard can be more generic ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on that we need a further consideration for dynamic partition. I think there are two options as below. But, it seems that I need a study on the dynamic partition to find out pros and cons---there are already many studies, such as [1], that did a research on sharded partitions.
Option 1 (initial version, but working): | super 1 | journal 1 | data 1 | - | super 2 (reserved) | journal 2 | data 2 |
Option 2: | superblocks(s) | journal(s) | data |
Before doing that, how about merging this PR as is. I'll continue doing this work to find what the better solution is.
[1] Scaling a file system to many cores using an operation log. SOSP 2017
src/crimson/os/seastore/random_block_manager/nvme_block_device.cc
Outdated
Show resolved
Hide resolved
8e061ad
to
e0d9b60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for comments
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
… multi shard change Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
e0d9b60
to
7a15d48
Compare
@cyx1231st Addressed. Besides, please note the the following work items I have planed to working on.
|
jenkins test api |
1 similar comment
jenkins test api |
Signed-off-by: Myoungwon Oh myoungwon.oh@samsung.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows