New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore: remove unused parameters. #18635
Conversation
@xiexingguo . Can you review this PR? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sounds reasonable to me
retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update PR description why o->flush() is removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the flushes are necessary!
src/os/bluestore/BlueStore.cc
Outdated
@@ -10733,7 +10733,6 @@ int BlueStore::_do_remove( | |||
bool is_gen = !o->oid.is_no_gen(); | |||
_do_truncate(txc, c, o, 0, is_gen ? &maybe_unshared_blobs : nullptr); | |||
if (o->onode.has_omap()) { | |||
o->flush(); | |||
_do_omap_clear(txc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_do_omap_clear has to iterator over the keys in order to delete them, so the flush() is needed here. We could move it into _do_omap_clear for better clarity, perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it use rm_range_keys . so we only consider rm_range_keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry, I am unfamiliar with RocksDB)
@liewegas The root cause here is because there are still chances we are manipulating the omap of the same object in concurrent transaction groups, such as pglogs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example, if following osd_op rm the key which previous op add,but the previous don't submit .so db don't know key exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, jianpeng.
if following osd_op rm the key which previous op add,but the previous don't submit .so db don't know key exist.
Then add_omap_keys and rm_omap_keys should be two separate ops, right? Generally PG should order write ops manipulating the same object, which means rm_omap_keys should not be handled by BlueStore until add_omap_keys is stable, but I guess there might be exceptions(e.g., pglogs?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly.. when i fixed this the test failure i saw was that the pg log was removed but left a single omap key behind, confusing things down the line.
the problem is that in order to implement the rm_range_keys we have to enumerate them which means we have to flush prior writes.
someday when rocksdb let's us enable the RangeDelete we can take these out.. but not until then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's fine to have an add followed by a delete.. just not a range delete that requires us to enumerate in order to name the keys to delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liewegas Thanks for the excellent clarifications:-)
src/os/bluestore/BlueStore.cc
Outdated
@@ -11078,7 +11076,6 @@ int BlueStore::_omap_rmkey_range(TransContext *txc, | |||
{ | |||
const string& prefix = | |||
o->onode.is_pgmeta_omap() ? PREFIX_PGMETA_OMAP : PREFIX_OMAP; | |||
o->flush(); | |||
get_omap_key(o->onode.nid, first, &key_first); | |||
get_omap_key(o->onode.nid, last, &key_last); | |||
txc->t->rm_range_keys(prefix, key_first, key_last); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm_range_keys impl enumerates keys, so we need the flush.
we could perhaps have a KeyValueDB method to tell us whether a proper range delete is implemented and if so skip the flush, but we have that off by default still (rocksdb recommends not using yet iirc?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
the second cleanup patch is good, though! |
@majianpeng Mind dropping the flush commit and repushing? |
7db291e
to
a140996
Compare
@xiexingguo . sorry for later response. updated by your suggestions. BTW, i has a quesiton: You said "Then add_omap_keys and rm_omap_keys should be two separate ops, right? Generally PG should order write ops manipulating the same object, which means rm_omap_keys should not be handled by BlueStore until add_omap_keys is stable, but I guess there might be exceptions(e.g., pglogs?)". How pg do this? by on_disk_lock or others? Thanks! |
@majianpeng I mean flush only works if we want to make sure that the omap changes made by a previous transaction(-group) are now visible in DB and hence the relevant keys can be safely enumerated. It is okay (and possible) to manipulate omap multiple times in a single transaction but it's then caller's responsibility to make sure there is no conflicits. |
@xiexingguo . FileStore don't care this (in a osop: set_omap_keys, rm_omap_keys). |
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
In some situations the IP address the Monitor wants to bind to might not be available yet. This might for example be a IPv6 Address which is still performing DAD or waiting for a Router Advertisement to be send by the Router(s). Have systemd wait for 10s before starting the Mon and increase the amount of times it does so to 5. This allows the system to bring up IP Addresses in the mean time while systemd waits with restarting the Mon. Fixes: ceph#18635 Signed-off-by: Wido den Hollander <wido@42on.com> (cherry picked from commit e73eb8c)
In some situations the IP address the Monitor wants to bind to might not be available yet. This might for example be a IPv6 Address which is still performing DAD or waiting for a Router Advertisement to be send by the Router(s). Have systemd wait for 10s before starting the Mon and increase the amount of times it does so to 5. This allows the system to bring up IP Addresses in the mean time while systemd waits with restarting the Mon. Fixes: ceph#18635 Signed-off-by: Wido den Hollander <wido@42on.com> (cherry picked from commit e73eb8c)
In some situations the IP address the Monitor wants to bind to might not be available yet. This might for example be a IPv6 Address which is still performing DAD or waiting for a Router Advertisement to be send by the Router(s). Have systemd wait for 10s before starting the Mon and increase the amount of times it does so to 5. This allows the system to bring up IP Addresses in the mean time while systemd waits with restarting the Mon. Fixes: ceph#18635 Signed-off-by: Wido den Hollander <wido@42on.com> (cherry picked from commit e73eb8c)
No description provided.