New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore: dynamic CF configuration; put pglog omap in separate CF #18224
Conversation
I only scanned this briefly so please feel free to disregard if I'm smoking crack. This feels a bit off to me. I don't like leaking the RocksDB column family implementation details out into bluestore via the KeyValueDB interface unless we have a really good reason (Maybe we do!). I guess I'd favor keeping this out of bluestore and simply having configurables at the rocksdb KeyvalueDB level to put different prefixes in different column families (or even shard the DB or whatever). Personally I would want the KeyValueDB to look like as much of a black box as possible (that ultimately we still control, but bluestore knows nothing about other than the simplest API necessary for storing keys/values). OTOH, if there are good reasons for doing it this way, I'm cool with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in perf meeting, Adam and Sage think the abstraction is right (ie bluestore knows what should be put into CFs). I don't like it, but I'm willing to approve. :)
@@ -10914,8 +10955,10 @@ int BlueStore::_omap_setheader(TransContext *txc, | |||
} else { | |||
txc->note_modified_object(o); | |||
} | |||
const string& prefix = | |||
o->onode.is_pgmeta_omap() ? PREFIX_PGMETA_OMAP : PREFIX_OMAP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to set FLAG_PGMETA_OMAP first, see BlueStore::_omap_setkeys
_do_omap_clear(txc, | ||
newo->onode.is_pgmeta_omap() ? PREFIX_PGMETA_OMAP | ||
: PREFIX_OMAP, | ||
newo->onode.nid); | ||
} | ||
if (oldo->onode.has_omap()) { | ||
dout(20) << __func__ << " copying omap data" << dendl; | ||
if (!newo->onode.has_omap()) { | ||
newo->onode.set_omap_flag(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to set FLAG_PGMETA_OMAP too. Do we really get the chance to ever clone a pgmeta object?
@@ -952,7 +952,8 @@ struct bluestore_onode_t { | |||
uint8_t flags = 0; | |||
|
|||
enum { | |||
FLAG_OMAP = 1, | |||
FLAG_OMAP = 1, ///< object may have omap data | |||
FLAG_PGMETA_OMAP = 2, ///< omap data is in meta omap prefix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't see the necessity of introducing an extra FLAG_PGMETA_OMAP flag.
Wouldn't the "o->oid.is_pgmeta()" checking be sufficient for the same purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have to distinguish between existing objects (in the normal omap space) and new objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right.
Signed-off-by: Sage Weil <sage@redhat.com>
…ksdb_cfs Signed-off-by: Sage Weil <sage@redhat.com>
36e7a6f
to
234700a
Compare
No description provided.