-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: add 'cache drop' and 'cache status' commands #24270
Conversation
@batrick there is still the cached buffers in Bluestore that can be cleared. I'll make sure to add those as well, but it can be a separate PR. |
src/os/ObjectStore.h
Outdated
@@ -1459,6 +1459,9 @@ class ObjectStore { | |||
virtual void generate_db_histogram(Formatter *f) { } | |||
virtual void flush_cache() { } | |||
virtual void dump_perf_counters(Formatter *f) {} | |||
virtual int get_cache_obj_count() { | |||
return -1; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too happy about implementing it that way, but wasn't sure if making it a pure method would make things cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know more about the nitty-gritty of the particular situation than I do, but I'd suggest a pure virtual member function in preference to something that always fails. Always failing /suggests/ that the behavior just isn't valid or available (or belongs in a parent class that may just not exist). That's just my $0.25, and again you know more about the specific situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah especially since it's returning an integer and the sole user is just blindly adding the value...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RIght, these are good points. This function is just going to disappear, I'm working on a different implementation that takes the Formatter as an argument and prints to it what is appropriate for each object store.
retest this please |
FWIW, I'd prefer the object store cache also be cleared in this PR. Note for filestore, it's necessary to write |
ce75fba
to
96f3870
Compare
Just as an update, I've added the implementation of clearing Filestore's buffer cache to this PR, and I'm implementing the same for Bluestore (clearing Bluestore's data cache). |
src/os/filestore/FileStore.cc
Outdated
int FileStore::flush_cache() | ||
{ | ||
string drop_caches_file = "/proc/sys/vm/drop_caches"; | ||
int drop_caches_fd = ::open(drop_caches_file.c_str(), O_WRONLY), ret = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add O_CLOEXEC
too please.
src/osd/OSD.cc
Outdated
else if (prefix == "clear_cache") { | ||
dout(20) << "clearing all caches" << dendl; | ||
// Clear the objectstore's cache - onode for Bluestore, pagecache for Filestore | ||
r = store->flush_cache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend passing a ostream
here which flush_cache
can write errors too (like that it can't open /proc/sys/vm/drop_caches
). This can then be returned in the MCommandReply
message and displayed in the output of ceph tell osd.* clear_cache
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The derr
calls do print out the messages:
> ./bin/ceph tell osd.* drop_cache
Error EACCES:
osd.0: Error flushing objectstore cache: (13) Permission denied
Error EACCES:
osd.1: Error flushing objectstore cache: (13) Permission denied
Would going the ostream
way cover any case where the derr
might be missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're just seeing the messages printed to the terminal from a vstart cluster? (IOW, all of the daemons' stderr is connected to your terminal so you see all of these derr messages.) I don't think it's supposed to be automatic like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, and it makes sense since these derr
messages aren't passed around anywhere. I updated the PR.
src/osd/OSD.cc
Outdated
@@ -6287,6 +6293,50 @@ int OSD::_do_command( | |||
probe_smart(devid, ds); | |||
} | |||
|
|||
else if (prefix == "clear_cache") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use drop cache
to be consistent with the MDS #21566.
src/os/filestore/FileStore.cc
Outdated
string drop_caches_file = "/proc/sys/vm/drop_caches"; | ||
int drop_caches_fd = ::open(drop_caches_file.c_str(), O_WRONLY), ret = 0; | ||
char buf[2] = "3"; | ||
int len = strlen(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise looks okay to me.
src/osd/OSD.cc
Outdated
// Clear the objectcontext cache (per PG) | ||
vector<PGRef> pgs; | ||
_get_pgs(&pgs); | ||
for (auto pg: pgs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto& pg
I'm just changing the |
@@ -12415,12 +12415,14 @@ void BlueStore::_flush_cache() | |||
// We use a best-effort policy instead, e.g., | |||
// we don't care if there are still some pinned onodes/data in the cache | |||
// after this command is completed. | |||
void BlueStore::flush_cache() | |||
int BlueStore::flush_cache(ostream *os) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the parameter "os" is just ignored. Consider just not taking it, or perhaps using std::optional<>. (Bare pointer parameters make me sad.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature of the virtual method changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mogeb please add override
tot his method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There already is the override
keyword in the header files, this is the .cc file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops. Sorry for the noise.
src/os/ObjectStore.h
Outdated
@@ -1459,6 +1459,9 @@ class ObjectStore { | |||
virtual void generate_db_histogram(Formatter *f) { } | |||
virtual void flush_cache() { } | |||
virtual void dump_perf_counters(Formatter *f) {} | |||
virtual int get_cache_obj_count() { | |||
return -1; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah especially since it's returning an integer and the sole user is just blindly adding the value...
src/osd/OSD.cc
Outdated
// Clear osd map cache | ||
{ | ||
Mutex::Locker l(service.map_cache_lock); | ||
service.map_cache.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of dropping the map caches? This is not a data cache; it's an OSD functionality cache and while probably anybody running a "drop cache" command is going to be okay with what happens, it could cause some problems on an overloaded cluster running at the end...(the OSD map cache is never emptied!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good point. Especially that it doesn't impact IO directly. I'll remove this part.
src/os/ObjectStore.h
Outdated
@@ -1457,8 +1457,11 @@ class ObjectStore { | |||
|
|||
virtual void get_db_statistics(Formatter *f) { } | |||
virtual void generate_db_histogram(Formatter *f) { } | |||
virtual void flush_cache() { } | |||
virtual int flush_cache(ostream *os = NULL) { return 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should probably return -1 — eg MemStore isn't ever going to be able to flush the cache and may not implement it.
src/osd/OSD.cc
Outdated
f->open_object_section("caches_object_count"); | ||
f->dump_int("object_ctx", obj_ctx_count); | ||
f->dump_int("objectstore_onode", store_cache_count); | ||
f->dump_int("osd_map", osd_map_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregsfortytwo should we keep printing the stats of the osdmap cache here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess as a counterpart to "drop cache" we probably shouldn't? And I think the number of cached osd maps is exposed as a perfcounter anyway
@@ -2334,6 +2334,24 @@ class BlueStore : public ObjectStore, | |||
int _fsck(bool deep, bool repair); | |||
|
|||
void set_cache_shards(unsigned num) override; | |||
void dump_cache_stats(Formatter *f) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregsfortytwo @chardan printing the cache stats is now offloaded to the object store. There is no more function get_cache_object_count
that can be called from the OSD to get the store's cache stats. I think this makes more sense, since the cache stats mean different things for different object stores.
Great work! Thanks @mogeb ! |
@mogeb i was trying out this PR and messed up when I asked you to change the command to "drop cache" to match the MDS. We're actually using "cache drop" in the MDS (to group commands by what they manipulate). Think it's reasonable to match? :) |
@batrick no problem at all. :) This one is also on me since I missed that, even though I did go through that PR. The MDS also has a |
@mogeb |
@batrick done! |
src/osd/OSD.cc
Outdated
@@ -6020,6 +6020,12 @@ COMMAND("compact", | |||
COMMAND("smart name=devid,type=CephString,req=False", | |||
"runs smartctl on this osd devices. ", | |||
"osd", "rw", "cli,rest") | |||
COMMAND("cache drop", | |||
"Drop all OSD caches", | |||
"osd", "rw", "cli,rest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually can we make this require rwx? We don't want normal clients to be able to force a cache drop command since it's got performance implications on all the others as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack and done.
Fixes: http://tracker.ceph.com/issues/24176 Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
…nent Rename 'drop cache' to 'cache drop' and 'get cache stats' to 'cache status'. Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Rebased. |
I still think this is fine as-is but the next step here is to also deliver a "cache drop" command via the OSDMap (as an epoch). We want to be able to tell RADOS clients to drop their caches too. |
Woah that is getting to a whole other level of thing @batrick. If you want that let's do it as another PR — that's going to be a bit contentious. |
What kind of testing do we want for this? Just stick it in one of the simple "make check" command-running tests, or actually try and verify that caches get dropped? |
@gregsfortytwo that's why I originally added the |
I wouldn't suggest putting it in this PR. I talked about this in the tracker ticket originally. We need a solution in Ceph to drop the multiple cache levels we have (client, osd, mds, kernel when possible). That's a next step. |
* refs/pull/24270/head: osd: make 'cache drop' command require 'executable' permission osd: rename 'drop cache' and 'get cache stats' to group them by component doc: add documentation for 'drop cache' and 'get cache stats' osd: don't print osdmap cache stats in 'get cache stats' command osd: do not clear osdmap cache on 'drop cache' command osd: offload dumping cache stats to the object store osd: pass a stream to flush_cache commands for more verbosity osd: implement flush_cache() method for Filestore osd: add clear_cache and get_cache_object_count commands Reviewed-by: Gregory Farnum <gfarnum@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Fixes: http://tracker.ceph.com/issues/24176
Signed-off-by: Mohamad Gebai mgebai@suse.com