Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: add 'cache drop' and 'cache status' commands #24270

Merged
merged 9 commits into from
Oct 12, 2018

Conversation

mogeb
Copy link
Contributor

@mogeb mogeb commented Sep 25, 2018

Fixes: http://tracker.ceph.com/issues/24176

Signed-off-by: Mohamad Gebai mgebai@suse.com

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

@mogeb
Copy link
Contributor Author

mogeb commented Sep 25, 2018

@batrick there is still the cached buffers in Bluestore that can be cleared. I'll make sure to add those as well, but it can be a separate PR.

@@ -1459,6 +1459,9 @@ class ObjectStore {
virtual void generate_db_histogram(Formatter *f) { }
virtual void flush_cache() { }
virtual void dump_perf_counters(Formatter *f) {}
virtual int get_cache_obj_count() {
return -1;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too happy about implementing it that way, but wasn't sure if making it a pure method would make things cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know more about the nitty-gritty of the particular situation than I do, but I'd suggest a pure virtual member function in preference to something that always fails. Always failing /suggests/ that the behavior just isn't valid or available (or belongs in a parent class that may just not exist). That's just my $0.25, and again you know more about the specific situation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah especially since it's returning an integer and the sole user is just blindly adding the value...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIght, these are good points. This function is just going to disappear, I'm working on a different implementation that takes the Formatter as an argument and prints to it what is appropriate for each object store.

@neha-ojha
Copy link
Member

retest this please

@batrick
Copy link
Member

batrick commented Sep 26, 2018

@batrick there is still the cached buffers in Bluestore that can be cleared. I'll make sure to add those as well, but it can be a separate PR.

FWIW, I'd prefer the object store cache also be cleared in this PR.

Note for filestore, it's necessary to write 3 to /proc/sys/vm/drop_caches. If the OSD cannot write to this file then it should return an appropriate error to the caller. Bluestore does not have this problem because the device is opened with O_DIRECT and there are no dentries/inodes to worry about in the Linux dircache.

@mogeb mogeb force-pushed the clear-cache branch 2 times, most recently from ce75fba to 96f3870 Compare September 27, 2018 17:49
@mogeb
Copy link
Contributor Author

mogeb commented Sep 28, 2018

Just as an update, I've added the implementation of clearing Filestore's buffer cache to this PR, and I'm implementing the same for Bluestore (clearing Bluestore's data cache).

int FileStore::flush_cache()
{
string drop_caches_file = "/proc/sys/vm/drop_caches";
int drop_caches_fd = ::open(drop_caches_file.c_str(), O_WRONLY), ret = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add O_CLOEXEC too please.

src/osd/OSD.cc Outdated
else if (prefix == "clear_cache") {
dout(20) << "clearing all caches" << dendl;
// Clear the objectstore's cache - onode for Bluestore, pagecache for Filestore
r = store->flush_cache();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend passing a ostream here which flush_cache can write errors too (like that it can't open /proc/sys/vm/drop_caches). This can then be returned in the MCommandReply message and displayed in the output of ceph tell osd.* clear_cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The derr calls do print out the messages:

> ./bin/ceph tell osd.* drop_cache
Error EACCES: 
osd.0: Error flushing objectstore cache: (13) Permission denied
Error EACCES: 
osd.1: Error flushing objectstore cache: (13) Permission denied

Would going the ostream way cover any case where the derr might be missed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're just seeing the messages printed to the terminal from a vstart cluster? (IOW, all of the daemons' stderr is connected to your terminal so you see all of these derr messages.) I don't think it's supposed to be automatic like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, and it makes sense since these derr messages aren't passed around anywhere. I updated the PR.

src/osd/OSD.cc Outdated
@@ -6287,6 +6293,50 @@ int OSD::_do_command(
probe_smart(devid, ds);
}

else if (prefix == "clear_cache") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use drop cache to be consistent with the MDS #21566.

string drop_caches_file = "/proc/sys/vm/drop_caches";
int drop_caches_fd = ::open(drop_caches_file.c_str(), O_WRONLY), ret = 0;
char buf[2] = "3";
int len = strlen(buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t len

@mogeb mogeb changed the title osd: add clear_cache and get_cache_object_count commands osd: add 'drop cache' and 'get_cache_object_count' commands Oct 1, 2018
Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks okay to me.

src/osd/OSD.cc Outdated
// Clear the objectcontext cache (per PG)
vector<PGRef> pgs;
_get_pgs(&pgs);
for (auto pg: pgs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto& pg

@mogeb
Copy link
Contributor Author

mogeb commented Oct 3, 2018

I'm just changing the get_cache_object_count command to get cache stats and making it a bit better.

@@ -12415,12 +12415,14 @@ void BlueStore::_flush_cache()
// We use a best-effort policy instead, e.g.,
// we don't care if there are still some pinned onodes/data in the cache
// after this command is completed.
void BlueStore::flush_cache()
int BlueStore::flush_cache(ostream *os)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the parameter "os" is just ignored. Consider just not taking it, or perhaps using std::optional<>. (Bare pointer parameters make me sad.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of the virtual method changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mogeb please add override tot his method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already is the override keyword in the header files, this is the .cc file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops. Sorry for the noise.

@@ -1459,6 +1459,9 @@ class ObjectStore {
virtual void generate_db_histogram(Formatter *f) { }
virtual void flush_cache() { }
virtual void dump_perf_counters(Formatter *f) {}
virtual int get_cache_obj_count() {
return -1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah especially since it's returning an integer and the sole user is just blindly adding the value...

src/osd/OSD.cc Outdated
// Clear osd map cache
{
Mutex::Locker l(service.map_cache_lock);
service.map_cache.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of dropping the map caches? This is not a data cache; it's an OSD functionality cache and while probably anybody running a "drop cache" command is going to be okay with what happens, it could cause some problems on an overloaded cluster running at the end...(the OSD map cache is never emptied!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. Especially that it doesn't impact IO directly. I'll remove this part.

@@ -1457,8 +1457,11 @@ class ObjectStore {

virtual void get_db_statistics(Formatter *f) { }
virtual void generate_db_histogram(Formatter *f) { }
virtual void flush_cache() { }
virtual int flush_cache(ostream *os = NULL) { return 0; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should probably return -1 — eg MemStore isn't ever going to be able to flush the cache and may not implement it.

src/osd/OSD.cc Outdated
f->open_object_section("caches_object_count");
f->dump_int("object_ctx", obj_ctx_count);
f->dump_int("objectstore_onode", store_cache_count);
f->dump_int("osd_map", osd_map_count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsfortytwo should we keep printing the stats of the osdmap cache here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess as a counterpart to "drop cache" we probably shouldn't? And I think the number of cached osd maps is exposed as a perfcounter anyway

@@ -2334,6 +2334,24 @@ class BlueStore : public ObjectStore,
int _fsck(bool deep, bool repair);

void set_cache_shards(unsigned num) override;
void dump_cache_stats(Formatter *f) override {
Copy link
Contributor Author

@mogeb mogeb Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsfortytwo @chardan printing the cache stats is now offloaded to the object store. There is no more function get_cache_object_count that can be called from the OSD to get the store's cache stats. I think this makes more sense, since the cache stats mean different things for different object stores.

@mogeb mogeb changed the title osd: add 'drop cache' and 'get_cache_object_count' commands osd: add 'drop cache' and 'get cache stats' commands Oct 3, 2018
@batrick
Copy link
Member

batrick commented Oct 4, 2018

Great work! Thanks @mogeb !

@batrick
Copy link
Member

batrick commented Oct 4, 2018

@mogeb i was trying out this PR and messed up when I asked you to change the command to "drop cache" to match the MDS. We're actually using "cache drop" in the MDS (to group commands by what they manipulate). Think it's reasonable to match? :)

@mogeb
Copy link
Contributor Author

mogeb commented Oct 5, 2018

@batrick no problem at all. :) This one is also on me since I missed that, even though I did go through that PR. The MDS also has a cache status command, should we match that too instead of get cache stats?

@batrick
Copy link
Member

batrick commented Oct 5, 2018

@mogeb cache status sounds great. Let's match that too :)

@mogeb mogeb changed the title osd: add 'drop cache' and 'get cache stats' commands osd: add 'cache drop' and 'cache status' commands Oct 8, 2018
@mogeb
Copy link
Contributor Author

mogeb commented Oct 8, 2018

@batrick done!

src/osd/OSD.cc Outdated
@@ -6020,6 +6020,12 @@ COMMAND("compact",
COMMAND("smart name=devid,type=CephString,req=False",
"runs smartctl on this osd devices. ",
"osd", "rw", "cli,rest")
COMMAND("cache drop",
"Drop all OSD caches",
"osd", "rw", "cli,rest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm actually can we make this require rwx? We don't want normal clients to be able to force a cache drop command since it's got performance implications on all the others as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack and done.

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
…nent

Rename 'drop cache' to 'cache drop' and 'get cache stats' to 'cache
status'.

Signed-off-by: Mohamad Gebai <mgebai@suse.com>
Signed-off-by: Mohamad Gebai <mgebai@suse.com>
@mogeb
Copy link
Contributor Author

mogeb commented Oct 10, 2018

Rebased.

@batrick
Copy link
Member

batrick commented Oct 10, 2018

I still think this is fine as-is but the next step here is to also deliver a "cache drop" command via the OSDMap (as an epoch). We want to be able to tell RADOS clients to drop their caches too.

@gregsfortytwo
Copy link
Member

Woah that is getting to a whole other level of thing @batrick. If you want that let's do it as another PR — that's going to be a bit contentious.

@gregsfortytwo
Copy link
Member

What kind of testing do we want for this? Just stick it in one of the simple "make check" command-running tests, or actually try and verify that caches get dropped?

@mogeb
Copy link
Contributor Author

mogeb commented Oct 10, 2018

or actually try and verify that caches get dropped?

@gregsfortytwo that's why I originally added the cache status command. Looking at rados bench throughput also reflects the caches being dropped.

@batrick
Copy link
Member

batrick commented Oct 10, 2018

Woah that is getting to a whole other level of thing @batrick. If you want that let's do it as another PR — that's going to be a bit contentious.

I wouldn't suggest putting it in this PR. I talked about this in the tracker ticket originally. We need a solution in Ceph to drop the multiple cache levels we have (client, osd, mds, kernel when possible). That's a next step.

@liewegas liewegas merged commit e8f28c2 into ceph:master Oct 12, 2018
liewegas added a commit that referenced this pull request Oct 12, 2018
* refs/pull/24270/head:
	osd: make 'cache drop' command require 'executable' permission
	osd: rename 'drop cache' and 'get cache stats' to group them by component
	doc: add documentation for 'drop cache' and 'get cache stats'
	osd: don't print osdmap cache stats in 'get cache stats' command
	osd: do not clear osdmap cache on 'drop cache' command
	osd: offload dumping cache stats to the object store
	osd: pass a stream to flush_cache commands for more verbosity
	osd: implement flush_cache() method for Filestore
	osd: add clear_cache and get_cache_object_count commands

Reviewed-by: Gregory Farnum <gfarnum@redhat.com>
Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants