Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/filestore/HashIndex: be loud about splits #12421

Merged
merged 1 commit into from Mar 3, 2017

Conversation

cernceph
Copy link

@cernceph cernceph commented Dec 9, 2016

This ML thread "[ceph-users] filestore_split_multiple hardcoded maximum" has coincided with ~1 week of slow requests on one of our clusters. After much debugging we finally tracked it down to filestore splitting.

It would have been much easier to understand the root cause if filestore splitting was verbose. Splitting is a rare but rather important event -- the OSD should be super verbose when it happens. (Otherwise, it simply looks like a filestore write is taking many 10s of seconds, which seems absurd).

So I'm sending this patch for comments. I'm not sure if the dout is needed in the _create function, and I didn't make the merges verbose yet (which I guess is also needed?)

If this makes it to master, I'd appreciate a backport to jewel. (filestore will lose relevance anyway in L and beyond).

@smithfarm
Copy link
Contributor

smithfarm commented Dec 9, 2016

@cernceph Could you open a tracker issue for this? You can flag the issue for jewel backport while doing that, just fill in Backport: jewel.

If you open the tracker, please also add a line Fixes: http://tracker.ceph.com/issues/... right above the Signed-off-to line.

@liewegas liewegas added the core label Dec 12, 2016
@liewegas
Copy link
Member

I would use dout(1) instead of dout(0) (which will also go to the log by default). Otherwise, LGTM!

Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls change dout(0) to dout(1)

Filestore splits are a rare yet important enough event that an
OSD should visibly report when they happen.

Without this reporting an operator could spend hours trying to
understand the cause of any split-induced slow requests.

Fixes: http://tracker.ceph.com/issues/18235
Signed-off-by: Dan van der Ster <daniel.vanderster@cern.ch>
@tchaikov
Copy link
Contributor

@liewegas @cernceph has addressed your comment, so it's ready for needs-qa, i guess?

@liewegas liewegas changed the title RFC: os/filestore/HashIndex: be loud about splits os/filestore/HashIndex: be loud about splits Feb 28, 2017
@liewegas liewegas merged commit 39e6ad1 into ceph:master Mar 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants