Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: simplify Onode pin/unpin logic. #32852

Merged
merged 6 commits into from Jul 11, 2020

Conversation

ifed01
Copy link
Contributor

@ifed01 ifed01 commented Jan 24, 2020

This patch makes pin/unpin logic more straightforward and
hopefully safe in multi-threading.

Also it eliminates Onode's pin_list as one doesn't need it to track
pinned Onodes.

OnodeCacheShard reference has been removed from onode too.

cached/pinned flags have been introduced. They're accessed under cache lock
hence original race has been totally addressed.

Signed-off-by: Igor Fedotov ifedotov@suse.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@ifed01 ifed01 force-pushed the wip-ifed-simplify-pin branch 3 times, most recently from 2186a13 to dd34e97 Compare January 24, 2020 23:15
@liewegas
Copy link
Member

This looks great! I think it's strictly better than the current master.

Reviewing this made me realize one thing though that I missed before: currently (and with this PR), if you have an onode somewhere in the middle of the LRU, and you take and release a reference, we forget its LRU position. I'm not sure offhand if we have any really important callers that behave that way, but certainly things like scrub come to mind.

I think what we want is something a bit more targetted at trimming: if trim encounters a pinned inode, only then does it change it to the pinned state. That way, when we put(), we can safely put it back at the tail of the LRU. And/or, we could even make _touch set a flag so that the unpin step puts it at the LRU top instead. What do you think? (Perhaps for later?)

@markhpc
Copy link
Member

markhpc commented Jan 24, 2020

@ifed01 I haven't looked closely at this yet, but after a quick skim, does this go back to (potentially) scanning the entire onode list again during trim? That was exactly the problem the original PR was trying to address (though I question if it's worth the complexity at this point).

@ifed01
Copy link
Contributor Author

ifed01 commented Jan 24, 2020

@ifed01 I haven't looked closely at this yet, but after a quick skim, does this go back to (potentially) scanning the entire onode list again during trim? That was exactly the problem the original PR was trying to address (though I question if it's worth the complexity at this point).

@markhpc - no, this PR 'physically' removes pinned onode from cache while logically (i.e. corresponding 'cached' flag in onode is set) it's still their. And doesn't put pinned onode into pin_list but marks it with pinned flag instead.
Hence trim doesn't see pinned onodes

Copy link
Member

@jdurgin jdurgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks simpler to me. In terms of trimming, this keeps the unpinned onodes always at a maximum count, and lets pinned ones build up outside the cache. If there was a bug that ended up not unpinning things this could cause memory growth, but that seems unlikely.

In practice I think this will work well for the cache trimming + keeping memory use down, since the amount of pinned onodes should be small (related to in-flight I/O).

I like the validator function idea, it keeps the related code together nicely.

@jdurgin
Copy link
Member

jdurgin commented Jan 25, 2020

@liewegas I think we'd be better off addressing scrub or other background/single-use callers specifically by making them not cache (https://trello.com/c/3OZ417Si/516-bluestore-prevent-scrub-from-polluting-onode-cache)

It'd be good to run this through some perf tests anyway to see if there's any detectable difference (I'm not sure whether the lack of lru position will matter in practice)

@ifed01 ifed01 changed the title os/bluestore: simplify Onode pin/unpin logic. DNM: os/bluestore: simplify Onode pin/unpin logic. Jan 25, 2020
@ifed01
Copy link
Contributor Author

ifed01 commented Jan 25, 2020

Marking DNM/WIP as I can still see some occasional failures.

@ifed01 ifed01 changed the title DNM: os/bluestore: simplify Onode pin/unpin logic. DNM/WIP: os/bluestore: simplify Onode pin/unpin logic. Jan 25, 2020
@markhpc
Copy link
Member

markhpc commented Jan 25, 2020

@ifed01 I haven't looked closely at this yet, but after a quick skim, does this go back to (potentially) scanning the entire onode list again during trim? That was exactly the problem the original PR was trying to address (though I question if it's worth the complexity at this point).

@markhpc - no, this PR 'physically' removes pinned onode from cache while logically (i.e. corresponding 'cached' flag in onode is set) it's still their. And doesn't put pinned onode into pin_list but marks it with pinned flag instead.
Hence trim doesn't see pinned onodes

Oh, getting them out of the cache entirely is quite clever! I will try to look at this more closely next week but that sounds excellent.

@ifed01 ifed01 changed the title DNM/WIP: os/bluestore: simplify Onode pin/unpin logic. DNM: os/bluestore: simplify Onode pin/unpin logic. Jan 28, 2020
@ifed01 ifed01 force-pushed the wip-ifed-simplify-pin branch 4 times, most recently from 34e9f81 to be070c7 Compare January 28, 2020 15:43
@ifed01 ifed01 changed the title DNM: os/bluestore: simplify Onode pin/unpin logic. os/bluestore: simplify Onode pin/unpin logic. Jan 28, 2020
@ifed01
Copy link
Contributor Author

ifed01 commented Jan 28, 2020

This passed ceph_test_objectstore tests, so looks ready for final review.

@liewegas
Copy link
Member

liewegas commented Feb 2, 2020

2020-02-01T23:27:11.523 INFO:tasks.ceph.osd.1.smithi037.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/huge/release/15.1.0-157-g6c94e43/rpm/el8/BUILD/ceph-15.1.0-157-g6c94e43/src/os/bluestore/BlueStore.cc: In function 
'void BlueStore::_flush_cache()' thread 14800700 time 2020-02-01T23:27:11.520837+0000
2020-02-01T23:27:11.523 INFO:tasks.ceph.osd.1.smithi037.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/huge/release/15.1.0-157-g6c94e43/rpm/el8/BUILD/ceph-15.1.0-157-g6c94e43/src/os/bluestore/BlueStore.cc: 15293: FAILE
D ceph_assert(i->empty())
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: ceph version 15.1.0-157-g6c94e43 (6c94e430dfa71cec909b03cd5b299abfbf1a3943) octopus (rc)
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x614444]
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: 2: (()+0x50c65e) [0x61465e]
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: 3: (BlueStore::_flush_cache()+0xe1c) [0xbdb5ac]
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: 4: (BlueStore::umount()+0x294) [0xbfaa54]
2020-02-01T23:27:11.576 INFO:tasks.ceph.osd.1.smithi037.stderr: 5: (OSD::shutdown()+0xe44) [0x7087e4]
2020-02-01T23:27:11.577 INFO:tasks.ceph.osd.1.smithi037.stderr: 6: (OSD::handle_signal(int)+0x14c) [0x70905c]
2020-02-01T23:27:11.577 INFO:tasks.ceph.osd.1.smithi037.stderr: 7: (SignalHandler::entry()+0x783) [0xcfa513]
2020-02-01T23:27:11.577 INFO:tasks.ceph.osd.1.smithi037.stderr: 8: (()+0x82de) [0xc23a2de]
2020-02-01T23:27:11.577 INFO:tasks.ceph.osd.1.smithi037.stderr: 9: (clone()+0x43) [0xd4d54b3]
2020-02-01T23:27:11.581 INFO:tasks.ceph.osd.1.smithi037.stderr:*** Caught signal (Aborted) **

/a/sage-2020-02-01_20:59:41-rados-wip-sage-testing-2020-02-01-1055-distro-basic-smithi/4725021

@liewegas
Copy link
Member

liewegas commented Feb 2, 2020

also from ObjectStore/StoreTest.AttrSynthetic/2
/a/sage-2020-02-01_20:59:41-rados-wip-sage-testing-2020-02-01-1055-distro-basic-smithi/4725032

@stale
Copy link

stale bot commented Jul 2, 2020

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Jul 2, 2020
This patch makes in/unpin logic more straightforward and
hopefully safe in multi-threading.

Also it eliminates Onode's pin_list as one doesn't need it to track
pinned Onodes.

OnodeCacheShard reference has been removed from onode too.

cached/pinned flags have been introduced. They're accessed under cache lock
hence original race has been totally addressed.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
Onode pinning does the same for us.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
This to be squashed with HEAD~2 later, left standalone to ease
reviewing for now.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
@stale stale bot removed the stale label Jul 6, 2020
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
@ifed01
Copy link
Contributor Author

ifed01 commented Jul 6, 2020

@tchaikov - would you give this another QA try, please?

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
Signed-off-by: Igor Fedotov <ifedotov@suse.com>
@ifed01
Copy link
Contributor Author

ifed01 commented Jul 9, 2020

jenkins test make check

@ifed01
Copy link
Contributor Author

ifed01 commented Jul 9, 2020

@tchaikov - honestly I don't understand what's broken in scrubbing if any. But I've made some changes so could you please give this PR another try?

@tchaikov
Copy link
Contributor

jenkins test make check

@ifed01
Copy link
Contributor Author

ifed01 commented Jul 10, 2020

jenkins test make check

@tchaikov
Copy link
Contributor

jenkins test make check

@tchaikov tchaikov merged commit ca77d2e into ceph:master Jul 11, 2020
@ifed01 ifed01 deleted the wip-ifed-simplify-pin branch July 11, 2020 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants