Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: Fix race condition in Onode:put() #48566

Closed
wants to merge 1 commit into from

Conversation

taodd
Copy link
Contributor

@taodd taodd commented Oct 20, 2022

This race condition happens when an Onode is unpinned in one "put" thread and being trimmed right away(after the cache lock is released) by another thread

The race happens like this:

  1. Firstly nref = 2, pinned = true
  2. Thread A calls put() and --nref =1, entering into the if condtion. then Get the cache lock, and unpin the Onode, release the cache lock, then at the same time another thread B start to trim this Onode before A reaches if(nref==0 && pn==0) {}, at this point of time the nref = 1 and put_nref = 0
  3. Thread B calls put(), and then decrease the nref to 0 and then we get nref == 0 and put_nref = 0 at the same time for both thread A and thread B when they enter into the body of if(nref==0 && pn==0).
  4. The same Onode will be deleted twice in thread A or B (the order isn't defined)

Fixes: https://tracker.ceph.com/issues/57895

Signed-off-by: dongdong tao dongdong.tao@canonical.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

The race condition happens when an Onode is unpinned in one "put" thread and being trimmed
right away(after the cache lock is released) by another thread

Fixes: https://tracker.ceph.com/issues/57895

Signed-off-by: dongdong tao <dongdong.tao@canonical.com>
@taodd taodd requested a review from a team as a code owner October 20, 2022 09:34
@taodd
Copy link
Contributor Author

taodd commented Oct 20, 2022

Please help to review carefully, I really don't want to break anything, as this is the core code of bluestore.
About put_nref change, I think I didn't break the fix for #43770.
But I'm not so sure about this #44311. at least, I'm not able to poke a hole in it yet, I may miss something

@markhpc
Copy link
Member

markhpc commented Oct 20, 2022

@taodd Thanks a ton for digging into this. You are right, this is fairly core code and this isn't the first time we've hit issues here. Added a couple of the relevant folks as potential reviewers.

@markhpc
Copy link
Member

markhpc commented Oct 20, 2022

@ifedo01 mentioned he has a (bigger) PR that also may fix this issue, but I haven't looked it over yet: #47702

@taodd taodd mentioned this pull request Oct 24, 2022
14 tasks
ocs->lock.unlock();
}
auto pn = --put_nref;
if (nref == 0 && pn == 0) {
if (n == 0 && put_nref == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all - it's not mandatory that onode is cached and hence potentially two threads might own onode independently with no references from c->onode_map.
So we might have two threads owning onode with nref == 2. Finally both threads call put()....
Case 1:
Thread A makes nref ==1 and goes through if(n == 1) block but before it reaches --put_nref thread B falls through and deletes onode... At this point thread A operates on a released onode.

Case 2:
Thread A makes nref == 1 and reaches n = nref. At this point thread B makes nref == 0 and falls through to put() return - with no onode release due to put_nref != 0. Then thread A continues and bypasses delete due to n == 1 as well. Hence Onode is leaking...

Generally my idea behind put_ref increment/decrement is that it has to "wrap" other manipulations on onode within put(). So its increment has to be the first op in put() and decrement to be the last one before the delete. I haven't achieved that completely but looks like you're moving even further from the original idea...

So we don't have good enough fix for now :(

Copy link
Contributor Author

@taodd taodd Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your review.
I did really assume caching onode is mandatory -- an Onode had to be added to the cache before it could be referenced.
Could you please give me some examples that the Onode might be referenced without adding to cache ?( is it deep-scrub ? ) thanks a lot :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taodd - I don't have any real-life examples under my hand atm. But generally:
a) it's available with the current onode design.
b) it's a bad practice to have a dependency between Onode use case (e.g. whether we put it into the cache or not) and its life-cycle tracking(aka ref counting). That latter has to be completely use case agnostic. Even if we don't actively use this mode at the moment - one can start using it in the future...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taodd - please take another look at #47702
I reworked my patch and it looks better now ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

such as two onode cached in two threads, but the onode is trimed.

@yanqiang-ux
Copy link
Contributor

@taodd I have a question, why onode put do not use a onode lock to make judgement and delete a atomic operation ?

@taodd
Copy link
Contributor Author

taodd commented Dec 29, 2022

This PR can be closed actually, Please see #47702

@taodd taodd closed this Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants