Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephfs: potential adjust failure in lru_expire #19277

Merged
merged 1 commit into from Dec 18, 2017
Merged

Conversation

taodd
Copy link
Contributor

@taodd taodd commented Dec 1, 2017

Fix: the first 'adjust' is not needed,it will never take real effect.
the second 'adjust' may never get the chance to be executed
suppose we can reach the second 'adjust', it will crash because the bottom list is empty here.

Signed-off-by: dongdong tao tdd21151186@gmail.com

@ukernel ukernel added cephfs Ceph File System cleanup labels Dec 1, 2017
@ukernel
Copy link
Contributor

ukernel commented Dec 1, 2017

The first adjust() can move item from 'top' to 'bottom'. The second adjust() seems uesless.

@taodd
Copy link
Contributor Author

taodd commented Dec 1, 2017

in the first adjust, the toplen and topwant never changed,

@taodd
Copy link
Contributor Author

taodd commented Dec 1, 2017

so the item from toplist would not move to bottom list

@ukernel
Copy link
Contributor

ukernel commented Dec 1, 2017

I think you are right. but we should call adjust() at very beginning of lru_get_next_expire(). because lru_pin() and lru_unpin() do not call adjust()/

@taodd
Copy link
Contributor Author

taodd commented Dec 1, 2017

I see your point, will add it to the beginning..

// look through tail of bot
while (bottom.size()) {
LRUObject *p = bottom.back();
if (!p->lru_pinned) return p;

// move to pintail
pintail.push_front(&p->lru_link);
adjust();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this adjust() is necessary because the size of bottom is changing so it may be necessary to move objects from top to bottom.

@@ -174,7 +174,6 @@ class LRU {

// move to pintail
pintail.push_front(&p->lru_link);
adjust();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop probably isn't well tested because bottom will be empty if we are in this loop. I think you're right it may be possible for adjust() to crash if we remove an item from top here. Is that the problem you found originally? If that's the case, then we should fix adjust() to handle bottom.size() == 0, not remove this call here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int adjust():
uint64_t toplen = top.size();
uint64_t topwant = (midpoint * (double)(lru_get_size() - num_pinned));

both lru_get_size() and num_pinned do not change when moving item to pintail

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Thanks Zheng.

Copy link
Contributor Author

@taodd taodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi batrick
please refer to zheng's comment for the first adjust .
about the second adjust
I know if we could hit the second adjust we need to handle bottom_lru.size==0 in adjust. but the thing is i think the second adjust will never be hitted.
the only condition would make that happen is the bottom lru does not contain any unpined item and the top lru contains at least one pined item. according to the adjust algoritm, we can do some math, that condition would never happen. let's suppose we got a coner case by setting midpoint to 1.0, which means size of top lru would be the number of unpined item, and, since the bottom does not contain any unpined item , so the unpined items would be all in top lru, which proves there is no room for pinned item in top lru. so even this case we can not get that condition.

Copy link
Contributor Author

@taodd taodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi batrick
please refer to zheng's comment for the first adjust .
about the second adjust
I know if we could hit the second adjust we need to handle bottom_lru.size==0 in adjust. but the thing is i think the second adjust will never be hitted.
the only condition would make that happen is the bottom lru does not contain any unpined item and the top lru contains at least one pined item. according to the adjust algoritm, we can do some math, that condition would never happen. let's suppose we got a coner case by setting midpoint to 1.0, which means size of top lru would be the number of unpined item, and, since the bottom does not contain any unpined item , so the unpined items would be all in top lru, which proves there is no room for pinned item in top lru. so even this case we can not get that condition.

@batrick
Copy link
Member

batrick commented Dec 7, 2017

You are right the adjust(); is unnecessary there. I think however we can actually remove all of them because we are not moving items between top and bottom or pinning objects (which would require adjustment of the midpoint). So I think the proper patch is to just remove all of the calls.

I agree with you otherwise. Thanks for the explanation.

@taodd
Copy link
Contributor Author

taodd commented Dec 7, 2017

@batrick do you mean i should remove the "adjust();" at the beginning of this function too ?

@batrick
Copy link
Member

batrick commented Dec 10, 2017

No, nevermind. We don't call adjust() after pinning an object; i.e. if lru_pin() is called before lru_get_next_expire().. Your patch is right as-is.

@taodd
Copy link
Contributor Author

taodd commented Dec 10, 2017

@batrick thank you !

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taodd please update the commit message to add:

Fixes: http://tracker.ceph.com/issues/22458

Fix: the first adjust is no needed,it will never take real effect.
     the second 'adjust' may never get the chance to be executed
     suppose we can reach the second 'adjust', it will crash because the bottom list is empty now.

Fixes: http://tracker.ceph.com/issues/22458

Signed-off-by: dongdong tao <tdd21151186@gmail.com>
@taodd
Copy link
Contributor Author

taodd commented Dec 16, 2017

@batrick updated

@batrick batrick merged commit 590c39e into ceph:master Dec 18, 2017
batrick added a commit that referenced this pull request Dec 18, 2017
* refs/pull/19277/head:
	cephfs: potential adjust failure in lru_expire

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
Reviewed-by: Zheng Yan <zyan@redhat.com>
@taodd taodd deleted the wip-lru branch December 26, 2017 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System cleanup
Projects
None yet
3 participants