Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: RGWMetaSyncCR holds refs to stacks instead of crs #10301

Merged
merged 1 commit into from Jul 16, 2016

Conversation

cbodley
Copy link
Contributor

@cbodley cbodley commented Jul 14, 2016

because RGWCoroutine::wakeup() calls RGWCoroutinesStack::wakeup(), the
stack must also stay alive. change the reference from
RGWMetaSyncShardControlCR to RGWCoroutinesStack and wake up the stacks
directly

Fixes: http://tracker.ceph.com/issues/16666

Signed-off-by: Casey Bodley cbodley@redhat.com

@yehudasa
Copy link
Member

@cbodley looks fine, should get it through teuthology + test/rgw

@cbodley
Copy link
Contributor Author

cbodley commented Jul 15, 2016

@mattbenjamin
Copy link
Contributor

So this is the version that takes a different ref, not 2 refs. Was this the most conservative option?

@cbodley
Copy link
Contributor Author

cbodley commented Jul 15, 2016

So this is the version that takes a different ref, not 2 refs. Was this the most conservative option?

That was the plan we discussed yesterday, yeah. But I was able to goose the sync process to test this case directly yesterday, and verified that we don't need to keep the RGWMetaSyncShardControlCR around in order to call wakeup() on its RGWCoroutineStack.

I did this by modifying RGWMetaSyncShardCR to return an error on startup for all shards > 0, so that shard 0 would keep the parent RGWMetaSyncCR open after the other shards were drained and freed. I then created buckets on another gateway to trigger the mdlog_notify calls, and the ones that mapped onto the finished shards were successful - wakeup() was called on the stack without segfaulting.

@yehudasa does that make sense to you? if not, i'm happy to keep the existing reference on RGWMetaSyncShardControlCR to be safe

@mattbenjamin
Copy link
Contributor

If holding the extra ref is harmless, it seems potentially sensible to do it at least temporarily, unless there seems like greater risk of not returning all refs. I'm not arguing for magic incantations, however :(

@yehudasa
Copy link
Member

@cbodley @mattbenjamin yeah, I agree with Matt on this one. Just to be on the safe side, let's hold the extra ref for now, get it backported to Jewel like that and open a ticket to remind us to change it. Once we're happy with stability on master we can remove the second ref and see how things are going.

because RGWCoroutine::wakeup() calls RGWCoroutinesStack::wakeup(), the
stack must also stay alive

Fixes: http://tracker.ceph.com/issues/16666

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Jul 15, 2016

updated to hold both refs. passing test_multi.py

@yehudasa yehudasa merged commit 3e57754 into ceph:master Jul 16, 2016
@cbodley cbodley deleted the wip-rgw-meta-stack-wakeup branch July 18, 2016 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants