New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: Thrasher: eliminate a race between kill_osd and __init__ #13237

Merged
merged 1 commit into from Feb 5, 2017

Conversation

Projects
None yet
2 participants
@smithfarm
Contributor

smithfarm commented Feb 2, 2017

If Thrasher.__init__() spawns the do_thrash thread before initializing the
ceph_objectstore_tool property, do_thrash races with the rest
of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before
Trasher.__init__() progresses much further. This can lead to an exception
("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'")
being thrown in kill_osd().

This commit eliminates the race by making sure the ceph_objectstore_tool
attribute is initialized before the do_thrash thread is spawned.

Fixes: http://tracker.ceph.com/issues/18799
Signed-off-by: Nathan Cutler ncutler@suse.com

tests: Thrasher: eliminate a race between kill_osd and __init__
If Thrasher.__init__() spawns the do_thrash thread before initializing the
ceph_objectstore_tool property, do_thrash races with the rest
of Thrasher.__init__() and in some cases do_thrash can call kill_osd() before
Trasher.__init__() progresses much further. This can lead to an exception
("AttributeError: Thrasher instance has no attribute 'ceph_objectstore_tool'")
being thrown in kill_osd().

This commit eliminates the race by making sure the ceph_objectstore_tool
attribute is initialized before the do_thrash thread is spawned.

Fixes: http://tracker.ceph.com/issues/18799
Signed-off-by: Nathan Cutler <ncutler@suse.com>

@smithfarm smithfarm requested a review from liewegas Feb 2, 2017

@liewegas liewegas requested a review from zmc Feb 2, 2017

@liewegas liewegas removed the request for review from zmc Feb 2, 2017

@liewegas

This comment has been minimized.

Member

liewegas commented Feb 2, 2017

awesome, thanks!

@liewegas liewegas added the needs-qa label Feb 2, 2017

@smithfarm

This comment has been minimized.

Contributor

smithfarm commented Feb 3, 2017

Pushed wip-18799 to ceph-ci and will run some thrash tests on it.

@smithfarm

This comment has been minimized.

Contributor

smithfarm commented Feb 3, 2017

This job is the closest one I could find in master to the one that exhibited the failure. Running it 10 times:

./virtualenv/bin/teuthology-suite -k distro --priority 101 --suite rados/thrash --email ncutler@suse.com --ceph wip-18799 --machine-type smithi --filter="{0-size-min-size-overrides/2-size-1-min-size.yaml 1-pg-log-overrides/normal_pg_log.yaml clusters/{fixed-2.yaml openstack.yaml} fs/btrfs.yaml hobj-sort.yaml msgr-failures/fastclose.yaml msgr/async.yaml objectstore/bluestore.yaml rados.yaml rocksdb.yaml thrashers/default.yaml workloads/cache-agent-big.yaml}" --num 10

pass http://pulpito.ceph.com:80/smithfarm-2017-02-03_18:32:49-rados:thrash-wip-18799-distro-basic-smithi/

@liewegas liewegas merged commit 5fc3dd3 into ceph:master Feb 5, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@smithfarm smithfarm deleted the smithfarm:wip-18799 branch Feb 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment