New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: fix fsck deferred_replay #15295

Merged
merged 9 commits into from May 31, 2017

Conversation

Projects
None yet
2 participants
@liewegas
Member

liewegas commented May 25, 2017

_deferred_replay needs the kv_sync_thread to complete IOs; start them
just for that, but then shut them down again. (We might revisit that
later if/when fsck does any sort of repair.)

Signed-off-by: Sage Weil sage@redhat.com

@liewegas

This comment has been minimized.

Show comment
Hide comment
@liewegas

liewegas May 26, 2017

Member
2017-05-26T09:20:59.890 INFO:teuthology.orchestra.run.smithi184.stderr:/build/ceph-12.0.2-1672-gfb8de47/src/common/Thread.cc: In function 'int Thread::join(void**)' thread 7f7bbfb8cc80 time 2017-05-26 09:20:59.889745
2017-05-26T09:21:00.104 INFO:teuthology.orchestra.run.smithi184.stderr:/build/ceph-12.0.2-1672-gfb8de47/src/common/Thread.cc: 159: FAILED assert("join on thread that was never started" == 0)
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: ceph version 12.0.2-1672-gfb8de47 (fb8de4772870beb90f0abb75406ea2abb706dcd5)
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5639a990e7f2]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 2: (Thread::join(void**)+0xc9) [0x5639a9a03c49]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 3: (BlueStore::_kv_stop()+0x8e) [0x5639a97e03ee]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 4: (BlueStore::fsck(bool)+0x514) [0x5639a97d6984]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 5: (BlueStore::mkfs()+0x10f0) [0x5639a97d51e0]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 6: (OSD::mkfs(CephContext*, ObjectStore*, std::__cxx11::basic_string, std::allocator > const&, uuid_d, int)+0x166) [0x5639a9333116]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 7: (main()+0xf88) [0x5639a9282b58]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 8: (__libc_start_main()+0xf0) [0x7f7bbd001830]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 9: (_start()+0x29) [0x5639a930b1a9]
Member

liewegas commented May 26, 2017

2017-05-26T09:20:59.890 INFO:teuthology.orchestra.run.smithi184.stderr:/build/ceph-12.0.2-1672-gfb8de47/src/common/Thread.cc: In function 'int Thread::join(void**)' thread 7f7bbfb8cc80 time 2017-05-26 09:20:59.889745
2017-05-26T09:21:00.104 INFO:teuthology.orchestra.run.smithi184.stderr:/build/ceph-12.0.2-1672-gfb8de47/src/common/Thread.cc: 159: FAILED assert("join on thread that was never started" == 0)
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: ceph version 12.0.2-1672-gfb8de47 (fb8de4772870beb90f0abb75406ea2abb706dcd5)
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5639a990e7f2]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 2: (Thread::join(void**)+0xc9) [0x5639a9a03c49]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 3: (BlueStore::_kv_stop()+0x8e) [0x5639a97e03ee]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 4: (BlueStore::fsck(bool)+0x514) [0x5639a97d6984]
2017-05-26T09:21:00.106 INFO:teuthology.orchestra.run.smithi184.stderr: 5: (BlueStore::mkfs()+0x10f0) [0x5639a97d51e0]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 6: (OSD::mkfs(CephContext*, ObjectStore*, std::__cxx11::basic_string, std::allocator > const&, uuid_d, int)+0x166) [0x5639a9333116]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 7: (main()+0xf88) [0x5639a9282b58]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 8: (__libc_start_main()+0xf0) [0x7f7bbd001830]
2017-05-26T09:21:00.107 INFO:teuthology.orchestra.run.smithi184.stderr: 9: (_start()+0x29) [0x5639a930b1a9]

@liewegas liewegas requested a review from markhpc May 28, 2017

@liewegas

This comment has been minimized.

Show comment
Hide comment

liewegas added some commits May 26, 2017

os/bluestore: fix fsck deferred_replay
_deferred_replay needs the kv_sync_thread to complete IOs; start them
just for that, but then shut them down again.  (We might revisit that
later if/when fsck does any sort of repair.)

Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: wait for kv thread to start before stopping it
Otherwise we can assert out when we try to join a thread that
hasn't started.

- move everything into _kv_start() and _kv_stop()
- separate stop bools for each thread
- wait until thread starts before signalling stop (and potentially calling
join()).

Signed-off-by: Sage Weil <sage@redhat.com>
buffer: make wasted() const
Remove useless assert (we'll segv on the next line anyway).

Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: rebuild Buffer buffers with too much waste
Avoid pinning extra memory by rebuilding Buffer buffers when we waste too
much.

Signed-off-by: Sage Weil <sage@redhat.com>
ceph-bluestore-tool: init deep = false
Signed-off-by: Sage Weil <sage@redhat.com>
include/cpp-btree/btree_set: add btree_set
Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: fsck: use btree_set to replace set<uint64_t>
Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: deep decode onode value
In particular, we want the attrs (map<string,bufferptr>) to be a deep
decode so that we do not pin this buffer, and so that any changed attr
will free the previous memory.

Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: bluestore_debug_fsck_abort
Abort fsck early to get a massif result.

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas

This comment has been minimized.

Show comment
Hide comment
@liewegas

liewegas May 31, 2017

Member

passes tests, needs review

Member

liewegas commented May 31, 2017

passes tests, needs review

@ifed01

ifed01 approved these changes May 31, 2017

@liewegas liewegas merged commit e4f156f into ceph:master May 31, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@liewegas liewegas deleted the liewegas:wip-bluestore-fsck branch May 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment