New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip 18533 tool #13423

Merged
merged 17 commits into from Mar 27, 2017

Conversation

Projects
None yet
7 participants
@dzafman
Member

dzafman commented Feb 14, 2017

Improve DBObjectMap testing
Improvements for ceph_osdomap_tool
Add ceph_test_object_map to make check
Simplify rm_keys() by eliminating the use of complete mappings and just copy on clone

@dzafman

This comment has been minimized.

Member

dzafman commented Feb 14, 2017

@athanatos If this is merged before your branch, then Jenkins testing for your DBOjbectMap fixes will be meaningful.

@dzafman dzafman requested a review from athanatos Feb 15, 2017

@dzafman

This comment has been minimized.

Member

dzafman commented Feb 17, 2017

retest this please.

@dzafman dzafman changed the title from Wip 18533 tool to [DNM] Wip 18533 tool Feb 22, 2017

@dzafman

This comment has been minimized.

Member

dzafman commented Feb 22, 2017

@athanatos @jdurgin I will just copy all keys on clone and add a rename. I will keep the commit from Sam in case we ever wanted to revert the clone/rename change.

@athanatos

This comment has been minimized.

Contributor

athanatos commented Feb 23, 2017

@dzafman You have to deal with existing objects which are children of parents even if you don't create new ones, so you still need to fix the bug in rmkeys.

@dzafman

This comment has been minimized.

Member

dzafman commented Feb 23, 2017

@athanatos Conceivably, at the start of rm_keys I could replace the full set of keys, remove the complete mapping and then perform the removal.

@dzafman

This comment has been minimized.

Member

dzafman commented Mar 1, 2017

@jdurgin I tried to make the whole DBObjectMap code easier to work with. We are moving to BlueStore so I'm less concern about extra clone overhead for filestore. I'm leaving @athanatos commits that fix rm_keys() in case we wanted to revert d1b021d . Another idea would be to take a hybrid approach and keep clone fast, but require the extra overhead in rm_keys() by copying and breaking parent connection.

@@ -161,6 +161,7 @@ int main(int argc, char **argv) {
r = omap.check(std::cout);

This comment has been minimized.

@dmick

dmick Mar 2, 2017

Member

commit msg has typo "ceoh"

@tchaikov tchaikov self-requested a review Mar 6, 2017

@dzafman

This comment has been minimized.

Member

dzafman commented Mar 6, 2017

Testing passed:

dzafman-2017-03-03_16:32:03-rados-wip-18533-tool-distro-basic-smithi
879980 BUG #15912 Bluestore ENOSPC - already fixed
880000 BUG #18929 - already fixed
880012 BUG #19099 valgrind OSD slow start - already fixed

@dzafman

This comment has been minimized.

Member

dzafman commented Mar 6, 2017

I renamed old clone call to legacy_clone() so my commit that we could reverse is c92c9f2 if we want the original clone performance.

@dzafman dzafman changed the title from [DNM] Wip 18533 tool to Wip 18533 tool Mar 6, 2017

@dzafman dzafman removed the request for review from athanatos Mar 6, 2017

desc.add_options()
("help", "produce help message")
("omap-path", po::value<string>(&store_path),
"path to mon directory, mandatory (current/omap usually)")
("paranoid", "use paranoid checking")
("oid", po::value<string>(&oid), "Restrict to this object id")

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

add ' when dumping objects' since it only applies to dump-objects and dump-objects-with-keys

@@ -276,7 +276,7 @@ class DBObjectMap : public ObjectMap {
uint64_t parent;
uint64_t num_children;
coll_t c;
coll_t c; // Not used

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

could remove and use a dummy coll_t in encode/decode

t->rmkeys_by_prefix(USER_PREFIX + header_key(header.seq) + COMPLETE_PREFIX);
db->submit_transaction(t);
out << "Cleared complete mapping to repair" << std::endl;
} else

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

need braces for the else clause - easy to mistake it for part of the loop otherwise

} else if (cmd == "check") {
r = omap.check(std::cout);
} else if (cmd == "check" || cmd == "repair") {
bool repair = false;

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

repair = (cmd == "repair")

@@ -180,6 +180,12 @@ int main(int argc, char **argv) {
}
for (auto i : headers)
std::cout << i << std::endl;
} else if (cmd == "corrupt") {

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

where is this used? do you have a test script using it that isn't in this branch?

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

it would be helpful to have a test that uses this, and checks that the repair op works on the corrupted store

This comment has been minimized.

@dzafman

dzafman Mar 9, 2017

Member

I haven't added testing of the "corrupt" command of ceph-osdomap-tool. It doesn't mean anything after my last commit because we no longer use complete mappings.

KeyValueDB::Transaction t = db->get_transaction();
t->rmkeys_by_prefix(USER_PREFIX + header_key(header.seq) + COMPLETE_PREFIX);
db->submit_transaction(t);
out << "Cleared complete mapping to repair" << std::endl;

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

the commit message should have an explanation of why clearing the complete mapping is the right way to repair this:

  • why do we not need to remove the reference to the parent?
  • why will queries be correct after this?
  • what are our assumptions about the state of the keyvaluedb here?

This comment has been minimized.

@dzafman

dzafman Mar 9, 2017

Member

@jdurgin I added this to the commit comment for the repair:

Under some situations the previous rm_keys() code would
generated a corrupt complete table. There is no way to
figure out what the table should look like now. By removing
the entries we fix the corruption and aren't much worse off
because the corruption caused some deleted keys to re-appear.

This doesn't breaking the parent/child relationship during
repair because some of the keys may still be contained
in the parent.

if (destination) {
if (check_spos(to, destination, spos))
return 0;
return -EEXIST;

This comment has been minimized.

@jdurgin

jdurgin Mar 8, 2017

Member

returning -EEXIST in this case differs from clone - can the caller in filestore handle that? is that case possible?

This comment has been minimized.

@dzafman

dzafman Mar 9, 2017

Member

I put the same code here as in clone() when the target already exists. I also added to a test case for this code path.

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Mar 20, 2017

@dzafman ack, the PR passed my rados qa run also.

@dzafman

This comment has been minimized.

Member

dzafman commented Mar 24, 2017

@tchaikov Please review on Monday. If you have questions maybe we can discuss Monday night (my morning).

@dzafman

This comment has been minimized.

Member

dzafman commented Mar 24, 2017

@dmick Please indicate if the LRC is running stable with the 11.2.0 version of this change.

@dmick

This comment has been minimized.

Member

dmick commented Mar 25, 2017

No problems so far

dzafman and others added some commits Feb 7, 2017

tools: Add --oid option to ceph-osdomap-tool
Signed-off-by: David Zafman <dzafman@redhat.com>
osd: Remove unnecessary assert and assignment in DBObjectMap
Fix and add comment(s)

Signed-off-by: David Zafman <dzafman@redhat.com>
tools: Add dump-headers command to ceph-osdomap-tool
Signed-off-by: David Zafman <dzafman@redhat.com>
tools: Check for overlaps in internal "complete" table for DBObjectMap
Changed check to return an error count and fix tool error message

Signed-off-by: David Zafman <dzafman@redhat.com>
test_object_map: Use ASSERT_EQ() for check() so failure doesn't stop …
…testing

Signed-off-by: David Zafman <dzafman@redhat.com>
osd: DBOjectMap::check: Dump complete mapping when inconsistency found
Signed-off-by: David Zafman <dzafman@redhat.com>
ceph-osdomap-tool: Add --debug and only show internal logging if enabled
Signed-off-by: David Zafman <dzafman@redhat.com>
test_object_map: add tests to trigger some bugs related to 18533
Signed-off-by: Samuel Just <sjust@redhat.com>
DBObjectMap: fix next_parent()
The previous implementation assumed that
lower_bound(parent_iter->key()) always leaves the iterator
on_parent().  There isn't any guarantee, however, that that
key isn't present on the child as well.

Signed-off-by: Samuel Just <sjust@redhat.com>
DBObjectMap: strengthen in_complete_region post condition
Previously, in_complete_region didn't guarantee anything about
where it left complete_iter pointing.  It will be handy for
complete_iter to be pointing at the lowest interval which ends
after to_test.  Make it so.

Signed-off-by: Samuel Just <sjust@redhat.com>
DBObjectMap: rewrite rm_keys and merge_new_complete
Leverage the updated in_complete_region and needs_parent to simplify
these methods.

Signed-off-by: Samuel Just <sjust@redhat.com>
test: Add ceph_test_object_map to make check tests
Signed-off-by: David Zafman <dzafman@redhat.com>
ceph-osdomap-tool: Fix tool exit status
Signed-off-by: David Zafman <dzafman@redhat.com>
osd: Add automatic repair for DBObjectMap bug
Add repair command to ceph-osdomap-tool too

Under some situations the previous rm_keys() code would
generated a corrupt complete table.  There is no way to
figure out what the table should look like now.  By removing
the entries we fix the corruption and aren't much worse off
because the corruption caused some deleted keys to re-appear.

This doesn't breaking the parent/child relationship during
repair because some of the keys may still be contained
in the parent.

Signed-off-by: David Zafman <dzafman@redhat.com>
ceph-osdomap-tool: Fix seg fault with large amount of check error output
Signed-off-by: David Zafman <dzafman@redhat.com>
osd: Simplify DBObjectMap by no longer creating complete tables
Bump the version for new maps to 3
Make clone less efficient but simpler
Add rename operation (use instead of clone/unlink)
For now keep code that understands version 2 maps

Signed-off-by: David Zafman <dzafman@redhat.com>
filestore, tools: Fix logging of DBObjectMap check() repairs
Signed-off-by: David Zafman <dzafman@redhat.com>
@dzafman

This comment has been minimized.

Member

dzafman commented Mar 27, 2017

@jdurgin Rebased due to new conflicts. This has had only a review by you. It seems to be ok running on the LRC. Should we merge it if it passes Jenkins run?

@jdurgin jdurgin added the core label Mar 27, 2017

@dzafman dzafman merged commit 4c8ec8a into ceph:master Mar 27, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@dzafman dzafman deleted the dzafman:wip-18533-tool branch Mar 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment