Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: check new crush for unknown name/type #4726

Merged
merged 8 commits into from Jun 12, 2015

Conversation

tchaikov
Copy link
Contributor

  • check to see if all OSDs in current osdmap are covered by provided
    crush map, reject it if any OSD is absent.

Fixes: #11680
Signed-off-by: Kefu Chai kchai@redhat.com

@tchaikov
Copy link
Contributor Author

per @jecluis this pull request only addresses a small portion of the problem, we should do more check on the new crush map. IIUC, which means we should try to map the concepts of clusters and pools to the ones in crush map, to see if it is capable to help us to route the request to the right place.

@jecluis
Copy link
Member

jecluis commented May 20, 2015

@tchaikov I think we should try to keep this simple and get concerned only about empty maps. I think those will be the main source of potential user errors, as I can see it being more likely for a user to mistakenly inject an empty map than a modified crush map.

Besides, I think we can't assume erroneous behavior when the user purposefully (even if my mistake) removes osds or any buckets from the crushmap and injects it to the cluster. But I'm sure we can assume it was not the user's intention to inject an empty map.

My take on this is to check if the new crushmap is empty, and if so deny that to the user. We can add a '--force-empty-crushmap' and allow it if the user specifies said option, but never by default.

Does someone else has an opposing view? Maybe @liewegas or @athanatos ?

@tchaikov
Copy link
Contributor Author

I think we should try to keep this simple and get concerned only about empty maps. I think those will be the main source of potential user errors, as I can see it being more likely for a user to mistakenly inject an empty map than a modified crush map.
I think we can't assume erroneous behavior when the user purposefully (even if my mistake) removes osds or any buckets from the crushmap and injects it to the cluster.

agreed. but i think monitor should never crash even because of user's mistake. so i added this vulnerable check at the expense of making things more complicated, in hope that monitor can reject a bad crush map, which could bring down this daemon when handling "ceph osd tree".

@jecluis
Copy link
Member

jecluis commented May 21, 2015

Letting the monitor die/crash when the harm is already done is, imo, the only way out of a really really bad situation. I'd rather the monitor dies on 'ceph osd tree' because we have a badly formed crushmap than to cope with it and cause who knows how much damage in the log run.

But the problem here is not the monitor crashing on 'ceph osd tree' when the crush is badly formed; the problem is allowing said map in the first place. But in order to prevent a bad crushmap from being injected, we must know what a bad crushmap looks like, and I for one am not entirely sure what such a map looks like besides "not being empty" -- at least of that one I'm pretty sure.

Yeah, we should make sure the monitor does not die on a whim, just because we expected something on the crushmap and got a null pointer. Your patch does address a portion of that by checking if the crushmap has the osds the osdmap expects. I think we can even make that a hard requirement: if you want to remove osds from your crushmap, you first have to remove them from the osdmap -- makes sense to me and I don't see a scenario in which we would want to allow otherwise.

But I still think this only solves part of the problem.

@tchaikov
Copy link
Contributor Author

Letting the monitor die/crash when the harm is already done is, imo, the only way out of a really really bad situation. I'd rather the monitor dies on 'ceph osd tree' because we have a badly formed crushmap than to cope with it and cause who knows how much damage in the log run.

yes (reluctantly). if we have no better way to prevent our user from doing so.

But the problem here is not the monitor crashing on 'ceph osd tree' when the crush is badly formed; the problem is allowing said map in the first place. But in order to prevent a bad crushmap from being injected, we must know what a bad crushmap looks like, and I for one am not entirely sure what such a map looks like besides "not being empty" -- at least of that one I'm pretty sure.

i start to understand you. actually we do some dry run with the crush rules using CrushTester. but the empty crush map short circuits the tests. yes, i will add a check for empty crush map.

Yeah, we should make sure the monitor does not die on a whim, just because we expected something on the crushmap and got a null pointer. Your patch does address a portion of that by checking if the crushmap has the osds the osdmap expects. I think we can even make that a hard requirement: if you want to remove osds from your crushmap, you first have to remove them from the osdmap -- makes sense to me and I don't see a scenario in which we would want to allow otherwise.

so we can keep my check against osdmap, right ? will see what i can do to make it a hard requirement.

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from fa5d7de to d49d712 Compare May 25, 2015 12:49
@tchaikov tchaikov changed the title mon: validate new crush against osdmap mon: check new crush to for unknown name/type May 25, 2015
@tchaikov
Copy link
Contributor Author

just because we expected something on the crushmap and got a null pointer.

@jecluis i realized that i should check for dangling name references in crush map. if an OSD does not exists in the crush map, there is chance that it is just a stray OSD, and is not added to the crush map yet. so this pull request is updated:

  1. check for dangling references to name/type from items and OSD.0[0] in crush map
  2. put the check code into CrushTester
  3. expose the check function from crushtool
  4. and add the check in OSDMonitor.

[0] this basically checks for an empty crush map.

@tchaikov tchaikov changed the title mon: check new crush to for unknown name/type [DNM] mon: check new crush to for unknown name/type May 25, 2015
@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from d49d712 to d0f0d3c Compare May 25, 2015 14:13
@tchaikov tchaikov changed the title [DNM] mon: check new crush to for unknown name/type mon: check new crush to for unknown name/type May 25, 2015
@ghost ghost self-assigned this May 25, 2015
@ghost
Copy link

ghost commented May 25, 2015

I think the ceph-erasure-code-corpus change is not intended.

run_osd $dir 0 || return 1

local empty_map=$dir/empty_map
:> $empty_map.txt
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cute ;-)

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from d0f0d3c to 5ac55b6 Compare May 25, 2015 16:58
local empty_map=$dir/empty_map
:> $empty_map.txt
./crushtool -c $empty_map.txt -o $empty_map.map || return 1
./ceph osd setcrushmap -i $empty_map.map
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove this line.

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from 5ac55b6 to c29f00f Compare May 25, 2015 17:00
@tchaikov
Copy link
Contributor Author

@dachary thanks for your review. i updated this pull request to address your comments. but joao's assertion still holds, this change only solves part of the problem. per the latest log from Harish, there is chance that we can fail to find an OSD in odsmap, but the OSD does exist in the crush map.

@ghost
Copy link

ghost commented May 25, 2015

After a rados run (needs-qa or equivalent).

Reviewed-by: Loic Dachary <ldachary@redhat.com>

It would be great to have a summary of the discussions in the commit message for the record.

@ghost
Copy link

ghost commented May 25, 2015

@tchaikov the bot failure is because of the rebase which jenkins git can't handle.

@tchaikov
Copy link
Contributor Author

thanks @dachary ! will revise the commit message to help future readers.

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from c29f00f to c971422 Compare May 26, 2015 10:27
@tchaikov
Copy link
Contributor Author

@dachary @jecluis , i added more commits on top of this pull request:

  1. also fail the test if the item id is greater or equal to the given max_id. the max_id comes from osdmap.get_max_osd().
  2. rename "--check-names" to "--check" and pass it to crushtool, in CrushTester::test_with_crushtool().

it is possible that CrushTester::test() crashes when running against a malformed crush map? if yes, the call "crushtool --test" from CrushTester::test_with_crushtool() does not help much, IMO. because CrushTester::test() always returns 0.

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch 2 times, most recently from c406416 to 4fc2dce Compare May 26, 2015 13:37
@ghost
Copy link

ghost commented May 26, 2015

After a rados run (needs-qa or equivalent).

Reviewed-by: Loic Dachary <ldachary@redhat.com>

this series is a nice adventure to read. It could be argued that it needs to be squashed but I like it this way because it explains how the --check option came to be.

@tchaikov
Copy link
Contributor Author

@ktdreyer thanks for your review =) the latest commits should address your comments.

@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from 4fc2dce to 9955b04 Compare May 27, 2015 02:02
@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from 9955b04 to 49e1cdd Compare May 27, 2015 02:06
@tchaikov tchaikov changed the title mon: check new crush to for unknown name/type mon: check new crush for unknown name/type May 27, 2015
* check for dangling bucket name or type names referenced by the
  buckets/items in the crush map.
* also check for the references from Item(0, 0, 0) which does not
  necessarily exist in the crush map under testing. the rationale
  behind this is: the "ceph osd tree" will also print stray OSDs
  whose id is greater or equal to 0. so it would be useful to
  check if the crush map offers the type name indexed by "0"
  (the name of OSDs is always "OSD.{id}", so we don't need to
  look up the name of an OSD item in the crushmap).

Signed-off-by: Kefu Chai <kchai@redhat.com>
* so one is able to verify that the "ceph osd tree" won't chock on the
  new crush map because of dangling name/type references

Signed-off-by: Kefu Chai <kchai@redhat.com>
* the "osd tree dump" command enumerates all buckets/osds found in either the
  crush map or the osd map. but the newly set crushmap is not validated for
  the dangling references, so we need to check to see if any item in new crush
  map is referencing unknown type/name when a new crush map is sent to
  monitor, reject it if any.

Fixes: ceph#11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
add an argument "max_id" for "--check-names" to check if any item
has an id greater or equal to given "max_id" in crush map.

Signed-off-by: Kefu Chai <kchai@redhat.com>
Fixes: ceph#11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
* because "--check" also checks for the max_id

Signed-off-by: Kefu Chai <kchai@redhat.com>
so we don't need to call CrushTester::check_name_maps() in OSDMonitor.cc
anymore.

Fixes: ceph#11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
@tchaikov tchaikov force-pushed the wip-11680-check-empty-crushmap branch from 49e1cdd to c6e6348 Compare May 31, 2015 17:25
jecluis added a commit that referenced this pull request Jun 1, 2015
mon: check new crush for unknown name/type
@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 2, 2015

@jecluis @dachary tested at http://pulpito.ceph.redhat.com/kchai-2015-06-01_00:05:54-rados-wip-kefu-testing---basic-magna, all of the tests but 1 passed.
and the failed test passed at http://pulpito.ceph.redhat.com/kchai-2015-06-02_05:49:24-rados-wip-kefu-testing---basic-magna

$ teuthology-suite --verbose --priority 101 --suite rados --subset 0/20 --suite-branch master --machine-type magna  --distro ubuntu --email tchaikov@gmail.com --ceph wip-kefu-testing
$ teuthology-suite --filter="rados/thrash/{0-size-min-size-overrides/3-size-2-min-size.yaml 1-pg-log-overrides/short_pg_log.yaml clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/fastclose.yaml thrashers/mapgap.yaml workloads/rgw_snaps.yaml}" --priority 101 --suite rados --subset 0/18 --suite-branch master --machine-type magna --distro ubuntu --email tchaikov@gmail.com --ceph wip-kefu-testing

@ghost
Copy link

ghost commented Jun 2, 2015

there is one job still running but it's rados/basic/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/many.yaml tasks/repair_test.yaml} and I don't see how it could be related to this series.

Good to merge unless @jecluis wants to wait for his own run of testing to complete.

Reviewed-by: Loic Dachary <ldachary@redhat.com>

@tchaikov
Copy link
Contributor Author

tchaikov commented Jun 2, 2015

thanks @dachary , will wait for @jecluis 's ack.

@ghost
Copy link

ghost commented Jun 5, 2015

@tchaikov you can merge whenever you feel like it ;-)

tchaikov added a commit that referenced this pull request Jun 12, 2015
mon: check new crush for unknown name/type

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@tchaikov tchaikov merged commit 95ba967 into ceph:master Jun 12, 2015
@tchaikov tchaikov deleted the wip-11680-check-empty-crushmap branch June 12, 2015 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants