New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: mon crashes when "ceph osd tree 85 --format json" #4936

Merged
merged 8 commits into from Jul 10, 2015

Conversation

Projects
None yet
4 participants
@tchaikov
Contributor

tchaikov commented Jun 12, 2015

@tchaikov tchaikov added this to the hammer milestone Jun 12, 2015

@ghost ghost assigned theanalyst Jun 12, 2015

@ghost ghost added bug fix core labels Jun 12, 2015

@ktdreyer

This comment has been minimized.

Member

ktdreyer commented Jun 29, 2015

I think the make-check bot failure above is spurious. Can you please re-push so it will trigger a new build attempt?

ktdreyer referenced this pull request Jul 2, 2015

mon: remove unused variable
* as a side effect, this change silences
  http://tracker.ceph.com/issues/11576

Fixes: #11576
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e7b196a)

tchaikov added some commits May 25, 2015

mon: validate new crush for unknown names
* the "osd tree dump" command enumerates all buckets/osds found in either the
  crush map or the osd map. but the newly set crushmap is not validated for
  the dangling references, so we need to check to see if any item in new crush
  map is referencing unknown type/name when a new crush map is sent to
  monitor, reject it if any.

Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a955f36)
crush/CrushTester: add check_name_maps() method
* check for dangling bucket name or type names referenced by the
  buckets/items in the crush map.
* also check for the references from Item(0, 0, 0) which does not
  necessarily exist in the crush map under testing. the rationale
  behind this is: the "ceph osd tree" will also print stray OSDs
  whose id is greater or equal to 0. so it would be useful to
  check if the crush map offers the type name indexed by "0"
  (the name of OSDs is always "OSD.{id}", so we don't need to
  look up the name of an OSD item in the crushmap).

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit b75384d)
crushtool: add the "--check-names" option
* so one is able to verify that the "ceph osd tree" won't chock on the
  new crush map because of dangling name/type references

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d6b46d4)
crush/CrushTester: check if any item id is too large
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e640d89)
crushtool: enable check against max_id
add an argument "max_id" for "--check-names" to check if any item
has an id greater or equal to given "max_id" in crush map.

Note: edited since we do not have the fix introduced in 46103b2 in
      hammer.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d0658dd)
mon: check the new crush map against osdmap.max_osd
Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 22e6bd6)
crushtool: rename "--check-names" to "--check"
* because "--check" also checks for the max_id

Note: edited since we do not have the fix introduced in 46103b2 in
      hammer.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9381d53)
mon: add "--check" to CrushTester::test_with_crushtool()
so we don't need to call CrushTester::check_name_maps() in OSDMonitor.cc
anymore.

Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit c6e6348)
@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jul 10, 2015

@ktdreyer @theanalyst removed the commit included in #5122 from this pr and repushed.

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jul 10, 2015

since this pr has been tested per http://tracker.ceph.com/issues/11990, before the commit from #5122 was removed. it's good to merge along with #5122 .

tchaikov added a commit that referenced this pull request Jul 10, 2015

Merge pull request #4936 from ceph/wip-11975-hammer
mon crashes when "ceph osd tree 85 --format json"

Reviewed-by: Kefu Chai <kchai@redhat.com>

@tchaikov tchaikov merged commit 7f1fb57 into hammer Jul 10, 2015

@ghost

This comment has been minimized.

ghost commented Jul 10, 2015

It looks like the bot failure is an actual problem. See also http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-trusty-i386-basic/log.cgi?log=552772025cb8d5f51ffb3a069d1bd93bc73f1123. I think to remember a pull request fixed racing code in ceph-helpers or something. I'll dig into this.

tchaikov added a commit to tchaikov/ceph that referenced this pull request Jul 11, 2015

tests: TEST_crush_reject_empty must not run a mon
* Back in Hammer, the osd-crush.sh individual tests did not run the
  monitor, it was taken care of by the run() function. An attempt to run
  another mon fails with:

  error: IO lock testdir/osd-crush/a/store.db/LOCK: Resource temporarily
  unavailable

  This problem was introduced by cc1cc03
  from ceph#4936
* replace test/mon/mon-test-helpers.sh with test/ceph-helpers.sh as
  we need run_osd() in this newly added test

http://tracker.ceph.com/issues/11975 Refs: ceph#11975

Signed-off-by: Loic Dachary <ldachary@redhat.com>

tchaikov added a commit to tchaikov/ceph that referenced this pull request Jul 11, 2015

tests: TEST_crush_reject_empty must not run a mon
* Back in Hammer, the osd-crush.sh individual tests did not run the
  monitor, it was taken care of by the run() function. An attempt to run
  another mon fails with:

  error: IO lock testdir/osd-crush/a/store.db/LOCK: Resource temporarily
  unavailable

  This problem was introduced by cc1cc03
  from ceph#4936
* replace test/mon/mon-test-helpers.sh with test/ceph-helpers.sh as
  we need run_osd() in this newly added test
* update the run-dir of commands: ceph-helpers.sh use the different
  convention for the run-dir of daemons.

http://tracker.ceph.com/issues/11975 Refs: ceph#11975

Signed-off-by: Loic Dachary <ldachary@redhat.com>

@ghost ghost changed the title from mon crashes when "ceph osd tree 85 --format json" to mon: mon crashes when "ceph osd tree 85 --format json" Aug 4, 2015

@tchaikov tchaikov deleted the wip-11975-hammer branch Aug 11, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment