Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephfs: Permit recovering metadata into a new RADOS pool #10636

Merged
merged 3 commits into from Apr 10, 2017

Conversation

Projects
None yet
3 participants
@fullerdj
Copy link
Contributor

fullerdj commented Aug 9, 2016

Add a procedure that permits reconstructing metadata in a potentially
damaged cephfs metadata pool and writing the results into a
freshly-initialized pool that refers to the same data pool. Add option
flags to override checks that would ordinarily prevent this and add
options to the recovery tools to write output to a separate pool instead of
the one selected for recovery. See docs/cephfs/disaster-recovery.rst for
details.

cf. https://github.com/fullerdj/ceph-qa-suite/tree/wip-djf-15069

Fixes: http://tracker.ceph.com/issues/15068
Fixes: http://tracker.ceph.com/issues/15069
Signed-off-by: Douglas Fuller dfuller@redhat.com

::

ceph fs flag set enable_multiple true --yes-i-really-mean-it
rados mkpool recovery

This comment has been minimized.

Copy link
@jcsp

jcsp Aug 10, 2016

Contributor

Let's not encourage people to use "rados mkpool", "ceph osd pool create" is nicer and forces them to choose sane PG counts

cephfs-journal-tool --rank recovery-fs:0 journal reset --force

After recovery, some recovered directories will have incorrect link counts. Set
mds_debug_scatterstat=false to prevent the MDS from checking the link counts,

This comment has been minimized.

Copy link
@jcsp

jcsp Aug 10, 2016

Contributor

mds_debug_scatterstat is false by default, so we should say something like "ensure mds_debug_scatterstat is set to false (the default)" to avoid confusing people who have never heard of the setting, and leave it out of the ceph-mds commandline so that people can just start the MDS as normal


::

ceph-mds -i a --mds_debug_scatterstat=false

This comment has been minimized.

Copy link
@jcsp

jcsp Aug 10, 2016

Contributor

In general MDSs will still be running even if the system is damaged, so let's just note here that we assume that you have a running MDS and should start one if it isn't already running, and use that MDS's ID in the subsequent scrub_path daemon command

if ((data_pools.find(data) != data_pools.end()
|| fs.second->mds_map.metadata_pool == metadata)
&& ((!cmd_getval(g_ceph_context, cmdmap, "allow_overlay", sure)
|| sure != "--allow-dangerous-metadata-overlay"))) {

This comment has been minimized.

Copy link
@jcsp

jcsp Aug 10, 2016

Contributor

My original thought was that we would only permit this if one of the filesystems is marked damaged, so that there was no way for the user to have two running MDSs pointing at the same pool. We should probably still do that, or at least enforce that cluster_down is set for one of the filesystems at all times -- users are careless with the --this-is-so-dangerous-i-cant-even flags.

@fullerdj fullerdj force-pushed the fullerdj:wip-djf-15069 branch from 08e78a0 to 5ffcf40 Aug 11, 2016

@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Sep 6, 2016

I'm seeing failures on TestJournalRepair.test_inject_to_empty, and TestDataScan.test_stashed_layout that are probably attributable to this PR.
http://pulpito.ceph.com/jspray-2016-09-05_10:01:52-fs-wip-jcsp-testing-20160902-testing-basic-mira/400996
http://pulpito.ceph.com/jspray-2016-09-05_10:01:52-fs-wip-jcsp-testing-20160902-testing-basic-mira/400938

@jcsp jcsp assigned fullerdj and unassigned jcsp Sep 6, 2016

@fullerdj fullerdj force-pushed the fullerdj:wip-djf-15069 branch from 5ffcf40 to 27d00d3 Sep 6, 2016

@fullerdj

This comment has been minimized.

Copy link
Contributor Author

fullerdj commented Sep 6, 2016

Fixed, thanks.

@fullerdj fullerdj assigned jcsp and unassigned fullerdj Sep 6, 2016

@jcsp

This comment has been minimized.

@jcsp jcsp assigned fullerdj and unassigned jcsp Sep 14, 2016

@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Sep 20, 2016

@fullerdj can you look into that test failure? Is that test passing locally for you?

@fullerdj

This comment has been minimized.

Copy link
Contributor Author

fullerdj commented Sep 20, 2016

Okay, I've updated the ceph-qa-suite PR. It passes for me locally either way, so I'll try to fire off a teuthology run to verify.

Failure; I'll continue to investigate.

@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Mar 8, 2017

@fullerdj could you revive this one please?

@fullerdj

This comment has been minimized.

Copy link
Contributor Author

fullerdj commented Mar 8, 2017

@fullerdj fullerdj force-pushed the fullerdj:wip-djf-15069 branch 5 times, most recently from 28be6e5 to cad706a Mar 23, 2017

fullerdj added some commits Jul 5, 2016

mon/cephfs: Warn about overlaying CephFS metadata pools
Don't allow creation of filesystems with overlaying metadata pools unless
the user passes --allow-dangerous-pool-overlay.

Signed-off-by: Douglas Fuller <dfuller@redhat.com>
cephfs: Permit recovering metadata into a new RADOS pool
Add a procedure that permits reconstructing metadata in a potentially
damaged cephfs metadata pool and writing the results into a
freshly-initialized pool that refers to the same data pool. Add option
flags to override checks that would ordinarily prevent this and add
options to the recovery tools to write output to a separate pool instead of
the one selected for recovery. See docs/cephfs/disaster-recovery.rst for
details.

Fixes: http://tracker.ceph.com/issues/15068
Fixes: http://tracker.ceph.com/issues/15069
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
qa/cephfs: Add test for rebuilding into an alternate metadata pool
Add a test to validate the ability of cephfs_data_scan and friends to
recover metadata from a damaged CephFS installation into a fresh metadata
pool.

cf: http://tracker.ceph.com/issues/15068
cf: http://tracker.ceph.com/issues/15069
Signed-off-by: Douglas Fuller <dfuller@redhat.com>

@fullerdj fullerdj force-pushed the fullerdj:wip-djf-15069 branch from cad706a to 37bafff Apr 4, 2017

@jcsp jcsp merged commit d529121 into ceph:master Apr 10, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.