MDSMonitor: avoid health checks on new file system #21810

batrick · 2018-05-04T05:09:41Z

Fixes: http://tracker.ceph.com/issues/23885

Signed-off-by: Patrick Donnelly pdonnell@redhat.com

batrick · 2018-05-04T05:10:33Z

I'm not really a fan of this solution but it works. @jcsp, any better ideas?

jcsp · 2018-05-04T08:26:11Z

For the case where the user creates MDSs first and then the filesystem, we could perhaps do a tick() after creating the FS in pending_fsmap (but before committing it) so that it has daemons assigned to it from the beginning? That way no need to explicitly give anything a pass on the health checks.

In the case where they create a filesystem before creating any MDSs, I think the current behaviour is OK -- the health warning acts as a cue to new users that "fs new" on its own is not sufficient to have a working filesystem.

batrick · 2018-05-04T17:48:54Z

For the case where the user creates MDSs first and then the filesystem, we could perhaps do a tick() after creating the FS in pending_fsmap (but before committing it) so that it has daemons assigned to it from the beginning? That way no need to explicitly give anything a pass on the health checks.

I tried that but the mon froze due to proposal request logic I think. I've tried a different approach which almost works. See the commit message.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

This avoids unnecessary health warnings. However, the original issue in i23885 still exists because the standbys are not available at fs creation time. If you create a new file system after these standbys are available, then you will observe that the promotion works to silence the warnings. Fixes: http://tracker.ceph.com/issues/23885 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

batrick · 2018-05-04T18:01:56Z

Adding a sleep before creating the fs in vstart.sh works. I think this is done. @jcsp what do you think?

batrick · 2018-05-04T21:07:13Z

retest this please

jcsp · 2018-05-08T09:16:11Z

src/mon/FSCommands.cc

    ss << "new fs with metadata pool " << metadata << " and data pool " << data;
+
+    // assign a standby to rank 0 to avoid health warnings
+    std::string _name;


Does it work to just call "maybe_promote_standby(fs)" here? I see that calling tick() was problematic because it does a propose_pending(), but maybe_promote_standby in isolation should be alright

We would actually use maybe_resize_cluster because there are no ranks in. I didn't want to use that becuase it needs to check the current epoch FSMap to see if a rank has become active:

ceph/src/mon/MDSMonitor.cc

Lines 1754 to 1762 in eb5ca24

/* Check that both the current epoch mds_map is resizeable as well as the

* current batch of changes in pending. This is important if an MDS is

* becoming active in the next epoch.

*/

if (!fsmap_mds_map.is_resizeable() ||

!pending_mds_map.is_resizeable()) {

dout(5) << __func__ << " mds_map is not currently resizeable" << dendl;

return false;

}

maybe_resize_cluster could be restructured to work but I'm not sure it's worth it.

* refs/pull/21810/head: MDSMonitor: promote standby after fs creation MDSMonitor: always prints standbys even if no fs Reviewed-by: John Spray <john.spray@redhat.com>

batrick added bug-fix cephfs Ceph File System needs-review labels May 4, 2018

batrick requested a review from jcsp May 4, 2018 05:09

batrick force-pushed the i23885 branch from e1b167c to 5077906 Compare May 4, 2018 17:48

batrick added 2 commits May 4, 2018 11:01

MDSMonitor: always prints standbys even if no fs

ad75128

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

batrick force-pushed the i23885 branch from cf591ec to 93bc8c5 Compare May 4, 2018 18:01

batrick added wip-pdonnell-testing and removed wip-pdonnell-testing labels May 4, 2018

batrick requested a review from jecluis May 7, 2018 23:02

batrick added the wip-pdonnell-testing label May 7, 2018

jcsp reviewed May 8, 2018

View reviewed changes

jcsp approved these changes May 8, 2018

View reviewed changes

batrick merged commit 93bc8c5 into ceph:master May 8, 2018

batrick added a commit that referenced this pull request May 8, 2018

Merge PR #21810 into master

f8aa12a

* refs/pull/21810/head: MDSMonitor: promote standby after fs creation MDSMonitor: always prints standbys even if no fs Reviewed-by: John Spray <john.spray@redhat.com>

batrick deleted the i23885 branch May 23, 2018 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MDSMonitor: avoid health checks on new file system #21810

MDSMonitor: avoid health checks on new file system #21810

batrick commented May 4, 2018

batrick commented May 4, 2018

jcsp commented May 4, 2018

batrick commented May 4, 2018

batrick commented May 4, 2018

batrick commented May 4, 2018

jcsp May 8, 2018

batrick May 8, 2018

	/* Check that both the current epoch mds_map is resizeable as well as the
	* current batch of changes in pending. This is important if an MDS is
	* becoming active in the next epoch.
	*/
	if (!fsmap_mds_map.is_resizeable() \|\|
	!pending_mds_map.is_resizeable()) {
	dout(5) << __func__ << " mds_map is not currently resizeable" << dendl;
	return false;
	}

MDSMonitor: avoid health checks on new file system #21810

MDSMonitor: avoid health checks on new file system #21810

Conversation

batrick commented May 4, 2018

batrick commented May 4, 2018

jcsp commented May 4, 2018

batrick commented May 4, 2018

batrick commented May 4, 2018

batrick commented May 4, 2018

jcsp May 8, 2018

Choose a reason for hiding this comment

batrick May 8, 2018

Choose a reason for hiding this comment