Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDSMonitor: avoid health checks on new file system #21810

Merged
merged 2 commits into from May 8, 2018
Merged

Conversation

batrick
Copy link
Member

@batrick batrick commented May 4, 2018

Fixes: http://tracker.ceph.com/issues/23885

Signed-off-by: Patrick Donnelly pdonnell@redhat.com

@batrick batrick requested a review from jcsp May 4, 2018 05:09
@batrick
Copy link
Member Author

batrick commented May 4, 2018

I'm not really a fan of this solution but it works. @jcsp, any better ideas?

@jcsp
Copy link
Contributor

jcsp commented May 4, 2018

For the case where the user creates MDSs first and then the filesystem, we could perhaps do a tick() after creating the FS in pending_fsmap (but before committing it) so that it has daemons assigned to it from the beginning? That way no need to explicitly give anything a pass on the health checks.

In the case where they create a filesystem before creating any MDSs, I think the current behaviour is OK -- the health warning acts as a cue to new users that "fs new" on its own is not sufficient to have a working filesystem.

@batrick
Copy link
Member Author

batrick commented May 4, 2018

For the case where the user creates MDSs first and then the filesystem, we could perhaps do a tick() after creating the FS in pending_fsmap (but before committing it) so that it has daemons assigned to it from the beginning? That way no need to explicitly give anything a pass on the health checks.

I tried that but the mon froze due to proposal request logic I think. I've tried a different approach which almost works. See the commit message.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This avoids unnecessary health warnings. However, the original issue in i23885
still exists because the standbys are not available at fs creation time. If you
create a new file system after these standbys are available, then you will
observe that the promotion works to silence the warnings.

Fixes: http://tracker.ceph.com/issues/23885

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick
Copy link
Member Author

batrick commented May 4, 2018

Adding a sleep before creating the fs in vstart.sh works. I think this is done. @jcsp what do you think?

@batrick
Copy link
Member Author

batrick commented May 4, 2018

retest this please

ss << "new fs with metadata pool " << metadata << " and data pool " << data;

// assign a standby to rank 0 to avoid health warnings
std::string _name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work to just call "maybe_promote_standby(fs)" here? I see that calling tick() was problematic because it does a propose_pending(), but maybe_promote_standby in isolation should be alright

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would actually use maybe_resize_cluster because there are no ranks in. I didn't want to use that becuase it needs to check the current epoch FSMap to see if a rank has become active:

ceph/src/mon/MDSMonitor.cc

Lines 1754 to 1762 in eb5ca24

/* Check that both the current epoch mds_map is resizeable as well as the
* current batch of changes in pending. This is important if an MDS is
* becoming active in the next epoch.
*/
if (!fsmap_mds_map.is_resizeable() ||
!pending_mds_map.is_resizeable()) {
dout(5) << __func__ << " mds_map is not currently resizeable" << dendl;
return false;
}

maybe_resize_cluster could be restructured to work but I'm not sure it's worth it.

@batrick batrick merged commit 93bc8c5 into ceph:master May 8, 2018
batrick added a commit that referenced this pull request May 8, 2018
* refs/pull/21810/head:
	MDSMonitor: promote standby after fs creation
	MDSMonitor: always prints standbys even if no fs

Reviewed-by: John Spray <john.spray@redhat.com>
@batrick batrick deleted the i23885 branch May 23, 2018 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants