Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pacific: MDSMonitor: monitor crash after upgrade from ceph 15.2.13 to 16.2.4 #42536

Merged
merged 9 commits into from Aug 16, 2021

Conversation

batrick
Copy link
Member

@batrick batrick commented Jul 28, 2021

@batrick batrick added cephfs Ceph File System needs-qa labels Jul 28, 2021
@batrick batrick added this to the pacific milestone Jul 28, 2021
@batrick
Copy link
Member Author

batrick commented Jul 28, 2021

jenkins test make check

This adds an upgrade suite to ensure that a Ceph cluster without a
CephFS file system does not blow up on upgrade (in particular, that the
MDSMonitor does not trip). This was developed to potentially reproduce
tracker 51673 but the actual cause for that issue was an old encoding
for the MDSMap which was obsoleted in Pacific. You must create a cluster
older than the FSMap (~Hammer or Infernalis) to reproduce. In any case,
this upgrade suite may be useful in the future so let's keep it!

Related-to: https://tracker.ceph.com/issues/51673
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 9941188)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 147c27c)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 9297690)
… epoch

The PaxosService code already excludes the value returned by
PaxosService::get_trim_to as the upper bound of the range of epochs to
trim. Without this fix, you need to set mon_mds_force_trim_to to one
greater than the epoch you want to trim _and_ force the current epoch to
be one greater than that; the net result being that you can only force
trimming up to 2 epochs behind the current epoch.

This change is helpful for resolving issue 51673, but not strictly
necessary.

Related-to: https://tracker.ceph.com/issues/51673
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit d9dc2f1)

Conflicts:
	src/common/options/mon.yaml.in: doc change dropped
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ee899d9)
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 5ddaa36)

Conflicts:
	qa/tasks/cephfs/test_admin.py: trivial conflict
This throws a proper exception which can be handled.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 47a8273)
Fixes: https://tracker.ceph.com/issues/51673
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 4298f97)
To flush older versions which may still be an empty MDSMap (for clusters
that have never used CephFS), we need to force a proposal so older
versions of the struct are trimmed.

This is the main fix of this branch. We removed code which processed old
encodings of the MDSMap in the mon store via 60bc524. That broke old
ceph clusters which never used CephFS (see cited ticket below).  This is
because the initial epoch is an empty MDSMap (back in Infernalis/Hammer)
that is never updated. So, the fix here is to just do proposals
periodically until all of the old structs are automatically trimmed by
the mons.

Fixes: 60bc524
Fixes: https://tracker.ceph.com/issues/51673
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit 56c3fc8)
@neha-ojha neha-ojha requested a review from ajarr August 3, 2021 00:12
@ajarr
Copy link
Contributor

ajarr commented Aug 3, 2021

PR looks good. Waiting for test results before approving.

Copy link
Contributor

@ajarr ajarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuriw yuriw merged commit 0b5500a into ceph:pacific Aug 16, 2021
@batrick batrick deleted the i51940 branch March 27, 2023 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants