New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools/cephfs: add tmap_upgrade #7003
Conversation
(testing this could be kind of interesting... dunno if any of the upgrade tests cover a version old enough to still use tmaps) |
LGTM Can we get a simple unit test that forces tmap to be used, then cleans up with "tmap_upgrade" ? |
} | ||
} | ||
|
||
return overall_r; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess overall_r can never be 0, because metadata pool does not store dirfrag object only. I think we should filter out non-dirfrag objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep it super-simple I've just changed it to ignore the EINVALs that we get on non-omap/tmap objects.
15f09f0
to
fecbc77
Compare
I've added a test now. It's kind of awkward because the only way to create tmaps is using librados, so instead of being a normal ceph-qa-suite python test this is a C++ test in the style of the librados tests. |
test this please |
I built v0.74 and created a cluster with mds. I created a whole bunch of directories. After that I checked out hammer and built it. I just started the OSDs and didn't do anything with mds. Then I fetched your branch and built that. I started up the cluster and executed "./cephfs-data-scan tmap_upgrade metadata" $ ./ceph mds stat I did note that a bunch of files in the filestore originally with 230 byte length (objects for empty directories with tmaps), are now empty which means the conversion from tmap to omap must have happened. |
@dzafman hmm, did you happen to keep the MDS logs? Would be useful to know what triggered the 'damaged' |
I guess we're blocked until @jcsp gets those logs? |
@jcsp Here the log output of the thread in mds that declared damage: After executing: ./cephfs-data-scan tmap_upgrade metadata and starting up the mds. I actually started all daemons.
|
OK, I think I was overestimating the safety of running TMAP2OMAP on objects. The TMAP loading code will accept anything that looks vaguely decode-able as a couple of bufferlists, and if it sees something that it thinks it could load, it will truncate the body of the object. |
1d4a668
to
43b4271
Compare
Updated: should work properly this time! |
@jcsp After the tmap upgrade the MDS crashes with an assert. I've attached a log in tracker: http://tracker.ceph.com/issues/13768 . Also, after the upgrade I found that all but 1 of objects with 230 byte length which I believe to be the tmaps are zero length as they should be except for inode 2. Looks like the code missed the root directory. dzafman$ find dev -type f -ls| grep " 230 " |
43b4271
to
fefe7c0
Compare
Ah, it's MDS_INO_CEPH (i.e. /.ceph), I've added that now, thanks for spotting it. However, I don't see a connection between that and the assertion
I'm suspicious that that could be from some other unrelated upgrade bug. @dzafman do you by any chance have a "rados export" of the metadata pool from before tmap_upgrade was run so that I can debug this? |
@jcsp No, I don't have a rados export, but I will generate one. I'm going to upgrade to Jewel and start the MDS to see if it crashes, if it does not, I'll create exports and then switch to your branch and try the tmap_upgrade. |
@jcsp Manual testing passed After Dumpling build (create a bunch of directories) -> firefly -> Hammer -> wip-cephfs-tmap-migrate. I ran "./cephfs-data-scan tmap_upgrade metadata" appeared to work. I stop the MDS and added an assert in do_osd_ops() for CEPH_OSD_OP_TMAPGET. It was never called when traversing the entire directory tree. |
@dzafman great, thanks for your persistence. |
@gregsfortytwo @jcsp Is there a reason we don't do the tmap upgrade automatically since we have to restarting the MDS anyway? |
On Tue, Feb 2, 2016 at 12:31 PM, David Zafman notifications@github.com wrote:
The MDS already switches anything it sees and changes over to omap. |
Right: the tool is so that the MDS doesn't have to traverse the entire metadata tree and do tmap upgrades. We could have built it into forward scrub and done it in the background, but this was way simpler. |
@gregsfortytwo We can't remove tmap in jewel+1 unless we require an upgrade to jewel (and running the tool) before upgrade to jewel+1. Will we we want to allow an upgrade from Hammer directly to Jewel+1? |
I think we should require upgraders stop at jewel, as we did with hammer.
We'll need to document loudly that the tool is required for any
clusters created before.. whenever it was that we stopped using
tmap.
|
I think we should require upgraders stop at jewel, as we did with hammer.
We'll need to document loudly that the tool is required for any
clusters created before.. whenever it was that we stopped using
tmap.
Should probably backport it to Hammer (I assume that's feasible) once
we're done; then they won't need the jewel stop.
Realistically I would be surprised if more than a couple people need
this anyway, unless there are a lot of home users with rsync archives
they haven't touched in years...
|
…into greg-fs-testing #7003
…into greg-fs-testing #7003
Assigned to self to make doc updates |
Because TMAP support will go away in Jewel+1, we need a way to ensure anyone upgrading an ancient CephFS filesystem can make sure all their TMAPs have gone away. Signed-off-by: John Spray <john.spray@redhat.com>
fefe7c0
to
b322873
Compare
Added docs, this is good to go. I have added a recurring note to my calendar to send out an email and make sure the release notes mention it at the point we're releasing Jewel. |
I think you can just update the PendingReleaseNotes file? |
This is part of the run-up to removing all TMAP code in the Jewel+1 cycle. Signed-off-by: John Spray <john.spray@redhat.com>
b322873
to
e564111
Compare
@gregsfortytwo good point, forgot about that file. Udpated. |
…into greg-fs-testing #7003 Reviewed-by: David Zafman <dzafman@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>
http://qa-proxy.ceph.com/teuthology/gregf-2016-03-07_23:12:27-fs-greg-fs-testing-3-7-safe---basic-mira/ |
tools/cephfs: add tmap_upgrade Reviewed-by: David Zafman <dzafman@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Because TMAP support will go away in Jewel+1,
we need a way to ensure anyone upgrading
an ancient CephFS filesystem can make sure
all their TMAPs have gone away.
Signed-off-by: John Spray john.spray@redhat.com