os/bluestore: preserve source collection cache during split #12574

Merged
merged 1 commit into from Dec 20, 2016

Projects

None yet

2 participants

@liewegas
Member

OSD split transactions look something like

mkcoll new
split old
...
omap_rmkey_range old
omap_setkeys old
omap_setkeys new

The last part splits the log into two pieces. The
problem is that the rmkey_range needs to wait on old
omap transactions to flush, and those are linked to the
old onode, and split clears the cache. The result is
that we don't wait, rmkeyrange leaves some recent pg log
keys behind, and on OSD restart we get an error because
the object doesn't belong to the (old) collection.

Fix this by preserving objects in the old collection and
only clear out objects that are moving to the newly
split collections. This will include the pgmeta object
that we care about.

(Note that we are one step closer to preserving the
cache contents across the split, but not quite there
yet: at this point we don't have all of the destination
collections. A change in the ObjectStore interface is
probably needed to make that not be extremely awkward.)

Signed-off-by: Sage Weil sage@redhat.com

@liewegas liewegas os/bluestore: preserve source collection cache during split
OSD split transactions look something like

 mkcoll new
 split old
 ...
 omap_rmkey_range old
 omap_setkeys old
 omap_setkeys new

The last part splits the log into two pieces.  The
problem is that the rmkey_range needs to wait on old
omap transactions to flush, and those are linked to the
old onode, and split clears the cache.  The result is
that we don't wait, rmkeyrange leaves some recent pg log
keys behind, and on OSD restart we get an error because
the object doesn't belong to the (old) collection.

Fix this by preserving objects in the old collection and
only clear out objects that are moving to the newly
split collections.  This will include the pgmeta object
that we care about.

(Note that we are one step closer to preserving the
cache contents across the split, but not quite there
yet: at this point we don't have all of the destination
collections.  A change in the ObjectStore interface is
probably needed to make that not be extremely awkward.)

Signed-off-by: Sage Weil <sage@redhat.com>
ec5ba4e
@liewegas liewegas added this to the kraken milestone Dec 19, 2016
@liewegas liewegas requested review from ifed01 and xiexingguo Dec 20, 2016
@xiexingguo
Contributor

lgtm

@liewegas liewegas merged commit d5d8641 into ceph:master Dec 20, 2016

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
@liewegas liewegas deleted the liewegas:wip-bluestore-split branch Dec 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment