Skip to content
/ linux Public

Commit 5b75c7f

Browse files
ethanwu-synoSasha Levin
authored andcommitted
ceph: supply snapshot context in ceph_uninline_data()
[ Upstream commit 305ff6b ] The ceph_uninline_data function was missing proper snapshot context handling for its OSD write operations. Both CEPH_OSD_OP_CREATE and CEPH_OSD_OP_WRITE requests were passing NULL instead of the appropriate snapshot context, which could lead to unnecessary object clone. Reproducer: ../src/vstart.sh --new -x --localhost --bluestore // turn on cephfs inline data ./bin/ceph fs set a inline_data true --yes-i-really-really-mean-it // allow fs_a client to take snapshot ./bin/ceph auth caps client.fs_a mds 'allow rwps fsname=a' mon 'allow r fsname=a' osd 'allow rw tag cephfs data=a' // mount cephfs with fuse, since kernel cephfs doesn't support inline write ceph-fuse --id fs_a -m 127.0.0.1:40318 --conf ceph.conf -d /mnt/mycephfs/ // bump snapshot seq mkdir /mnt/mycephfs/.snap/snap1 echo "foo" > /mnt/mycephfs/test // umount and mount it again using kernel cephfs client umount /mnt/mycephfs mount -t ceph fs_a@.a=/ /mnt/mycephfs/ -o conf=./ceph.conf echo "bar" >> /mnt/mycephfs/test ./bin/rados listsnaps -p cephfs.a.data $(printf "%x\n" $(stat -c %i /mnt/mycephfs/test)).00000000 will see this object does unnecessary clone 1000000000a.00000000 (seq:2): cloneid snaps size overlap 2 2 4 [] head - 8 but it's expected to see 10000000000.00000000 (seq:2): cloneid snaps size overlap head - 8 since there's no snapshot between these 2 writes clone happened because the first osd request CEPH_OSD_OP_CREATE doesn't pass snap context so object is created with snap seq 0, but later data writeback is equipped with snapshot context. snap.seq(1) > object snap seq(0), so osd does object clone. This fix properly acquiring the snapshot context before performing write operations. Signed-off-by: ethanwu <ethanwu@synology.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 57633b4 commit 5b75c7f

File tree

1 file changed

+22
-2
lines changed

1 file changed

+22
-2
lines changed

fs/ceph/addr.c

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2203,6 +2203,7 @@ int ceph_uninline_data(struct file *file)
22032203
struct ceph_osd_request *req = NULL;
22042204
struct ceph_cap_flush *prealloc_cf = NULL;
22052205
struct folio *folio = NULL;
2206+
struct ceph_snap_context *snapc = NULL;
22062207
u64 inline_version = CEPH_INLINE_NONE;
22072208
struct page *pages[1];
22082209
int err = 0;
@@ -2230,6 +2231,24 @@ int ceph_uninline_data(struct file *file)
22302231
if (inline_version == 1) /* initial version, no data */
22312232
goto out_uninline;
22322233

2234+
down_read(&fsc->mdsc->snap_rwsem);
2235+
spin_lock(&ci->i_ceph_lock);
2236+
if (__ceph_have_pending_cap_snap(ci)) {
2237+
struct ceph_cap_snap *capsnap =
2238+
list_last_entry(&ci->i_cap_snaps,
2239+
struct ceph_cap_snap,
2240+
ci_item);
2241+
snapc = ceph_get_snap_context(capsnap->context);
2242+
} else {
2243+
if (!ci->i_head_snapc) {
2244+
ci->i_head_snapc = ceph_get_snap_context(
2245+
ci->i_snap_realm->cached_context);
2246+
}
2247+
snapc = ceph_get_snap_context(ci->i_head_snapc);
2248+
}
2249+
spin_unlock(&ci->i_ceph_lock);
2250+
up_read(&fsc->mdsc->snap_rwsem);
2251+
22332252
folio = read_mapping_folio(inode->i_mapping, 0, file);
22342253
if (IS_ERR(folio)) {
22352254
err = PTR_ERR(folio);
@@ -2245,7 +2264,7 @@ int ceph_uninline_data(struct file *file)
22452264
req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout,
22462265
ceph_vino(inode), 0, &len, 0, 1,
22472266
CEPH_OSD_OP_CREATE, CEPH_OSD_FLAG_WRITE,
2248-
NULL, 0, 0, false);
2267+
snapc, 0, 0, false);
22492268
if (IS_ERR(req)) {
22502269
err = PTR_ERR(req);
22512270
goto out_unlock;
@@ -2261,7 +2280,7 @@ int ceph_uninline_data(struct file *file)
22612280
req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout,
22622281
ceph_vino(inode), 0, &len, 1, 3,
22632282
CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE,
2264-
NULL, ci->i_truncate_seq,
2283+
snapc, ci->i_truncate_seq,
22652284
ci->i_truncate_size, false);
22662285
if (IS_ERR(req)) {
22672286
err = PTR_ERR(req);
@@ -2324,6 +2343,7 @@ int ceph_uninline_data(struct file *file)
23242343
folio_put(folio);
23252344
}
23262345
out:
2346+
ceph_put_snap_context(snapc);
23272347
ceph_free_cap_flush(prealloc_cf);
23282348
doutc(cl, "%llx.%llx inline_version %llu = %d\n",
23292349
ceph_vinop(inode), inline_version, err);

0 commit comments

Comments
 (0)