osd/PrimaryLogPG: fix the oi size mismatch with real object size #21408

hsepeng · 2018-04-13T08:19:24Z

oi (object_info_t) size mismatch with the real object size on the
persistent backend, which was introduced by the old write with
a smaller truncate_seq falsefully modified the oi size.

Fixes: http://tracker.ceph.com/issues/23701
Signed-off-by: Peng Xie peng.hse@xtaotech.com

oi (object_info_t) size mismatch with the real object size on the persistent backend, which was introduced by the old write with a smaller truncate_seq falsefully modified the oi size. Fixes: http://tracker.ceph.com/issues/23701 Signed-off-by: Peng Xie peng.hse@xtaotech.com

hsepeng · 2018-04-18T08:57:29Z

hi @batrick , is there anyone who will review my fix. it is important since the bug will introduce data and metadata inconsistent problem causing potential data corruption.

tchaikov · 2018-04-18T12:08:43Z

src/osd/PrimaryLogPG.cc

@@ -7899,7 +7899,8 @@ void PrimaryLogPG::write_update_size_and_usage(object_stat_sum_t& delta_stats, o
  } else if (length)
    ch.insert(offset, length);
  modified.union_of(ch);
-  if (write_full || offset + length > oi.size) {
+  if (write_full ||


i don't think this issue exists in master. see 732a950#diff-fb41013d27e932534adb50eb3de2aaa5R7790

my guess is that the backend filestore did a short read, so the returned result was less than left. in the case of trimtrunc, this is expected. because the size of file representing the object is 0. so i think we should probably backport part of 732a950#diff-fb41013d27e932534adb50eb3de2aaa5R7790

@jdurgin what do you think?

@tchaikov
i exactly pinpoint the time sequence of the events that causing this bug , please follow my tracker link.
In summary:

CEPHFS Client first issue a write op to the according object in the osd with truncate_seq 0,
the write was writeback cached in the Client side objectcacher for latter flush

latter on , client sent a truncate op to the mds, mds issued the truncate op with truncate_seq 2 to the same object.

on the osd side, it first receive the truncate op, set the oi.truncate_seq to 2, the truncate
the according object to size 0.

the client side old write arrives (with truncate_seq 0), and its write offset start at 4063232,
with len for example 100. however, it found that the object has already got truncated to 0,
so shrink its write data buffer length to 0

finally, during write_update_size_and_usage, the code falsefully update the oi.size to the
offset (4063232)

the old previous write op, out of range of the truncated object, from the cephfs fuse client should not modify the oi.size. and persist the oi metadata onto the disk, since it caused mismatch among oi.size, oi.truncated_size and real object data size on the backend. it is
not relevant to whether the sparse async read or not. @jdurgin

@tchaikov @hsepeng You're both right. The particular assert won't be triggered in luminous or later due to the changes to COPY_GET handling to update the cursor with the logical size, not the size read off disk. However, it doesn't make sense to change the logical object size when the write has been superseded by a trimtrunc.

Thus, I think this patch does make sense for master, even though it won't trigger the same assert in cache tiering. It fixes a flaw in the usage accounting for this trimtrunc + write race case. Also, it points out an area where we need more testing - trimtrunc handling.

tchaikov · 2018-04-23T05:27:38Z

http://pulpito.ceph.com/kchai-2018-04-22_04:45:00-rados-wip-kefu-testing-2018-04-22-0053-distro-basic-mira/

tchaikov · 2018-04-23T05:28:41Z

@jdurgin since it's a bug fix. and relatively low-risk. can we have it in mimic?

jdurgin · 2018-04-23T05:49:52Z

@tchaikov yeah I think this is fine for mimic

hsepeng force-pushed the oisizemismatch-osd-bugfix branch from 98c013e to 4a34f11 Compare April 13, 2018 08:42

batrick added core needs-review backport and removed backport labels Apr 15, 2018

tchaikov reviewed Apr 18, 2018

View reviewed changes

tchaikov added the bug-fix label Apr 18, 2018

tchaikov requested a review from jdurgin April 18, 2018 12:09

tchaikov added needs-qa wip-kefu-testing and removed needs-review labels Apr 20, 2018

tchaikov approved these changes Apr 23, 2018

View reviewed changes

tchaikov self-assigned this Apr 23, 2018

tchaikov removed needs-qa wip-kefu-testing labels Apr 23, 2018

tchaikov merged commit 9f35318 into ceph:master Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd/PrimaryLogPG: fix the oi size mismatch with real object size #21408

osd/PrimaryLogPG: fix the oi size mismatch with real object size #21408

hsepeng commented Apr 13, 2018 •

edited

hsepeng commented Apr 18, 2018

tchaikov Apr 18, 2018 •

edited

hsepeng Apr 18, 2018 •

edited

hsepeng Apr 20, 2018

jdurgin Apr 20, 2018

tchaikov commented Apr 23, 2018

tchaikov commented Apr 23, 2018

jdurgin commented Apr 23, 2018

osd/PrimaryLogPG: fix the oi size mismatch with real object size #21408

osd/PrimaryLogPG: fix the oi size mismatch with real object size #21408

Conversation

hsepeng commented Apr 13, 2018 • edited

hsepeng commented Apr 18, 2018

tchaikov Apr 18, 2018 • edited

Choose a reason for hiding this comment

hsepeng Apr 18, 2018 • edited

Choose a reason for hiding this comment

hsepeng Apr 20, 2018

Choose a reason for hiding this comment

jdurgin Apr 20, 2018

Choose a reason for hiding this comment

tchaikov commented Apr 23, 2018

tchaikov commented Apr 23, 2018

jdurgin commented Apr 23, 2018

hsepeng commented Apr 13, 2018 •

edited

tchaikov Apr 18, 2018 •

edited

hsepeng Apr 18, 2018 •

edited