osd: force promote for ops which ec base pool can't handle #5776

wonzhq · 2015-09-02T14:43:07Z

For ops which the ec base pool can't handle, if they are proxied to the
base ec pool, ENOTSUPP is returned. Need to force promote the objects
into the cache pool.

Fixes: #12903

ghost · 2015-09-02T14:53:11Z

Awesome (and really quick ;-)

./virtualenv/bin/teuthology-suite --priority 101 --suite rbd --filter='rbd/qemu/{cache/writethrough.yaml cachepool/ec-cache.yaml clusters/fixed-3.yaml fs/btrfs.yaml msgr-failures/few.yaml workloads/qemu_xfstests.yaml}' --suite-branch master --distro ubuntu --ceph wonzhq-tmap-update --machine-type plana,burnupi,mira

should confirm it works assuming you have built wonzhq-tmap-update with this pull request in the gitbuilders.

ghost · 2015-09-02T14:55:54Z

http://pulpito.ceph.com/loic-2015-09-02_15:41:18-rbd-master---basic-multi/ is running the suite above on master to confirm it always fails and provide a simple way to reproduce the error.

ghost · 2015-09-02T14:58:10Z

@wonzhq this should target infernalis and not master

wonzhq · 2015-09-02T15:02:54Z

@dachary though I can access sepia to run teuthology, I'm not able to create a branch in the ceph repo. I'll need to ask someone else to do it for me.

ghost · 2015-09-02T15:23:03Z

@wonzhq I'll push the branch for you

ghost · 2015-09-02T15:24:22Z

@wonzhq wonzhq-tmap-update is building in http://ceph.com/gitbuilder.cgi

wonzhq · 2015-09-02T15:24:49Z

Please hold on this. The commit only addresses the problem for v1 image. For v2 image, we need to play around the CEPH_OSD_OP_CALL op.

ghost · 2015-09-02T15:28:30Z

@wonzhq I'll repush to the gitbuilders whenever you tell me to

wonzhq · 2015-09-02T15:44:13Z

@liewegas can you help to take a look if the fix makes sense? Thanks!

jdurgin · 2015-09-02T16:21:01Z

shouldn't this apply to all ops ec pools can't handle, i.e. everything that's not a class write, aligned append, or writefull?

wonzhq · 2015-09-03T04:32:20Z

@jdurgin you are right. For all ops which the ec base pool can't handle, we can't proxy them. Besides WRITE, ZERO and TRUNCATE, the others are SYNC_READ, MAPEXT, SPARSE_READ, CLONERANGE, TMAPGET, TMAPPUT, TMAPUP, OMAPSETVALS, OMAPSETVALS, OMAPSETHEADER, OMAPCLEAR, OMAPRMKEYS. I'll make the changes for them.

loic-bot · 2015-09-03T04:40:56Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7444/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7444/

ghost · 2015-09-03T10:38:00Z

@wonzhq wonzhq-tmap-update is building in http://ceph.com/gitbuilder.cgi

loic-bot · 2015-09-03T13:15:51Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7461/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7461/

wonzhq · 2015-09-04T09:08:55Z

@dachary thanks! I've run the teuthology test for this in http://pulpito.ceph.com/yuan-2015-09-03_19:24:10-rbd-wonzhq-tmap-update---basic-multi/. It ran successfully.

ghost · 2015-09-04T09:31:46Z

@jdurgin do you think it is good to merge ?

liewegas · 2015-09-04T17:05:53Z

src/osd/OSD.cc

+          iter->op.op == CEPH_OSD_OP_TMAPGET ||
+          iter->op.op == CEPH_OSD_OP_TMAPPUT ||
+          iter->op.op == CEPH_OSD_OP_TMAPUP) {
+        if (base_pool->require_rollback()) {


let's move the base_pool if so that it's the outter one?

I think a safer approach would be to whitelist the things that the erasure pool can do (instead of blacklisting stuff it can't). And make a helper to determine that.. like pg_pool_t::support_osd_op(int opcode)? Similar to supports_omap()...

Eh, actually a helper is probably overkill since this method is the only user, and it may need to take additional information into account (like the whole OSDOp).

I still think flipping this around to be a whitelist is an improvement, though!

yeah, a whitelist would be easier to keep up to date. For appends I think we need to check the length with the EC pool's alignment requirement

wonzhq · 2015-09-06T02:31:18Z

@liewegas @jdurgin updated this to use a whitelist.

For append to an ec base pool, when the object is not in the cache tier, we don't know its size. Thus we are unable to determine if it's aligned or not. I think we have to promote this object anyway.

loic-bot · 2015-09-06T02:53:56Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7539/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7539/

liewegas · 2015-09-06T14:47:58Z

src/osd/OSD.cc

+            (iter->op.op != CEPH_OSD_OP_COPY_GET_CLASSIC) &&
+            (iter->op.op != CEPH_OSD_OP_COPY_GET) &&
+            (iter->op.op != CEPH_OSD_OP_COPY_FROM)) {
+          op->set_promote();


we can remove

cache_* (can't use ec for cache pools)
notify, watch (no watch/notify on ec pool)

from the list

loic-bot · 2015-09-07T02:24:29Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7561/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7561/

liewegas · 2015-09-07T11:21:51Z

-29> 2015-09-07 00:46:37.840258 7fa36b07b700 5 -- op tracker -- seq: 13141, time: 2015-09-07 00:46:37.840192, event: throttled, op: osd_op(client.4547.0:3 0.obj [watch unwatch cookie 94197997405968] 174.db6ee63a ondisk+write+known_if_redirected e1013)
-28> 2015-09-07 00:46:37.840265 7fa36b07b700 5 -- op tracker -- seq: 13141, time: 2015-09-07 00:46:37.840227, event: all_read, op: osd_op(client.4547.0:3 0.obj [watch unwatch cookie 94197997405968] 174.db6ee63a ondisk+write+known_if_redirected e1013)
-27> 2015-09-07 00:46:37.840271 7fa36b07b700 5 -- op tracker -- seq: 13141, time: 0.000000, event: dispatched, op: osd_op(client.4547.0:3 0.obj [watch unwatch cookie 94197997405968] 174.db6ee63a ondisk+write+known_if_redirected e1013)
-26> 2015-09-07 00:46:37.840282 7fa36b07b700 20 osd.2 1016 should_share_map client.4547 10.214.136.114:0/1011485 1013
-25> 2015-09-07 00:46:37.840287 7fa36b07b700 20 osd.2 1016 client session last_sent_epoch: 0 versus osdmap epoch 1016
0> 2015-09-07 00:46:37.843087 7fa36b07b700 -1 *** Caught signal (Segmentation fault) **
in thread 7fa36b07b700
ceph version 9.0.3-1469-g2d7e480 (2d7e480)
1: (()+0x7ec80a) [0x55d2c7a0b80a]
2: (()+0x10340) [0x7fa372bc5340]
3: (OSD::init_op_flags(std::shared_ptr&)+0x125) [0x55d2c75374c5]
4: (OSD::handle_op(std::shared_ptr&, std::shared_ptr&)+0xbab) [0x55d2c7565c0b]
5: (OSD::dispatch_op_fast(std::shared_ptr&, std::shared_ptr&)+0x19e) [0x55d2c756695e]
6: (OSD::dispatch_session_waiting(OSD::Session, std::shared_ptr)+0x88) [0x55d2c7566c18]
7: (OSD::ms_fast_dispatch(Message)+0x234) [0x55d2c7566f64]
8: (AsyncConnection::process()+0x125a) [0x55d2c7bf3a0a]
9: (EventCenter::process_events(int)+0x41b) [0x55d2c7bb2d8b]
10: (Worker::entry()+0xd8) [0x55d2c7b92498]
11: (()+0x8182) [0x7fa372bbd182]
12: (clone()+0x6d) [0x7fa370f0447d]

That's the version prior to your last push, but I'm not sure the change would affect it crashing?

wonzhq · 2015-09-08T03:19:46Z

@liewegas is it possible osdmap->get_pg_pool returns NULL at this point?

For ops which the ec base pool can't handle, if they are proxied to the base ec pool, ENOTSUPP is returned. Need to force promote the objects into the cache pool. Fixes: ceph#12903 Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>

wonzhq · 2015-09-08T07:40:37Z

Looks like it's possible osdmap->get_pg_pool returns NULL in this function. I've changed the commit a little bit.

loic-bot · 2015-09-08T08:09:22Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7588/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7588/

ghost · 2015-09-08T12:06:40Z

@wonzhq wonzhq-tmap-update is building in http://ceph.com/gitbuilder.cgi . You should be able to run

teuthology-suite --priority 101 --suite rbd --filter='rbd/qemu/{cache/writethrough.yaml cachepool/ec-cache.yaml clusters/fixed-3.yaml fs/btrfs.yaml msgr-failures/few.yaml workloads/qemu_xfstests.yaml}' --suite-branch master --distro ubuntu --ceph wonzhq-tmap-update --machine-type plana,burnupi,mira

to confirm it does the right thing once it's finished.

liewegas · 2015-09-08T12:19:47Z

On Mon, 7 Sep 2015, wonzhq wrote:

@liewegas is it possible osdmap->get_pg_pool returns NULL at this point?

Hmm, only if this is called before we do the osdmap epoch check in the
request and possibly put the op on wait_for_osdmap...

wonzhq · 2015-09-09T11:23:24Z

The rbd test case ran successfully at http://pulpito.ceph.com/yuan-2015-09-08_19:06:22-rbd-wonzhq-tmap-update---basic-multi/

jdurgin · 2015-09-10T01:17:00Z

lgtm

osd: force promote for ops which ec base pool can't handle Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Loic Dachary <ldachary@redhat.com>

wonzhq mentioned this pull request Sep 2, 2015

librbd: do write_full for whole object write #5750

Merged

tchaikov added bug-fix core labels Sep 2, 2015

ghost self-assigned this Sep 2, 2015

wonzhq force-pushed the tmap-update branch from 29a3e21 to 1ca6ef8 Compare September 3, 2015 05:45

wonzhq changed the title ~~osd: tmap update can't be proxied to an ec base tier~~ osd: force promote for ops which ec base pool can't handle Sep 3, 2015

ghost assigned jdurgin and unassigned ghost Sep 4, 2015

ghost added this to the infernalis milestone Sep 4, 2015

liewegas reviewed Sep 4, 2015
View reviewed changes

wonzhq force-pushed the tmap-update branch from 1ca6ef8 to f5bf2c5 Compare September 6, 2015 02:24

liewegas reviewed Sep 6, 2015
View reviewed changes

liewegas added the wip-sage-testing label Sep 6, 2015

wonzhq force-pushed the tmap-update branch from f5bf2c5 to f719d99 Compare September 7, 2015 01:58

liewegas removed the wip-sage-testing label Sep 7, 2015

wonzhq force-pushed the tmap-update branch from f719d99 to 8c2dfad Compare September 8, 2015 07:39

jdurgin added the needs-qa label Sep 10, 2015

liewegas added the wip-sage-testing label Sep 10, 2015

liewegas added a commit that referenced this pull request Sep 12, 2015

Merge pull request #5776 from wonzhq/tmap-update

2e44373

osd: force promote for ops which ec base pool can't handle Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Loic Dachary <ldachary@redhat.com>

liewegas merged commit 2e44373 into ceph:master Sep 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd: force promote for ops which ec base pool can't handle #5776

osd: force promote for ops which ec base pool can't handle #5776

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

jdurgin commented Sep 2, 2015

wonzhq commented Sep 3, 2015

loic-bot commented Sep 3, 2015

ghost commented Sep 3, 2015

loic-bot commented Sep 3, 2015

wonzhq commented Sep 4, 2015

ghost commented Sep 4, 2015

liewegas Sep 4, 2015

liewegas Sep 4, 2015

liewegas Sep 4, 2015

jdurgin Sep 4, 2015

wonzhq commented Sep 6, 2015

loic-bot commented Sep 6, 2015

liewegas Sep 6, 2015

loic-bot commented Sep 7, 2015

liewegas commented Sep 7, 2015

wonzhq commented Sep 8, 2015

wonzhq commented Sep 8, 2015

loic-bot commented Sep 8, 2015

ghost commented Sep 8, 2015

liewegas commented Sep 8, 2015

wonzhq commented Sep 9, 2015

jdurgin commented Sep 10, 2015

osd: force promote for ops which ec base pool can't handle #5776

osd: force promote for ops which ec base pool can't handle #5776

Conversation

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

ghost commented Sep 2, 2015

wonzhq commented Sep 2, 2015

jdurgin commented Sep 2, 2015

wonzhq commented Sep 3, 2015

loic-bot commented Sep 3, 2015

ghost commented Sep 3, 2015

loic-bot commented Sep 3, 2015

wonzhq commented Sep 4, 2015

ghost commented Sep 4, 2015

liewegas Sep 4, 2015

Choose a reason for hiding this comment

liewegas Sep 4, 2015

Choose a reason for hiding this comment

liewegas Sep 4, 2015

Choose a reason for hiding this comment

jdurgin Sep 4, 2015

Choose a reason for hiding this comment

wonzhq commented Sep 6, 2015

loic-bot commented Sep 6, 2015

liewegas Sep 6, 2015

Choose a reason for hiding this comment

loic-bot commented Sep 7, 2015

liewegas commented Sep 7, 2015

wonzhq commented Sep 8, 2015

wonzhq commented Sep 8, 2015

loic-bot commented Sep 8, 2015

ghost commented Sep 8, 2015

liewegas commented Sep 8, 2015

wonzhq commented Sep 9, 2015

jdurgin commented Sep 10, 2015