New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: EC read handling: don't grab an objectstore error to use as the … #16663

Merged
merged 1 commit into from Aug 1, 2017

Conversation

Projects
None yet
5 participants
@dzafman
Member

dzafman commented Jul 28, 2017

…read error

A missing shard (ENOENT from objectstore) will currently get back to the client. Since ErasureCode::minimum_to_decode() returns EIO when there aren't enough shards to read an object, I'm going to always use that error code even though we've collected any and all errors from each shard's attempt to read.

If you use cmpext to read an object that doesn't exist, you should still get ENOENT I assume based on the primary having no shard. Also, if OSDs are down such that there aren't enough readable shards available that is handled earlier in the process (maybe that case hangs). Finally, if we get past those two cases and reading shards get ENOENT from the objectstore, we need to decide what error code to return. I hadn't thought about the fact that a client can't tell if ENOENT means that an object doesn't exist, or there are not enough shards because one or more shards got ENOENT when attempting to read from the objectstore.

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jul 28, 2017

I don't think this has anything to do with CMPEXT (other than that having surfaced the issue again), does it?

In general yes, we should only return ENOENT if we believe that the object does not exist, not just because there happened to have been a missing shard somewhere.

@dzafman

This comment has been minimized.

Member

dzafman commented Jul 28, 2017

@gregsfortytwo This fixes CMPEXT getting confused by an ENOENT due to filestore error(s) for an object that should exist and doesn't contain all zeros.

non-existent object -> ENOENT
insufficient shards up -> hung until OSDs rejoin
filestore error(s) (ENOENT/EIO...) resulting in insufficient shards -> EIO

A scenario we aren't handling yet is a combination of down OSDs and filestore errors. If the non-error filestore reads plus the down OSDs yields sufficient shards, we should hang until those OSDs rejoin, otherwise it is a permanent error.

@dillaman

This comment has been minimized.

Contributor

dillaman commented Jul 28, 2017

@dzafman As long as CMPEXT can still filter out -ENOENT and any EC reads to non-existent objects don't cause cluster-wide log spamming, lgtm (note: if a user wanted CMPEXT to ensure the object exists, they could put a STAT op in front of it).

@dzafman

This comment has been minimized.

Member

dzafman commented Jul 28, 2017

@dillaman @gregsfortytwo This is why I didn't want you to suppress ENOENT errors in #16622

With this change and the removal all non-primary shards for object named "foo":

$ bin/rados -p ecpool get foo foo.out
error getting ecpool/foo: (5) Input/output error

With these clogs of objectstore errors.
2017-07-28 14:43:33.150217 osd.0 [ERR] handle_sub_read: Error -2 reading 1:602f83fe:::foo:head
2017-07-28 14:43:33.150825 osd.2 [ERR] handle_sub_read: Error -2 reading 1:602f83fe:::foo:head

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jul 28, 2017

Without this change that scenario returns ENOENT to the client/rados tool, right?

Reviewed-by: Greg Farnum gfarnum@redhat.com

@dzafman

This comment has been minimized.

Member

dzafman commented Jul 28, 2017

@gregsfortytwo

Without this change that scenario returns ENOENT to the client/rados tool, right?

Yes

@dzafman dzafman requested a review from liewegas Jul 28, 2017

osd: EC read handling: don't grab an objectstore error to use as the …
…read error

Signed-off-by: David Zafman <dzafman@redhat.com>

@dzafman dzafman requested a review from tchaikov Jul 28, 2017

@tchaikov tchaikov self-assigned this Jul 29, 2017

@@ -0,0 +1,34 @@
#!/bin/bash

This comment has been minimized.

@tchaikov

tchaikov Jul 30, 2017

Contributor

this test fails http://pulpito.ceph.com/kchai-2017-07-30_00:01:51-rados-wip-kefu-testing-distro-basic-smithi/1460917/. because the test of rados/standalone/misc.yaml runs every executable under qa/standalone/misc/. and the executables under that directory are not launched by qa/run-standalone.sh.

This comment has been minimized.

@dzafman

dzafman Jul 31, 2017

Member

Actually, a test in misc that deliberately fails for testing purposes was run. I ignore that test in run-standalone.sh, but teuthology must be running it using another mechanism.

This comment has been minimized.

@tchaikov

tchaikov Jul 31, 2017

Contributor

yeah, exactly.

This comment has been minimized.

@dzafman

dzafman Jul 31, 2017

Member

I removed that script from standalone/misc. I'll but it somewhere that won't be run by teuthology. It isn't part of this pull request anymore.

@tchaikov

aside from the newly added always-fail test in misc directory. lgtm

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jul 31, 2017

err = ec_impl->minimum_to_decode(want_to_read, have, &dummy_minimum) appears to be the key line. When I looked at this Friday I thought it was okay, but after standup today I think @jdurgin is correct — this will make us return errors based on the up set, not the set of OSDs which ought to be considered alive.

Of course, we already do return an error in that case, so this isn't worse. But we need to fix this, I think?

dryrun=true
shift
fi
#doregex=false

This comment has been minimized.

@tchaikov

tchaikov Jul 31, 2017

Contributor

might want to remove the not-yet-completed code

@ceph-jenkins

This comment has been minimized.

Collaborator

ceph-jenkins commented Jul 31, 2017

make check failed

@gregsfortytwo

This is better than the current behavior.

Reviewed-by: Greg Farnum gfarnum@redhat.com

@dzafman

This comment has been minimized.

Member

dzafman commented Aug 1, 2017

retest this please

@dzafman dzafman merged commit 901f14a into ceph:master Aug 1, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details

@dzafman dzafman deleted the dzafman:wip-ec-enoent branch Aug 1, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment