Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream v12.0.3 plus https://github.com/ceph/ceph/pull/14997 #115

Merged
merged 1,057 commits into from May 26, 2017

Conversation

smithfarm
Copy link

The upstream dmClock developer wants me to try this PR. He claims it will prevent the dmclock test binaries from being built.

liewegas added 30 commits May 5, 2017 13:38
No reason to wait for make_writeable(); ensure we have a valid snapc
from the start.

Signed-off-by: Sage Weil <sage@redhat.com>
If REQUIRE_LUMINOUS is set on the OSDMap, put the SnapSet on the head
and make it a whiteout.  This is simpler and will eventually (once all
the old snapdir objects have been fixed up) let us remove a ton of
snapdir-related headaches.

This is surprisingly simple.  Everywhere else we work with snapdir is
already set up to handle the snapset on the head.  The only difference is
that we are preventing outselves from moving from the snapset-on-head
state to the snapset-on-snapdir-with-no-head state.  The only time this
happens is when the object is logically deleted in _delete_oid.

Signed-off-by: Sage Weil <sage@redhat.com>
We may still need to create a whiteout because clones still exist.

Arguably delete+ignore_cache is not the right way to remove whiteouts and
we should have a separate RADOS operation for this.  But we don't.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
This will make it easier to identify users of this field that need to
conditionally use either the old legacy_snaps or the new
SnapSet::clone_snaps vector.

Signed-off-by: Sage Weil <sage@redhat.com>
Store the per-clone snaps list in SnapSet if (1) REQUIRE_LUMINOUS is set
in the OSDMap and (2) the SnapSet isn't a 'legacy' SnapSet that hasn't
been converted yet by scrub.

Signed-off-by: Sage Weil <sage@redhat.com>
During scrub, we assemble the clone legacy_snaps values and put them
in the SnapSet.

Note that we do not bother clearing legacy_snaps in the clones; this is
more complexity to do correctly and not worth the effort; we will always
have the SnapSet to indicate whether we need to look at the oldf ield
or not.

If the rebuild clone_snaps is not the correct size we bail out; this can
happen if some clones are missing oi or missing entirely, or if there are
extra clones.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
- Assume that any snapset we update, if not require_luminous, is a net-new
legacy SnapSet.  (It might be an existing one, which would be net-0, but
that is harder to tell.)

Then, during scrub,

- Any unreadable oi is assumed to include a legacy snapset
- Any snapset we encounter if !require_luminous is legacy
- Any object that should have a snapset but doesn't (corrupt or missing)
is assumed to be legacy.
- If were trying to update a legacy Snapset but have to abort, then it is
still legacy.

We could assume that a missing/broken snapset is not legacy since it has
to be repaired anyway (and therefore shouldn't block upgrade), but I'm
not sure.  For now, we'll take the conservative approach of blocking the
upgrade if the snapset metadata is missing/corrupt.

Signed-off-by: Sage Weil <sage@redhat.com>

# Conflicts:
#	src/osd/PrimaryLogPG.cc
We include a check to make sure we do not delete a dirty whiteout if this
is a tier pool and the object is dirty.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Replicas need this in order to store the clones in SnapMapper.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
In the new world, head_exists is always true.  Stop clearing it in
_delete_oid (except for legacy compatibility), and fix the various callers
and assertions to compensate.  Note that we now use head_exists as a flag
to guide us into the snapdir branch of finish_ctx() (which will go away
post-luminous).

Signed-off-by: Sage Weil <sage@redhat.com>
- objects come in a different order, meh
- ss is on head, always, not snapdir.
- error messages on head, not snapdir

Signed-off-by: Sage Weil <sage@redhat.com>
If we encounter legacy snaps, add to snapmapper then; otherwise, use the
clone_snaps field in SnapSet to add all clones to snapmapper when we
process the head object.

Signed-off-by: Sage Weil <sage@redhat.com>
Do not touch the object before the put if we expect to see ENOENT.
We would get it anyway with older code, but with the snapset
changes we are on a whiteout and get ENODATA instead.

And really we shouldn't have been getting ENOENT after a touch
anyway!

Signed-off-by: Sage Weil <sage@redhat.com>
We no longer see dup entries on split.

Signed-off-by: Sage Weil <sage@redhat.com>
First, this is pointless--each test runs in a namespace so they don't
step on each other.  Second, leaving objects in place is an opportunity
for scrub to notice any issues we created.  Third, the cleanup asserts
that delete succeeds but if clones exist pgls will show whiteouts and then
delete will return ENOENT.  We could disable the assert, but why bother
even attempting a sloppy cleanup?

We need to preserve cleanup behavior for a few tests (notably the object
listing ones).

Signed-off-by: Sage Weil <sage@redhat.com>
Instead of deleting a pool, add a .NNN.DELETED suffix to the end.  This
keeps the data around long enough for it to be scrubbed later (in the
case of a teuthology job cleanup).

If you really want to delete the pool, then instead of the usual force
flag option you can pass --yes-i-really-really-mean-it-not-faking.  :)

Signed-off-by: Sage Weil <sage@redhat.com>
… pool even if faking

Signed-off-by: Sage Weil <sage@redhat.com>
This breaks the upgrade test from jewel.  We can probably revert it later.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
John Spray and others added 7 commits May 16, 2017 10:51
qa/cephfs: Fix for test_data_scan

Reviewed-by: John Spray <john.spray@redhat.com>
… queue

Create an mClock priority queue, which can in turn be used for two new
implementations of the PG shards operator queue. The first
(mClockOpClassQueue) prioritizes operations based on which class they
belong to (recovery, scrub, snaptrim, client op, osd subop). The
second (mClockClientQueue) also incorporates the client identifier, in
order to promote fairness between clients.

In addition, also remove OpQueue's remove_by_filter and all possible
associated subclass implementations and tests.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
a26d29e Modify cmake files to match naming of cmake's upstream gtest and ceph's boost.
e23e63f Reorganize CMakeLists.txt files, so that projects that incorporate dmclock can control whether add_test is called. Now, add_test is called only at the top-level. Projects that incorporate dmclock can use add_subdirectory on subdirectories beneath the top-level to selective exactly what is included.

git-subtree-dir: src/dmclock
git-subtree-split: a26d29ef46ef8fbd5512a1f36637af7d3099c307
cmake set-up, specifically not allowing dmclock to call add_test.
Remove dmclock tests from being dependencies on ceph's "test" target.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
v12.0.3

Conflicts:
	ceph.spec.in
	src/test/librados/c_write_operations.cc
@smithfarm
Copy link
Author

@smithfarm
Copy link
Author

Test results: build succeeds on Tumbleweed, fails on Leap 42.3 (and presumably on SLE-12-SP3 as well).

Non-expurgated version, as reported upstream at ceph#14997:

Here [1] is the result of building this PR - if you click on the link at [1] and look at the area on the right where it says "Build Results". . . In that area, under the heading "ceph" are the WITH_TESTS=OFF builds and the builds under "ceph-test" are WITH_TESTS=ON. And you're right - the build does succeed in the newer openSUSE Tumbleweed, but we also need it to succeed in Leap 42.3, which is exhibiting the https://paste2.org/paJcefcb failure.

When you click on any build result (e.g. "failed") it takes you to a page showing the last few lines of the build log. From there you can click on another link to get the full build log, at the beginning of which it shows which packages/versions are installed in the build environment. Possibly the different result is caused by different versions of cmake in Leap 42.3 and Tumbleweed?

It's interesting that the WITH_TESTS=ON build succeeds in CentOS 7.3 - maybe the failure is reproducible in CentOS 7.2, though?

[1] https://build.opensuse.org/package/show/home:smithfarm:branches:filesystems:ceph:luminous/ceph

ivancich and others added 2 commits May 23, 2017 16:54
WITH_DMCLOCK_TESTS are both set. This is so openSUSE Leap will build
correctly.

Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
smithfarm and others added 9 commits May 25, 2017 03:17
Fixes: http://tracker.ceph.com/issues/20052
Signed-off-by: Giacomo Comes <comes@naic.edu>
Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 88ac9da)

Conflicts:
	ceph.spec.in (trivial resolution)
SUSE ships this package under the name "rdma-core-devel". See
https://build.opensuse.org/package/view_file/openSUSE:Factory/rdma-core/rdma-core.spec?expand=1

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit ba641b5)
The correct package name in SUSE is python-PrettyTable

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 08bc5e6)

Conflicts:
	ceph.spec.in (no python-CherryPy downstream, yet)
Signed-off-by: Nathan Cutler <ncutler@suse.com>
Introduced in b7215b0

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit c9169a7)
when using WITH_SYSTEM_BOOST dont set header-only packages
for BOOST_COMPONENTS. On some distros these packages dont
exist.

Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com>
(cherry picked from commit 23b0732)
(cherry picked from commit f807387)
boost::context is currently (1.63) unsupported for s390x and anyway
it makes sense to conditionalize Boost components so they are only
built with the Ceph components that need them (like is already being
done for mgr).

Signed-off-by: Nathan Cutler <ncutler@suse.com>
Signed-off-by: Tim Serong <tserong@suse.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit 7fe6305)
I'm not a fan of "if NOT x - then - else" blocks.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit d97ab76)
Since the Beast frontend uses boost::context which is not supported on
s390x.

Signed-off-by: Nathan Cutler <ncutler@suse.com>
(cherry picked from commit 84e80b7)
@smithfarm
Copy link
Author

smithfarm commented May 25, 2017

Upstream developer added a patch to ceph#14997 addressing the build failure in Leap. In addition to that, I also brought in the entire crop of upstream and downstream patches from this week addressing various aspects of 12.0.3, plus a fix for https://bugzilla.suse.com/show_bug.cgi?id=1040230

Now testing in https://build.opensuse.org/package/show/home:smithfarm:branches:filesystems:ceph:luminous/ceph

@smithfarm
Copy link
Author

smithfarm commented May 25, 2017

Build in OBS is green. Now testing in IBS (including s390x) in https://build.suse.de/package/show/Devel:Storage:5.0:Staging:Testing/ceph

Result: also green

Signed-off-by: Dan Mick <dan.mick@redhat.com>
(cherry picked from commit bd6f2d9)

Conflicts:
	ceph.spec.in (trivial resolution)
@smithfarm
Copy link
Author

smithfarm commented May 25, 2017

Now trying to get some green in teuthology. First failure is in ceph-mgr which tries to do "import cherrypy" - cherry-picked the upstream fix.

@smithfarm smithfarm merged commit 07282a4 into ses5 May 26, 2017
@smithfarm smithfarm deleted the wip-12-0-3-alt branch May 26, 2017 04:36
tserong pushed a commit that referenced this pull request Mar 8, 2023
MotrDeleteOp::delete_obj do not check or open mobj before trying to
delete it,object deletion fails because of it.

Fix: Open object by calling get_bucket_dir_ent() function in
MotrDeleteOp::delete_obj(), it will do a lookup on object and
initializes mobj if found.

Signed-off-by: saurabh jain <saurabh.jain2@seagate.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet