osd, test: reworks for manifest dedup test cases #39216

myoungwon · 2021-02-02T06:54:49Z

This PR is what we've planed here (https://docs.ceph.com/en/latest/dev/osd_internals/manifest/).
The purpose of this PR is to rework and add manifest ops tests based on current
snapshot and flush scheme.

Signed-off-by: Myoungwon Oh myoungwon.oh@samsung.com

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

sebastian-philipp · 2021-02-08T10:52:52Z

jenkins test make check

myoungwon · 2021-02-11T14:04:16Z

@athanatos Base on https://docs.ceph.com/en/latest/dev/osd_internals/manifest/, this PR probably covers the scope of a test we have planned excepts for RBD, RGW. Please take a look if you have time to review it, and let me know what I missed.

athanatos · 2021-02-16T00:57:04Z

I think the teuthology radosmodel tests need a portion at the end to run and validate that the references in the target pool are valid. We won't be able to catch refcount mistakes without that.

athanatos · 2021-02-16T00:58:08Z

Have you run the full suite on this yet?

myoungwon · 2021-02-17T02:06:29Z

@athanatos This PR is only tested on my local machine. If you agree to this, I'll run this PR through rados suite test. Also, I added a commit that checks the refcount is correct at the end of the test.

athanatos · 2021-02-17T03:09:39Z

Go ahead and run at least your added test cases. I'll have a closer look once you've got it passing consistently.

myoungwon · 2021-03-01T09:07:32Z

@athanatos It seems that this PR is ready for review.
https://pulpito.ceph.com/myoungwon-2021-02-28_16:32:35-rados-wip-manifest-dedup-test-distro-basic-smithi/
https://pulpito.ceph.com/myoungwon-2021-03-01_07:19:04-rados-wip-manifest-dedup-test-distro-basic-smithi/

athanatos · 2021-03-01T18:14:24Z

src/test/osd/RadosModel.h

  {}

+  void get_rand_off_len(uint64_t &rand_offset, uint32_t &rand_length, uint32_t max_len) {


return a pair.

athanatos · 2021-03-01T18:15:46Z

src/test/osd/RadosModel.h

+    if (r == 0) {
+      // ok
+    } else if (r == -EINVAL) {
+      // probably this is not manifest object 


Don't we have enough information to validate this return value?

I added an explanation.

athanatos · 2021-03-01T18:16:00Z

src/test/osd/RadosModel.h

+    } else if (r == -EINVAL) {
+      // probably this is not manifest object 
+    } else if (r == -ENOENT) {
+      // may have raced with a remove?


Same here, don't we have state that should exclude racing removes?

I added a specific condition to check ENOENT.

src/test/osd/RadosModel.h

athanatos · 2021-03-01T18:24:50Z

src/test/osd/RadosModel.h

+    } else if (r == -EBUSY) {
+      // could fail if snap is not oldest
+    } else if (r == -ENOENT) {
+      // could fail if obj is removed


I added a specific condition to check ENOENT.

athanatos · 2021-03-01T18:27:06Z

src/test/osd/RadosModel.h

+	if (snap == -1) {
+	  ChunkDesc info {tgt_offset, length, oid_tgt};
+	  context->update_object_chunk_target(oid, offset, info);
+	  context->update_object_version(oid, comp->get_version64());


Is this actually only valid for HEAD? Wouldn't we also need to update chunk_target for a clone if we call SET_CHUNK on a clone?

Related: how does chunk_target relate to FLUSH? We don't appear to populate it in that case.

ChunkDesc is only for SetChunkReadOp to check whether chunk entries in chunk_map can be read correctly. Also, SetChunkReadOp is only used in case the object is head (/qa/suites/rados/thrash/workloads/set-chunks-read.yaml).
Probably, deleting SetCunkReadOp is the right way to address your comment?

athanatos · 2021-03-01T18:35:42Z

src/osd/PrimaryLogPG.cc

-	  rollback_to);
+      if (!rollback_to->obs.oi.has_manifest()) {
+	// rollback_to is not manifest object
+	tier_mode_result = cache_result_t::NOOP;


It might be cleaner to move this check into maybe_handle_manifest_detail. The only other caller (do_op via maybe_handle_manifest) does the same check.

You didn't update the call site in do_op.

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

current inc_refcount_by_set only supports a case where a single entry is updated via SET_CHUNK. This commit will make existing inc_refcount_by_set to handle multiple entries. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

Upon rollback, we should update new chunk_map in head. To do so, all entries in the chunk_map need to be updated via inc_refcount_by_set(). Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

This commit prevent updating wrong state, which happens when TierFlush receives error values. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

The object updated by the Ops should be set unflushed. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

If the head is deleted in rollback(), manifest info also needs to be clear. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon · 2021-04-01T16:04:47Z

@athanatos Rebase is done. Also, I added the last seven commits to fix issues during QA and cleanup code. Here is the test result. Can you take a look?

https://pulpito.ceph.com/myoungwon-2021-04-01_11:20:37-rados-wip-manifest-dedup-test-distro-basic-smithi/
https://pulpito.ceph.com/myoungwon-2021-04-01_14:00:55-rados-wip-manifest-dedup-test-distro-basic-smithi/

myoungwon · 2021-04-08T02:17:22Z

@athanatos Ping. Please take a look when you are available.

athanatos · 2021-04-08T07:07:30Z

The commit 'osd: move handling ref. counting to finish_ctx()' doesn't actually add anything to finish_ctx.

myoungwon · 2021-04-08T07:48:56Z

There are update_chunk_map_by_dirty() and dec_refcount_by_diry() in finish_ctx().
The purpose of the commit is to move ref count calculation to there by removing ctx->new_obs.oi.size != 0

athanatos · 2021-04-08T08:28:56Z

In that case, change the commit message to "osd: remove unnecessary ref handling in _delete_oid" and fix the commit message body.

Let's consider the following case when handling a delete op. 1. Delete --> whiteouted 2. Make clone In this case, current code clears chunk_map and calls dec_all_manifest_refcount() in _delete_oid() even if the clone still has the references. To fix this, This commit remove unnecessary ref handling in _delete_oid, and makes finish_ctx() to handle ref handling, which can aware of whether the clone is created or not. Also, remove oi.size == 0 condition in finish_ctx() to handle ref. counting upon a delete op with whitedouted clone. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

Upon rollback, we should handle ENOENT case, so what we should do here is to return NOOP. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon · 2021-04-08T08:39:15Z

Fixed.

github-actions bot added core tests labels Feb 2, 2021

myoungwon added tests and removed tests labels Feb 2, 2021

myoungwon force-pushed the wip-manifest-dedup-test branch 3 times, most recently from 06ab533 to 2459366 Compare February 8, 2021 03:02

myoungwon force-pushed the wip-manifest-dedup-test branch from d93a565 to 68a5a08 Compare February 10, 2021 03:34

myoungwon changed the title ~~WIP: test: reworks for manifest dedup test cases~~ test: reworks for manifest dedup test cases Feb 10, 2021

myoungwon force-pushed the wip-manifest-dedup-test branch from 68a5a08 to 40f3acb Compare February 10, 2021 03:51

github-actions bot added the build/ops label Feb 16, 2021

myoungwon force-pushed the wip-manifest-dedup-test branch from d56a6f8 to 9e27fbd Compare February 16, 2021 10:15

myoungwon force-pushed the wip-manifest-dedup-test branch from 26838ec to 14a25d0 Compare February 21, 2021 06:15

myoungwon force-pushed the wip-manifest-dedup-test branch from a7da94b to 78de94b Compare March 1, 2021 08:22

athanatos reviewed Mar 1, 2021

View reviewed changes

src/test/osd/RadosModel.h Show resolved Hide resolved

athanatos reviewed Mar 1, 2021

View reviewed changes

src/test/osd/RadosModel.h Show resolved Hide resolved

athanatos reviewed Mar 1, 2021

View reviewed changes

src/test/osd/RadosModel.h Show resolved Hide resolved

athanatos reviewed Mar 1, 2021

View reviewed changes

myoungwon added 8 commits March 29, 2021 17:22

osd: move check condition into maybe_handle_manifest_detail()

d9d0d31

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

osd: add manifest info when duplicating head upon manifest object

a2a0251

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

src/test: check flush hasn't been called regarding EBUSY

96f2ab1

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

osd: _do_rollback_to() refactor

b7d4ccf

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

osd: inc_refcount_by_set() refactor

0005a82

current inc_refcount_by_set only supports a case where a single entry is updated via SET_CHUNK. This commit will make existing inc_refcount_by_set to handle multiple entries. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

osd: fix reference mismatch on rollback

962fe03

Upon rollback, we should update new chunk_map in head. To do so, all entries in the chunk_map need to be updated via inc_refcount_by_set(). Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

src/test: add ManifestRollbackRefcount test

803e58e

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

src/test: generate useful log regarding ENOENT

6caa92d

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon force-pushed the wip-manifest-dedup-test branch from 01d60c2 to 6caa92d Compare March 29, 2021 08:24

github-actions bot removed the needs-rebase label Mar 29, 2021

myoungwon changed the title ~~test: reworks for manifest dedup test cases~~ osd, test: reworks for manifest dedup test cases Mar 29, 2021

src/test: fix not updating the object state in the error case

f9ef3cb

This commit prevent updating wrong state, which happens when TierFlush receives error values. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon force-pushed the wip-manifest-dedup-test branch 2 times, most recently from 605956a to 9a8c72f Compare April 1, 2021 03:16

myoungwon added 4 commits April 1, 2021 18:43

src/test: reset flushed to false when updating object

d5137e7

The object updated by the Ops should be set unflushed. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

src/test: remove unnecessary log

ba3cd1e

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

osd: clear manifest info to fix reference mismatch

cd7a295

If the head is deleted in rollback(), manifest info also needs to be clear. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

qa: fix typo to call rollback op

b5f5649

Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon force-pushed the wip-manifest-dedup-test branch from 9a8c72f to 7823776 Compare April 1, 2021 09:46

myoungwon added 2 commits April 8, 2021 17:35

osd: do not assert() in the case of no obc

efd89f9

Upon rollback, we should handle ENOENT case, so what we should do here is to return NOOP. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>

myoungwon force-pushed the wip-manifest-dedup-test branch from 7823776 to efd89f9 Compare April 8, 2021 08:38

athanatos self-requested a review April 9, 2021 19:41

athanatos merged commit 055ebe3 into ceph:master Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd, test: reworks for manifest dedup test cases #39216

osd, test: reworks for manifest dedup test cases #39216

myoungwon commented Feb 2, 2021 •

edited

sebastian-philipp commented Feb 8, 2021

myoungwon commented Feb 11, 2021

athanatos commented Feb 16, 2021

athanatos commented Feb 16, 2021

myoungwon commented Feb 17, 2021

athanatos commented Feb 17, 2021

myoungwon commented Mar 1, 2021

athanatos Mar 1, 2021

athanatos Mar 1, 2021

myoungwon Mar 3, 2021

athanatos Mar 1, 2021

myoungwon Mar 3, 2021

athanatos Mar 1, 2021

myoungwon Mar 3, 2021

athanatos Mar 1, 2021

athanatos Mar 1, 2021

myoungwon Mar 3, 2021

athanatos Mar 1, 2021 •

edited

myoungwon Mar 3, 2021

athanatos Mar 4, 2021 •

edited

myoungwon commented Apr 1, 2021

myoungwon commented Apr 8, 2021

athanatos commented Apr 8, 2021

myoungwon commented Apr 8, 2021 •

edited

athanatos commented Apr 8, 2021

myoungwon commented Apr 8, 2021

		{}

		void get_rand_off_len(uint64_t &rand_offset, uint32_t &rand_length, uint32_t max_len) {

osd, test: reworks for manifest dedup test cases #39216

osd, test: reworks for manifest dedup test cases #39216

Conversation

myoungwon commented Feb 2, 2021 • edited

Checklist

sebastian-philipp commented Feb 8, 2021

myoungwon commented Feb 11, 2021

athanatos commented Feb 16, 2021

athanatos commented Feb 16, 2021

myoungwon commented Feb 17, 2021

athanatos commented Feb 17, 2021

myoungwon commented Mar 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

athanatos Mar 1, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

athanatos Mar 4, 2021 • edited

Choose a reason for hiding this comment

myoungwon commented Apr 1, 2021

myoungwon commented Apr 8, 2021

athanatos commented Apr 8, 2021

myoungwon commented Apr 8, 2021 • edited

athanatos commented Apr 8, 2021

myoungwon commented Apr 8, 2021

myoungwon commented Feb 2, 2021 •

edited

athanatos Mar 1, 2021 •

edited

athanatos Mar 4, 2021 •

edited

myoungwon commented Apr 8, 2021 •

edited