New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd, test: reworks for manifest dedup test cases #39216
Conversation
06ab533
to
2459366
Compare
jenkins test make check |
d93a565
to
68a5a08
Compare
68a5a08
to
40f3acb
Compare
@athanatos Base on https://docs.ceph.com/en/latest/dev/osd_internals/manifest/, this PR probably covers the scope of a test we have planned excepts for RBD, RGW. Please take a look if you have time to review it, and let me know what I missed. |
I think the teuthology radosmodel tests need a portion at the end to run and validate that the references in the target pool are valid. We won't be able to catch refcount mistakes without that. |
Have you run the full suite on this yet? |
d56a6f8
to
9e27fbd
Compare
@athanatos This PR is only tested on my local machine. If you agree to this, I'll run this PR through rados suite test. Also, I added a commit that checks the refcount is correct at the end of the test. |
Go ahead and run at least your added test cases. I'll have a closer look once you've got it passing consistently. |
26838ec
to
14a25d0
Compare
a7da94b
to
78de94b
Compare
src/test/osd/RadosModel.h
Outdated
{} | ||
|
||
void get_rand_off_len(uint64_t &rand_offset, uint32_t &rand_length, uint32_t max_len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return a pair.
src/test/osd/RadosModel.h
Outdated
if (r == 0) { | ||
// ok | ||
} else if (r == -EINVAL) { | ||
// probably this is not manifest object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have enough information to validate this return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an explanation.
src/test/osd/RadosModel.h
Outdated
} else if (r == -EINVAL) { | ||
// probably this is not manifest object | ||
} else if (r == -ENOENT) { | ||
// may have raced with a remove? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, don't we have state that should exclude racing removes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a specific condition to check ENOENT.
src/test/osd/RadosModel.h
Outdated
} else if (r == -EBUSY) { | ||
// could fail if snap is not oldest | ||
} else if (r == -ENOENT) { | ||
// could fail if obj is removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a specific condition to check ENOENT.
if (snap == -1) { | ||
ChunkDesc info {tgt_offset, length, oid_tgt}; | ||
context->update_object_chunk_target(oid, offset, info); | ||
context->update_object_version(oid, comp->get_version64()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually only valid for HEAD? Wouldn't we also need to update chunk_target for a clone if we call SET_CHUNK on a clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related: how does chunk_target relate to FLUSH? We don't appear to populate it in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChunkDesc is only for SetChunkReadOp to check whether chunk entries in chunk_map can be read correctly. Also, SetChunkReadOp is only used in case the object is head (/qa/suites/rados/thrash/workloads/set-chunks-read.yaml).
Probably, deleting SetCunkReadOp is the right way to address your comment?
src/osd/PrimaryLogPG.cc
Outdated
rollback_to); | ||
if (!rollback_to->obs.oi.has_manifest()) { | ||
// rollback_to is not manifest object | ||
tier_mode_result = cache_result_t::NOOP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be cleaner to move this check into maybe_handle_manifest_detail. The only other caller (do_op via maybe_handle_manifest) does the same check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didn't update the call site in do_op.
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
current inc_refcount_by_set only supports a case where a single entry is updated via SET_CHUNK. This commit will make existing inc_refcount_by_set to handle multiple entries. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Upon rollback, we should update new chunk_map in head. To do so, all entries in the chunk_map need to be updated via inc_refcount_by_set(). Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
01d60c2
to
6caa92d
Compare
This commit prevent updating wrong state, which happens when TierFlush receives error values. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
605956a
to
9a8c72f
Compare
The object updated by the Ops should be set unflushed. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
If the head is deleted in rollback(), manifest info also needs to be clear. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
9a8c72f
to
7823776
Compare
@athanatos Rebase is done. Also, I added the last seven commits to fix issues during QA and cleanup code. Here is the test result. Can you take a look? https://pulpito.ceph.com/myoungwon-2021-04-01_11:20:37-rados-wip-manifest-dedup-test-distro-basic-smithi/ |
@athanatos Ping. Please take a look when you are available. |
The commit 'osd: move handling ref. counting to finish_ctx()' doesn't actually add anything to finish_ctx. |
There are update_chunk_map_by_dirty() and dec_refcount_by_diry() in finish_ctx(). |
In that case, change the commit message to "osd: remove unnecessary ref handling in _delete_oid" and fix the commit message body. |
Let's consider the following case when handling a delete op. 1. Delete --> whiteouted 2. Make clone In this case, current code clears chunk_map and calls dec_all_manifest_refcount() in _delete_oid() even if the clone still has the references. To fix this, This commit remove unnecessary ref handling in _delete_oid, and makes finish_ctx() to handle ref handling, which can aware of whether the clone is created or not. Also, remove oi.size == 0 condition in finish_ctx() to handle ref. counting upon a delete op with whitedouted clone. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
Upon rollback, we should handle ENOENT case, so what we should do here is to return NOOP. Signed-off-by: Myoungwon Oh <myoungwon.oh@samsung.com>
7823776
to
efd89f9
Compare
Fixed. |
This PR is what we've planed here (https://docs.ceph.com/en/latest/dev/osd_internals/manifest/).
The purpose of this PR is to rework and add manifest ops tests based on current
snapshot and flush scheme.
Signed-off-by: Myoungwon Oh myoungwon.oh@samsung.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox