-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recover metablk in subtype order #200
Conversation
4bcab54
to
b963af5
Compare
after this PR is merge , we can change the recovery logic in this patch to a simple way |
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #200 +/- ##
==========================================
- Coverage 68.98% 68.91% -0.08%
==========================================
Files 93 93
Lines 7581 7611 +30
Branches 969 979 +10
==========================================
+ Hits 5230 5245 +15
- Misses 1901 1907 +6
- Partials 450 459 +9
☔ View full report in Codecov by Sentry. |
I think it is better to clarify this PR again. 1 what is the current metaservice recovery process? the current recovery process of metaservice is recovering all the metablks according to BlkID, and know nothing about the upper layer logic. so from the perspective of upper layer , no recovery order guaranteeed. for the upper layer user, they should adapt the out-of-order of metaservice recovery and implement the upper layer recovery logic carefully by themselves. that is it. 2 what does this PR do? this PR changes the default recovery order of metaservice from out-of-order to the order of the presence of subsystem in the metablk chain. the presence of subsystem means the presence of the first metablk of the subsystem in the metablk chain. for example , suppose we have 3 subsystems A, B and C , and in each subsystem we have 3 metablk( A1, A2 ,A3 ),(B1,B2,B3),(C1,C2,C3). the matablk chain is as follows: since A1 comes before C1 and C1 comes before B1, so subsystem A comes before C and C come before B. after the PR, the recovery order will be this gives the upper layer a chance to untilize this recovery order to simplify the upper layer recovery logic. well , the upper lay can also ignore this change and do like before. note that , here, the upper layer can not control the order of metaservice recovery, and we also do not want to give the chance to upper layer to control until we really need. so , in fact , what this PR does is to give a default natural order of metaservice recovery, and no extra performance issue will be involved. 3 is it necessary for the upper layer to control the recover order of metaservice in fine granularity i don`t think it is necessary for now at least. the reason is : so my suggestion is we can delay to implement fine-granularity control of the metaservice recovery order until we have a clear scenario . for now, we can just use the newly added default recovery order to simplify HO Implementation |
ca7b8ff
to
be40b71
Compare
another example of ordering requirement |
4434840
to
3897ef7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm in implementation
A few improvement area in UT but the existing UT covered the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be better to use reference & for MetaBlkService::topological_sort parameter
aaa5d63
a0ea4bb
to
c238031
Compare
add dependencied for metaservice Signed-off-by: Jie Yao <jyao3@ebay.com>
c238031
to
9c7d1c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
src/lib/meta/meta_blk_service.cpp
Outdated
meta_subtype_vec_t independent_subtypes; | ||
meta_subtype_vec_t ordered_subtypes; | ||
|
||
if (hs_utils::topological_sort(m_dep_topo_graph, ordered_subtypes, &independent_subtypes)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this sort out of MetaBlkService::recover and put it into either MetaBlkService::start (before ::recover) or into a new function (e.g. validate_register_dep(...)) to indicate this is a validation of register dependency, since this is not part of recovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, we use hs_utils::topological_sort
to both check circle and get the ordered_subtypes. if we call this in another function out of recovery, we need to pass the ordered_subtypes to MetaBlkService::recover
, which need to change the signature of MetaBlkService::recover
, and thus we need to do a lot of changes in the whole metaservice UT. can we do it in a seperate PR?
if we do not want to change the signature of MetaBlkService::recover
, we can add ordered_subtypes as a member in MetaBlkService so that MetaBlkService::recover
can read it, but it is generate from the m_dep_topo_graph
and only be used once, i do not think it is a good fashion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the sort in recovery
or in validate_register_dep
doesnt have any behavior change, more for the code style. It looks to me also seems wired that validate_register_dep
returns a order and passed to MetaBlkService::recover
Maybe do the topo-sort twice is a proper solution for the sake of code-style...
Honestly I dont think we need to spend more cycle on this style/preference and blocking the PR (other comments needed to be addressed), as there is significant communication overhead to let other believe this is better compare to having another PR to fix to your prefer one and show the beauty of another implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why we want to put more stuff in a function that is apparent not part of its job, and also would cause some confusion to the reader. The validation simply is NOT part of recovery, it is part of registration, if it fails, it shouldn't start recover process. Another benefit is if some use case want to call ::recover multiple times (either to measure the speed of recovery or test the dep/non-dep order correctness), it shouldn't do the sort multiple times, but just do recover with its clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically it is , the point is the cycle-detection is actually a topo-sort, so we either
- do the topo-sort twice , one for validation at the end of registration(validate_register_dep) , and one for recover.
- let recovery() accept
ordered_subtypes
andindependent_subtypes
, which returned fromvalidate_register_dep
and passed in here. - Put the validation in recovery()
I think 2) will makes the flow more complicate and other caller of recovery
will be more confusing. I am fine with 1 or 3. Though honestly dont see too much benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need for independent_subtypes, where we can get from m_sub_info.
This is not a major comment. Feel free to merge as it as. The rest looks good to me.
But please expect some refactoring as to this part in next PR.
2802786
to
3bb9d3c
Compare
In HO , when metaservice restart, we actually want to recover repl-dev before PGinfo and recover PGinfo before Shardinfo.
but metaservice will recover all the metablk only according to blkid and we have to use some pending method to solve this problem.
https://github.com/eBay/HomeObject/blob/dbde1bb6b2eb705fc55f7539be08d9c9fdbbd508/src/lib/homestore_backend/hs_pg_manager.cpp#L126
https://github.com/eBay/HomeObject/blob/dbde1bb6b2eb705fc55f7539be08d9c9fdbbd508/src/lib/homestore_backend/hs_homeobject.cpp#L105
this PR do some change in metaservice recovery logic to make sure all the metablk will be recovered by subtype, and the subtype order is that appear order in the metablk chain.