New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd-mirror A/A: track images in policy map #15788
Conversation
items that need to be addressed:
|
src/cls/rbd/cls_rbd.cc
Outdated
@@ -3097,6 +3097,7 @@ static const std::string IMAGE_KEY_PREFIX("image_"); | |||
static const std::string GLOBAL_KEY_PREFIX("global_"); | |||
static const std::string STATUS_GLOBAL_KEY_PREFIX("status_global_"); | |||
static const std::string INSTANCE_KEY_PREFIX("instance_"); | |||
static const std::string MIRROR_IMAGE_MAP_KEY_PREFIX("map_image_"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: map_image_
-> image_map_
to match constant's naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that result in these entries being picked by when IMAGE_KEY_PREFIX
is used as the key? I had faced this earlier...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they used in the same object? I thought one was in rbd_mirroring and the other was in rbd_leader? Could also name it "instance_image_"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the sense, I used instance_map
earlier and that got these keys to be listed out where INSTANCE_KEY_PREFIX
was being used (leader object). So I stuck to keeping it distinct. image_map
would be safe to use here.
src/cls/rbd/cls_rbd.cc
Outdated
@@ -3654,6 +3659,84 @@ int instances_remove(cls_method_context_t hctx, const string &instance_id) { | |||
return 0; | |||
} | |||
|
|||
int mirror_image_map_list( | |||
cls_method_context_t hctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: hanging indent should be 4 spaces to differentiate the params from the function body
src/cls/rbd/cls_rbd.cc
Outdated
std::string &start_after, uint64_t max_return, | ||
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) { | ||
|
||
int max_read = RBD_MAX_KEYS_READ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move into the while loop and properly initialize max_read
to the min of RBD_MAX_KEYS_READ
and the number of remaining items from max_return
src/cls/rbd/cls_rbd.cc
Outdated
@@ -3654,6 +3659,84 @@ int instances_remove(cls_method_context_t hctx, const string &instance_id) { | |||
return 0; | |||
} | |||
|
|||
int mirror_image_map_list( | |||
cls_method_context_t hctx, | |||
std::string &start_after, uint64_t max_return, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: const std::string &start_after
src/cls/rbd/cls_rbd_client.h
Outdated
librados::ObjectWriteOperation *op, | ||
const std::string &global_id, cls::rbd::MirrorImageMap &image_map); | ||
void mirror_image_map_remove( | ||
librados::ObjectWriteOperation *op, std::string &global_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: const std::string&
|
||
struct MappedImage { | ||
std::string id; | ||
std::string global_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: global_image_id
}; | ||
|
||
struct MappedImage { | ||
std::string id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this id used for? It's not obvious w/o tracing back through the code. As I mentioned in the (obsolete) ImageSpec
class, I think you really need to track a set of peer UUIDs (black for local pool, populated UUID for remote peer pools). Once you drop to zero UUIDs, you know you can unmap the image from an instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The id
here is not for tracking. This structure is used by PoolReplayer::handle_update()
and id
gets assigned to whatever we get in image_id in PoolWatcher
(although as you mentioned that passing id
to start image replayers is not really required as we already pass the global_id
). This can be taken away if required.
} | ||
}; | ||
|
||
struct RemappedImage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just track this all within the same data structure above? It would have a state of associating, unassociating, associated, and rebalancing. Probably should also track "next instance id" for the case of rebalancing so that your policy can re-execute while a rebalance is in-progress and make decisions based on the expected state, not current state. Also should add a "last mapped" timestamp to ensure you don't ping-pong images back-and-force (i.e. a given image that is already mapped cannot be re-mapped for X minutes -- config option).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These structures are used to pass a bunch of images to PoolReplayer::handle_update()
. Tracking next instance id (plus other things) should mostly be a part of some other "long lived" data structure such as the map itself (or a part of the map which tracks peer uuids).
Regarding the RemappedImage
structure, yeh, we have just stick to having just one structure for representing mapped, removed and remapped images.
std::ostream &, const MapSpec &mapspec); | ||
|
||
// in-memory map of tracked images | ||
struct InImage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would only need this data structure on a transient basis when loading the map from disk and then the policy's in-memory map would be initialized from it. Of course, the global image id -> instance id map is so simple it could just be represented by a std::map
src/cls/rbd/cls_rbd_types.h
Outdated
@@ -364,6 +364,34 @@ struct TrashImageSpec { | |||
}; | |||
WRITE_CLASS_ENCODER(TrashImageSpec); | |||
|
|||
enum MirrorImageMapState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop this for now -- right now, the existence of the struct indicates the image is associated (or dissociating or associating).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've taken out the state tracking part and just maintain instance_id
on disk.
73599ae
to
2f7f293
Compare
updated pr with a minor chage in tracking async operations in policy. |
138e1c6
to
da35ebd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls_rbd changes also need updates to ceph_test_cls_rbd
src/cls/rbd/cls_rbd.cc
Outdated
|
||
max_read = MIN(RBD_MAX_KEYS_READ, max_return - image_map->size()); | ||
if (!vals.empty()) { | ||
last_read = mirror_image_map_key(image_map->rbegin()->first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key should already be good-to-go -- no need to prefix an additional prefix onto it.
src/cls/rbd/cls_rbd.cc
Outdated
const std::string& global_image_id = | ||
it->first.substr(MIRROR_IMAGE_MAP_KEY_PREFIX.size()); | ||
|
||
cls::rbd::MirrorImageMap imagemap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: image_map
src/cls/rbd/cls_rbd.cc
Outdated
int max_read = MIN(RBD_MAX_KEYS_READ, max_return); | ||
std::string last_read = mirror_image_map_key(start_after); | ||
|
||
while (max_read) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: max_read > 0
src/cls/rbd/cls_rbd.cc
Outdated
int mirror_image_map_list(cls_method_context_t hctx, | ||
const std::string &start_after, | ||
uint64_t max_return, | ||
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: image_maps
or image_mapping
@vshankar ignore -- apparently I was looking at the wrong PR |
bf0c12e
to
37eacbf
Compare
src/cls/rbd/cls_rbd.cc
Outdated
const std::string &start_after, | ||
uint64_t max_return, | ||
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) { | ||
int max_read = MIN(RBD_MAX_KEYS_READ, max_return); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move this inside the while
loop and pick the min of max keys read or max_return - image_map->size()
src/cls/rbd/cls_rbd.cc
Outdated
|
||
/** | ||
* Input: | ||
* @param std::map<std::string, cls::rbd::MirrorImageMap>: image map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: for the set and remove methods, you should just take a global id and an image map struct. You can combine multiple class method calls within the same librados operation. This will keep it consistent w/ all other RBD class methods.
@@ -0,0 +1,62 @@ | |||
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still would like to see the RemoveRequest
folded into UpdateRequest
where you give it a collection of global image id that need updated instance ids / data + a collection of global image ids to remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It kind of felt a bit odd to accept a map of images -> data
to be updated and a set of image ids to remove, so I made them into separate state machines (earlier I did have them clubbed up as they were accepting a map of image -> instance ids
with empty instance_id hinting removal).
* @endverbatim | ||
*/ | ||
LoadRequest(librados::IoCtx &ioctx, | ||
std::map<std::string, cls::rbd::MirrorImageMap> *image_map, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: see previous nit about swapping these to be something like image_maps
or image_mapping
|
||
if (!image_map.empty()) { | ||
m_start_after = image_map.rbegin()->first; | ||
image_map_list(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: wrap w/ if (image_map.size() == MAX_RETURN)
to avoid the extra empty call
template <typename I> | ||
void Policy<I>::update_images_removed(const std::string &mirror_uuid, | ||
const std::set<std::string> &global_image_ids) { | ||
dout(20) << dendl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only once the global image id has been removed from all peers (local and remote) should the remove remap logic execute
} | ||
|
||
void finish(int r) override { | ||
policy->finish_async_op(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: cannot finish the async op until after last dereference of policy
enum ShuffleType { | ||
SHUFFLE_TYPE_INSTANCES_ADDED = 0, // rebalance due to addition of instances | ||
SHUFFLE_TYPE_INSTANCES_REMOVED, // rebalance due to removal of instances | ||
SHUFFLE_TYPE_IMAGES_ADDED_REMOVED, // rebalance due to adding/removing images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: case doesn't appear to be used (of course, not sure the ShuffleType
would need to live either)
} | ||
|
||
uint64_t images_per_instance = nr_images / nr_instances; | ||
if (nr_images % nr_instances) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: not a boolean result so append w/ != 0
|
||
bool instance_present(const std::vector<std::string> &instance_ids, | ||
const std::string &instance) { | ||
return !(std::find(instance_ids.begin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: swap ==
for !=
and drop the negation
ccdbf46
to
741e3ae
Compare
minor update: added encoder test for |
a5a6231
to
5d92b41
Compare
src/common/options.cc
Outdated
|
||
Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED) | ||
.set_default(1) | ||
.set_description("interval (in seconds) to check images for mirror daemon peer updations"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "updations" -> "updates"
src/common/options.cc
Outdated
@@ -5276,6 +5276,14 @@ static std::vector<Option> get_rbd_mirror_options() { | |||
Option("rbd_mirror_leader_max_acquire_attempts_before_break", Option::TYPE_INT, Option::LEVEL_ADVANCED) | |||
.set_default(3) | |||
.set_description("number of failed attempts to acquire lock after missing heartbeats before breaking lock"), | |||
|
|||
Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED) | |||
.set_default(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add a .set_min(1)
as well
src/common/options.cc
Outdated
@@ -5276,6 +5276,14 @@ static std::vector<Option> get_rbd_mirror_options() { | |||
Option("rbd_mirror_leader_max_acquire_attempts_before_break", Option::TYPE_INT, Option::LEVEL_ADVANCED) | |||
.set_default(3) | |||
.set_description("number of failed attempts to acquire lock after missing heartbeats before breaking lock"), | |||
|
|||
Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: check
-> throttle
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe a double instead of int so that you can advance the state machine faster?
src/tools/rbd_mirror/CMakeLists.txt
Outdated
@@ -1,6 +1,7 @@ | |||
add_library(rbd_mirror_types STATIC | |||
instance_watcher/Types.cc | |||
leader_watcher/Types.cc) | |||
leader_watcher/Types.cc | |||
image_map/Types.cc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: alphabetical order
src/tools/rbd_mirror/CMakeLists.txt
Outdated
@@ -21,6 +22,12 @@ set(rbd_mirror_internal | |||
types.cc | |||
image_map/LoadRequest.cc | |||
image_map/UpdateRequest.cc | |||
image_map/ImageAction.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: alphabetical order
|
||
derr << ": invalid transition (action_type=" << action_type << ", state=" | ||
<< state << ")" << dendl; | ||
return {SUB_ACTION_TYPE_INVALID, STATE_UNASSIGNED}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: assert?
} | ||
}; | ||
|
||
Mutex m_lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the lock required? Looks like the Policy::m_map_lock
is always locked when using the ImageAction
instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, its not required. But I'm thinking if at all locking in ImageAction
is required in future. Also, this lock is never contented as of now (due to all access is under Policy::m_map_lock
), so its pretty much lightweight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally -- I'd yank it since ImageAction
should always be used under the image map's lock
RWLock::WLocker map_lock(m_map_lock); | ||
|
||
for (auto const &global_image_id : global_image_ids) { | ||
ImageAction::SubActionType sub_action_type = m_image_action.sub_action_start(global_image_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoking this method will invoke one of the subaction callbacks -- might be nice if the switch below could be eliminated and instead have those callbacks queue themselves in the set/remove update lists.
if (r == 0) { | ||
policy->notify_listener(to_map, to_unmap, to_purge); | ||
} else { | ||
// on failure, resume pending actions (if any) for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On failure I don't think it would be safe to proceed to the action. If you failed to notify an instance, try again until that instance is removed (marked dead) so that you know it's safe to proceed.
// on an ack, finish the current sub-action and schedule the | ||
// next if available or start a next action if finished. | ||
// on a nack: directly finish the current action (not that | ||
// this will invoke the context callback if any with a -ve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: -ve
-> negative
5d92b41
to
1e438dc
Compare
@dillaman couldn't write up a list of changes in this current version (from the last version) of the PR. |
src/tools/rbd_mirror/ImageMap.h
Outdated
// retry friendly callback contexts for sub actions. note that | ||
// callback context when an action is completed need not be | ||
// of this type as its just invoked once, upon completion. | ||
struct SubActionContext : Context { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel there could be a better way of implementing this abstract class and how its used by the derived classes below (SubActionMapContext
etc..). Thoughts?
1e438dc
to
dc6d022
Compare
changes in the latest update (w/ comments incorporated)
|
src/tools/rbd_mirror/ImageAction.h
Outdated
|
||
// NOTE: this class assumes callers take care of serialization when | ||
// invoking class APIs | ||
class ImageAction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since this actually is a per-image container of actions, perhaps it should be named ImageActions
(plural)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done -- this and moved to image_map
.
src/tools/rbd_mirror/ImageAction.h
Outdated
@@ -0,0 +1,124 @@ | |||
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move to image_map
subdirectory / namespace since it's only used by ImageMap
src/tools/rbd_mirror/CMakeLists.txt
Outdated
@@ -20,6 +23,9 @@ set(rbd_mirror_internal | |||
Threads.cc | |||
types.cc | |||
image_map/LoadRequest.cc | |||
image_map/Policy.cc | |||
image_map/SimplePolicy.cc | |||
image_map/Types.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: duplicate -- already in rbd_mirror_types
above
@@ -0,0 +1,26 @@ | |||
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: move to image_map
directory / namespace (and drop the ImageMap
prefix since it would be duplicated by the namespace. Could also add it to the existing image_map/Types.h
file.
src/tools/rbd_mirror/ImageMap.h
Outdated
|
||
std::unique_ptr<image_map::Policy<ImageCtxT> > m_policy; // our mapping policy | ||
|
||
Context *m_timer_task; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: default initialize to nullptr
here instead of the constructor
src/tools/rbd_mirror/ImageMap.cc
Outdated
|
||
for (auto const &global_image_id : global_image_ids) { | ||
std::string instance_id = m_policy->lookup(global_image_id); | ||
assert(instance_id != Policy<I>::UNMAPPED_INSTANCE_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: probably should just continue if the instance is already unmapped. It might be possible for a duplicate remove event to fire.
src/tools/rbd_mirror/ImageMap.cc
Outdated
|
||
// removal notification will be notified instantly. this is safe | ||
// even after scheduling action for images as we still hold m_lock | ||
notify_listener_remove_images(mirror_uuid, to_remove); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: wrap w/ if (!mirror_uuid.empty())
src/tools/rbd_mirror/ImageMap.cc
Outdated
<< peer_uuid << dendl; | ||
|
||
auto rm = m_image_info[global_image_id].registered_peers.erase(peer_uuid); | ||
assert(rm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: might be possible that peer sends duplicate removal event, so just ignore.
src/tools/rbd_mirror/ImageMap.h
Outdated
|
||
void complete(int r) override { | ||
finish(r); | ||
image_map->set_retry_instance_id(global_image_id, instance_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if ImageAction
and Policy
are better integrated, the policy would already know where the image is currently mapped and where the desired next mapping is located. Therefore, you wouldn't need to separately track where to re-send notifications.
src/tools/rbd_mirror/ImageMap.h
Outdated
instance_id(instance_id) { | ||
} | ||
|
||
void update_map_sub_action() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why have these "helper" methods instead of just directly invoking the methods from the appropriate finish
override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong reason to have it this way... SubActionContext
invokes set_retry_instance_id()
which requires it to record global_image_id
and instance_id
. Alternatively, each derived class could have accessed the base class member variable (via SubActionContext::
) and invoked the appropriate methods in finish()
. Just found this a bit cleaner.
I'll relook at this w.r.t. your comment on integrating Policy
and ImageActions
.
f36c439
to
00b25b3
Compare
@dillaman PR updated with changes. |
a9fbd19
to
6d88a0b
Compare
std::map<std::string, ActionState> m_actions; | ||
std::map<std::string, utime_t> m_shuffled_timestamp; | ||
|
||
std::set<std::string> m_dead_instances; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to retain this as policy keeps the instance around till all images mapping to an instance are unmapped (migrated) when an instance is removed.
RWLock::WLocker map_lock(m_map_lock); | ||
|
||
for (auto const &it : image_mapping) { | ||
map(it.first, it.second.instance_id, utime_t(0, 0), m_map_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should persist the "mapped" timestamp within the on-disk image map so a failure of primary still honors the migration timeout upon recovery. This could be added in a add-on commit to this PR so I don't need to keep re-reviewing large sections.
virtual ~Policy() { | ||
} | ||
|
||
struct Action { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: does this need to be public? It also doesn't seem to be tied to Policy
so perhaps it could be moved to its own stand-alone .h/.cc files?
std::set<std::string> *remap_global_image_ids); | ||
|
||
// state machine related methods | ||
void init_state(const std::string &global_image_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK -- so here is my proposal to simplify this interface. The ImageMap
should never construct actions. Instead, provide add_image(global_image_id, Context*, ...)
, remove_image(global_image_id, Context*, ...)
, and shuffle_image(global_image_id, Context, ...)
. The add_image
method would replace init_state
, the internal action state machine would automatically handle releasing the image's state if no further actions are scheduled (removing the need for release_state
and actions_pending
).
When combined with the suggestions to eliminate the subactions, you would probably then only need to have start_next_action(global_image_id)
and finish_action(global_image_id, int r)
.
src/tools/rbd_mirror/ImageMap.cc
Outdated
Context *on_purge = new SubActionPurgeContext(this, global_image_id); | ||
|
||
Context *on_finish = new FunctionContext( | ||
[this, global_image_id, on_unmap, on_purge](int r) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: break this out into its own handle_remove_action
method instead of using a lambda. It has the added bonus that the dout
at the top of the handle
method helps state tracking
src/tools/rbd_mirror/ImageMap.cc
Outdated
Context *on_map = new SubActionMapContext(this, global_image_id); | ||
|
||
Context *on_finish = new FunctionContext([this, global_image_id, on_map](int r) { | ||
assert(m_lock.is_locked()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: break this out into its own handle_add_action
method instead of using a lambda. It has the added bonus that the dout
at the top of the handle
method helps state tracking
ActionType action_type; | ||
State current_state; | ||
Transition transition; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to have an error_transition
to automatically handle the failure path depending on the current action+state. For example, I think we should assume that acquire/release notifications to instances should only fail after the peer has been blacklisted. The only retry-able case would be when updating the on-disk map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarity -- retries with peer would be handled by the listener (of ImageMap
class)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- InstanceWatcher
will retry the peer until successful or until the peer has been blacklisted (this last part needs to be handled).
STATE_UNASSIGNED = 0, | ||
STATE_ASSOCIATED, | ||
STATE_DISASSOCIATED, | ||
STATE_PURGING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: STATE_UPDATING_MAP
ACTION_TYPE_SHUFFLE, | ||
}; | ||
|
||
enum SubActionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost wonder if we can just eliminate the subactions and promote the few transitory states to first-level states (i.e. associating, dissociating, updating_map). There should be two idle states (associated and unassociated) where you can start new actions and transition to new transitory states.
src/tools/rbd_mirror/ImageMap.cc
Outdated
} | ||
|
||
template <typename I> | ||
bool ImageMap<I>::start_next_action(const std::string &global_image_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it appears that start_next_action
is always associated w/ scheduling an action if possible, so perhaps instead of returning a bool
here, just schedule the action if possible.
|
||
// shuffle images when instances are added/removed | ||
void add_instances(const std::vector<std::string> &instance_ids, | ||
std::set<std::string> *remap_global_image_ids, bool shuffle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is shuffle
ever false
besides in the constructor? If not, I think it can be dropped since at initialization, there isn't anything to shuffle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this can be removed, but not due to the reason of it always being false
but because the map is empty when Policy
is instantiated -- so do_shuffle_add_instances()
would not do anything.
876e273
to
9a47114
Compare
Also, a "simple" policy implementation that maps M images to N instances (M/N per rbd mirror daemon instance). Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Fixes: http://tracker.ceph.com/issues/18786 Signed-off-by: Venky Shankar <vshankar@redhat.com>
9a47114
to
4a00edf
Compare
changes in this update:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
this change breaks the "make check" on aarch64 consistently. see https://jenkins.ceph.com/job/ceph-pull-requests-arm64/9051/consoleFull#-2069175742d63714d2-c8d8-41fc-a9d4-8dee30be4c32
|
I'll look into it. |
|
||
struct PolicyMetaUnknown { | ||
static const PolicyMetaType TYPE = static_cast<PolicyMetaType>(-1); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vshankar i think this is suspicious. as the underlying type of enum is unsigned
by default. and casting it to and from uint32_t
does not always work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point here to to assign it to an unused ID (i.e. 0xFFFFFFFF)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my concern was that the MSBs of the number could be removed when being casted. but that's not the case.
Part of tracker: http://tracker.ceph.com/issues/18786